955 resultados para Open source information retrieval


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the article the theoretical aspects of planning of the systems of the controlled from distance diagnosing of level of know ledges of students are resulted on the basis of modern pedagogical theoretical and technological approaches. The practical results of creation of the systems of this type are resulted for organization of testing both in the structure of local networks of higher educational establishments and with access through the global network of Internet.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Доклад, поместен в сборника на Националната конференция "Образованието в информационното общество", Пловдив, май 2011 г.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The possibility to analyze, quantify and forecast epidemic outbreaks is fundamental when devising effective disease containment strategies. Policy makers are faced with the intricate task of drafting realistically implementable policies that strike a balance between risk management and cost. Two major techniques policy makers have at their disposal are: epidemic modeling and contact tracing. Models are used to forecast the evolution of the epidemic both globally and regionally, while contact tracing is used to reconstruct the chain of people who have been potentially infected, so that they can be tested, isolated and treated immediately. However, both techniques might provide limited information, especially during an already advanced crisis when the need for action is urgent. In this paper we propose an alternative approach that goes beyond epidemic modeling and contact tracing, and leverages behavioral data generated by mobile carrier networks to evaluate contagion risk on a per-user basis. The individual risk represents the loss incurred by not isolating or treating a specific person, both in terms of how likely it is for this person to spread the disease as well as how many secondary infections it will cause. To this aim, we develop a model, named Progmosis, which quantifies this risk based on movement and regional aggregated statistics about infection rates. We develop and release an open-source tool that calculates this risk based on cellular network events. We simulate a realistic epidemic scenarios, based on an Ebola virus outbreak; we find that gradually restricting the mobility of a subset of individuals reduces the number of infected people after 30 days by 24%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

GitHub is the most popular repository for open source code (Finley 2011). It has more than 3.5 million users, as the company declared in April 2013, and more than 10 million repositories, as of December 2013. It has a publicly accessible API and, since March 2012, it also publishes a stream of all the events occurring on public projects. Interactions among GitHub users are of a complex nature and take place in different forms. Developers create and fork repositories, push code, approve code pushed by others, bookmark their favorite projects and follow other developers to keep track of their activities. In this paper we present a characterization of GitHub, as both a social network and a collaborative platform. To the best of our knowledge, this is the first quantitative study about the interactions happening on GitHub. We analyze the logs from the service over 18 months (between March 11, 2012 and September 11, 2013), describing 183.54 million events and we obtain information about 2.19 million users and 5.68 million repositories, both growing linearly in time. We show that the distributions of the number of contributors per project, watchers per project and followers per user show a power-law-like shape. We analyze social ties and repository-mediated collaboration patterns, and we observe a remarkably low level of reciprocity of the social connections. We also measure the activity of each user in terms of authored events and we observe that very active users do not necessarily have a large number of followers. Finally, we provide a geographic characterization of the centers of activity and we investigate how distance influences collaboration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2015

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Digital Observatory for Protected Areas (DOPA) has been developed to support the European Union’s efforts in strengthening our capacity to mobilize and use biodiversity data, information and forecasts so that they are readily accessible to policymakers, managers, experts and other users. Conceived as a set of web based services, DOPA provides a broad set of free and open source tools to assess, monitor and even forecast the state of and pressure on protected areas at local, regional and global scale. DOPA Explorer 1.0 is a web based interface available in four languages (EN, FR, ES, PT) providing simple means to explore the nearly 16,000 protected areas that are at least as large as 100 km2. Distinguishing between terrestrial, marine and mixed protected areas, DOPA Explorer 1.0 can help end users to identify those with most unique ecosystems and species, and assess the pressures they are exposed to because of human development. Recognized by the UN Convention on Biological Diversity (CBD) as a reference information system, DOPA Explorer is based on the best global data sets available and provides means to rank protected areas at the country and ecoregion levels. Inversely, DOPA Explorer indirectly highlights the protected areas for which information is incomplete. We finally invite the end-users of DOPA to engage with us through the proposed communication platforms to help improve our work to support the safeguarding of biodiversity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The miniaturization, sophistication, proliferation, and accessibility of technologies are enabling the capture of more and previously inaccessible phenomena in Parkinson's disease (PD). However, more information has not translated into a greater understanding of disease complexity to satisfy diagnostic and therapeutic needs. Challenges include noncompatible technology platforms, the need for wide-scale and long-term deployment of sensor technology (among vulnerable elderly patients in particular), and the gap between the "big data" acquired with sensitive measurement technologies and their limited clinical application. Major opportunities could be realized if new technologies are developed as part of open-source and/or open-hardware platforms that enable multichannel data capture sensitive to the broad range of motor and nonmotor problems that characterize PD and are adaptable into self-adjusting, individualized treatment delivery systems. The International Parkinson and Movement Disorders Society Task Force on Technology is entrusted to convene engineers, clinicians, researchers, and patients to promote the development of integrated measurement and closed-loop therapeutic systems with high patient adherence that also serve to (1) encourage the adoption of clinico-pathophysiologic phenotyping and early detection of critical disease milestones, (2) enhance the tailoring of symptomatic therapy, (3) improve subgroup targeting of patients for future testing of disease-modifying treatments, and (4) identify objective biomarkers to improve the longitudinal tracking of impairments in clinical care and research. This article summarizes the work carried out by the task force toward identifying challenges and opportunities in the development of technologies with potential for improving the clinical management and the quality of life of individuals with PD. © 2016 International Parkinson and Movement Disorder Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Good estimates of ecosystem complexity are essential for a number of ecological tasks: from biodiversity estimation, to forest structure variable retrieval, to feature extraction by edge detection and generation of multifractal surface as neutral models for e.g. feature change assessment. Hence, measuring ecological complexity over space becomes crucial in macroecology and geography. Many geospatial tools have been advocated in spatial ecology to estimate ecosystem complexity and its changes over space and time. Among these tools, free and open source options especially offer opportunities to guarantee the robustness of algorithms and reproducibility. In this paper we will summarize the most straightforward measures of spatial complexity available in the Free and Open Source Software GRASS GIS, relating them to key ecological patterns and processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Starting from the Schumpeterian producer-driven understanding of innovation, followed by user-generated solutions and understanding of collaborative forms of co-creation, scholars investigated the drivers and the nature of interactions underpinning success in various ways. Innovation literature has gone a long way, where open innovation has attracted researchers to investigate problems like compatibilities of external resources, networks of innovation, or open source collaboration. Openness itself has gained various shades in the different strands of literature. In this paper the author provides with an overview and a draft evaluation of the different models of open innovation, illustrated with some empirical findings from various fields drawn from the literature. She points to the relevance of transaction costs affecting viable forms of (open) innovation strategies of firms, and the importance to define the locus of innovation for further analyses of different firm and interaction level formations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background As the use of electronic health records (EHRs) becomes more widespread, so does the need to search and provide effective information discovery within them. Querying by keyword has emerged as one of the most effective paradigms for searching. Most work in this area is based on traditional Information Retrieval (IR) techniques, where each document is compared individually against the query. We compare the effectiveness of two fundamentally different techniques for keyword search of EHRs. Methods We built two ranking systems. The traditional BM25 system exploits the EHRs' content without regard to association among entities within. The Clinical ObjectRank (CO) system exploits the entities' associations in EHRs using an authority-flow algorithm to discover the most relevant entities. BM25 and CO were deployed on an EHR dataset of the cardiovascular division of Miami Children's Hospital. Using sequences of keywords as queries, sensitivity and specificity were measured by two physicians for a set of 11 queries related to congenital cardiac disease. Results Our pilot evaluation showed that CO outperforms BM25 in terms of sensitivity (65% vs. 38%) by 71% on average, while maintaining the specificity (64% vs. 61%). The evaluation was done by two physicians. Conclusions Authority-flow techniques can greatly improve the detection of relevant information in EHRs and hence deserve further study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of this work was to enable the recognition of human gestures through the development of a computer program. The program created captures the gestures executed by the user through a camera attached to the computer and sends it to the robot command referring to the gesture. They were interpreted in total ve gestures made by human hand. The software (developed in C ++) widely used the computer vision concepts and open source library OpenCV that directly impact the overall e ciency of the control of mobile robots. The computer vision concepts take into account the use of lters to smooth/blur the image noise reduction, color space to better suit the developer's desktop as well as useful information for manipulating digital images. The OpenCV library was essential in creating the project because it was possible to use various functions/procedures for complete control lters, image borders, image area, the geometric center of borders, exchange of color spaces, convex hull and convexity defect, plus all the necessary means for the characterization of imaged features. During the development of the software was the appearance of several problems, as false positives (noise), underperforming the insertion of various lters with sizes oversized masks, as well as problems arising from the choice of color space for processing human skin tones. However, after the development of seven versions of the control software, it was possible to minimize the occurrence of false positives due to a better use of lters combined with a well-dimensioned mask size (tested at run time) all associated with a programming logic that has been perfected over the construction of the seven versions. After all the development is managed software that met the established requirements. After the completion of the control software, it was observed that the overall e ectiveness of the various programs, highlighting in particular the V programs: 84.75 %, with VI: 93.00 % and VII with: 94.67 % showed that the nal program performed well in interpreting gestures, proving that it was possible the mobile robot control through human gestures without the need for external accessories to give it a better mobility and cost savings for maintain such a system. The great merit of the program was to assist capacity in demystifying the man set/machine therefore uses an easy and intuitive interface for control of mobile robots. Another important feature observed is that to control the mobile robot is not necessary to be close to the same, as to control the equipment is necessary to receive only the address that the Robotino passes to the program via network or Wi-Fi.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overall aim of our research is to develop a clinical information retrieval system that retrieves systematic reviews and underlying clinical studies from the Cochrane Library to support physician decision making. We believe that in order to accomplish this goal we need to develop a mechanism for effectively representing documents that will be retrieved by the application. Therefore, as a first step in developing the retrieval application we have developed a methodology that semi-automatically generates high quality indices and applies them as descriptors to documents from The Cochrane Library. In this paper we present a description and implementation of the automatic indexing methodology and an evaluation that demonstrates that enhanced document representation results in the retrieval of relevant documents for clinical queries. We argue that the evaluation of information retrieval applications should also include an evaluation of the quality of the representation of documents that may be retrieved. ©2010 IEEE.