955 resultados para Open source information retrieval
Resumo:
The municipality of San Juan La Laguna, Guatemala is home to approximately 5,200 people and located on the western side of the Lake Atitlán caldera. Steep slopes surround all but the eastern side of San Juan. The Lake Atitlán watershed is susceptible to many natural hazards, but most predictable are the landslides that can occur annually with each rainy season, especially during high-intensity events. Hurricane Stan hit Guatemala in October 2005; the resulting flooding and landslides devastated the Atitlán region. Locations of landslide and non-landslide points were obtained from field observations and orthophotos taken following Hurricane Stan. This study used data from multiple attributes, at every landslide and non-landslide point, and applied different multivariate analyses to optimize a model for landslides prediction during high-intensity precipitation events like Hurricane Stan. The attributes considered in this study are: geology, geomorphology, distance to faults and streams, land use, slope, aspect, curvature, plan curvature, profile curvature and topographic wetness index. The attributes were pre-evaluated for their ability to predict landslides using four different attribute evaluators, all available in the open source data mining software Weka: filtered subset, information gain, gain ratio and chi-squared. Three multivariate algorithms (decision tree J48, logistic regression and BayesNet) were optimized for landslide prediction using different attributes. The following statistical parameters were used to evaluate model accuracy: precision, recall, F measure and area under the receiver operating characteristic (ROC) curve. The algorithm BayesNet yielded the most accurate model and was used to build a probability map of landslide initiation points. The probability map developed in this study was also compared to the results of a bivariate landslide susceptibility analysis conducted for the watershed, encompassing Lake Atitlán and San Juan. Landslides from Tropical Storm Agatha 2010 were used to independently validate this study’s multivariate model and the bivariate model. The ultimate aim of this study is to share the methodology and results with municipal contacts from the author's time as a U.S. Peace Corps volunteer, to facilitate more effective future landslide hazard planning and mitigation.
Resumo:
As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.
Resumo:
Dieser Beitrag beschreibt die Konzeption, den Funktionsumfang und Erfahrungswerte der Open-Source-eLearning-Plattform Stud.IP. Der Funktionsumfang umfasst für jede einzelne Veranstaltung Ablaufpläne, das Hochladen von Hausarbeiten, Diskussionsforen, persönliche Homepages, Chaträume u.v.a. Ziel ist es hierbei, eine Infrastruktur des Lehrens und Lernens anzubieten, die dem Stand der Technik entspricht. Wissenschaftliche Einrichtungen finden zudem eine leistungsstarke Umgebung zur Verwaltung ihres Personals, Pflege ihrer Webseiten und der automatischer Erstellung von Veranstaltungs- oder Personallisten vor. Betreiber können auf ein verlässliches Supportsystem zugreifen, dass sie an der Weiterentwicklung durch die Entwickler- und Betreiber-Community teilhaben lässt.
Resumo:
The aim of the web-based course “Advertising Psychology – The Blog Seminar” was to offer a contemporary teaching design using typical Web 2.0 characteristics such as comments, discussions and social media integration which covers facebook and Twitter support, as nowadays, this is a common part of students’ everyday life. This weblog (blog)-based seminar for Advertising Psychology was set up in order to make the course accessible to students from different campuses in the Ruhr metropolitan area. The technical aspect of the open-source content management system Drupal 6.0 and the didactical course structure, based on Merrill’s five first principles of instruction, are introduced. To date, this blog seminar has been conducted three times with a total of 84 participants, who were asked to rate the course according to the benefits of different didactical elements and with regard to Kirkpatrick’s levels of evaluation model. This model covers a) reactions such as reported enjoyment, perceived usefulness and perceived difficulty, and b) effects on learning through the subjectively reported increase in knowledge and attitude towards the seminar. Overall, the blog seminar was evaluated very positively and can be considered as providing support for achieving the learning objectives. However, a successful blended learning approach should always be tailored to the learning contents and the environment.
Resumo:
The usage of social media in leisure time settings has become a prominent research topic. However, less research has been done on the design of social media in collaboration settings. In this study, we investigate how social media can support asynchronous collaboration in virtual teams and specifically how they can increase activity awareness. On the basis of an open source social networking platform, we present two prototype designs: a standard platform with basic support for information processing, communication and process – as suggested by Zigurs and Buckland (1998) – and an advanced platform with additional support for activity awareness via specialfeed functions. We argue that the standard platform already conveys activity awareness to a certain extent, however, that this awareness can be increased even more by the feeds in the advanced platform. Both prototypes are tested in a field experiment and evaluated with respect to their impact on perceived activity awareness, coordination and satisfaction. We show that the advanced design increases coordination and satisfaction through increased perceived activity awareness.
Resumo:
Web-scale knowledge retrieval can be enabled by distributed information retrieval, clustering Web clients to a large-scale computing infrastructure for knowledge discovery from Web documents. Based on this infrastructure, we propose to apply semiotic (i.e., sub-syntactical) and inductive (i.e., probabilistic) methods for inferring concept associations in human knowledge. These associations can be combined to form a fuzzy (i.e.,gradual) semantic net representing a map of the knowledge in the Web. Thus, we propose to provide interactive visualizations of these cognitive concept maps to end users, who can browse and search the Web in a human-oriented, visual, and associative interface.
Resumo:
OBJECTIVE: To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.
Resumo:
Molecular beacons (MBs) are stem-loop DNA probes used for identifying and reporting the presence and localization of nucleic acid targets in vitro and in vivo via target-dependent dequenching of fluorescence. A drawback of conventional MB design is present in the stem sequence that is necessary to keep the MBs in a closed conformation in the absence of a target, but that can participate in target binding in the open (target-on) conformation, giving rise to the possibility of false-positive results. In order to circumvent these problems, we designed MBs in which the stem was replaced by an orthogonal DNA analog that does not cross-pair with natural nucleic acids. Homo-DNA seemed to be specially suited, as it forms stable adenine-adenine base pairs of the reversed Hoogsteen type, potentially reducing the number of necessary building blocks for stem design to one. We found that MBs in which the stem part was replaced by homo-adenylate residues can easily be synthesized using conventional automated DNA synthesis. As conventional MBs, such hybrid MBs show cooperative hairpin to coil transitions in the absence of a DNA target, indicating stable homo-DNA base pair formation in the closed conformation. Furthermore, our results show that the homo-adenylate stem is excluded from DNA target binding, which leads to a significant increase in target binding selectivity
Resumo:
The biological effect of oxidatively damaged RNA, unlike oxidatively damaged DNA, has rarely been investigated, although it poses a threat to any living cell. Here we report on the effect of the commonly known RNA base-lesions 8-oxo-rG, 8-oxo-rA, ε-rC, ε-rA, 5-HO-rC, 5-HO-rU and the RNA abasic site (rAS) on ribosomal translation. To this end we have developed an in vitro translation assay based on the mRNA display methodology. A short synthetic mRNA construct containing the base lesion in a predefined position of the open reading frame was 32P-labeled at the 5′-end and equipped with a puromycin unit at the 3′-end. Upon in vitro translation in rabbit reticulocyte lysates, the encoded peptide chain is transferred to the puromycin unit and the products analyzed by gel electrophoresis. Alternatively, the unlabeled mRNA construct was used and incubated with 35S-methionine to prove peptide elongation of the message. We find that all base-lesions interfere substantially with ribosomal translation. We identified two classes, the first containing modifications at the base coding edge (ε-rC, ε-rA and rAS) which completely abolish peptide synthesis at the site of modification, and the second consisting of 8-oxo-rG, 8-oxo-rA, 5-HO-rC and 5-HO-rU that significantly retard full-length peptide synthesis, leading to some abortive peptides at the site of modification.
Resumo:
Previous studies on issue tracking systems for open source software (OSS) focused mainly on requests for bug fixes. However, requests to add a new feature or an improvement to an OSS project are often also made in an issue tracking system. These inquiries are particularly important because they determine the further development of the software. This study examines if there is any difference between requests of the IBM developer community and other sources in terms of the likelihood of successful implementation. Our study consists of a case study of the issue tracking system BugZilla in the Eclipse integrated development environment (IDE). Our hypothesis, which was that feature requests from outsiders have less chances of being implemented, than feature requests from IBM developers, was confirmed.
Resumo:
Early Employee Assistance Programs (EAPs) had their origin in humanitarian motives, and there was little concern for their cost/benefit ratios; however, as some programs began accumulating data and analyzing it over time, even with single variables such as absenteeism, it became apparent that the humanitarian reasons for a program could be reinforced by cost savings particularly when the existence of the program was subject to justification.^ Today there is general agreement that cost/benefit analyses of EAPs are desirable, but the specific models for such analyses, particularly those making use of sophisticated but simple computer based data management systems, are few.^ The purpose of this research and development project was to develop a method, a design, and a prototype for gathering managing and presenting information about EAPS. This scheme provides information retrieval and analyses relevant to such aspects of EAP operations as: (1) EAP personnel activities, (2) Supervisory training effectiveness, (3) Client population demographics, (4) Assessment and Referral Effectiveness, (5) Treatment network efficacy, (6) Economic worth of the EAP.^ This scheme has been implemented and made operational at The University of Texas Employee Assistance Programs for more than three years.^ Application of the scheme in the various programs has defined certain variables which remained necessary in all programs. Depending on the degree of aggressiveness for data acquisition maintained by program personnel, other program specific variables are also defined. ^
Resumo:
The thesis represents the first part of a reference book to the Tertiary flora of Saxony. All taxa based on leaves of angiosperms and on Ginkgo are included in this compendium. After an overview about the geological state of knowledge on the Tertiary in Saxony, phytostratigraphic concepts are introduced and a historical survey on the Tertiary paleobotanical research in Saxony is given. All plant macrofossils published from Saxonian Tertiary until end of 2013 and their sites of discovery (primary data) were recorded. This data were supplemented by additional attributes and unified through project-based M.Sc. theses. Subsequently, taxa of fossil leaves were selected, their data evaluated and brought to a consistent state of research. Data sheets for 187 out of 235 examined taxa were established for a determination atlas. Macro- and micromorphological attributes are described in this atlas and information are given about the systematic, synonymy, palaeoecology and spatial and temporal distribution. The describing part is illustrated by images and instructive drawings. The documented data were surveyed and discussed related to their quality within the literature in the result part. A bibliography of the extensive palaeobotanical literature for plant fossils of Saxony completes the work. The taxon and locality related data are implemented into an open source geographical information system (GIS) in order to visualize and to manage them effectively. For the first time, the results of this thesis implemented in the GIS allow the generation of distribution maps for the taxa of leaves of Tertiary angiospermes and Ginkgo in Saxony. Furthermore it enables to query topographical, geological and paleobotanical information about the fossil sites. A determination key was developed for the fossil material that allows a rough determination of the findings in the field. The compendium will be available for free use in a printed as well as in a digital version.
Resumo:
Software evolution, and particularly its growth, has been mainly studied at the file (also sometimes referred as module) level. In this paper we propose to move from the physical towards a level that includes semantic information by using functions or methods for measuring the evolution of a software system. We point out that use of functions-based metrics has many advantages over the use of files or lines of code. We demonstrate our approach with an empirical study of two Free/Open Source projects: a community-driven project, Apache, and a company-led project, Novell Evolution. We discovered that most functions never change; when they do their number of modifications is correlated with their size, and that very few authors who modify each; finally we show that the departure of a developer from a software project slows the evolution of the functions that she authored.
A repository for integration of software artifacts with dependency resolution and federation support
Resumo:
While developing new IT products, reusability of existing components is a key aspect that can considerably improve the success rate. This fact has become even more important with the rise of the open source paradigm. However, integrating different products and technologies is not always an easy task. Different communities employ different standards and tools, and most times is not clear which dependencies a particular piece of software has. This is exacerbated by the transitive nature of these dependencies, making component integration a complicated affair. To help reducing this complexity we propose a model-based repository, capable of automatically resolve the required dependencies. This repository needs to be expandable, so new constraints can be analyzed, and also have federation support, for the integration with other sources of artifacts. The solution we propose achieves these working with OSGi components and using OSGi itself.
Resumo:
ImageCLEF is a pilot experiment run at CLEF 2003 for cross language image retrieval using textual captions related to image contents. In this paper, we describe the participation of the MIRACLE research team (Multilingual Information RetrievAl at CLEF), detailing the different experiments and discussing their preliminary results.