871 resultados para XML Mining
Resumo:
The use of XML for representation of eLearning-content and for automatic generation of different kinds of teaching media from this material is with all its advantages nowadays stateof-the-art. In the last years there have been numerous projects that leveraged XML-based production environments. At the end of the financial advancement the created materials have to be maintained with limited (human) resources. In the majority of cases this is only possible, if the authors care for their teaching material without extensive IT-support. From our point of view there has so far been a lack of intuitive usable XML editors. The prototype of such an XML editor “aXess” is introduced with the intention to encourage a broad discussion about the required features to manage the content of eLearning-materials.
Resumo:
One of the main roles of the Neural Open Markup Language, NeuroML, is to facilitate cooperation in building, simulating, testing and publishing models of channels, neurons and networks of neurons. MorphML, which was developed as a common format for exchange of neural morphology data, is distributed as part of NeuroML but can be used as a stand-alone application. In this collection of tutorials and workshop summary, we provide an overview of these XML schemas and provide examples of their use in down-stream applications. We also summarize plans for the further development of XML specifications for modeling channels, channel distributions, and network connectivity.
Resumo:
This paper is a report about the FuXML project carried out at the FernUniversität Hagen. FuXML is a Learning Content Management System (LCMS) aimed at providing a practical and efficient solution for the issues attributed to authoring, maintenance, production and distribution of online and offline distance learning material. The paper presents the environment for which the system was conceived and describes the technical realisation. We discuss the reasons for specific implementation decisions and also address the integration of the system within the organisational and technical infrastructure of the university.
Resumo:
Unterstützungssysteme für die Programmierausbildung sind weit verbreitet, doch gängige Standards für den Austausch von allgemeinen (Lern-) Inhalten und Tests erfüllen nicht die speziellen Anforderungen von Programmieraufgaben wie z. B. den Umgang mit komplexen Einreichungen aus mehreren Dateien oder die Kombination verschiedener (automatischer) Bewertungsverfahren. Dadurch können Aufgaben nicht zwischen Systemen ausgetauscht werden, was aufgrund des hohen Aufwands für die Entwicklung guter Aufgaben jedoch wünschenswert wäre. In diesem Beitrag wird ein erweiterbares XML-basiertes Format zum Austausch von Programmieraufgaben vorgestellt, das bereits von mehreren Systemen prototypisch genutzt wird. Die Spezifikation des Austauschformats ist online verfügbar [PFMA].
Resumo:
I work in the field of Armenian historiography. This means I get to play with medieval manuscripts. The things I'm doing with the manuscripts are theoretically interesting, but pretty boring in practice, so I'm using Perl to program away the most boring bits. I will talk about the problems of text criticism in general, what sorts of things can and can't be done by the computer, my initial aversion to XML, how I was shown (some of) the error of my ways, and how I'm combining a bunch of isolated pieces of technology that were mostly already in use to achieve fame and fortune in the world of Armenian studies.
Resumo:
Large amounts of animal health care data are present in veterinary electronic medical records (EMR) and they present an opportunity for companion animal disease surveillance. Veterinary patient records are largely in free-text without clinical coding or fixed vocabulary. Text-mining, a computer and information technology application, is needed to identify cases of interest and to add structure to the otherwise unstructured data. In this study EMR's were extracted from veterinary management programs of 12 participating veterinary practices and stored in a data warehouse. Using commercially available text-mining software (WordStat™), we developed a categorization dictionary that could be used to automatically classify and extract enteric syndrome cases from the warehoused electronic medical records. The diagnostic accuracy of the text-miner for retrieving cases of enteric syndrome was measured against human reviewers who independently categorized a random sample of 2500 cases as enteric syndrome positive or negative. Compared to the reviewers, the text-miner retrieved cases with enteric signs with a sensitivity of 87.6% (95%CI, 80.4-92.9%) and a specificity of 99.3% (95%CI, 98.9-99.6%). Automatic and accurate detection of enteric syndrome cases provides an opportunity for community surveillance of enteric pathogens in companion animals.
Resumo:
Detecting bugs as early as possible plays an important role in ensuring software quality before shipping. We argue that mining previous bug fixes can produce good knowledge about why bugs happen and how they are fixed. In this paper, we mine the change history of 717 open source projects to extract bug-fix patterns. We also manually inspect many of the bugs we found to get insights into the contexts and reasons behind those bugs. For instance, we found out that missing null checks and missing initializations are very recurrent and we believe that they can be automatically detected and fixed.
Resumo:
Dynamically typed languages lack information about the types of variables in the source code. Developers care about this information as it supports program comprehension. Ba- sic type inference techniques are helpful, but may yield many false positives or negatives. We propose to mine information from the software ecosys- tem on how frequently given types are inferred unambigu- ously to improve the quality of type inference for a single system. This paper presents an approach to augment existing type inference techniques by supplementing the informa- tion available in the source code of a project with data from other projects written in the same language. For all available projects, we track how often messages are sent to instance variables throughout the source code. Predictions for the type of a variable are made based on the messages sent to it. The evaluation of a proof-of-concept prototype shows that this approach works well for types that are sufficiently popular, like those from the standard librarie, and tends to create false positives for unpopular or domain specific types. The false positives are, in most cases, fairly easily identifiable. Also, the evaluation data shows a substantial increase in the number of correctly inferred types when compared to the non-augmented type inference.
Artisanal and small scale mining in Mongolia: Statistical overview based on survey data by suom 2012
Resumo:
Background Simple Sequence Repeats (SSRs) are widely used in population genetic studies but their classical development is costly and time-consuming. The ever-increasing available DNA datasets generated by high-throughput techniques offer an inexpensive alternative for SSRs discovery. Expressed Sequence Tags (ESTs) have been widely used as SSR source for plants of economic relevance but their application to non-model species is still modest. Methods Here, we explored the use of publicly available ESTs (GenBank at the National Center for Biotechnology Information-NCBI) for SSRs development in non-model plants, focusing on genera listed by the International Union for the Conservation of Nature (IUCN). We also search two model genera with fully annotated genomes for EST-SSRs, Arabidopsis and Oryza, and used them as controls for genome distribution analyses. Overall, we downloaded 16 031 555 sequences for 258 plant genera which were mined for SSRsand their primers with the help of QDD1. Genome distribution analyses in Oryza and Arabidopsis were done by blasting the sequences with SSR against the Oryza sativa and Arabidopsis thaliana reference genomes implemented in the Basal Local Alignment Tool (BLAST) of the NCBI website. Finally, we performed an empirical test to determine the performance of our EST-SSRs in a few individuals from four species of two eudicot genera, Trifolium and Centaurea. Results We explored a total of 14 498 726 EST sequences from the dbEST database (NCBI) in 257 plant genera from the IUCN Red List. We identify a very large number (17 102) of ready-to-test EST-SSRs in most plant genera (193) at no cost. Overall, dinucleotide and trinucleotide repeats were the prevalent types but the abundance of the various types of repeat differed between taxonomic groups. Control genomes revealed that trinucleotide repeats were mostly located in coding regions while dinucleotide repeats were largely associated with untranslated regions. Our results from the empirical test revealed considerable amplification success and transferability between congenerics. Conclusions The present work represents the first large-scale study developing SSRs by utilizing publicly accessible EST databases in threatened plants. Here we provide a very large number of ready-to-test EST-SSR (17 102) for 193 genera. The cross-species transferability suggests that the number of possible target species would be large. Since trinucleotide repeats are abundant and mainly linked to exons they might be useful in evolutionary and conservation studies. Altogether, our study highly supports the use of EST databases as an extremely affordable and fast alternative for SSR developing in threatened plants.
Resumo:
The rapid expansion of the mineral and metal mining sector in the past decade was accompanied by an increase in social conflicts. What are the impacts of large-scale mining operations? What are the strategies used by transnational corporations to gain access to underground resources and legitimize their activities? And how do local and indigenous communities confronted with mining react to, negotiate with and resist these activities? This book covers 13 case studies of copper, gold, uranium and other mining operations, situated in Latin America, Africa, Asia, Australia and Switzerland. With an extensive introduction to the subject and a systematic comparison across mining operations in different phases of development and social contexts, it serves as a primer and reference book for activists, students and researchers alike.