21 resultados para Data stream mining


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report a trace element - Pb isotope analytical (LIA) database on the "Singen Copper", a peculiar type of copper found in the North Alpine realm, from its type locality, the Early Bronze Age Singen Cemetery (Germany). What distinguishes “Singen Copper” from other coeval copper types? (i) is it a discrete metal lot with a uniform provenance (if so, can its provenance be constrained)? (ii) was it manufactured by a special, unique metallurgical process that can be discriminated from others? Trace element concentrations can give clues on the ore types that were mined, but they can be modified (more or less intentionally) by metallurgical operations. A more robust indicator are the ratios of chemically similar elements (e.g. Co/Ni, Bi/Sb, etc.), since they should remain nearly constant during metallurgical operations, and are expected to behave homogeneously in each mineral of a given mining area, but their partition amongst the different mineral species is known to cause strong inter-element fractionations. We tested the trace element ratio pattern predicted by geochemical arguments on the Brixlegg mining area. Brixlegg itself is not compatible with the Singen Copper objects, and we only report it because it is a rare instance of a mining area for which sufficient trace element analyses are available in the literature. We observe that As/Sb in fahlerz varies by a factor 1.8 above/below median; As/Sb in enargite varies by a factor of 2.5 with a 10 times higher median. Most of the 102 analyzed metal objects from Singen are Sb-Ni-rich, corresponding to “antimony-nickel copper” of the literature. Other trace element concentrations vary by > 100 times, ratios by factors > 50. Pb isotopic compositions are all significantly different from each other. They do not form a single linear array and require > 3 ore batches that certainly do not derive from one single mining area. Our data suggest a heterogeneous provenance of “Singen copper”. Archaeological information limits the scope to Central European sources. LIA requires a diverse supply network from many mining localities, including possibly Brittany. Trace element ratios show more heterogeneity than LIA; this can be explained either by deliberate selection of one particular ore mineral (from very many sources) or by processing of assorted ore minerals from a smaller number of sources, with the unintentional effect that the quality of the copper would not be constant, as the metallurgical properties of alloys would vary with trace element concentrations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large amounts of animal health care data are present in veterinary electronic medical records (EMR) and they present an opportunity for companion animal disease surveillance. Veterinary patient records are largely in free-text without clinical coding or fixed vocabulary. Text-mining, a computer and information technology application, is needed to identify cases of interest and to add structure to the otherwise unstructured data. In this study EMR's were extracted from veterinary management programs of 12 participating veterinary practices and stored in a data warehouse. Using commercially available text-mining software (WordStat™), we developed a categorization dictionary that could be used to automatically classify and extract enteric syndrome cases from the warehoused electronic medical records. The diagnostic accuracy of the text-miner for retrieving cases of enteric syndrome was measured against human reviewers who independently categorized a random sample of 2500 cases as enteric syndrome positive or negative. Compared to the reviewers, the text-miner retrieved cases with enteric signs with a sensitivity of 87.6% (95%CI, 80.4-92.9%) and a specificity of 99.3% (95%CI, 98.9-99.6%). Automatic and accurate detection of enteric syndrome cases provides an opportunity for community surveillance of enteric pathogens in companion animals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dynamically typed languages lack information about the types of variables in the source code. Developers care about this information as it supports program comprehension. Ba- sic type inference techniques are helpful, but may yield many false positives or negatives. We propose to mine information from the software ecosys- tem on how frequently given types are inferred unambigu- ously to improve the quality of type inference for a single system. This paper presents an approach to augment existing type inference techniques by supplementing the informa- tion available in the source code of a project with data from other projects written in the same language. For all available projects, we track how often messages are sent to instance variables throughout the source code. Predictions for the type of a variable are made based on the messages sent to it. The evaluation of a proof-of-concept prototype shows that this approach works well for types that are sufficiently popular, like those from the standard librarie, and tends to create false positives for unpopular or domain specific types. The false positives are, in most cases, fairly easily identifiable. Also, the evaluation data shows a substantial increase in the number of correctly inferred types when compared to the non-augmented type inference.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Smart homes for the aging population have recently started attracting the attention of the research community. The "health state" of smart homes is comprised of many different levels; starting with the physical health of citizens, it also includes longer-term health norms and outcomes, as well as the arena of positive behavior changes. One of the problems of interest is to monitor the activities of daily living (ADL) of the elderly, aiming at their protection and well-being. For this purpose, we installed passive infrared (PIR) sensors to detect motion in a specific area inside a smart apartment and used them to collect a set of ADL. In a novel approach, we describe a technology that allows the ground truth collected in one smart home to train activity recognition systems for other smart homes. We asked the users to label all instances of all ADL only once and subsequently applied data mining techniques to cluster in-home sensor firings. Each cluster would therefore represent the instances of the same activity. Once the clusters were associated to their corresponding activities, our system was able to recognize future activities. To improve the activity recognition accuracy, our system preprocessed raw sensor data by identifying overlapping activities. To evaluate the recognition performance from a 200-day dataset, we implemented three different active learning classification algorithms and compared their performance: naive Bayesian (NB), support vector machine (SVM) and random forest (RF). Based on our results, the RF classifier recognized activities with an average specificity of 96.53%, a sensitivity of 68.49%, a precision of 74.41% and an F-measure of 71.33%, outperforming both the NB and SVM classifiers. Further clustering markedly improved the results of the RF classifier. An activity recognition system based on PIR sensors in conjunction with a clustering classification approach was able to detect ADL from datasets collected from different homes. Thus, our PIR-based smart home technology could improve care and provide valuable information to better understand the functioning of our societies, as well as to inform both individual and collective action in a smart city scenario.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When genetic constraints restrict phenotypic evolution, diversification can be predicted to evolve along so-called lines of least resistance. To address the importance of such constraints and their resolution, studies of parallel phenotypic divergence that differ in their age are valuable. Here, we investigate the parapatric evolution of six lake and stream threespine stickleback systems from Iceland and Switzerland, ranging in age from a few decades to several millennia. Using phenotypic data, we test for parallelism in ecotypic divergence between parapatric lake and stream populations and compare the observed patterns to an ancestral-like marine population. We find strong and consistent phenotypic divergence, both among lake and stream populations and between our freshwater populations and the marine population. Interestingly, ecotypic divergence in low-dimensional phenotype space (i.e. single traits) is rapid and seems to be often completed within 100 years. Yet, the dimensionality of ecotypic divergence was highest in our oldest systems and only there parallel evolution of unrelated ecotypes was strong enough to overwrite phylogenetic contingency. Moreover, the dimensionality of divergence in different systems varies between trait complexes, suggesting different constraints and evolutionary pathways to their resolution among freshwater systems.