920 resultados para Hierarchical multi-label classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rare species have restricted geographic ranges, habitat specialization, and/or small population sizes. Datasets on rare species distribution usually have few observations, limited spatial accuracy and lack of valid absences; conversely they provide comprehensive views of species distributions allowing to realistically capture most of their realized environmental niche. Rare species are the most in need of predictive distribution modelling but also the most difficult to model. We refer to this contrast as the "rare species modelling paradox" and propose as a solution developing modelling approaches that deal with a sufficiently large set of predictors, ensuring that statistical models aren't overfitted. Our novel approach fulfils this condition by fitting a large number of bivariate models and averaging them with a weighted ensemble approach. We further propose that this ensemble forecasting is conducted within a hierarchic multi-scale framework. We present two ensemble models for a test species, one at regional and one at local scale, each based on the combination of 630 models. In both cases, we obtained excellent spatial projections, unusual when modelling rare species. Model results highlight, from a statistically sound approach, the effects of multiple drivers in a same modelling framework and at two distinct scales. From this added information, regional models can support accurate forecasts of range dynamics under climate change scenarios, whereas local models allow the assessment of isolated or synergistic impacts of changes in multiple predictors. This novel framework provides a baseline for adaptive conservation, management and monitoring of rare species at distinct spatial and temporal scales.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic classification of makams from symbolic data is a rarely studied topic. In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using (almost) the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale. In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The potential of type-2 fuzzy sets for managing high levels of uncertainty in the subjective knowledge of experts or of numerical information has focused on control and pattern classification systems in recent years. One of the main challenges in designing a type-2 fuzzy logic system is how to estimate the parameters of type-2 fuzzy membership function (T2MF) and the Footprint of Uncertainty (FOU) from imperfect and noisy datasets. This paper presents an automatic approach for learning and tuning Gaussian interval type-2 membership functions (IT2MFs) with application to multi-dimensional pattern classification problems. T2MFs and their FOUs are tuned according to the uncertainties in the training dataset by a combination of genetic algorithm (GA) and crossvalidation techniques. In our GA-based approach, the structure of the chromosome has fewer genes than other GA methods and chromosome initialization is more precise. The proposed approach addresses the application of the interval type-2 fuzzy logic system (IT2FLS) for the problem of nodule classification in a lung Computer Aided Detection (CAD) system. The designed IT2FLS is compared with its type-1 fuzzy logic system (T1FLS) counterpart. The results demonstrate that the IT2FLS outperforms the T1FLS by more than 30% in terms of classification accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The taxonomy of Bambusoideae is in a state of flux and phylogenetic studies are required to help resolve systematic issues. Over 60 taxa, representing all subtribes of Bambuseae and related non-bambusoid grasses were sampled. A combined analysis of five plastid DNA regions, trnL intron, trnL-F intergenic spacer, atpB-rbcL intergenic spacer, rps16 intron, and matK, was used to study the phylogenetic relationships among the bamboos in general and the woody bamboos in particular. Within the BEP clade (Bambusoideae s.s., Ehrhartoideae, Pooideae), Pooideae were resolved as sister to Bambusoideae s.s. Tribe Bambuseae, the woody bamboos, as currently recognized were not monophyletic because Olyreae, the herbaceous bamboos, were sister to tropical Bambuseae. Temperate Bambuseae were sister to the group consisting of tropical Bambuseae and Olyreae. Thus, the temperate Bambuseae would be better treated as their own tribe Arundinarieae than as a subgroup of Bambuseae. Within the tropical Bambuseae, neotropical Bambuseae were sister to the palaeotropical and Austral Bambuseae. In addition, Melocanninae were found to be sister to the remaining palaeotropical and Austral Bambuseae. We discuss phylogenetic and morphological patterns of diversification and interpret them in a biogeographic context.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The majority of Haemosporida species infect birds or reptiles, but many important genera, including Plasmodium, infect mammals. Dipteran vectors shared by avian, reptilian and mammalian Haemosporida, suggest multiple invasions of Mammalia during haemosporidian evolution; yet, phylogenetic analyses have detected only a single invasion event. Until now, several important mammal-infecting genera have been absent in these analyses. This study focuses on the evolutionary origin of Polychromophilus, a unique malaria genus that only infects bats (Microchiroptera) and is transmitted by bat flies (Nycteribiidae). METHODS: Two species of Polychromophilus were obtained from wild bats caught in Switzerland. These were molecularly characterized using four genes (asl, clpc, coI, cytb) from the three different genomes (nucleus, apicoplast, mitochondrion). These data were then combined with data of 60 taxa of Haemosporida available in GenBank. Bayesian inference, maximum likelihood and a range of rooting methods were used to test specific hypotheses concerning the phylogenetic relationships between Polychromophilus and the other haemosporidian genera. RESULTS: The Polychromophilus melanipherus and Polychromophilus murinus samples show genetically distinct patterns and group according to species. The Bayesian tree topology suggests that the monophyletic clade of Polychromophilus falls within the avian/saurian clade of Plasmodium and directed hypothesis testing confirms the Plasmodium origin. CONCLUSION: Polychromophilus' ancestor was most likely a bird- or reptile-infecting Plasmodium before it switched to bats. The invasion of mammals as hosts has, therefore, not been a unique event in the evolutionary history of Haemosporida, despite the suspected costs of adapting to a new host. This was, moreover, accompanied by a switch in dipteran host.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SUMMARY: A top scoring pair (TSP) classifier consists of a pair of variables whose relative ordering can be used for accurately predicting the class label of a sample. This classification rule has the advantage of being easily interpretable and more robust against technical variations in data, as those due to different microarray platforms. Here we describe a parallel implementation of this classifier which significantly reduces the training time, and a number of extensions, including a multi-class approach, which has the potential of improving the classification performance. AVAILABILITY AND IMPLEMENTATION: Full C++ source code and R package Rgtsp are freely available from http://lausanne.isb-sib.ch/~vpopovic/research/. The implementation relies on existing OpenMP libraries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

MOTIVATION: Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked. RESULTS: More than 1 million hyper-variable internal transcribed spacer 1 (ITS1) sequences of fungal origin have been analyzed. The ITS1 sequences were first properly extracted from 454 reads using generalized profiles. Then, otupipe, cd-hit-454, ESPRIT-Tree and DBC454, a new algorithm presented here, were used to analyze the sequences. A numerical assay was developed to measure the reproducibility and robustness of these algorithms. DBC454 was the most robust, closely followed by ESPRIT-Tree. DBC454 features density-based hierarchical clustering, which complements the other methods by providing insights into the structure of the data. AVAILABILITY: An executable is freely available for non-commercial users at ftp://ftp.vital-it.ch/tools/dbc454. It is designed to run under MPI on a cluster of 64-bit Linux machines running Red Hat 4.x, or on a multi-core OSX system. CONTACT: dbc454@vital-it.ch or nicolas.guex@isb-sib.ch.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study we propose an evaluation of the angular effects altering the spectral response of the land-cover over multi-angle remote sensing image acquisitions. The shift in the statistical distribution of the pixels observed in an in-track sequence of WorldView-2 images is analyzed by means of a kernel-based measure of distance between probability distributions. Afterwards, the portability of supervised classifiers across the sequence is investigated by looking at the evolution of the classification accuracy with respect to the changing observation angle. In this context, the efficiency of various physically and statistically based preprocessing methods in obtaining angle-invariant data spaces is compared and possible synergies are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Axée dans un premier temps sur le formalisme et les méthodes, cette thèse est construite sur trois concepts formalisés: une table de contingence, une matrice de dissimilarités euclidiennes et une matrice d'échange. À partir de ces derniers, plusieurs méthodes d'Analyse des données ou d'apprentissage automatique sont exprimées et développées: l'analyse factorielle des correspondances (AFC), vue comme un cas particulier du multidimensional scaling; la classification supervisée, ou non, combinée aux transformations de Schoenberg; et les indices d'autocorrélation et d'autocorrélation croisée, adaptés à des analyses multivariées et permettant de considérer diverses familles de voisinages. Ces méthodes débouchent dans un second temps sur une pratique de l'analyse exploratoire de différentes données textuelles et musicales. Pour les données textuelles, on s'intéresse à la classification automatique en types de discours de propositions énoncées, en se basant sur les catégories morphosyntaxiques (CMS) qu'elles contiennent. Bien que le lien statistique entre les CMS et les types de discours soit confirmé, les résultats de la classification obtenus avec la méthode K- means, combinée à une transformation de Schoenberg, ainsi qu'avec une variante floue de l'algorithme K-means, sont plus difficiles à interpréter. On traite aussi de la classification supervisée multi-étiquette en actes de dialogue de tours de parole, en se basant à nouveau sur les CMS qu'ils contiennent, mais aussi sur les lemmes et le sens des verbes. Les résultats obtenus par l'intermédiaire de l'analyse discriminante combinée à une transformation de Schoenberg sont prometteurs. Finalement, on examine l'autocorrélation textuelle, sous l'angle des similarités entre diverses positions d'un texte, pensé comme une séquence d'unités. En particulier, le phénomène d'alternance de la longueur des mots dans un texte est observé pour des voisinages d'empan variable. On étudie aussi les similarités en fonction de l'apparition, ou non, de certaines parties du discours, ainsi que les similarités sémantiques des diverses positions d'un texte. Concernant les données musicales, on propose une représentation d'une partition musicale sous forme d'une table de contingence. On commence par utiliser l'AFC et l'indice d'autocorrélation pour découvrir les structures existant dans chaque partition. Ensuite, on opère le même type d'approche sur les différentes voix d'une partition, grâce à l'analyse des correspondances multiples, dans une variante floue, et à l'indice d'autocorrélation croisée. Qu'il s'agisse de la partition complète ou des différentes voix qu'elle contient, des structures répétées sont effectivement détectées, à condition qu'elles ne soient pas transposées. Finalement, on propose de classer automatiquement vingt partitions de quatre compositeurs différents, chacune représentée par une table de contingence, par l'intermédiaire d'un indice mesurant la similarité de deux configurations. Les résultats ainsi obtenus permettent de regrouper avec succès la plupart des oeuvres selon leur compositeur.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work was to evaluate the biochemical composition of six berry types belonging to Fragaria, Rubus, Vaccinium and Ribes genus. Fruit samples were collected in triplicate (50 fruit each) from 18 different species or cultivars of the mentioned genera, during three years (2008 to 2010). Content of individual sugars, organic acids, flavonols, and phenolic acids were determined by high performance liquid chromatography (HPLC) analysis, while total phenolics (TPC) and total antioxidant capacity (TAC), by using spectrophotometry. Principal component analysis (PCA) and hierarchical cluster analysis (CA) were performed to evaluate the differences in fruit biochemical profile. The highest contents of bioactive components were found in Ribes nigrum and in Fragaria vesca, Rubus plicatus, and Vaccinium myrtillus. PCA and CA were able to partially discriminate between berries on the basis of their biochemical composition. Individual and total sugars, myricetin, ellagic acid, TPC and TAC showed the highest impact on biochemical composition of the berry fruits. CA separated blackberry, raspberry, and blueberry as isolate groups, while classification of strawberry, black and red currant in a specific group has not occurred. There is a large variability both between and within the different types of berries. Metabolite fingerprinting of the evaluated berries showed unique biochemical profiles and specific combination of bioactive compound contents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of this study was todo a statistical analysis of ecological type from optical satellite data, using Tipping's sparse Bayesian algorithm. This thesis uses "the Relevence Vector Machine" algorithm in ecological classification betweenforestland and wetland. Further this bi-classification technique was used to do classification of many other different species of trees and produces hierarchical classification of entire subclasses given as a target class. Also, we carried out an attempt to use airborne image of same forest area. Combining it with image analysis, using different image processing operation, we tried to extract good features and later used them to perform classification of forestland and wetland.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES: The aim of this study was to investigate pathological mechanisms underlying brain tissue alterations in mild cognitive impairment (MCI) using multi-contrast 3 T magnetic resonance imaging (MRI). METHODS: Forty-two MCI patients and 77 healthy controls (HC) underwent T1/T2* relaxometry as well as Magnetization Transfer (MT) MRI. Between-groups comparisons in MRI metrics were performed using permutation-based tests. Using MRI data, a generalized linear model (GLM) was computed to predict clinical performance and a support-vector machine (SVM) classification was used to classify MCI and HC subjects. RESULTS: Multi-parametric MRI data showed microstructural brain alterations in MCI patients vs HC that might be interpreted as: (i) a broad loss of myelin/cellular proteins and tissue microstructure in the hippocampus (p ≤ 0.01) and global white matter (p < 0.05); and (ii) iron accumulation in the pallidus nucleus (p ≤ 0.05). MRI metrics accurately predicted memory and executive performances in patients (p ≤ 0.005). SVM classification reached an accuracy of 75% to separate MCI and HC, and performed best using both volumes and T1/T2*/MT metrics. CONCLUSION: Multi-contrast MRI appears to be a promising approach to infer pathophysiological mechanisms leading to brain tissue alterations in MCI. Likewise, parametric MRI data provide powerful correlates of cognitive deficits and improve automatic disease classification based on morphometric features.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES: Specifically we aim to demonstrate that the results of our earlier safety data hold true in this much larger multi-national and multi-ethnical population. BACKGROUND: We sought to re-evaluate the frequency, manifestations, and severity of acute adverse reactions associated with administration of several gadolinium- based contrast agents during routine CMR on a European level. METHODS: Multi-centre, multi-national, and multi-ethnical registry with consecutive enrolment of patients in 57 European centres. RESULTS: During the current observation 37,788 doses of Gadolinium based contrast agent were administered to 37,788 patients. The mean dose was 24.7 ml (range 5-80 ml), which is equivalent to 0.123 mmol/kg (range 0.01 - 0.3 mmol/kg). Forty-five acute adverse reactions due to contrast administration occurred (0.12%). Most reactions were classified as mild (43 of 45) according to the American College of Radiology definition. The most frequent complaints following contrast administration were rashes and hives (15 of 45), followed by nausea (10 of 45) and flushes (10 of 45). The event rate ranged from 0.05% (linear non-ionic agent gadodiamide) to 0.42% (linear ionic agent gadobenate dimeglumine). Interestingly, we also found different event rates between the three main indications for CMR ranging from 0.05% (risk stratification in suspected CAD) to 0.22% (viability in known CAD). CONCLUSIONS: The current data indicate that the results of the earlier safety data hold true in this much larger multi-national and multi-ethnical population. Thus, the "off-label" use of Gadolinium based contrast in cardiovascular MR should be regarded as safe concerning the frequency, manifestation and severity of acute events.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have studied how leaders emerge in a group as a consequence of interactions among its members. We propose that leaders can emerge as a consequence of a self-organized process based on local rules of dyadic interactions among individuals. Flocks are an example of self-organized behaviour in a group and properties similar to those observed in flocks might also explain some of the dynamics and organization of human groups. We developed an agent-based model that generated flocks in a virtual world and implemented it in a multi-agent simulation computer program that computed indices at each time step of the simulation to quantify the degree to which a group moved in a coordinated way (index of flocking behaviour) and the degree to which specific individuals led the group (index of hierarchical leadership). We ran several series of simulations in order to test our model and determine how these indices behaved under specific agent and world conditions. We identified the agent, world property, and model parameters that made stable, compact flocks emerge, and explored possible environmental properties that predicted the probability of becoming a leader.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a multi-stage classifier for magnetic resonance spectra of human brain tumours which is being developed as part of a decision support system for radiologists. The basic idea is to decompose a complex classification scheme into a sequence of classifiers, each specialising in different classes of tumours and trying to reproducepart of the WHO classification hierarchy. Each stage uses a particular set of classification features, which are selected using a combination of classical statistical analysis, splitting performance and previous knowledge.Classifiers with different behaviour are combined using a simple voting scheme in order to extract different error patterns: LDA, decision trees and the k-NN classifier. A special label named "unknown¿ is used when the outcomes of the different classifiers disagree. Cascading is alsoused to incorporate class distances computed using LDA into decision trees. Both cascading and voting are effective tools to improve classification accuracy. Experiments also show that it is possible to extract useful information from the classification process itself in order to helpusers (clinicians and radiologists) to make more accurate predictions and reduce the number of possible classification mistakes.