923 resultados para classification aided by clustering
Resumo:
Among the soils in the Mato Grosso do Sul, stand out in the Pantanal biome, the Spodosols. Despite being recorded in considerable extensions, few studies aiming to characterize and classify these soils were performed. The purpose of this study was to characterize and classify soils in three areas of two physiographic types in the Taquari river basin: bay and flooded fields. Two trenches were opened in the bay area (P1 and P2) and two in the flooded field (P3 and P4). The third area (saline) with high sodium levels was sampled for further studies. In the soils in both areas the sand fraction was predominant and the texture from sand to sandy loam, with the main constituent quartz. In the bay area, the soil organic carbon in the surface layer (P1) was (OC) > 80 g kg-1, being diagnosed as Histic epipedon. In the other profiles the surface horizons had low OC levels which, associated with other properties, classified them as Ochric epipedons. In the soils of the bay area (P1 and P2), the pH ranged from 5.0 to 7.5, associated with dominance of Ca2+ and Mg2+, with base saturation above 50 % in some horizons. In the flooded fields (P3 and P4) the soil pH ranged from 4.9 to 5.9, H+ contents were high in the surface horizons (0.8-10.5 cmol c kg-1 ), Ca2+ and Mg² contents ranged from 0.4 to 0.8 cmol c kg-1 and base saturation was < 50 %. In the soils of the bay area (P1 and P2) iron was accumulated (extracted by dithionite - Fed) and OC in the spodic horizon; in the P3 and P4 soils only Fed was accumulated (in the subsurface layers). According to the criteria adopted by the Brazilian System of Soil Classification (SiBCS) at the subgroup level, the soils were classified as: P1: Organic Hydromorphic Ferrohumiluvic Spodosol. P2: Typical Orthic Ferrohumiluvic Spodosol. P3: Typical Hydromorphic Ferroluvic Spodosol. P4: Arenic Orthic Ferroluvic Spodosol.
Resumo:
Myasthenia gravis (MG) can be difficult to treat despite an available therapeutic armamentarium. Our aim was to analyze the factors leading to unsatisfactory outcome (UO). To this end we used the Myasthenia Gravis Foundation of America classification system. Forty one patients with autoimmune MG were followed prospectively from January 2003 to December 2007. Outcomes were assessed throughout follow-up and at a final visit. 'Unchanged', 'worse', 'exacerbation' and 'died of MG' post-intervention status were considered UOs. During follow-up, UO rates reached 54% and were related to undertreatment (41%), poor treatment compliance (23%), infections (23%), and adverse drug effects (13%). The UO rate at final study assessment was 20%. UO during follow-up was significantly (P = 0.004) predictive of UOs at final assessment. When care was provided by neuromuscular (NM) specialists, patients had significantly better follow-up scores (P = 0.01). At final assessment UO rates were 7% and significantly better in patients treated by NM specialists, compared to other physicians where UO rates reached 27%. UO was a frequent finding occurring in more than half our patients during follow-up. Nearly two-thirds of the UOs could have been prevented by appropriate therapeutic adjustments and improved compliance. The differential UO rates at follow-up, their dependency on the degree to which the management was specialized and their correlation with final outcomes suggest that specialized MG care improves outcomes.
Resumo:
Soil surveys are the main source of spatial information on soils and have a range of different applications, mainly in agriculture. The continuity of this activity has however been severely compromised, mainly due to a lack of governmental funding. The purpose of this study was to evaluate the feasibility of two different classifiers (artificial neural networks and a maximum likelihood algorithm) in the prediction of soil classes in the northwest of the state of Rio de Janeiro. Terrain attributes such as elevation, slope, aspect, plan curvature and compound topographic index (CTI) and indices of clay minerals, iron oxide and Normalized Difference Vegetation Index (NDVI), derived from Landsat 7 ETM+ sensor imagery, were used as discriminating variables. The two classifiers were trained and validated for each soil class using 300 and 150 samples respectively, representing the characteristics of these classes in terms of the discriminating variables. According to the statistical tests, the accuracy of the classifier based on artificial neural networks (ANNs) was greater than of the classic Maximum Likelihood Classifier (MLC). Comparing the results with 126 points of reference showed that the resulting ANN map (73.81 %) was superior to the MLC map (57.94 %). The main errors when using the two classifiers were caused by: a) the geological heterogeneity of the area coupled with problems related to the geological map; b) the depth of lithic contact and/or rock exposure, and c) problems with the environmental correlation model used due to the polygenetic nature of the soils. This study confirms that the use of terrain attributes together with remote sensing data by an ANN approach can be a tool to facilitate soil mapping in Brazil, primarily due to the availability of low-cost remote sensing data and the ease by which terrain attributes can be obtained.
Resumo:
Since different pedologists will draw different soil maps of a same area, it is important to compare the differences between mapping by specialists and mapping techniques, as for example currently intensively discussed Digital Soil Mapping. Four detailed soil maps (scale 1:10.000) of a 182-ha sugarcane farm in the county of Rafard, São Paulo State, Brazil, were compared. The area has a large variation of soil formation factors. The maps were drawn independently by four soil scientists and compared with a fifth map obtained by a digital soil mapping technique. All pedologists were given the same set of information. As many field expeditions and soil pits as required by each surveyor were provided to define the mapping units (MUs). For the Digital Soil Map (DSM), spectral data were extracted from Landsat 5 Thematic Mapper (TM) imagery as well as six terrain attributes from the topographic map of the area. These data were summarized by principal component analysis to generate the map designs of groups through Fuzzy K-means clustering. Field observations were made to identify the soils in the MUs and classify them according to the Brazilian Soil Classification System (BSCS). To compare the conventional and digital (DSM) soil maps, they were crossed pairwise to generate confusion matrices that were mapped. The categorical analysis at each classification level of the BSCS showed that the agreement between the maps decreased towards the lower levels of classification and the great influence of the surveyor on both the mapping and definition of MUs in the soil map. The average correspondence between the conventional and DSM maps was similar. Therefore, the method used to obtain the DSM yielded similar results to those obtained by the conventional technique, while providing additional information about the landscape of each soil, useful for applications in future surveys of similar areas.
Resumo:
Considering that information from soil reflectance spectra is underutilized in soil classification, this paper aimed to evaluate the relationship of soil physical, chemical properties and their spectra, to identify spectral patterns for soil classes, evaluate the use of numerical classification of profiles combined with spectral data for soil classification. We studied 20 soil profiles from the municipality of Piracicaba, State of São Paulo, Brazil, which were morphologically described and classified up to the 3rd category level of the Brazilian Soil Classification System (SiBCS). Subsequently, soil samples were collected from pedogenetic horizons and subjected to soil particle size and chemical analyses. Their Vis-NIR spectra were measured, followed by principal component analysis. Pearson's linear correlation coefficients were determined among the four principal components and the following soil properties: pH, organic matter, P, K, Ca, Mg, Al, CEC, base saturation, and Al saturation. We also carried out interpretation of the first three principal components and their relationships with soil classes defined by SiBCS. In addition, numerical classification of the profiles based on the OSACA algorithm was performed using spectral data as a basis. We determined the Normalized Mutual Information (NMI) and Uncertainty Coefficient (U). These coefficients represent the similarity between the numerical classification and the soil classes from SiBCS. Pearson's correlation coefficients were significant for the principal components when compared to sand, clay, Al content and soil color. Visual analysis of the principal component scores showed differences in the spectral behavior of the soil classes, mainly among Argissolos and the others soils. The NMI and U similarity coefficients showed values of 0.74 and 0.64, respectively, suggesting good similarity between the numerical and SiBCS classes. For example, numerical classification correctly distinguished Argissolos from Latossolos and Nitossolos. However, this mathematical technique was not able to distinguish Latossolos from Nitossolos Vermelho férricos, but the Cambissolos were well differentiated from other soil classes. The numerical technique proved to be effective and applicable to the soil classification process.
Resumo:
We present a generator of random networks where both the degree-dependent clustering coefficient and the degree distribution are tunable. Following the same philosophy as in the configuration model, the degree distribution and the clustering coefficient for each class of nodes of degree k are fixed ad hoc and a priori. The algorithm generates corresponding topologies by applying first a closure of triangles and second the classical closure of remaining free stubs. The procedure unveils an universal relation among clustering and degree-degree correlations for all networks, where the level of assortativity establishes an upper limit to the level of clustering. Maximum assortativity ensures no restriction on the decay of the clustering coefficient whereas disassortativity sets a stronger constraint on its behavior. Correlation measures in real networks are seen to observe this structural bound.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
ABSTRACT Preservation of mangroves, a very significant ecosystem from a social, economic, and environmental viewpoint, requires knowledge on soil composition, genesis, morphology, and classification. These aspects are of paramount importance to understand the dynamics of sustainability and preservation of this natural resource. In this study mangrove soils in the Subaé river basin were described and classified and inorganic waste concentrations evaluated. Seven pedons of mangrove soil were chosen, five under fluvial influence and two under marine influence and analyzed for morphology. Samples of horizons and layers were collected for physical and chemical analyses, including heavy metals (Pb, Cd, Mn, Zn, and Fe). The moist soils were suboxidic, with Eh values below 350 mV. The pH level of the pedons under fluvial influence ranged from moderately acid to alkaline, while the pH in pedons under marine influence was around 7.0 throughout the profile. The concentration of cations in the sorting complex for all pedons, independent of fluvial or marine influence, indicated the following order: Na+>Mg2+>Ca2+>K+. Mangrove soils from the Subaé river basin under fluvial and marine influence had different morphological, physical, and chemical characteristics. The highest Pb and Cd concentrations were found in the pedons under fluvial influence, perhaps due to their closeness to the mining company Plumbum, while the concentrations in pedon P7 were lowest, due to greater distance from the factory. For containing at least one metal above the reference levels established by the National Oceanic and Atmospheric Administration (United States Environmental Protection Agency), the pedons were classified as potentially toxic. The soils were classified as Gleissolos Tiomórficos Órticos (sálicos) sódico neofluvissólico in according to the Brazilian Soil Classification System, indicating potential toxicity and very poor drainage, except for pedon P7, which was classified in the same subgroup as the others, but different in that the metal concentrations met acceptable standards.
Resumo:
PURPOSE: To objectively characterize different heart tissues from functional and viability images provided by composite-strain-encoding (C-SENC) MRI. MATERIALS AND METHODS: C-SENC is a new MRI technique for simultaneously acquiring cardiac functional and viability images. In this work, an unsupervised multi-stage fuzzy clustering method is proposed to identify different heart tissues in the C-SENC images. The method is based on sequential application of the fuzzy c-means (FCM) and iterative self-organizing data (ISODATA) clustering algorithms. The proposed method is tested on simulated heart images and on images from nine patients with and without myocardial infarction (MI). The resulting clustered images are compared with MRI delayed-enhancement (DE) viability images for determining MI. Also, Bland-Altman analysis is conducted between the two methods. RESULTS: Normal myocardium, infarcted myocardium, and blood are correctly identified using the proposed method. The clustered images correctly identified 90 +/- 4% of the pixels defined as infarct in the DE images. In addition, 89 +/- 5% of the pixels defined as infarct in the clustered images were also defined as infarct in DE images. The Bland-Altman results show no bias between the two methods in identifying MI. CONCLUSION: The proposed technique allows for objectively identifying divergent heart tissues, which would be potentially important for clinical decision-making in patients with MI.
Resumo:
MOTIVATION: Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked. RESULTS: More than 1 million hyper-variable internal transcribed spacer 1 (ITS1) sequences of fungal origin have been analyzed. The ITS1 sequences were first properly extracted from 454 reads using generalized profiles. Then, otupipe, cd-hit-454, ESPRIT-Tree and DBC454, a new algorithm presented here, were used to analyze the sequences. A numerical assay was developed to measure the reproducibility and robustness of these algorithms. DBC454 was the most robust, closely followed by ESPRIT-Tree. DBC454 features density-based hierarchical clustering, which complements the other methods by providing insights into the structure of the data. AVAILABILITY: An executable is freely available for non-commercial users at ftp://ftp.vital-it.ch/tools/dbc454. It is designed to run under MPI on a cluster of 64-bit Linux machines running Red Hat 4.x, or on a multi-core OSX system. CONTACT: dbc454@vital-it.ch or nicolas.guex@isb-sib.ch.
Resumo:
Determination of the precise composition and variation of microbiota in cystic fibrosis lungs is crucial since chronic inflammation due to microorganisms leads to lung damage and ultimately, death. However, this constitutes a major technical challenge. Culturing of microorganisms does not provide a complete representation of a microbiota, even when using culturomics (high-throughput culture). So far, only PCR-based metagenomics have been investigated. However, these methods are biased towards certain microbial groups, and suffer from uncertain quantification of the different microbial domains. We have explored whole genome sequencing (WGS) using the Illumina high-throughput technology applied directly to DNA extracted from sputa obtained from two cystic fibrosis patients. To detect all microorganism groups, we used four procedures for DNA extraction, each with a different lysis protocol. We avoided biases due to whole DNA amplification thanks to the high efficiency of current Illumina technology. Phylogenomic classification of the reads by three different methods produced similar results. Our results suggest that WGS provides, in a single analysis, a better qualitative and quantitative assessment of microbiota compositions than cultures and PCRs. WGS identified a high quantity of Haemophilus spp. (patient 1) or Staphylococcus spp. plus Streptococcus spp. (patient 2) together with low amounts of anaerobic (Veillonella, Prevotella, Fusobacterium) and aerobic bacteria (Gemella, Moraxella, Granulicatella). WGS suggested that fungal members represented very low proportions of the microbiota, which were detected by cultures and PCRs because of their selectivity. The future increase of reads' sizes and decrease in cost should ensure the usefulness of WGS for the characterisation of microbiota.
Resumo:
Map produced by Iowa Department of Transportation of System Classification.
Resumo:
For several years, the lack of consensus on definition, nomenclature, natural history, and biology of serrated polyps (SPs) of the colon has created considerable confusion among pathologists. According to the latest WHO classification, the family of SPs comprises hyperplastic polyps (HPs), sessile serrated adenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs). The term SSA/P with dysplasia has replaced the category of mixed hyperplastic/adenomatous polyps (MPs). The present study aimed to evaluate the reproducibility of the diagnosis of SPs based on currently available diagnostic criteria and interactive consensus development. In an initial round, H&E slides of 70 cases of SPs were circulated among participating pathologists across Europe. This round was followed by a consensus discussion on diagnostic criteria. A second round was performed on the same 70 cases using the revised criteria and definitions according to the recent WHO classification. Data were evaluated for inter-observer agreement using Kappa statistics. In the initial round, for the total of 70 cases, a fair overall kappa value of 0.318 was reached, while in the second round overall kappa value improved to moderate (kappa = 0.557; p < 0.001). Overall kappa values for each diagnostic category also significantly improved in the final round, reaching 0.977 for HP, 0.912 for SSA/P, and 0.845 for TSA (p < 0.001). The diagnostic reproducibility of SPs improves when strictly defined, standardized diagnostic criteria adopted by consensus are applied.
Resumo:
The use of multiple legal and illegal substances by adolescents is a growing concern in all countries, but since no consensus about a taxonomy did emerge yet, it is difficult to understand the different patterns of consumption and to implement tailored prevention and treatment programs directed towards specific subgroups of the adolescent population. Using data from a Swiss survey on adolescent health, we analyzed the age at which ten legal and illegal substances were consumed for the first time ever by applying a method combining the strength of both automatic clustering and use of substance experts. Results were then compared to 30 socio-economic factors to establish the usefulness of and to validate our taxonomy. We also analyzed the succession of substance first use for each group. The final taxonomy consists of eight groups ranging from non-consumers to heavy drug addicts. All but four socio-economic factors were significantly associated with the taxonomy, the strongest associations being observed with health, behavior, and sexuality factors. Numerous factors influence adolescents in their decision to first try substances or to use them on a regular basis, and no factor alone can be considered as an absolute marker of problematic behavior regarding substance use. Different processes of experimentation with substances are associated with different behaviors, therefore focusing on only one substance or only one factor is not efficient. Prevention and treatment programs can then be tailored to address specific issues related to different youth subgroups.
Resumo:
This document Classifications and Pay Plans is produced by the State of Iowa Executive Branch, Department of Administrative Services. Informational document about the pay plan codes and classification codes, how to use them.