820 resultados para Data classification
Resumo:
The glasses of the rosette forming the main window of the transept of the Gothic Cathedral of Tarragona have been characterised by means of SEM/EDS, XRD, FTIR and electronic microprobe. The multivariate statistical treatment of these data allow to establish a classification of the samples forming groups having an historical significance and reflecting ancient restorations. Furthermore, the decay patterns and mechanisms have been determined and the weathering by-products characterised. It has been demonstrated a clear influence of the bioactivity in the decay of these glasses, which activity is partially controlled by the chemical composition of the glasses.
Resumo:
This paper presents a validation study on statistical nonsupervised brain tissue classification techniques in magnetic resonance (MR) images. Several image models assuming different hypotheses regarding the intensity distribution model, the spatial model and the number of classes are assessed. The methods are tested on simulated data for which the classification ground truth is known. Different noise and intensity nonuniformities are added to simulate real imaging conditions. No enhancement of the image quality is considered either before or during the classification process. This way, the accuracy of the methods and their robustness against image artifacts are tested. Classification is also performed on real data where a quantitative validation compares the methods' results with an estimated ground truth from manual segmentations by experts. Validity of the various classification methods in the labeling of the image as well as in the tissue volume is estimated with different local and global measures. Results demonstrate that methods relying on both intensity and spatial information are more robust to noise and field inhomogeneities. We also demonstrate that partial volume is not perfectly modeled, even though methods that account for mixture classes outperform methods that only consider pure Gaussian classes. Finally, we show that simulated data results can also be extended to real data.
Resumo:
The objective of this work was to develop a procedure to estimate soybean crop areas in Rio Grande do Sul state, Brazil. Estimations were made based on the temporal profiles of the enhanced vegetation index (Evi) calculated from moderate resolution imaging spectroradiometer (Modis) images. The methodology developed for soybean classification was named Modis crop detection algorithm (MCDA). The MCDA provides soybean area estimates in December (first forecast), using images from the sowing period, and March (second forecast), using images from the sowing and maximum crop development periods. The results obtained by the MCDA were compared with the official estimates on soybean area of the Instituto Brasileiro de Geografia e Estatística. The coefficients of determination ranged from 0.91 to 0.95, indicating good agreement between the estimates. For the 2000/2001 crop year, the MCDA soybean crop map was evaluated using a soybean crop map derived from Landsat images, and the overall map accuracy was approximately 82%, with similar commission and omission errors. The MCDA was able to estimate soybean crop areas in Rio Grande do Sul State and to generate an annual thematic map with the geographic position of the soybean fields. The soybean crop area estimates by the MCDA are in good agreement with the official agricultural statistics.
Resumo:
Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation‑based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi‑resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Among the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, have the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical‑based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data.
Resumo:
In this paper, we consider active sampling to label pixels grouped with hierarchical clustering. The objective of the method is to match the data relationships discovered by the clustering algorithm with the user's desired class semantics. The first is represented as a complete tree to be pruned and the second is iteratively provided by the user. The active learning algorithm proposed searches the pruning of the tree that best matches the labels of the sampled points. By choosing the part of the tree to sample from according to current pruning's uncertainty, sampling is focused on most uncertain clusters. This way, large clusters for which the class membership is already fixed are no longer queried and sampling is focused on division of clusters showing mixed labels. The model is tested on a VHR image in a multiclass classification setting. The method clearly outperforms random sampling in a transductive setting, but cannot generalize to unseen data, since it aims at optimizing the classification of a given cluster structure.
Resumo:
Mature T-cell and T/NK-cell neoplasms are both uncommon and heterogeneous, among the broad category of non-Hodgkin's lymphomas. Due to the lack of specific genetic alterations in the vast majority of cases, most currently defined entities show overlapping morphologic and immunophenotypic features and therefore pose a challenge to the diagnostic pathologist. The goal of the symposium is to address current criteria for the recognition of specific subtypes of T-cell lymphoma, and to highlight new data regarding emerging immunophenotypic or molecular markers. This activity has been designed to meet the needs of practicing pathologists, and residents and fellows enrolled in training programs in anatomic and clinical pathology. It should be a particular benefit to those with an interest in hematopathology. Upon completion of this activity, participants should be better able to: -To be able to state the basis for the classification of mature T-cell malignancies involving nodal and extranodal sites. -To recognize and accurately diagnose the various subtypes of nodal and extranodal peripheral T-cell lymphomas. -To utilize immunohistochemical and molecular tests to characterize atypical T-cell proliferations. -To recognize and accurately diagnose T-cell lymphoproliferative lesions involving the skin and gastrointestinal tract, and be able to provide guidance regarding their clinical aggressiveness and management -To be able to utilize flow cytometric data to identify diverse functional T-cell subsets.
Resumo:
En aquesta memòria es presenta un projecte que té com a objectiu principal la creació d'una aplicació que, donada una imatge d'entrada, retorni en quina dècada va ser adquirida la imatge. Apart es pretén millorar els resultats obtinguts en altres estudis sobre la classificació d'imatges segons la seva data d'adquisició. En la memòria s'explica l'estudi realitzat anteriorment sobre el tema i quin ha estat el mètode escollit per millorar els seus resultats. També s'explica com hem creat l'aplicació i els passos que segueix l'aplicació en la seva execució.
Resumo:
Luokittelujärjestelmää suunniteltaessa tarkoituksena on rakentaa systeemi, joka pystyy ratkaisemaan mahdollisimman tarkasti tutkittavan ongelma-alueen. Hahmontunnistuksessa tunnistusjärjestelmän ydin on luokitin. Luokittelun sovellusaluekenttä on varsin laaja. Luokitinta tarvitaan mm. hahmontunnistusjärjestelmissä, joista kuvankäsittely toimii hyvänä esimerkkinä. Myös lääketieteen parissa tarkkaa luokittelua tarvitaan paljon. Esimerkiksi potilaan oireiden diagnosointiin tarvitaan luokitin, joka pystyy mittaustuloksista päättelemään mahdollisimman tarkasti, onko potilaalla kyseinen oire vai ei. Väitöskirjassa on tehty similaarisuusmittoihin perustuva luokitin ja sen toimintaa on tarkasteltu mm. lääketieteen paristatulevilla data-aineistoilla, joissa luokittelutehtävänä on tunnistaa potilaan oireen laatu. Väitöskirjassa esitetyn luokittimen etuna on sen yksinkertainen rakenne, josta johtuen se on helppo tehdä sekä ymmärtää. Toinen etu on luokittimentarkkuus. Luokitin saadaan luokittelemaan useita eri ongelmia hyvin tarkasti. Tämä on tärkeää varsinkin lääketieteen parissa, missä jo pieni tarkkuuden parannus luokittelutuloksessa on erittäin tärkeää. Väitöskirjassa ontutkittu useita eri mittoja, joilla voidaan mitata samankaltaisuutta. Mitoille löytyy myös useita parametreja, joille voidaan etsiä juuri kyseiseen luokitteluongelmaan sopivat arvot. Tämä parametrien optimointi ongelma-alueeseen sopivaksi voidaan suorittaa mm. evoluutionääri- algoritmeja käyttäen. Kyseisessä työssä tähän on käytetty geneettistä algoritmia ja differentiaali-evoluutioalgoritmia. Luokittimen etuna on sen joustavuus. Ongelma-alueelle on helppo vaihtaa similaarisuusmitta, jos kyseinen mitta ei ole sopiva tutkittavaan ongelma-alueeseen. Myös eri mittojen parametrien optimointi voi parantaa tuloksia huomattavasti. Kun käytetään eri esikäsittelymenetelmiä ennen luokittelua, tuloksia pystytään parantamaan.
Resumo:
In this thesis author approaches the problem of automated text classification, which is one of basic tasks for building Intelligent Internet Search Agent. The work discusses various approaches to solving sub-problems of automated text classification, such as feature extraction and machine learning on text sources. Author also describes her own multiword approach to feature extraction and pres-ents the results of testing this approach using linear discriminant analysis based classifier, and classifier combining unsupervised learning for etalon extraction with supervised learning using common backpropagation algorithm for multilevel perceptron.
Resumo:
Purpose: Wolfram syndrome is a degenerative, recessive rare disease with an onset in childhood. It is caused by mutations in WFS1 or CISD2 genes. More than 200 different variations in WFS1 have been described in patients with Wolfram syndrome, which complicates the establishment of clear genotype-phenotype correlation. The purpose of this study was to elucidate the role of WFS1 mutations and update the natural history of the disease. Methods: This study analyzed clinical and genetic data of 412 patients with Wolfram syndrome published in the last 15 years. Results: (i) 15% of published patients do not fulfill the current inclusion criterion; (ii) genotypic prevalence differences may exist among countries; (iii) diabetes mellitus and optic atrophy might not be the first two clinical features in some patients; (iv) mutations are nonuniformly distributed in WFS1; (v) age at onset of diabetes mellitus, hearing defects, and diabetes insipidus may depend on the patient"s genotypic class; and (vi) disease progression rate might depend on genotypic class. Conclusion: New genotype-phenotype correlations were established, disease progression rate for the general population and for the genotypic classes has been calculated, and new diagnostic criteria have been proposed. The conclusions raised could be important for patient management and counseling as well as for the development of treatments for Wolfram syndrome.
Resumo:
The increase of publicly available sequencing data has allowed for rapid progress in our understanding of genome composition. As new information becomes available we should constantly be updating and reanalyzing existing and newly acquired data. In this report we focus on transposable elements (TEs) which make up a significant portion of nearly all sequenced genomes. Our ability to accurately identify and classify these sequences is critical to understanding their impact on host genomes. At the same time, as we demonstrate in this report, problems with existing classification schemes have led to significant misunderstandings of the evolution of both TE sequences and their host genomes. In a pioneering publication Finnegan (1989) proposed classifying all TE sequences into two classes based on transposition mechanisms and structural features: the retrotransposons (class I) and the DNA transposons (class II). We have retraced how ideas regarding TE classification and annotation in both prokaryotic and eukaryotic scientific communities have changed over time. This has led us to observe that: (1) a number of TEs have convergent structural features and/or transposition mechanisms that have led to misleading conclusions regarding their classification, (2) the evolution of TEs is similar to that of viruses by having several unrelated origins, (3) there might be at least 8 classes and 12 orders of TEs including 10 novel orders. In an effort to address these classification issues we propose: (1) the outline of a universal TE classification, (2) a set of methods and classification rules that could be used by all scientific communities involved in the study of TEs, and (3) a 5-year schedule for the establishment of an International Committee for Taxonomy of Transposable Elements (ICTTE).
Resumo:
PURPOSE: To develop a consensus opinion regarding capturing diagnosis-timing in coded hospital data. METHODS: As part of the World Health Organization International Classification of Diseases-11th Revision initiative, the Quality and Safety Topic Advisory Group is charged with enhancing the capture of quality and patient safety information in morbidity data sets. One such feature is a diagnosis-timing flag. The Group has undertaken a narrative literature review, scanned national experiences focusing on countries currently using timing flags, and held a series of meetings to derive formal recommendations regarding diagnosis-timing reporting. RESULTS: The completeness of diagnosis-timing reporting continues to improve with experience and use; studies indicate that it enhances risk-adjustment and may have a substantial impact on hospital performance estimates, especially for conditions/procedures that involve acutely ill patients. However, studies suggest that its reliability varies, is better for surgical than medical patients (kappa in hip fracture patients of 0.7-1.0 versus kappa in pneumonia of 0.2-0.6) and is dependent on coder training and setting. It may allow simpler and more precise specification of quality indicators. CONCLUSIONS: As the evidence indicates that a diagnosis-timing flag improves the ability of routinely collected, coded hospital data to support outcomes research and the development of quality and safety indicators, the Group recommends that a classification of 'arising after admission' (yes/no), with permitted designations of 'unknown or clinically undetermined', will facilitate coding while providing flexibility when there is uncertainty. Clear coding standards and guidelines with ongoing coder education will be necessary to ensure reliability of the diagnosis-timing flag.
Resumo:
The Commission on Classification and Terminology and the Commission on Epidemiology of the International League Against Epilepsy (ILAE) have charged a Task Force to revise concepts, definition, and classification of status epilepticus (SE). The proposed new definition of SE is as follows: Status epilepticus is a condition resulting either from the failure of the mechanisms responsible for seizure termination or from the initiation of mechanisms, which lead to abnormally, prolonged seizures (after time point t1 ). It is a condition, which can have long-term consequences (after time point t2 ), including neuronal death, neuronal injury, and alteration of neuronal networks, depending on the type and duration of seizures. This definition is conceptual, with two operational dimensions: the first is the length of the seizure and the time point (t1 ) beyond which the seizure should be regarded as "continuous seizure activity." The second time point (t2 ) is the time of ongoing seizure activity after which there is a risk of long-term consequences. In the case of convulsive (tonic-clonic) SE, both time points (t1 at 5 min and t2 at 30 min) are based on animal experiments and clinical research. This evidence is incomplete, and there is furthermore considerable variation, so these time points should be considered as the best estimates currently available. Data are not yet available for other forms of SE, but as knowledge and understanding increase, time points can be defined for specific forms of SE based on scientific evidence and incorporated into the definition, without changing the underlying concepts. A new diagnostic classification system of SE is proposed, which will provide a framework for clinical diagnosis, investigation, and therapeutic approaches for each patient. There are four axes: (1) semiology; (2) etiology; (3) electroencephalography (EEG) correlates; and (4) age. Axis 1 (semiology) lists different forms of SE divided into those with prominent motor systems, those without prominent motor systems, and currently indeterminate conditions (such as acute confusional states with epileptiform EEG patterns). Axis 2 (etiology) is divided into subcategories of known and unknown causes. Axis 3 (EEG correlates) adopts the latest recommendations by consensus panels to use the following descriptors for the EEG: name of pattern, morphology, location, time-related features, modulation, and effect of intervention. Finally, axis 4 divides age groups into neonatal, infancy, childhood, adolescent and adulthood, and elderly.
Resumo:
An important aspect of immune monitoring for vaccine development, clinical trials, and research is the detection, measurement, and comparison of antigen-specific T-cells from subject samples under different conditions. Antigen-specific T-cells compose a very small fraction of total T-cells. Developments in cytometry technology over the past five years have enabled the measurement of single-cells in a multivariate and high-throughput manner. This growth in both dimensionality and quantity of data continues to pose a challenge for effective identification and visualization of rare cell subsets, such as antigen-specific T-cells. Dimension reduction and feature extraction play pivotal role in both identifying and visualizing cell populations of interest in large, multi-dimensional cytometry datasets. However, the automated identification and visualization of rare, high-dimensional cell subsets remains challenging. Here we demonstrate how a systematic and integrated approach combining targeted feature extraction with dimension reduction can be used to identify and visualize biological differences in rare, antigen-specific cell populations. By using OpenCyto to perform semi-automated gating and features extraction of flow cytometry data, followed by dimensionality reduction with t-SNE we are able to identify polyfunctional subpopulations of antigen-specific T-cells and visualize treatment-specific differences between them.
Resumo:
The main objective of the study is to form a framework that provides tools to recognise and classify items whose demand is not smooth but varies highly on size and/or frequency. The framework will then be combined with two other classification methods in order to form a three-dimensional classification model. Forecasting and inventory control of these abnormal demand items is difficult. Therefore another object of this study is to find out which statistical forecasting method is most suitable for forecasting of abnormal demand items. The accuracy of different methods is measured by comparing the forecast to the actual demand. Moreover, the study also aims at finding proper alternatives to the inventory control of abnormal demand items. The study is quantitative and the methodology is a case study. The research methods consist of theory, numerical data, current state analysis and testing of the framework in case company. The results of the study show that the framework makes it possible to recognise and classify the abnormal demand items. It is also noticed that the inventory performance of abnormal demand items differs significantly from the performance of smoothly demanded items. This makes the recognition of abnormal demand items very important.