930 resultados para classification algorithm
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
BACKGROUND: To compare the prognostic relevance of Masaoka and Müller-Hermelink classifications. METHODS: We treated 71 patients with thymic tumors at our institution between 1980 and 1997. Complete follow-up was achieved in 69 patients (97%) with a mean follow up-time of 8.3 years (range, 9 months to 17 years). RESULTS: Masaoka stage I was found in 31 patients (44.9%), stage II in 17 (24.6%), stage III in 19 (27.6%), and stage IV in 2 (2.9%). The 10-year overall survival rate was 83.5% for stage I, 100% for stage IIa, 58% for stage IIb, 44% for stage III, and 0% for stage IV. The disease-free survival rates were 100%, 70%, 40%, 38%, and 0%, respectively. Histologic classification according to Müller-Hermelink found medullary tumors in 7 patients (10.1%), mixed in 18 (26.1%), organoid in 14 (20.3%), cortical in 11 (15.9%), well-differentiated thymic carcinoma in 14 (20.3%), and endocrine carcinoma in 5 (7.3%), with 10-year overall survival rates of 100%, 75%, 92%, 87.5%, 30%, and 0%, respectively, and 10-year disease-free survival rates of 100%, 100%, 77%, 75%, 37%, and 0%, respectively. Medullary, mixed, and well-differentiated organoid tumors were correlated with stage I and II, and well-differentiated thymic carcinoma and endocrine carcinoma with stage III and IV (p < 0.001). Multivariate analysis showed age, gender, myasthenia gravis, and postoperative adjuvant therapy not to be significant predictors of overall and disease-free survival after complete resection, whereas the Müller-Hermelink and Masaoka classifications were independent significant predictors for overall (p < 0.05) and disease-free survival (p < 0.004; p < 0.0001). CONCLUSIONS: The consideration of staging and histology in thymic tumors has the potential to improve recurrence prediction and patient selection for combined treatment modalities.
Resumo:
ABSTRACT Preservation of mangroves, a very significant ecosystem from a social, economic, and environmental viewpoint, requires knowledge on soil composition, genesis, morphology, and classification. These aspects are of paramount importance to understand the dynamics of sustainability and preservation of this natural resource. In this study mangrove soils in the Subaé river basin were described and classified and inorganic waste concentrations evaluated. Seven pedons of mangrove soil were chosen, five under fluvial influence and two under marine influence and analyzed for morphology. Samples of horizons and layers were collected for physical and chemical analyses, including heavy metals (Pb, Cd, Mn, Zn, and Fe). The moist soils were suboxidic, with Eh values below 350 mV. The pH level of the pedons under fluvial influence ranged from moderately acid to alkaline, while the pH in pedons under marine influence was around 7.0 throughout the profile. The concentration of cations in the sorting complex for all pedons, independent of fluvial or marine influence, indicated the following order: Na+>Mg2+>Ca2+>K+. Mangrove soils from the Subaé river basin under fluvial and marine influence had different morphological, physical, and chemical characteristics. The highest Pb and Cd concentrations were found in the pedons under fluvial influence, perhaps due to their closeness to the mining company Plumbum, while the concentrations in pedon P7 were lowest, due to greater distance from the factory. For containing at least one metal above the reference levels established by the National Oceanic and Atmospheric Administration (United States Environmental Protection Agency), the pedons were classified as potentially toxic. The soils were classified as Gleissolos Tiomórficos Órticos (sálicos) sódico neofluvissólico in according to the Brazilian Soil Classification System, indicating potential toxicity and very poor drainage, except for pedon P7, which was classified in the same subgroup as the others, but different in that the metal concentrations met acceptable standards.
Resumo:
As part of its 2006 systemic evaluation of DOC’s facilities, operations and programming, the Durrant/PBA consulting group found several shortcomings with the Department’s inmate custody classification system. Specifically, the consultants found that the system:
Resumo:
Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.
Resumo:
Map produced by Iowa Department of Transportation of System Classification.
Resumo:
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.
Resumo:
When dealing with multi-angular image sequences, problems of reflectance changes due either to illumination and acquisition geometry, or to interactions with the atmosphere, naturally arise. These phenomena interplay with the scene and lead to a modification of the measured radiance: for example, according to the angle of acquisition, tall objects may be seen from top or from the side and different light scatterings may affect the surfaces. This results in shifts in the acquired radiance, that make the problem of multi-angular classification harder and might lead to catastrophic results, since surfaces with the same reflectance return significantly different signals. In this paper, rather than performing atmospheric or bi-directional reflection distribution function (BRDF) correction, a non-linear manifold learning approach is used to align data structures. This method maximizes the similarity between the different acquisitions by deforming their manifold, thus enhancing the transferability of classification models among the images of the sequence.
Resumo:
For several years, the lack of consensus on definition, nomenclature, natural history, and biology of serrated polyps (SPs) of the colon has created considerable confusion among pathologists. According to the latest WHO classification, the family of SPs comprises hyperplastic polyps (HPs), sessile serrated adenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs). The term SSA/P with dysplasia has replaced the category of mixed hyperplastic/adenomatous polyps (MPs). The present study aimed to evaluate the reproducibility of the diagnosis of SPs based on currently available diagnostic criteria and interactive consensus development. In an initial round, H&E slides of 70 cases of SPs were circulated among participating pathologists across Europe. This round was followed by a consensus discussion on diagnostic criteria. A second round was performed on the same 70 cases using the revised criteria and definitions according to the recent WHO classification. Data were evaluated for inter-observer agreement using Kappa statistics. In the initial round, for the total of 70 cases, a fair overall kappa value of 0.318 was reached, while in the second round overall kappa value improved to moderate (kappa = 0.557; p < 0.001). Overall kappa values for each diagnostic category also significantly improved in the final round, reaching 0.977 for HP, 0.912 for SSA/P, and 0.845 for TSA (p < 0.001). The diagnostic reproducibility of SPs improves when strictly defined, standardized diagnostic criteria adopted by consensus are applied.
Resumo:
The primary goal of this project is to demonstrate the accuracy and utility of a freezing drizzle algorithm that can be implemented on roadway environmental sensing systems (ESSs). The types of problems related to the occurrence of freezing precipitation range from simple traffic delays to major accidents that involve fatalities. Freezing drizzle can also lead to economic impacts in communities with lost work hours, vehicular damage, and downed power lines. There are means for transportation agencies to perform preventive and reactive treatments to roadways, but freezing drizzle can be difficult to forecast accurately or even detect as weather radar and surface observation networks poorly observe these conditions. The detection of freezing precipitation is problematic and requires special instrumentation and analysis. The Federal Aviation Administration (FAA) development of aircraft anti-icing and deicing technologies has led to the development of a freezing drizzle algorithm that utilizes air temperature data and a specialized sensor capable of detecting ice accretion. However, at present, roadway ESSs are not capable of reporting freezing drizzle. This study investigates the use of the methods developed for the FAA and the National Weather Service (NWS) within a roadway environment to detect the occurrence of freezing drizzle using a combination of icing detection equipment and available ESS sensors. The work performed in this study incorporated the algorithm developed initially and further modified for work with the FAA for aircraft icing. The freezing drizzle algorithm developed for the FAA was applied using data from standard roadway ESSs. The work performed in this study lays the foundation for addressing the central question of interest to winter maintenance professionals as to whether it is possible to use roadside freezing precipitation detection (e.g., icing detection) sensors to determine the occurrence of pavement icing during freezing precipitation events and the rates at which this occurs.