985 resultados para Patterns recognition
Resumo:
Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming.
Resumo:
In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results for speaker recognition shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone, and for speaker identification can reduce the minimum detection cost function with saturated test sentences from 6.42% to 4.15%, while the results with clean speech (without saturation) is 5.74% for one microphone and 7.02% for the other one.
Resumo:
In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used, and Principal Component Analysis (PCA) is applied in order to study which is the best number of components for the classification task, implemented by means of a Support Vector Machine (SVM) System. Obtained results are satisfactory, and compared with [4] our system improves the recognition success, diminishing the variance at the same time.
Resumo:
In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used to characterize the leaves. Independent Component Analysis (ICA) is then applied in order to study which is the best number of components to be considered for the classification task, implemented by means of an Artificial Neural Network (ANN). Obtained results with ICA as a pre-processing tool are satisfactory, and compared with some references our system improves the recognition success up to 80.8% depending on the number of considered independent components.
Resumo:
In this work we explore the multivariate empirical mode decomposition combined with a Neural Network classifier as technique for face recognition tasks. Images are simultaneously decomposed by means of EMD and then the distance between the modes of the image and the modes of the representative image of each class is calculated using three different distance measures. Then, a neural network is trained using 10- fold cross validation in order to derive a classifier. Preliminary results (over 98 % of classification rate) are satisfactory and will justify a deep investigation on how to apply mEMD for face recognition.
Resumo:
New plate-tectonic reconstructions of the Gondwana margin suggest that the location of Gondwana-derived terranes should not only be guided by the models, but should also consider the possible detrital input from some Asian blocks (Hunia), supposed to have been located along the Cambrian Gondwana margin, and accreted in the Silurian to the North-Chinese block. Consequently, the Gondwana margin has to be subdivided into a more western domain, where the future Avalonian blocks will be separated from Gondwana by the opening Rheic Ocean, whereas in its eastern continuation, hosting the future basement areas of Central Europe, different periods of crustal extension should be distinguished. Instead of applying a rather cylindrical model, it is supposed that crustal extension follows a much more complex pattern, where local back-arcs or intra-continental rifts are involved. Guided by the age data of magmatic rocks and the pattern of subsidence curves, the following extensional events can be distinguished: During the early to middle Cambrian, a back-arc setting guided the evolution at the Gondwana margin. Contemporaneous intra-continental rift basins developed at other places related to a general post-PanAfrican extensional phase affecting Africa Upper Cambrian formation of oceanic crust is manifested in the Chamrousse area, and may have lateral cryptic relics preserved in other places. This is regarded as the oceanisation of some marginal basins in a context of back-arc rifting. These basins were closed in a mid-Ordovician tectonic phase, related to the subduction of buoyant material (mid-ocean ridge?) Since the Early Ordovician, a new phase of extension is observed, accompanied by a large-scale volcanic activity, erosion of the rift shoulders generated detritus (Armorican Quartzite) and the rift basins collected detrital zircons from a wide hinterland. This phase heralded the opening of Palaeotethys, but it failed due to the Silurian collision (Eo-Variscan phase) of an intra-oceanic arc with the Gondwana margin. During this time period, at the eastern wing of the Gondwana margin begins the drift of the future Hunia microcontinents, through the opening of an eastern prolongation of the already existing Rheic Ocean. The passive margin of the remaining Gondwana was composed of the Galatian superterranes, constituents of the future Variscan basement areas. Remaining under the influence of crustal extension, they will start their drift to Laurussia since the earliest Devonian during the opening of the Palaeotethys Ocean. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The infiltration of river water into aquifers is of high relevance to drinking-water production and is a key driver of biogeochemical processes in the hyporheic and riparian zone, but the distribution and quantification of the infiltrating water are difficult to determine using conventional hydrological methods (e.g., borehole logging and tracer tests). By time-lapse inverting crosshole ERT (electrical resistivity tomography) monitoring data, we imaged groundwater flow patterns driven by river water infiltrating a perialpine gravel aquifer in northeastern Switzerland. This was possible because the electrical resistivity of the infiltrating water changed during rainfall-runoff events. Our time-lapse resistivity models indicated rather complex flow patterns as a result of spatially heterogeneous bank filtration and aquifer heterogeneity. The upper part of the aquifer was most affected by the river infiltrate, and the highest groundwater velocities and possible preferential flow occurred at shallow to intermediate depths. Time series of the reconstructed resistivity models matched groundwater electrical resistivity data recorded on borehole loggers in the upper and middle parts of the aquifer, whereas the resistivity models displayed smaller variations and delayed responses with respect to the logging data. in the lower part. This study demonstrated that crosshole ERT monitoring of natural electrical resistivity variations of river infiltrate could be used to image and quantify 3D bank filtration and aquifer dynamics at a high spatial resolution.
Resumo:
The value of earmarks as an efficient means of personal identification is still subject to debate. It has been argued that the field is lacking a firm systematic and structured data basis to help practitioners to form their conclusions. Typically, there is a paucity of research guiding as to the selectivity of the features used in the comparison process between an earmark and reference earprints taken from an individual. This study proposes a system for the automatic comparison of earprints and earmarks, operating without any manual extraction of key-points or manual annotations. For each donor, a model is created using multiple reference prints, hence capturing the donor within source variability. For each comparison between a mark and a model, images are automatically aligned and a proximity score, based on a normalized 2D correlation coefficient, is calculated. Appropriate use of this score allows deriving a likelihood ratio that can be explored under known state of affairs (both in cases where it is known that the mark has been left by the donor that gave the model and conversely in cases when it is established that the mark originates from a different source). To assess the system performance, a first dataset containing 1229 donors elaborated during the FearID research project was used. Based on these data, for mark-to-print comparisons, the system performed with an equal error rate (EER) of 2.3% and about 88% of marks are found in the first 3 positions of a hitlist. When performing print-to-print transactions, results show an equal error rate of 0.5%. The system was then tested using real-case data obtained from police forces.
Resumo:
The objective of this study was to evaluate the performance of stacked species distribution models in predicting the alpha and gamma species diversity patterns of two important plant clades along elevation in the Andes. We modelled the distribution of the species in the Anthurium genus (53 species) and the Bromeliaceae family (89 species) using six modelling techniques. We combined all of the predictions for the same species in ensemble models based on two different criteria: the average of the rescaled predictions by all techniques and the average of the best techniques. The rescaled predictions were then reclassified into binary predictions (presence/absence). By stacking either the original predictions or binary predictions for both ensemble procedures, we obtained four different species richness models per taxa. The gamma and alpha diversity per elevation band (500 m) was also computed. To evaluate the prediction abilities for the four predictions of species richness and gamma diversity, the models were compared with the real data along an elevation gradient that was independently compiled by specialists. Finally, we also tested whether our richness models performed better than a null model of altitudinal changes of diversity based on the literature. Stacking of the ensemble prediction of the individual species models generated richness models that proved to be well correlated with the observed alpha diversity richness patterns along elevation and with the gamma diversity derived from the literature. Overall, these models tend to overpredict species richness. The use of the ensemble predictions from the species models built with different techniques seems very promising for modelling of species assemblages. Stacking of the binary models reduced the over-prediction, although more research is needed. The randomisation test proved to be a promising method for testing the performance of the stacked models, but other implementations may still be developed.
Resumo:
In contrast with the low frequency of most single epitope reactive T cells in the preimmune repertoire, up to 1 of 1,000 naive CD8(+) T cells from A2(+) individuals specifically bind fluorescent A2/peptide multimers incorporating the A27L analogue of the immunodominant 26-35 peptide from the melanocyte differentiation and melanoma associated antigen Melan-A. This represents the only naive antigen-specific T cell repertoire accessible to direct analysis in humans up to date. To get insight into the molecular basis for the selection and maintenance of such an abundant repertoire, we analyzed the functional diversity of T cells composing this repertoire ex vivo at the clonal level. Surprisingly, we found a significant proportion of multimer(+) clonotypes that failed to recognize both Melan-A analogue and parental peptides in a functional assay but efficiently recognized peptides from proteins of self- or pathogen origin selected for their potential functional cross-reactivity with Melan-A. Consistent with these data, multimers incorporating some of the most frequently recognized peptides specifically stained a proportion of naive CD8(+) T cells similar to that observed with Melan-A multimers. Altogether these results indicate that the high frequency of Melan-A multimer(+) T cells can be explained by the existence of largely cross-reactive subsets of naive CD8(+) T cells displaying multiple specificities.
Resumo:
This study investigates the potential stages of drug use. Data from the longitudinal Cohort Study on Substance Use Risk Factors were used (N = 5,116). Drug use (alcohol, tobacco, and 16 illicit drugs) over the previous 12 months was assessed at two time points. Patterns and trajectories of drug use were studied using latent transition analysis (LTA). This study's substantive contributions are twofold. First, the pattern of drug use displayed the well-known sequence of drug involvement (licit drugs to cannabis to other illicit drugs), but with an added distinction between two kinds of illicit drugs ("middle-stage" drugs: uppers, hallucinogens, inhaled drugs; and "final-stage" drugs: heroin, ketamine, GHB/GBL, research chemicals, crystal meth, and spice). Second, subgroup membership was stable over time, as the most likely transition was remaining in the same latent class.
The CD8 beta polypeptide is required for the recognition of an altered peptide ligand as an agonist.
Resumo:
T cell activation is triggered by the specific recognition of cognate peptides presented by MHC molecules. Altered peptide ligands are analogs of cognate peptides which have a high affinity for MHC molecules. Some of them induce complete T cell responses, i.e. they act as agonists, whereas others behave as partial agonists or even as antagonists. Here, we analyzed both early (intracellular Ca2+ mobilization), and late (interleukin-2 production) signal transduction events induced by a cognate peptide or a corresponding altered peptide ligand using T cell hybridomas expressing or not the CD8 alpha and beta chains. With a video imaging system, we showed that the intracellular Ca2+ response to an altered peptide ligand induces the appearance of a characteristic sustained intracellular Ca2+ concentration gradient which can be detected shortly after T cell interaction with antigen-presenting cells. We also provide evidence that the same altered peptide ligand can be seen either as an agonist or a partial agonist, depending on the presence of CD8beta in the CD8 co-receptor dimers expressed at the T cell surface.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.