150 resultados para Recognition algorithms

em Université de Lausanne, Switzerland


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the first part of this research, three stages were stated for a program to increase the information extracted from ink evidence and maximise its usefulness to the criminal and civil justice system. These stages are (a) develop a standard methodology for analysing ink samples by high-performance thin layer chromatography (HPTLC) in reproducible way, when ink samples are analysed at different time, locations and by different examiners; (b) compare automatically and objectively ink samples; and (c) define and evaluate theoretical framework for the use of ink evidence in forensic context. This report focuses on the second of the three stages. Using the calibration and acquisition process described in the previous report, mathematical algorithms are proposed to automatically and objectively compare ink samples. The performances of these algorithms are systematically studied for various chemical and forensic conditions using standard performance tests commonly used in biometrics studies. The results show that different algorithms are best suited for different tasks. Finally, this report demonstrates how modern analytical and computer technology can be used in the field of ink examination and how tools developed and successfully applied in other fields of forensic science can help maximising its impact within the field of questioned documents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dendritic cell (DC) populations consist of multiple subsets that are essential orchestrators of the immune system. Technological limitations have so far prevented systems-wide accurate proteome comparison of rare cell populations in vivo. Here, we used high-resolution mass spectrometry-based proteomics, combined with label-free quantitation algorithms, to determine the proteome of mouse splenic conventional and plasmacytoid DC subsets to a depth of 5,780 and 6,664 proteins, respectively. We found mutually exclusive expression of pattern recognition pathways not previously known to be different among conventional DC subsets. Our experiments assigned key viral recognition functions to be exclusively expressed in CD4(+) and double-negative DCs. The CD8alpha(+) DCs largely lack the receptors required to sense certain viruses in the cytoplasm. By avoiding activation via cytoplasmic receptors, including retinoic acid-inducible gene I, CD8alpha(+) DCs likely gain a window of opportunity to process and present viral antigens before activation-induced shutdown of antigen presentation pathways occurs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recognition systems play a key role in a range of biological processes, including mate choice, immune defence and altruistic behaviour. Social insects provide an excellent model for studying recognition systems because workers need to discriminate between nestmates and non-nestmates, enabling them to direct altruistic behaviour towards closer kin and to repel potential invaders. However, the level of aggression directed towards conspecific intruders can vary enormously, even among workers within the same colony. This is usually attributed to differences in the aggression thresholds of individuals or to workers having different roles within the colony. Recent evidence from the weaver ant Oecophylla smaragdina suggests that this does not tell the whole story. Here I propose a new model for nestmate recognition based on a vector template derived from both the individual's innate odour and the shared colony odour. This model accounts for the recent findings concerning weaver ants, and also provides an alternative explanation for why the level of aggression expressed by a colony decreases as the diversity within the colony increases, even when odour is well-mixed. The model makes additional predictions that are easily tested, and represents a significant advance in our conceptualisation of recognition systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: Difficult tracheal intubation remains a constant and significant source of morbidity and mortality in anaesthetic practice. Insufficient airway assessment in the preoperative period continues to be a major cause of unanticipated difficult intubation. Although many risk factors have already been identified, preoperative airway evaluation is not always regarded as a standard procedure and the respective weight of each risk factor remains unclear. Moreover the predictive scores available are not sensitive, moderately specific and often operator-dependant. In order to improve the preoperative detection of patients at risk for difficult intubation, we developed a system for automated and objective evaluation of morphologic criteria of the face and neck using video recordings and advanced techniques borrowed from face recognition. Method and results: Frontal video sequences were recorded in 5 healthy volunteers. During the video recording, subjects were requested to perform maximal flexion-extension of the neck and to open wide the mouth with tongue pulled out. A robust and real-time face tracking system was then applied, allowing to automatically identify and map a grid of 55 control points on the face, which were tracked during head motion. These points located important features of the face, such as the eyebrows, the nose, the contours of the eyes and mouth, and the external contours, including the chin. Moreover, based on this face tracking, the orientation of the head could also be estimated at each frame of the video sequence. Thus, we could infer for each frame the pitch angle of the head pose (related to the vertical rotation of the head) and obtain the degree of head extension. Morphological criteria used in the most frequent cited predictive scores were also extracted, such as mouth opening, degree of visibility of the uvula or thyreo-mental distance. Discussion and conclusion: Preliminary results suggest the high feasibility of the technique. The next step will be the application of the same automated and objective evaluation to patients who will undergo tracheal intubation. The difficulties related to intubation will be then correlated to the biometric characteristics of the patients. The objective in mind is to analyze the biometrics data with artificial intelligence algorithms to build a highly sensitive and specific predictive test.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Lung clearance index (LCI), a marker of ventilation inhomogeneity, is elevated early in children with cystic fibrosis (CF). However, in infants with CF, LCI values are found to be normal, although structural lung abnormalities are often detectable. We hypothesized that this discrepancy is due to inadequate algorithms of the available software package. AIM: Our aim was to challenge the validity of these software algorithms. METHODS: We compared multiple breath washout (MBW) results of current software algorithms (automatic modus) to refined algorithms (manual modus) in 17 asymptomatic infants with CF, and 24 matched healthy term-born infants. The main difference between these two analysis methods lies in the calculation of the molar mass differences that the system uses to define the completion of the measurement. RESULTS: In infants with CF the refined manual modus revealed clearly elevated LCI above 9 in 8 out of 35 measurements (23%), all showing LCI values below 8.3 using the automatic modus (paired t-test comparing the means, P < 0.001). Healthy infants showed normal LCI values using both analysis methods (n = 47, paired t-test, P = 0.79). The most relevant reason for false normal LCI values in infants with CF using the automatic modus was the incorrect recognition of the end-of-test too early during the washout. CONCLUSION: We recommend the use of the manual modus for the analysis of MBW outcomes in infants in order to obtain more accurate results. This will allow appropriate use of infant lung function results for clinical and scientific purposes. Pediatr Pulmonol. 2015; 50:970-977. © 2015 Wiley Periodicals, Inc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inbreeding avoidance is predicted to induce sex biases in dispersal. But which sex should disperse? In polygynous species, females pay higher costs to inbreeding and thus might be expected to disperse more, but empirical evidence consistently reveals male biases. Here, we show that theoretical expectations change drastically if females are allowed to avoid inbreeding via kin recognition. At high inbreeding loads, females should prefer immigrants over residents, thereby boosting male dispersal. At lower inbreeding loads, by contrast, inclusive fitness benefits should induce females to prefer relatives, thereby promoting male philopatry. This result points to disruptive effects of sexual selection. The inbreeding load that females are ready to accept is surprisingly high. In absence of search costs, females should prefer related partners as long as delta<r/(1+r) where r is relatedness and delta is the fecundity loss relative to an outbred mating. This amounts to fitness losses up to one-fifth for a half-sib mating and one-third for a full-sib mating, which lie in the upper range of inbreeding depression values currently reported in natural populations. The observation of active inbreeding avoidance in a polygynous species thus suggests that inbreeding depression exceeds this threshold in the species under scrutiny or that inbred matings at least partly forfeit other mating opportunities for males. Our model also shows that female choosiness should decline rapidly with search costs, stemming from, for example, reproductive delays. Species under strong time constraints on reproduction should thus be tolerant of inbreeding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Brittle cornea syndrome (BCS) is an autosomal recessive disorder characterised by extreme corneal thinning and fragility. Corneal rupture can therefore occur either spontaneously or following minimal trauma in affected patients. Two genes, ZNF469 and PRDM5, have now been identified, in which causative pathogenic mutations collectively account for the condition in nearly all patients with BCS ascertained to date. Therefore, effective molecular diagnosis is now available for affected patients, and those at risk of being heterozygous carriers for BCS. We have previously identified mutations in ZNF469 in 14 families (in addition to 6 reported by others in the literature), and in PRDM5 in 8 families (with 1 further family now published by others). Clinical features include extreme corneal thinning with rupture, high myopia, blue sclerae, deafness of mixed aetiology with hypercompliant tympanic membranes, and variable skeletal manifestations. Corneal rupture may be the presenting feature of BCS, and it is possible that this may be incorrectly attributed to non-accidental injury. Mainstays of management include the prevention of ocular rupture by provision of protective polycarbonate spectacles, careful monitoring of visual and auditory function, and assessment for skeletal complications such as developmental dysplasia of the hip. Effective management depends upon appropriate identification of affected individuals, which may be challenging given the phenotypic overlap of BCS with other connective tissue disorders.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Interleukin 1 beta (IL-1 beta) is a potent proinflammatory factor during viral infection. Its production is tightly controlled by transcription of Il1b dependent on the transcription factor NF-kappaB and subsequent processing of pro-IL-1 beta by an inflammasome. However, the sensors and mechanisms that facilitate RNA virus-induced production of IL-1 beta are not well defined. Here we report a dual role for the RNA helicase RIG-I in RNA virus-induced proinflammatory responses. Whereas RIG-I-mediated activation of NF-kappaB required the signaling adaptor MAVS and a complex of the adaptors CARD9 and Bcl-10, RIG-I also bound to the adaptor ASC to trigger caspase-1-dependent inflammasome activation by a mechanism independent of MAVS, CARD9 and the Nod-like receptor protein NLRP3. Our results identify the CARD9-Bcl-10 module as an essential component of the RIG-I-dependent proinflammatory response and establish RIG-I as a sensor able to activate the inflammasome in response to certain RNA viruses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work compares the structural/dynamics features of the wild-type alb-adrenergic receptor (AR) with those of the D142A active mutant and the agonist-bound state. The two active receptor forms were compared in their isolated states as well as in their ability to form homodimers and to recognize the G alpha q beta 1 gamma 2 heterotrimer. The analysis of the isolated structures revealed that, although the mutation- and agonist-induced active states of the alpha 1b-AR are different, they, however, share several structural peculiarities including (a) the release of some constraining interactions found in the wild-type receptor and (b) the opening of a cytosolic crevice formed by the second and third intracellular loops and the cytosolic extensions of helices 5 and 6. Accordingly, also their tendency to form homodimers shows commonalties and differences. In fact, in both the active receptor forms, helix 6 plays a crucial role in mediating homodimerization. However, the homodimeric models result from different interhelical assemblies. On the same line of evidence, in both of the active receptor forms, the cytosolic opened crevice recognizes similar domains on the G protein. However, the docking solutions are differently populated and the receptor-G protein preorientation models suggest that the final complexes should be characterized by different interaction patterns.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin, and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.