877 resultados para Naïve Bayes classifier


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hebb proposed that synapses between neurons that fire synchronously are strengthened, forming cell assemblies and phase sequences. The former, on a shorter scale, are ensembles of synchronized cells that function transiently as a closed processing system; the latter, on a larger scale, correspond to the sequential activation of cell assemblies able to represent percepts and behaviors. Nowadays, the recording of large neuronal populations allows for the detection of multiple cell assemblies. Within Hebb's theory, the next logical step is the analysis of phase sequences. Here we detected phase sequences as consecutive assembly activation patterns, and then analyzed their graph attributes in relation to behavior. We investigated action potentials recorded from the adult rat hippocampus and neocortex before, during and after novel object exploration (experimental periods). Within assembly graphs, each assembly corresponded to a node, and each edge corresponded to the temporal sequence of consecutive node activations. The sum of all assembly activations was proportional to firing rates, but the activity of individual assemblies was not. Assembly repertoire was stable across experimental periods, suggesting that novel experience does not create new assemblies in the adult rat. Assembly graph attributes, on the other hand, varied significantly across behavioral states and experimental periods, and were separable enough to correctly classify experimental periods (Naïve Bayes classifier; maximum AUROCs ranging from 0.55 to 0.99) and behavioral states (waking, slow wave sleep, and rapid eye movement sleep; maximum AUROCs ranging from 0.64 to 0.98). Our findings agree with Hebb's view that assemblies correspond to primitive building blocks of representation, nearly unchanged in the adult, while phase sequences are labile across behavioral states and change after novel experience. The results are compatible with a role for phase sequences in behavior and cognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hebb proposed that synapses between neurons that fire synchronously are strengthened, forming cell assemblies and phase sequences. The former, on a shorter scale, are ensembles of synchronized cells that function transiently as a closed processing system; the latter, on a larger scale, correspond to the sequential activation of cell assemblies able to represent percepts and behaviors. Nowadays, the recording of large neuronal populations allows for the detection of multiple cell assemblies. Within Hebb's theory, the next logical step is the analysis of phase sequences. Here we detected phase sequences as consecutive assembly activation patterns, and then analyzed their graph attributes in relation to behavior. We investigated action potentials recorded from the adult rat hippocampus and neocortex before, during and after novel object exploration (experimental periods). Within assembly graphs, each assembly corresponded to a node, and each edge corresponded to the temporal sequence of consecutive node activations. The sum of all assembly activations was proportional to firing rates, but the activity of individual assemblies was not. Assembly repertoire was stable across experimental periods, suggesting that novel experience does not create new assemblies in the adult rat. Assembly graph attributes, on the other hand, varied significantly across behavioral states and experimental periods, and were separable enough to correctly classify experimental periods (Naïve Bayes classifier; maximum AUROCs ranging from 0.55 to 0.99) and behavioral states (waking, slow wave sleep, and rapid eye movement sleep; maximum AUROCs ranging from 0.64 to 0.98). Our findings agree with Hebb's view that assemblies correspond to primitive building blocks of representation, nearly unchanged in the adult, while phase sequences are labile across behavioral states and change after novel experience. The results are compatible with a role for phase sequences in behavior and cognition.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Our ability to infer the protein quaternary structure automatically from atom and lattice information is inadequate, especially for weak complexes, and heteromeric quaternary structures. Several approaches exist, but they have limited performance. Here, we present a new scheme to infer protein quaternary structure from lattice and protein information, with all-around coverage for strong, weak and very weak affinity homomeric and heteromeric complexes. The scheme combines naive Bayes classifier and point group symmetry under Boolean framework to detect quaternary structures in crystal lattice. It consistently produces >= 90% coverage across diverse benchmarking data sets, including a notably superior 95% coverage for recognition heteromeric complexes, compared with 53% on the same data set by current state-of-the-art method. The detailed study of a limited number of prediction-failed cases offers interesting insights into the intriguing nature of protein contacts in lattice. The findings have implications for accurate inference of quaternary states of proteins, especially weak affinity complexes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper a custom classification algorithm based on linear discriminant analysis and probability-based weights is implemented and applied to the hippocampus measurements of structural magnetic resonance images from healthy subjects and Alzheimer’s Disease sufferers; and then attempts to diagnose them as accurately as possible. The classifier works by classifying each measurement of a hippocampal volume as healthy controlsized or Alzheimer’s Disease-sized, these new features are then weighted and used to classify the subject as a healthy control or suffering from Alzheimer’s Disease. The preliminary results obtained reach an accuracy of 85.8% and this is a similar accuracy to state-of-the-art methods such as a Naive Bayes classifier and a Support Vector Machine. An advantage of the method proposed in this paper over the aforementioned state of the art classifiers is the descriptive ability of the classifications it produces. The descriptive model can be of great help to aid a doctor in the diagnosis of Alzheimer’s Disease, or even further the understand of how Alzheimer’s Disease affects the hippocampus.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Equipment maintenance is the major cost factor in industrial plants, it is very important the development of fault predict techniques. Three-phase induction motors are key electrical equipments used in industrial applications mainly because presents low cost and large robustness, however, it isn t protected from other fault types such as shorted winding and broken bars. Several acquisition ways, processing and signal analysis are applied to improve its diagnosis. More efficient techniques use current sensors and its signature analysis. In this dissertation, starting of these sensors, it is to make signal analysis through Park s vector that provides a good visualization capability. Faults data acquisition is an arduous task; in this way, it is developed a methodology for data base construction. Park s transformer is applied into stationary reference for machine modeling of the machine s differential equations solution. Faults detection needs a detailed analysis of variables and its influences that becomes the diagnosis more complex. The tasks of pattern recognition allow that systems are automatically generated, based in patterns and data concepts, in the majority cases undetectable for specialists, helping decision tasks. Classifiers algorithms with diverse learning paradigms: k-Neighborhood, Neural Networks, Decision Trees and Naïves Bayes are used to patterns recognition of machines faults. Multi-classifier systems are used to improve classification errors. It inspected the algorithms homogeneous: Bagging and Boosting and heterogeneous: Vote, Stacking and Stacking C. Results present the effectiveness of constructed model to faults modeling, such as the possibility of using multi-classifiers algorithm on faults classification

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Los sensores inerciales (acelerómetros y giróscopos) se han ido introduciendo poco a poco en dispositivos que usamos en nuestra vida diaria gracias a su minituarización. Hoy en día todos los smartphones contienen como mínimo un acelerómetro y un magnetómetro, siendo complementados en losmás modernos por giróscopos y barómetros. Esto, unido a la proliferación de los smartphones ha hecho viable el diseño de sistemas basados en las medidas de sensores que el usuario lleva colocados en alguna parte del cuerpo (que en un futuro estarán contenidos en tejidos inteligentes) o los integrados en su móvil. El papel de estos sensores se ha convertido en fundamental para el desarrollo de aplicaciones contextuales y de inteligencia ambiental. Algunos ejemplos son el control de los ejercicios de rehabilitación o la oferta de información referente al sitio turístico que se está visitando. El trabajo de esta tesis contribuye a explorar las posibilidades que ofrecen los sensores inerciales para el apoyo a la detección de actividad y la mejora de la precisión de servicios de localización para peatones. En lo referente al reconocimiento de la actividad que desarrolla un usuario, se ha explorado el uso de los sensores integrados en los dispositivos móviles de última generación (luz y proximidad, acelerómetro, giróscopo y magnetómetro). Las actividades objetivo son conocidas como ‘atómicas’ (andar a distintas velocidades, estar de pie, correr, estar sentado), esto es, actividades que constituyen unidades de actividades más complejas como pueden ser lavar los platos o ir al trabajo. De este modo, se usan algoritmos de clasificación sencillos que puedan ser integrados en un móvil como el Naïve Bayes, Tablas y Árboles de Decisión. Además, se pretende igualmente detectar la posición en la que el usuario lleva el móvil, no sólo con el objetivo de utilizar esa información para elegir un clasificador entrenado sólo con datos recogidos en la posición correspondiente (estrategia que mejora los resultados de estimación de la actividad), sino también para la generación de un evento que puede producir la ejecución de una acción. Finalmente, el trabajo incluye un análisis de las prestaciones de la clasificación variando el tipo de parámetros y el número de sensores usados y teniendo en cuenta no sólo la precisión de la clasificación sino también la carga computacional. Por otra parte, se ha propuesto un algoritmo basado en la cuenta de pasos utilizando informaiii ción proveniente de un acelerómetro colocado en el pie del usuario. El objetivo final es detectar la actividad que el usuario está haciendo junto con la estimación aproximada de la distancia recorrida. El algoritmo de cuenta pasos se basa en la detección de máximos y mínimos usando ventanas temporales y umbrales sin requerir información específica del usuario. El ámbito de seguimiento de peatones en interiores es interesante por la falta de un estándar de localización en este tipo de entornos. Se ha diseñado un filtro extendido de Kalman centralizado y ligeramente acoplado para fusionar la información medida por un acelerómetro colocado en el pie del usuario con medidas de posición. Se han aplicado también diferentes técnicas de corrección de errores como las de velocidad cero que se basan en la detección de los instantes en los que el pie está apoyado en el suelo. Los resultados han sido obtenidos en entornos interiores usando las posiciones estimadas por un sistema de triangulación basado en la medida de la potencia recibida (RSS) y GPS en exteriores. Finalmente, se han implementado algunas aplicaciones que prueban la utilidad del trabajo desarrollado. En primer lugar se ha considerado una aplicación de monitorización de actividad que proporciona al usuario información sobre el nivel de actividad que realiza durante un período de tiempo. El objetivo final es favorecer el cambio de comportamientos sedentarios, consiguiendo hábitos saludables. Se han desarrollado dos versiones de esta aplicación. En el primer caso se ha integrado el algoritmo de cuenta pasos en una plataforma OSGi móvil adquiriendo los datos de un acelerómetro Bluetooth colocado en el pie. En el segundo caso se ha creado la misma aplicación utilizando las implementaciones de los clasificadores en un dispositivo Android. Por otro lado, se ha planteado el diseño de una aplicación para la creación automática de un diario de viaje a partir de la detección de eventos importantes. Esta aplicación toma como entrada la información procedente de la estimación de actividad y de localización además de información almacenada en bases de datos abiertas (fotos, información sobre sitios) e información sobre sensores reales y virtuales (agenda, cámara, etc.) del móvil. Abstract Inertial sensors (accelerometers and gyroscopes) have been gradually embedded in the devices that people use in their daily lives thanks to their miniaturization. Nowadays all smartphones have at least one embedded magnetometer and accelerometer, containing the most upto- date ones gyroscopes and barometers. This issue, together with the fact that the penetration of smartphones is growing steadily, has made possible the design of systems that rely on the information gathered by wearable sensors (in the future contained in smart textiles) or inertial sensors embedded in a smartphone. The role of these sensors has become key to the development of context-aware and ambient intelligent applications. Some examples are the performance of rehabilitation exercises, the provision of information related to the place that the user is visiting or the interaction with objects by gesture recognition. The work of this thesis contributes to explore to which extent this kind of sensors can be useful to support activity recognition and pedestrian tracking, which have been proven to be essential for these applications. Regarding the recognition of the activity that a user performs, the use of sensors embedded in a smartphone (proximity and light sensors, gyroscopes, magnetometers and accelerometers) has been explored. The activities that are detected belong to the group of the ones known as ‘atomic’ activities (e.g. walking at different paces, running, standing), that is, activities or movements that are part of more complex activities such as doing the dishes or commuting. Simple, wellknown classifiers that can run embedded in a smartphone have been tested, such as Naïve Bayes, Decision Tables and Trees. In addition to this, another aim is to estimate the on-body position in which the user is carrying the mobile phone. The objective is not only to choose a classifier that has been trained with the corresponding data in order to enhance the classification but also to start actions. Finally, the performance of the different classifiers is analysed, taking into consideration different features and number of sensors. The computational and memory load of the classifiers is also measured. On the other hand, an algorithm based on step counting has been proposed. The acceleration information is provided by an accelerometer placed on the foot. The aim is to detect the activity that the user is performing together with the estimation of the distance covered. The step counting strategy is based on detecting minima and its corresponding maxima. Although the counting strategy is not innovative (it includes time windows and amplitude thresholds to prevent under or overestimation) no user-specific information is required. The field of pedestrian tracking is crucial due to the lack of a localization standard for this kind of environments. A loosely-coupled centralized Extended Kalman Filter has been proposed to perform the fusion of inertial and position measurements. Zero velocity updates have been applied whenever the foot is detected to be placed on the ground. The results have been obtained in indoor environments using a triangulation algorithm based on RSS measurements and GPS outdoors. Finally, some applications have been designed to test the usefulness of the work. The first one is called the ‘Activity Monitor’ whose aim is to prevent sedentary behaviours and to modify habits to achieve desired objectives of activity level. Two different versions of the application have been implemented. The first one uses the activity estimation based on the step counting algorithm, which has been integrated in an OSGi mobile framework acquiring the data from a Bluetooth accelerometer placed on the foot of the individual. The second one uses activity classifiers embedded in an Android smartphone. On the other hand, the design of a ‘Travel Logbook’ has been planned. The input of this application is the information provided by the activity and localization modules, external databases (e.g. pictures, points of interest, weather) and mobile embedded and virtual sensors (agenda, camera, etc.). The aim is to detect important events in the journey and gather the information necessary to store it as a journal page.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective To develop and evaluate machine learning techniques that identify limb fractures and other abnormalities (e.g. dislocations) from radiology reports. Materials and Methods 99 free-text reports of limb radiology examinations were acquired from an Australian public hospital. Two clinicians were employed to identify fractures and abnormalities from the reports; a third senior clinician resolved disagreements. These assessors found that, of the 99 reports, 48 referred to fractures or abnormalities of limb structures. Automated methods were then used to extract features from these reports that could be useful for their automatic classification. The Naive Bayes classification algorithm and two implementations of the support vector machine algorithm were formally evaluated using cross-fold validation over the 99 reports. Result Results show that the Naive Bayes classifier accurately identifies fractures and other abnormalities from the radiology reports. These results were achieved when extracting stemmed token bigram and negation features, as well as using these features in combination with SNOMED CT concepts related to abnormalities and disorders. The latter feature has not been used in previous works that attempted classifying free-text radiology reports. Discussion Automated classification methods have proven effective at identifying fractures and other abnormalities from radiology reports (F-Measure up to 92.31%). Key to the success of these techniques are features such as stemmed token bigrams, negations, and SNOMED CT concepts associated with morphologic abnormalities and disorders. Conclusion This investigation shows early promising results and future work will further validate and strengthen the proposed approaches.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Being able to accurately predict the risk of falling is crucial in patients with Parkinson’s dis- ease (PD). This is due to the unfavorable effect of falls, which can lower the quality of life as well as directly impact on survival. Three methods considered for predicting falls are decision trees (DT), Bayesian networks (BN), and support vector machines (SVM). Data on a 1-year prospective study conducted at IHBI, Australia, for 51 people with PD are used. Data processing are conducted using rpart and e1071 packages in R for DT and SVM, con- secutively; and Bayes Server 5.5 for the BN. The results show that BN and SVM produce consistently higher accuracy over the 12 months evaluation time points (average sensitivity and specificity > 92%) than DT (average sensitivity 88%, average specificity 72%). DT is prone to imbalanced data so needs to adjust for the misclassification cost. However, DT provides a straightforward, interpretable result and thus is appealing for helping to identify important items related to falls and to generate fallers’ profiles.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A technique is proposed for classifying respiratory volume waveforms(RVW) into normal and abnormal categories of respiratory pathways. The proposed method transforms the temporal sequence into frequency domain by using an orthogonal transform, namely discrete cosine transform (DCT) and the transformed signal is pole-zero modelled. A Bayes classifier using model pole angles as the feature vector performed satisfactorily when a limited number of RVWs recorded under deep and rapid (DR) manoeuvre are classified.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

There are many popular models available for classification of documents like Naïve Bayes Classifier, k-Nearest Neighbors and Support Vector Machine. In all these cases, the representation is based on the “Bag of words” model. This model doesn't capture the actual semantic meaning of a word in a particular document. Semantics are better captured by proximity of words and their occurrence in the document. We propose a new “Bag of Phrases” model to capture this discriminative power of phrases for text classification. We present a novel algorithm to extract phrases from the corpus using the well known topic model, Latent Dirichlet Allocation(LDA), and to integrate them in vector space model for classification. Experiments show a better performance of classifiers with the new Bag of Phrases model against related representation models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naive Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (approximate to 85%) and specific (approximate to 95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions. Proteins 2014; 82:1219-1234. (c) 2013 Wiley Periodicals, Inc.