919 resultados para Statistical classifiers


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent studies have demonstrated that spatial patterns of fMRI BOLD activity distribution over the brain may be used to classify different groups or mental states. These studies are based on the application of advanced pattern recognition approaches and multivariate statistical classifiers. Most published articles in this field are focused on improving the accuracy rates and many approaches have been proposed to accomplish this task. Nevertheless, a point inherent to most machine learning methods (and still relatively unexplored in neuroimaging) is how the discriminative information can be used to characterize groups and their differences. In this work, we introduce the Maximum Uncertainty Linear Discrimination Analysis (MLDA) and show how it can be applied to infer groups` patterns by discriminant hyperplane navigation. In addition, we show that it naturally defines a behavioral score, i.e., an index quantifying the distance between the states of a subject from predefined groups. We validate and illustrate this approach using a motor block design fMRI experiment data with 35 subjects. (C) 2008 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Na atualidade, está a emergir um novo paradigma de interação, designado por Natural User Interface (NUI) para reconhecimento de gestos produzidos com o corpo do utilizador. O dispositivo de interação Microsoft Kinect foi inicialmente concebido para controlo de videojogos, para a consola Xbox360. Este dispositivo demonstra ser uma aposta viável para explorar outras áreas, como a do apoio ao processo de ensino e de aprendizagem para crianças do ensino básico. O protótipo desenvolvido visa definir um modo de interação baseado no desenho de letras no ar, e realizar a interpretação dos símbolos desenhados, usando os reconhecedores de padrões Kernel Discriminant Analysis (KDA), Support Vector Machines (SVM) e $N. O desenvolvimento deste projeto baseou-se no estudo dos diferentes dispositivos NUI disponíveis no mercado, bibliotecas de desenvolvimento NUI para este tipo de dispositivos e algoritmos de reconhecimento de padrões. Com base nos dois elementos iniciais, foi possível obter uma visão mais concreta de qual o hardware e software disponíveis indicados à persecução do objetivo pretendido. O reconhecimento de padrões constitui um tema bastante extenso e complexo, de modo que foi necessária a seleção de um conjunto limitado deste tipo de algoritmos, realizando os respetivos testes por forma a determinar qual o que melhor se adequava ao objetivo pretendido. Aplicando as mesmas condições aos três algoritmos de reconhecimento de padrões permitiu avaliar as suas capacidades e determinar o $N como o que apresentou maior eficácia no reconhecimento. Por último, tentou-se averiguar a viabilidade do protótipo desenvolvido, tendo sido testado num universo de elementos de duas faixas etárias para determinar a capacidade de adaptação e aprendizagem destes dois grupos. Neste estudo, constatou-se um melhor desempenho inicial ao modo de interação do grupo de idade mais avançada. Contudo, o grupo mais jovem foi revelando uma evolutiva capacidade de adaptação a este modo de interação melhorando progressivamente os resultados.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Remote sensing is one technology of extreme importance, allowing capture of data from the Earth's surface that are used with various purposes, including, environmental monitoring, tracking usage of natural resources, geological prospecting and monitoring of disasters. One of the main applications of remote sensing is the generation of thematic maps and subsequent survey of areas from images generated by orbital or sub-orbital sensors. Pattern classification methods are used in the implementation of computational routines to automate this activity. Artificial neural networks present themselves as viable alternatives to traditional statistical classifiers, mainly for applications whose data show high dimensionality as those from hyperspectral sensors. This work main goal is to develop a classiffier based on neural networks radial basis function and Growing Neural Gas, which presents some advantages over using individual neural networks. The main idea is to use Growing Neural Gas's incremental characteristics to determine the radial basis function network's quantity and choice of centers in order to obtain a highly effective classiffier. To demonstrate the performance of the classiffier three studies case are presented along with the results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La heterogeneidad del medio geológico introduce en el proyecto de obra subterránea un alto grado de incertidumbre que debe ser debidamente gestionado a fin de reducir los riesgos asociados, que son fundamentalmente de tipo geotécnico. Entre los principales problemas a los que se enfrenta la Mecánica de Rocas moderna en el ámbito de la construcción subterránea, se encuentran la fluencia de roca en túneles (squeezing) y la rotura de pilares de carbón. Es ampliamente conocido que su aparición causa importantes perjuicios en el coste y la seguridad de los proyectos por lo que su estudio, ha estado tradicionalmente vinculado a la predicción de su ocurrencia. Entre las soluciones existentes para la determinación de estos problemas se encuentran las que se basan en métodos analíticos y numéricos. Estas metodologías son capaces de proporcionar un alto nivel de representatividad respecto del comportamiento geotécnico real, sin embargo, su utilización solo es posible cuando se dispone de una suficiente caracterización geotécnica y por tanto de una detallada definición de los parámetros que alimentan los complejos modelos constitutivos y criterios de rotura que los fenómenos estudiados requieren. Como es lógico, este nivel de definición solo es posible cuando se alcanzan etapas avanzadas de proyecto, incluso durante la propia construcción, a fin de calibrar adecuadamente los parámetros introducidos en los modelos, lo que supone una limitación de uso en etapas iniciales, cuando su predicción tiene verdadero sentido. Por su parte, los métodos empíricos permiten proporcionar soluciones a estos complejos problemas de un modo sencillo, con una baja parametrización y, dado su eminente enfoque observacional, de gran fiabilidad cuando se implementan sobre condiciones de contorno similares a las originales. La sencillez y escasez de los parámetros utilizados permiten a estas metodologías ser utilizadas desde las fases preliminares del proyecto, ya que estos constituyen en general, información habitual de fácil y económica adquisición. Este aspecto permite por tanto incorporar la predicción desde el principio del proceso de diseño, anticipando el riesgo en origen. En esta tesis doctoral, se presenta una nueva metodología empírica que sirve para proporcionar predicciones para la ocurrencia de squeezing y el fallo de pilares de carbón basada en una extensa recopilación de información de casos reales de túneles y minas en las que ambos fenómenos fueron evaluados. Esta información, recogida de referencias bibliográficas de prestigio, ha permitido recopilar una de las más extensas bases de datos existentes hasta la fecha relativa a estos fenómenos, lo que supone en sí mismo una importante contribución sobre el estado del arte. Con toda esta información, y con la ayuda de la teoría de clasificadores estadísticos, se ha implementado sobre las bases de datos un clasificador lineal de tipo regresión logística que permite hacer predicciones sobre la ocurrencia de ambos fenómenos en términos de probabilidad, y por tanto ponderar la incertidumbre asociada a la heterogeneidad incorporada por el medio geológico. Este aspecto del desarrollo es el verdadero valor añadido proporcionado por la tesis y la principal ventaja de la solución propuesta respecto de otras metodologías empíricas. Esta capacidad de ponderación probabilística permite al clasificador constituir una solución muy interesante como metodología para la evaluación de riesgo geotécnico y la toma de decisiones. De hecho, y como ejercicio de validación práctica, se ha implementado la solución desarrollada en un modelo coste-beneficio asociado a la optimización del diseño de pilares involucrados en una de mina “virtual” explotada por tajos largos. La capacidad del clasificador para cuantificar la probabilidad de fallo del diseño, junto con una adecuada cuantificación de las consecuencias de ese fallo, ha permitido definir una ley de riesgo que se ha incorporado al balance de costes y beneficios, que es capaz, a partir del redimensionamiento iterativo del sistema de pilares y de la propia configuración de la mina, maximizar el resultado económico del proyecto minero bajo unas condiciones de seguridad aceptables, fijadas de antemano. Geological media variability introduces to the subterranean project a high grade of uncertainty that should be properly managed with the aim to reduce the associated risks, which are mainly geotechnical. Among the major problems facing the modern Rock Mechanics in the field of underground construction are both, the rock squeezing while tunneling and the failure of coal pillars. Given their harmfulness to the cost and safety of the projects, their study has been traditionally linked to the determination of its occurrence. Among the existing solutions for the determination of these problems are those that are based on analytical and numerical methods. Those methodologies allow providing a high level of reliability of the geotechnical behavior, and therefore a detailed definition of the parameters that feed the complex constitutive models and failure criteria that require the studied phenomena. Obviously, this level of definition is only possible when advanced stages of the project are achieved and even during construction in order to properly calibrate the parameters entered in the models, which suppose a limited use in early stages, when the prediction has true sense. Meanwhile, empirical methods provide solutions to these complex problems in a simple way, with low parameterization and, given his observational scope, with highly reliability when implemented on similar conditions to the original context. The simplicity and scarcity of the parameters used allow these methodologies be applied in the early stages of the project, since that information should be commonly easy and cheaply to get. This aspect can therefore incorporate the prediction from the beginning of the design process, anticipating the risk beforehand. This thesis, based on the extensive data collection of case histories of tunnels and underground mines, presents a novel empirical approach used to provide predictions for the occurrence of both, squeezing and coal pillars failures. The information has been collected from prestigious references, providing one of the largest databases to date concerning phenomena, a fact which provides an important contribution to the state of the art. With all this information, and with the aid of the theory of statistical classifiers, it has been implemented on both databases, a type linear logistic regression classifier that allows predictions about the occurrence of these phenomena in terms of probability, and therefore weighting the uncertainty associated with geological variability. This aspect of the development is the real added value provided by the thesis and the main advantage of the proposed solution over other empirical methodologies. This probabilistic weighting capacity, allows being the classifier a very interesting methodology for the evaluation of geotechnical risk and decision making. In fact, in order to provide a practical validation, we have implemented the developed solution within a cost-benefit analysis associated with the optimization of the design of coal pillar systems involved in a "virtual" longwall mine. The ability of the classifier to quantify the probability of failure of the design along with proper quantification of the consequences of that failure, has allowed defining a risk law which is introduced into the cost-benefits model, which is able, from iterative resizing of the pillar system and the configuration of the mine, maximize the economic performance of the mining project under acceptable safety conditions established beforehand.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

When combining remote sensing imagery with statistical classifiers to obtain categorical thematic maps it is not usual to provide data about the spatial distribution of the error and uncertainty of the resulting maps. This paper describes, in the context of GeoViQua FP7 project, feasible approaches for methods based on several steps such as hybrid classifiers. Both for “per pixel” and “per polygon” strategies, the proposal is based on the use of the available ground truth, which is used to properly model the spatial distribution of the errors. Results allow mapping the classification success with a very high level of reliability (R2>0,94), providing users a sound knowledge of the accuracy at every area of the map.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Using techniques from Statistical Physics, the annealed VC entropy for hyperplanes in high dimensional spaces is calculated as a function of the margin for a spherical Gaussian distribution of inputs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Among the types of remote sensing acquisitions, optical images are certainly one of the most widely relied upon data sources for Earth observation. They provide detailed measurements of the electromagnetic radiation reflected or emitted by each pixel in the scene. Through a process termed supervised land-cover classification, this allows to automatically yet accurately distinguish objects at the surface of our planet. In this respect, when producing a land-cover map of the surveyed area, the availability of training examples representative of each thematic class is crucial for the success of the classification procedure. However, in real applications, due to several constraints on the sample collection process, labeled pixels are usually scarce. When analyzing an image for which those key samples are unavailable, a viable solution consists in resorting to the ground truth data of other previously acquired images. This option is attractive but several factors such as atmospheric, ground and acquisition conditions can cause radiometric differences between the images, hindering therefore the transfer of knowledge from one image to another. The goal of this Thesis is to supply remote sensing image analysts with suitable processing techniques to ensure a robust portability of the classification models across different images. The ultimate purpose is to map the land-cover classes over large spatial and temporal extents with minimal ground information. To overcome, or simply quantify, the observed shifts in the statistical distribution of the spectra of the materials, we study four approaches issued from the field of machine learning. First, we propose a strategy to intelligently sample the image of interest to collect the labels only in correspondence of the most useful pixels. This iterative routine is based on a constant evaluation of the pertinence to the new image of the initial training data actually belonging to a different image. Second, an approach to reduce the radiometric differences among the images by projecting the respective pixels in a common new data space is presented. We analyze a kernel-based feature extraction framework suited for such problems, showing that, after this relative normalization, the cross-image generalization abilities of a classifier are highly increased. Third, we test a new data-driven measure of distance between probability distributions to assess the distortions caused by differences in the acquisition geometry affecting series of multi-angle images. Also, we gauge the portability of classification models through the sequences. In both exercises, the efficacy of classic physically- and statistically-based normalization methods is discussed. Finally, we explore a new family of approaches based on sparse representations of the samples to reciprocally convert the data space of two images. The projection function bridging the images allows a synthesis of new pixels with more similar characteristics ultimately facilitating the land-cover mapping across images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study we propose an evaluation of the angular effects altering the spectral response of the land-cover over multi-angle remote sensing image acquisitions. The shift in the statistical distribution of the pixels observed in an in-track sequence of WorldView-2 images is analyzed by means of a kernel-based measure of distance between probability distributions. Afterwards, the portability of supervised classifiers across the sequence is investigated by looking at the evolution of the classification accuracy with respect to the changing observation angle. In this context, the efficiency of various physically and statistically based preprocessing methods in obtaining angle-invariant data spaces is compared and possible synergies are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this research, the effectiveness of Naive Bayes and Gaussian Mixture Models classifiers on segmenting exudates in retinal images is studied and the results are evaluated with metrics commonly used in medical imaging. Also, a color variation analysis of retinal images is carried out to find how effectively can retinal images be segmented using only the color information of the pixels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mobile malwares are increasing with the growing number of Mobile users. Mobile malwares can perform several operations which lead to cybersecurity threats such as, stealing financial or personal information, installing malicious applications, sending premium SMS, creating backdoors, keylogging and crypto-ransomware attacks. Knowing the fact that there are many illegitimate Applications available on the App stores, most of the mobile users remain careless about the security of their Mobile devices and become the potential victim of these threats. Previous studies have shown that not every antivirus is capable of detecting all the threats; due to the fact that Mobile malwares use advance techniques to avoid detection. A Network-based IDS at the operator side will bring an extra layer of security to the subscribers and can detect many advanced threats by analyzing their traffic patterns. Machine Learning(ML) will provide the ability to these systems to detect unknown threats for which signatures are not yet known. This research is focused on the evaluation of Machine Learning classifiers in Network-based Intrusion detection systems for Mobile Networks. In this study, different techniques of Network-based intrusion detection with their advantages, disadvantages and state of the art in Hybrid solutions are discussed. Finally, a ML based NIDS is proposed which will work as a subsystem, to Network-based IDS deployed by Mobile Operators, that can help in detecting unknown threats and reducing false positives. In this research, several ML classifiers were implemented and evaluated. This study is focused on Android-based malwares, as Android is the most popular OS among users, hence most targeted by cyber criminals. Supervised ML algorithms based classifiers were built using the dataset which contained the labeled instances of relevant features. These features were extracted from the traffic generated by samples of several malware families and benign applications. These classifiers were able to detect malicious traffic patterns with the TPR upto 99.6% during Cross-validation test. Also, several experiments were conducted to detect unknown malware traffic and to detect false positives. These classifiers were able to detect unknown threats with the Accuracy of 97.5%. These classifiers could be integrated with current NIDS', which use signatures, statistical or knowledge-based techniques to detect malicious traffic. Technique to integrate the output from ML classifier with traditional NIDS is discussed and proposed for future work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning Disability (LD) is a general term that describes specific kinds of learning problems. It is a neurological condition that affects a child's brain and impairs his ability to carry out one or many specific tasks. The learning disabled children are neither slow nor mentally retarded. This disorder can make it problematic for a child to learn as quickly or in the same way as some child who isn't affected by a learning disability. An affected child can have normal or above average intelligence. They may have difficulty paying attention, with reading or letter recognition, or with mathematics. It does not mean that children who have learning disabilities are less intelligent. In fact, many children who have learning disabilities are more intelligent than an average child. Learning disabilities vary from child to child. One child with LD may not have the same kind of learning problems as another child with LD. There is no cure for learning disabilities and they are life-long. However, children with LD can be high achievers and can be taught ways to get around the learning disability. In this research work, data mining using machine learning techniques are used to analyze the symptoms of LD, establish interrelationships between them and evaluate the relative importance of these symptoms. To increase the diagnostic accuracy of learning disability prediction, a knowledge based tool based on statistical machine learning or data mining techniques, with high accuracy,according to the knowledge obtained from the clinical information, is proposed. The basic idea of the developed knowledge based tool is to increase the accuracy of the learning disability assessment and reduce the time used for the same. Different statistical machine learning techniques in data mining are used in the study. Identifying the important parameters of LD prediction using the data mining techniques, identifying the hidden relationship between the symptoms of LD and estimating the relative significance of each symptoms of LD are also the parts of the objectives of this research work. The developed tool has many advantages compared to the traditional methods of using check lists in determination of learning disabilities. For improving the performance of various classifiers, we developed some preprocessing methods for the LD prediction system. A new system based on fuzzy and rough set models are also developed for LD prediction. Here also the importance of pre-processing is studied. A Graphical User Interface (GUI) is designed for developing an integrated knowledge based tool for prediction of LD as well as its degree. The designed tool stores the details of the children in the student database and retrieves their LD report as and when required. The present study undoubtedly proves the effectiveness of the tool developed based on various machine learning techniques. It also identifies the important parameters of LD and accurately predicts the learning disability in school age children. This thesis makes several major contributions in technical, general and social areas. The results are found very beneficial to the parents, teachers and the institutions. They are able to diagnose the child’s problem at an early stage and can go for the proper treatments/counseling at the correct time so as to avoid the academic and social losses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present distribution independent bounds on the generalization misclassification performance of a family of kernel classifiers with margin. Support Vector Machine classifiers (SVM) stem out of this class of machines. The bounds are derived through computations of the $V_gamma$ dimension of a family of loss functions where the SVM one belongs to. Bounds that use functions of margin distributions (i.e. functions of the slack variables of SVM) are derived.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Support Vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights and threshold such as to minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by $k$--means clustering and the weights are found using error backpropagation. We consider three machines, namely a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the US postal service database of handwritten digits, the SV machine achieves the highest test accuracy, followed by the hybrid approach. The SV approach is thus not only theoretically well--founded, but also superior in a practical application.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Obiettivo della tesi è analizzare e testare i principali approcci di Machine Learning applicabili in contesti semantici, partendo da algoritmi di Statistical Relational Learning, quali Relational Probability Trees, Relational Bayesian Classifiers e Relational Dependency Networks, per poi passare ad approcci basati su fattorizzazione tensori, in particolare CANDECOMP/PARAFAC, Tucker e RESCAL.