870 resultados para Classifier Generalization Ability
Resumo:
The problem of recognition on finite set of events is considered. The generalization ability of classifiers for this problem is studied within the Bayesian approach. The method for non-uniform prior distribution specification on recognition tasks is suggested. It takes into account the assumed degree of intersection between classes. The results of the analysis are applied for pruning of classification trees.
Resumo:
BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).
Resumo:
This paper discusses ECG classification after parametrizing the ECG waveforms in the wavelet domain. The aim of the work is to develop an accurate classification algorithm that can be used to diagnose cardiac beat abnormalities detected using a mobile platform such as smart-phones. Continuous time recurrent neural network classifiers are considered for this task. Records from the European ST-T Database are decomposed in the wavelet domain using discrete wavelet transform (DWT) filter banks and the resulting DWT coefficients are filtered and used as inputs for training the neural network classifier. Advantages of the proposed methodology are the reduced memory requirement for the signals which is of relevance to mobile applications as well as an improvement in the ability of the neural network in its generalization ability due to the more parsimonious representation of the signal to its inputs.
Resumo:
Objective: To investigate whether spirography-based objective measures are able to effectively characterize the severity of unwanted symptom states (Off and dyskinesia) and discriminate them from motor state of healthy elderly subjects. Background: Sixty-five patients with advanced Parkinson’s disease (PD) and 10 healthy elderly (HE) subjects performed repeated assessments of spirography, using a touch screen telemetry device in their home environments. On inclusion, the patients were either treated with levodopa-carbidopa intestinal gel or were candidates for switching to this treatment. On each test occasion, the subjects were asked trace a pre-drawn Archimedes spiral shown on the screen, using an ergonomic pen stylus. The test was repeated three times and was performed using dominant hand. A clinician used a web interface which animated the spiral drawings, allowing him to observe different kinematic features, like accelerations and spatial changes, during the drawing process and to rate different motor impairments. Initially, the motor impairments of drawing speed, irregularity and hesitation were rated on a 0 (normal) to 4 (extremely severe) scales followed by marking the momentary motor state of the patient into 2 categories that is Off and Dyskinesia. A sample of spirals drawn by HE subjects was randomly selected and used in subsequent analysis. Methods: The raw spiral data, consisting of stylus position and timestamp, were processed using time series analysis techniques like discrete wavelet transform, approximate entropy and dynamic time warping in order to extract 13 quantitative measures for representing meaningful motor impairment information. A principal component analysis (PCA) was used to reduce the dimensions of the quantitative measures into 4 principal components (PC). In order to classify the motor states into 3 categories that is Off, HE and dyskinesia, a logistic regression model was used as a classifier to map the 4 PCs to the corresponding clinically assigned motor state categories. A stratified 10-fold cross-validation (also known as rotation estimation) was applied to assess the generalization ability of the logistic regression classifier to future independent data sets. To investigate mean differences of the 4 PCs across the three categories, a one-way ANOVA test followed by Tukey multiple comparisons was used. Results: The agreements between computed and clinician ratings were very good with a weighted area under the receiver operating characteristic curve (AUC) coefficient of 0.91. The mean PC scores were different across the three motor state categories, only at different levels. The first 2 PCs were good at discriminating between the motor states whereas the PC3 was good at discriminating between HE subjects and PD patients. The mean scores of PC4 showed a trend across the three states but without significant differences. The Spearman’s rank correlations between the first 2 PCs and clinically assessed motor impairments were as follows: drawing speed (PC1, 0.34; PC2, 0.83), irregularity (PC1, 0.17; PC2, 0.17), and hesitation (PC1, 0.27; PC2, 0.77). Conclusions: These findings suggest that spirography-based objective measures are valid measures of spatial- and time-dependent deficits and can be used to distinguish drug-related motor dysfunctions between Off and dyskinesia in PD. These measures can be potentially useful during clinical evaluation of individualized drug-related complications such as over- and under-medications thus maximizing the amount of time the patients spend in the On state.
Resumo:
Usually, generalization is considered as a function of learning from a set of examples. In present work on the basis of recent neural network assembly memory model (NNAMM), a biologically plausible 'grandmother' model for vision, where each separate memory unit itself can generalize, has been proposed. For such a generalization by computation through memory, analytical formulae and numerical procedure are found to calculate exactly the perfectly learned memory unit's generalization ability. The model's memory has complex hierarchical structure, can be learned from one example by a one-step process, and may be considered as a semi-representational one. A simple binary neural network for bell-shaped tuning is described.
Resumo:
Numa Estação de Tratamento de Águas Residuais (ETAR), são elevados os custos não só de tratamento das águas residuais como também de manutenção dos equipamentos lá existentes, nesse sentido procura-se utilizar processos capazes de transformar os resíduos em produtos úteis. A Digestão Anaeróbia (DA) é um processo atualmente disponível capaz de contribuir para a redução da poluição ambiental e ao mesmo tempo de valorizar os subprodutos gerados. Durante o processo de DA é produzido um gás, o biogás, que pode ser utilizado como fonte de energia, reduzindo assim a dependência energética da ETAR e a emissão de gases com efeito de estufa para a atmosfera. A otimização do processo de DA das lamas é essencial para o aumento da produção de biogás, mas a complexidade do processo constitui um obstáculo à sua otimização. Neste trabalho, aplicaram-se Redes Neuronais Artificiais (RNA) ao processo de DA de lamas de ETAR. RNA são modelos simplificados inspirados no funcionamento das células neuronais humanas e que adquirem conhecimento através da experiência. Quando a RNA é criada e treinada, produz valores de output aproximadamente corretos para os inputs fornecidos. Foi esse o motivo para recorrer a RNA na otimização da produção de biogás no digestor I da ETAR Norte da SIMRIA, usando o programa NeuralToolsTM da PalisadeTM para desenvolvimento das RNA. Para tal, efetuou-se uma análise e tratamento de dados referentes aos últimos quatro anos de funcionamento do digestor. Os resultados obtidos permitiram concluir que as RNA modeladas apresentam boa capacidade de generalização do processo de DA. Considera-se que este caso de estudo é promissor, fornecendo uma boa base para o desenvolvimento de modelos eventualmente mais gerais de RNA que, aplicado conjuntamente com as características de funcionamento de um digestor e o processo de DA, permitirá otimizar a produção de biogás em ETAR.
Resumo:
PURPOSE: Statistical shape and appearance models play an important role in reducing the segmentation processing time of a vertebra and in improving results for 3D model development. Here, we describe the different steps in generating a statistical shape model (SSM) of the second cervical vertebra (C2) and provide the shape model for general use by the scientific community. The main difficulties in its construction are the morphological complexity of the C2 and its variability in the population. METHODS: The input dataset is composed of manually segmented anonymized patient computerized tomography (CT) scans. The alignment of the different datasets is done with the procrustes alignment on surface models, and then, the registration is cast as a model-fitting problem using a Gaussian process. A principal component analysis (PCA)-based model is generated which includes the variability of the C2. RESULTS: The SSM was generated using 92 CT scans. The resulting SSM was evaluated for specificity, compactness and generalization ability. The SSM of the C2 is freely available to the scientific community in Slicer (an open source software for image analysis and scientific visualization) with a module created to visualize the SSM using Statismo, a framework for statistical shape modeling. CONCLUSION: The SSM of the vertebra allows the shape variability of the C2 to be represented. Moreover, the SSM will enable semi-automatic segmentation and 3D model generation of the vertebra, which would greatly benefit surgery planning.
Resumo:
Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.
Resumo:
The objective of this study was to predict by means of Artificial Neural Network (ANN), multilayer perceptrons, the texture attributes of light cheesecurds perceived by trained judges based on instrumental texture measurements. Inputs to the network were the instrumental texture measurements of light cheesecurd (imitative and fundamental parameters). Output variables were the sensory attributes consistency and spreadability. Nine light cheesecurd formulations composed of different combinations of fat and water were evaluated. The measurements obtained by the instrumental and sensory analyses of these formulations constituted the data set used for training and validation of the network. Network training was performed using a back-propagation algorithm. The network architecture selected was composed of 8-3-9-2 neurons in its layers, which quickly and accurately predicted the sensory texture attributes studied, showing a high correlation between the predicted and experimental values for the validation data set and excellent generalization ability, with a validation RMSE of 0.0506.
Resumo:
Analyzes the use of linear and neural network models for financial distress classification, with emphasis on the issues of input variable selection and model pruning. A data-driven method for selecting input variables (financial ratios, in this case) is proposed. A case study involving 60 British firms in the period 1997-2000 is used for illustration. It is shown that the use of the Optimal Brain Damage pruning technique can considerably improve the generalization ability of a neural model. Moreover, the set of financial ratios obtained with the proposed selection procedure is shown to be an appropriate alternative to the ratios usually employed by practitioners.
Resumo:
In last decades, neural networks have been established as a major tool for the identification of nonlinear systems. Among the various types of networks used in identification, one that can be highlighted is the wavelet neural network (WNN). This network combines the characteristics of wavelet multiresolution theory with learning ability and generalization of neural networks usually, providing more accurate models than those ones obtained by traditional networks. An extension of WNN networks is to combine the neuro-fuzzy ANFIS (Adaptive Network Based Fuzzy Inference System) structure with wavelets, leading to generate the Fuzzy Wavelet Neural Network - FWNN structure. This network is very similar to ANFIS networks, with the difference that traditional polynomials present in consequent of this network are replaced by WNN networks. This paper proposes the identification of nonlinear dynamical systems from a network FWNN modified. In the proposed structure, functions only wavelets are used in the consequent. Thus, it is possible to obtain a simplification of the structure, reducing the number of adjustable parameters of the network. To evaluate the performance of network FWNN with this modification, an analysis of network performance is made, verifying advantages, disadvantages and cost effectiveness when compared to other existing FWNN structures in literature. The evaluations are carried out via the identification of two simulated systems traditionally found in the literature and a real nonlinear system, consisting of a nonlinear multi section tank. Finally, the network is used to infer values of temperature and humidity inside of a neonatal incubator. The execution of such analyzes is based on various criteria, like: mean squared error, number of training epochs, number of adjustable parameters, the variation of the mean square error, among others. The results found show the generalization ability of the modified structure, despite the simplification performed
Resumo:
The static and cyclic assays are common to test materials in structures.. For cycling assays to assess the fatigue behavior of the material and thereby obtain the S-N curves and these are used to construct the diagrams of living constant. However, these diagrams, when constructed with small amounts of S-N curves underestimate or overestimate the actual behavior of the composite, there is increasing need for more testing to obtain more accurate results. Therewith, , a way of reducing costs is the statistical analysis of the fatigue behavior. The aim of this research was evaluate the probabilistic fatigue behavior of composite materials. The research was conducted in three parts. The first part consists of associating the equation of probability Weilbull equations commonly used in modeling of composite materials S-N curve, namely the exponential equation and power law and their generalizations. The second part was used the results obtained by the equation which best represents the S-N curves of probability and trained a network to the modular 5% failure. In the third part, we carried out a comparative study of the results obtained using the nonlinear model by parts (PNL) with the results of a modular network architecture (MN) in the analysis of fatigue behavior. For this we used a database of ten materials obtained from the literature to assess the ability of generalization of the modular network as well as its robustness. From the results it was found that the power law of probability generalized probabilistic behavior better represents the fatigue and composites that although the generalization ability of the MN that was not robust training with 5% failure rate, but for values mean the MN showed more accurate results than the PNL model
Resumo:
One objective of the feeder reconfiguration problem in distribution systems is to minimize the power losses for a specific load. For this problem, mathematical modeling is a nonlinear mixed integer problem that is generally hard to solve. This paper proposes an algorithm based on artificial neural network theory. In this context, clustering techniques to determine the best training set for a single neural network with generalization ability are also presented. The proposed methodology was employed for solving two electrical systems and presented good results. Moreover, the methodology can be employed for large-scale systems in real-time environment.
Resumo:
Statistical shape models (SSMs) have been used widely as a basis for segmenting and interpreting complex anatomical structures. The robustness of these models are sensitive to the registration procedures, i.e., establishment of a dense correspondence across a training data set. In this work, two SSMs based on the same training data set of scoliotic vertebrae, and registration procedures were compared. The first model was constructed based on the original binary masks without applying any image pre- and post-processing, and the second was obtained by means of a feature preserving smoothing method applied to the original training data set, followed by a standard rasterization algorithm. The accuracies of the correspondences were assessed quantitatively by means of the maximum of the mean minimum distance (MMMD) and Hausdorf distance (H(D)). Anatomical validity of the models were quantified by means of three different criteria, i.e., compactness, specificity, and model generalization ability. The objective of this study was to compare quasi-identical models based on standard metrics. Preliminary results suggest that the MMMD distance and eigenvalues are not sensitive metrics for evaluating the performance and robustness of SSMs.
Resumo:
Cette thèse contribue a la recherche vers l'intelligence artificielle en utilisant des méthodes connexionnistes. Les réseaux de neurones récurrents sont un ensemble de modèles séquentiels de plus en plus populaires capable en principe d'apprendre des algorithmes arbitraires. Ces modèles effectuent un apprentissage en profondeur, un type d'apprentissage machine. Sa généralité et son succès empirique en font un sujet intéressant pour la recherche et un outil prometteur pour la création de l'intelligence artificielle plus générale. Le premier chapitre de cette thèse donne un bref aperçu des sujets de fonds: l'intelligence artificielle, l'apprentissage machine, l'apprentissage en profondeur et les réseaux de neurones récurrents. Les trois chapitres suivants couvrent ces sujets de manière de plus en plus spécifiques. Enfin, nous présentons quelques contributions apportées aux réseaux de neurones récurrents. Le chapitre \ref{arxiv1} présente nos travaux de régularisation des réseaux de neurones récurrents. La régularisation vise à améliorer la capacité de généralisation du modèle, et joue un role clé dans la performance de plusieurs applications des réseaux de neurones récurrents, en particulier en reconnaissance vocale. Notre approche donne l'état de l'art sur TIMIT, un benchmark standard pour cette tâche. Le chapitre \ref{cpgp} présente une seconde ligne de travail, toujours en cours, qui explore une nouvelle architecture pour les réseaux de neurones récurrents. Les réseaux de neurones récurrents maintiennent un état caché qui représente leurs observations antérieures. L'idée de ce travail est de coder certaines dynamiques abstraites dans l'état caché, donnant au réseau une manière naturelle d'encoder des tendances cohérentes de l'état de son environnement. Notre travail est fondé sur un modèle existant; nous décrivons ce travail et nos contributions avec notamment une expérience préliminaire.