118 resultados para Feature Selection
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We are going to implement the "GA-SEFS" by Tsymbal and analyse experimentally its performance depending on the classifier algorithms used in the fitness function (NB, MNge, SMO). We are also going to study the effect of adding to the fitness function a measure to control complexity of the base classifiers.
Resumo:
Diagnosis of community acquired legionella pneumonia (CALP) is currently performed by means of laboratory techniques which may delay diagnosis several hours. To determine whether ANN can categorize CALP and non-legionella community-acquired pneumonia (NLCAP) and be standard for use by clinicians, we prospectively studied 203 patients with community-acquired pneumonia (CAP) diagnosed by laboratory tests. Twenty one clinical and analytical variables were recorded to train a neural net with two classes (LCAP or NLCAP class). In this paper we deal with the problem of diagnosis, feature selection, and ranking of the features as a function of their classification importance, and the design of a classifier the criteria of maximizing the ROC (Receiving operating characteristics) area, which gives a good trade-off between true positives and false negatives. In order to guarantee the validity of the statistics; the train-validation-test databases were rotated by the jackknife technique, and a multistarting procedure was done in order to make the system insensitive to local maxima.
Resumo:
Peer-reviewed
Resumo:
Alzheimer׳s disease (AD) is the most common type of dementia among the elderly. This work is part of a larger study that aims to identify novel technologies and biomarkers or features for the early detection of AD and its degree of severity. The diagnosis is made by analyzing several biomarkers and conducting a variety of tests (although only a post-mortem examination of the patients’ brain tissue is considered to provide definitive confirmation). Non-invasive intelligent diagnosis techniques would be a very valuable diagnostic aid. This paper concerns the Automatic Analysis of Emotional Response (AAER) in spontaneous speech based on classical and new emotional speech features: Emotional Temperature (ET) and fractal dimension (FD). This is a pre-clinical study aiming to validate tests and biomarkers for future diagnostic use. The method has the great advantage of being non-invasive, low cost, and without any side effects. The AAER shows very promising results for the definition of features useful in the early diagnosis of AD.
Resumo:
Objective. Recently, significant advances have been made in the early diagnosis of Alzheimer’s disease from EEG. However, choosing suitable measures is a challenging task. Among other measures, frequency Relative Power and loss of complexity have been used with promising results. In the present study we investigate the early diagnosis of AD using synchrony measures and frequency Relative Power on EEG signals, examining the changes found in different frequency ranges. Approach. We first explore the use of a single feature for computing the classification rate, looking for the best frequency range. Then, we present a multiple feature classification system that outperforms all previous results using a feature selection strategy. These two approaches are tested in two different databases, one containing MCI and healthy subjects (patients age: 71.9 ± 10.2, healthy subjects age: 71.7 ± 8.3), and the other containing Mild AD and healthy subjects (patients age: 77.6 ± 10.0; healthy subjects age: 69.4± 11.5). Main Results. Using a single feature to compute classification rates we achieve a performance of 78.33% for the MCI data set and of 97.56 % for Mild AD. Results are clearly improved using the multiple feature classification, where a classification rate of 95% is found for the MCI data set using 11 features, and 100% for the Mild AD data set using 4 features. Significance. The new features selection method described in this work may be a reliable tool that could help to design a realistic system that does not require prior knowledge of a patient's status. With that aim, we explore the standardization of features for MCI and Mild AD data sets with promising results.
Resumo:
The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.
Resumo:
A parts based model is a parametrization of an object class using a collection of landmarks following the object structure. The matching of parts based models is one of the problems where pairwise Conditional Random Fields have been successfully applied. The main reason of their effectiveness is tractable inference and learning due to the simplicity of involved graphs, usually trees. However, these models do not consider possible patterns of statistics among sets of landmarks, and thus they sufffer from using too myopic information. To overcome this limitation, we propoese a novel structure based on a hierarchical Conditional Random Fields, which we explain in the first part of this memory. We build a hierarchy of combinations of landmarks, where matching is performed taking into account the whole hierarchy. To preserve tractable inference we effectively sample the label set. We test our method on facial feature selection and human pose estimation on two challenging datasets: Buffy and MultiPIE. In the second part of this memory, we present a novel approach to multiple kernel combination that relies on stacked classification. This method can be used to evaluate the landmarks of the parts-based model approach. Our method is based on combining responses of a set of independent classifiers for each individual kernel. Unlike earlier approaches that linearly combine kernel responses, our approach uses them as inputs to another set of classifiers. We will show that we outperform state-of-the-art methods on most of the standard benchmark datasets.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
Mosaics have been commonly used as visual maps for undersea exploration and navigation. The position and orientation of an underwater vehicle can be calculated by integrating the apparent motion of the images which form the mosaic. A feature-based mosaicking method is proposed in this paper. The creation of the mosaic is accomplished in four stages: feature selection and matching, detection of points describing the dominant motion, homography computation and mosaic construction. In this work we demonstrate that the use of color and textures as discriminative properties of the image can improve, to a large extent, the accuracy of the constructed mosaic. The system is able to provide 3D metric information concerning the vehicle motion using the knowledge of the intrinsic parameters of the camera while integrating the measurements of an ultrasonic sensor. The experimental results of real images have been tested on the GARBI underwater vehicle
Resumo:
In this work we present the results of experimental work on the development of lexical class-based lexica by automatic means. Our purpose is to assess the use of linguistic lexical-class based information as a feature selection methodology for the use of classifiers in quick lexical development. The results show that the approach can help reduce the human effort required in the development of language resources significantly.
Resumo:
Alzheimer's disease is the most prevalent form of progressive degenerative dementia; it has a high socio-economic impact in Western countries. Therefore it is one of the most active research areas today. Alzheimer's is sometimes diagnosed by excluding other dementias, and definitive confirmation is only obtained through a post-mortem study of the brain tissue of the patient. The work presented here is part of a larger study that aims to identify novel technologies and biomarkers for early Alzheimer's disease detection, and it focuses on evaluating the suitability of a new approach for early diagnosis of Alzheimer’s disease by non-invasive methods. The purpose is to examine, in a pilot study, the potential of applying Machine Learning algorithms to speech features obtained from suspected Alzheimer sufferers in order help diagnose this disease and determine its degree of severity. Two human capabilities relevant in communication have been analyzed for feature selection: Spontaneous Speech and Emotional Response. The experimental results obtained were very satisfactory and promising for the early diagnosis and classification of Alzheimer’s disease patients.
Resumo:
In this study, a wrapper approach was applied to objectively select the most important variables related to two different anaerobic digestion imbalances, acidogenic states and foaming. This feature selection method, implemented in artificial neural networks (ANN), was performed using input and output data from a fully instrumented pilot plant (1 m 3 upflow fixed bed digester). Results for acidogenic states showed that pH, volatile fatty acids, and inflow rate were the most relevant variables. Results for foaming showed that inflow rate and total organic carbon were among the relevant variables, both of which were related to the feed loading of the digester. Because there is not a complete agreement on the causes of foaming, these results highlight the role of digester feeding patterns in the development of foaming
Resumo:
tThis paper deals with the potential and limitations of using voice and speech processing to detect Obstruc-tive Sleep Apnea (OSA). An extensive body of voice features has been extracted from patients whopresent various degrees of OSA as well as healthy controls. We analyse the utility of a reduced set offeatures for detecting OSA. We apply various feature selection and reduction schemes (statistical rank-ing, Genetic Algorithms, PCA, LDA) and compare various classifiers (Bayesian Classifiers, kNN, SupportVector Machines, neural networks, Adaboost). S-fold crossvalidation performed on 248 subjects showsthat in the extreme cases (that is, 127 controls and 121 patients with severe OSA) voice alone is able todiscriminate quite well between the presence and absence of OSA. However, this is not the case withmild OSA and healthy snoring patients where voice seems to play a secondary role. We found that thebest classification schemes are achieved using a Genetic Algorithm for feature selection/reduction.
Resumo:
Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.
Resumo:
Current technology trends in medical device industry calls for fabrication of massive arrays of microfeatures such as microchannels on to nonsilicon material substrates with high accuracy, superior precision, and high throughput. Microchannels are typical features used in medical devices for medication dosing into the human body, analyzing DNA arrays or cell cultures. In this study, the capabilities of machining systems for micro-end milling have been evaluated by conducting experiments, regression modeling, and response surface methodology. In machining experiments by using micromilling, arrays of microchannels are fabricated on aluminium and titanium plates, and the feature size and accuracy (width and depth) and surface roughness are measured. Multicriteria decision making for material and process parameters selection for desired accuracy is investigated by using particle swarm optimization (PSO) method, which is an evolutionary computation method inspired by genetic algorithms (GA). Appropriate regression models are utilized within the PSO and optimum selection of micromilling parameters; microchannel feature accuracy and surface roughness are performed. An analysis for optimal micromachining parameters in decision variable space is also conducted. This study demonstrates the advantages of evolutionary computing algorithms in micromilling decision making and process optimization investigations and can be expanded to other applications