853 resultados para Learning machine
Resumo:
This paper reports on the empirical comparison of seven machine learning algorithms in texture classification with application to vegetation management in power line corridors. Aiming at classifying tree species in power line corridors, object-based method is employed. Individual tree crowns are segmented as the basic classification units and three classic texture features are extracted as the input to the classification algorithms. Several widely used performance metrics are used to evaluate the classification algorithms. The experimental results demonstrate that the classification performance depends on the performance matrix, the characteristics of datasets and the feature used.
Resumo:
A diagnostic method based on Bayesian Networks (probabilistic graphical models) is presented. Unlike conventional diagnostic approaches, in this method instead of focusing on system residuals at one or a few operating points, diagnosis is done by analyzing system behavior patterns over a window of operation. It is shown how this approach can loosen the dependency of diagnostic methods on precise system modeling while maintaining the desired characteristics of fault detection and diagnosis (FDD) tools (fault isolation, robustness, adaptability, and scalability) at a satisfactory level. As an example, the method is applied to fault diagnosis in HVAC systems, an area with considerable modeling and sensor network constraints.
Resumo:
The primary genetic risk factor in multiple sclerosis (MS) is the HLA-DRB1*1501 allele; however, much of the remaining genetic contribution to MS has yet to be elucidated. Several lines of evidence support a role for neuroendocrine system involvement in autoimmunity which may, in part, be genetically determined. Here, we comprehensively investigated variation within eight candidate hypothalamic-pituitary-adrenal (HPA) axis genes and susceptibility to MS. A total of 326 SNPs were investigated in a discovery dataset of 1343 MS cases and 1379 healthy controls of European ancestry using a multi-analytical strategy. Random Forests, a supervised machine-learning algorithm, identified eight intronic SNPs within the corticotrophin-releasing hormone receptor 1 or CRHR1 locus on 17q21.31 as important predictors of MS. On the basis of univariate analyses, six CRHR1 variants were associated with decreased risk for disease following a conservative correction for multiple tests. Independent replication was observed for CRHR1 in a large meta-analysis comprising 2624 MS cases and 7220 healthy controls of European ancestry. Results from a combined meta-analysis of all 3967 MS cases and 8599 controls provide strong evidence for the involvement of CRHR1 in MS. The strongest association was observed for rs242936 (OR = 0.82, 95% CI = 0.74-0.90, P = 9.7 × 10-5). Replicated CRHR1 variants appear to exist on a single associated haplotype. Further investigation of mechanisms involved in HPA axis regulation and response to stress in MS pathogenesis is warranted. © The Author 2010. Published by Oxford University Press. All rights reserved.
Resumo:
The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.
Resumo:
Objective To develop and evaluate machine learning techniques that identify limb fractures and other abnormalities (e.g. dislocations) from radiology reports. Materials and Methods 99 free-text reports of limb radiology examinations were acquired from an Australian public hospital. Two clinicians were employed to identify fractures and abnormalities from the reports; a third senior clinician resolved disagreements. These assessors found that, of the 99 reports, 48 referred to fractures or abnormalities of limb structures. Automated methods were then used to extract features from these reports that could be useful for their automatic classification. The Naive Bayes classification algorithm and two implementations of the support vector machine algorithm were formally evaluated using cross-fold validation over the 99 reports. Result Results show that the Naive Bayes classifier accurately identifies fractures and other abnormalities from the radiology reports. These results were achieved when extracting stemmed token bigram and negation features, as well as using these features in combination with SNOMED CT concepts related to abnormalities and disorders. The latter feature has not been used in previous works that attempted classifying free-text radiology reports. Discussion Automated classification methods have proven effective at identifying fractures and other abnormalities from radiology reports (F-Measure up to 92.31%). Key to the success of these techniques are features such as stemmed token bigrams, negations, and SNOMED CT concepts associated with morphologic abnormalities and disorders. Conclusion This investigation shows early promising results and future work will further validate and strengthen the proposed approaches.
Resumo:
Background Cancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities. Aims In this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated. Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes. Results Death certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers. Conclusion The selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifier.
Resumo:
In this study, a machine learning technique called anomaly detection is employed for wind turbine bearing fault detection. Basically, the anomaly detection algorithm is used to recognize the presence of unusual and potentially faulty data in a dataset, which contains two phases: a training phase and a testing phase. Two bearing datasets were used to validate the proposed technique, fault-seeded bearing from a test rig located at Case Western Reserve University to validate the accuracy of the anomaly detection method, and a test to failure data of bearings from the NSF I/UCR Center for Intelligent Maintenance Systems (IMS). The latter data set was used to compare anomaly detection with SVM, a previously well-known applied method, in rapidly finding the incipient faults.
Resumo:
Brain decoding of functional Magnetic Resonance Imaging data is a pattern analysis task that links brain activity patterns to the experimental conditions. Classifiers predict the neural states from the spatial and temporal pattern of brain activity extracted from multiple voxels in the functional images in a certain period of time. The prediction results offer insight into the nature of neural representations and cognitive mechanisms and the classification accuracy determines our confidence in understanding the relationship between brain activity and stimuli. In this paper, we compared the efficacy of three machine learning algorithms: neural network, support vector machines, and conditional random field to decode the visual stimuli or neural cognitive states from functional Magnetic Resonance data. Leave-one-out cross validation was performed to quantify the generalization accuracy of each algorithm on unseen data. The results indicated support vector machine and conditional random field have comparable performance and the potential of the latter is worthy of further investigation.
Resumo:
Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.
Resumo:
Objectives Recent research has shown that machine learning techniques can accurately predict activity classes from accelerometer data in adolescents and adults. The purpose of this study is to develop and test machine learning models for predicting activity type in preschool-aged children. Design Participants completed 12 standardised activity trials (TV, reading, tablet game, quiet play, art, treasure hunt, cleaning up, active game, obstacle course, bicycle riding) over two laboratory visits. Methods Eleven children aged 3–6 years (mean age = 4.8 ± 0.87; 55% girls) completed the activity trials while wearing an ActiGraph GT3X+ accelerometer on the right hip. Activities were categorised into five activity classes: sedentary activities, light activities, moderate to vigorous activities, walking, and running. A standard feed-forward Artificial Neural Network and a Deep Learning Ensemble Network were trained on features in the accelerometer data used in previous investigations (10th, 25th, 50th, 75th and 90th percentiles and the lag-one autocorrelation). Results Overall recognition accuracy for the standard feed forward Artificial Neural Network was 69.7%. Recognition accuracy for sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running was 82%, 79%, 64%, 36% and 46%, respectively. In comparison, overall recognition accuracy for the Deep Learning Ensemble Network was 82.6%. For sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running recognition accuracy was 84%, 91%, 79%, 73% and 73%, respectively. Conclusions Ensemble machine learning approaches such as Deep Learning Ensemble Network can accurately predict activity type from accelerometer data in preschool children.
Resumo:
The commercialization of aerial image processing is highly dependent on the platforms such as UAVs (Unmanned Aerial Vehicles). However, the lack of an automated UAV forced landing site detection system has been identified as one of the main impediments to allow UAV flight over populated areas in civilian airspace. This article proposes a UAV forced landing site detection system that is based on machine learning approaches including the Gaussian Mixture Model and the Support Vector Machine. A range of learning parameters are analysed including the number of Guassian mixtures, support vector kernels including linear, radial basis function Kernel (RBF) and polynormial kernel (poly), and the order of RBF kernel and polynormial kernel. Moreover, a modified footprint operator is employed during feature extraction to better describe the geometric characteristics of the local area surrounding a pixel. The performance of the presented system is compared to a baseline UAV forced landing site detection system which uses edge features and an Artificial Neural Network (ANN) region type classifier. Experiments conducted on aerial image datasets captured over typical urban environments reveal improved landing site detection can be achieved with an SVM classifier with an RBF kernel using a combination of colour and texture features. Compared to the baseline system, the proposed system provides significant improvement in term of the chance to detect a safe landing area, and the performance is more stable than the baseline in the presence of changes to the UAV altitude.
Resumo:
This thesis is concerned with the detection and prediction of rain in environmental recordings using different machine learning algorithms. The results obtained in this research will help ecologists to efficiently analyse environmental data and monitor biodiversity.
Resumo:
Lateralization of temporal lobe epilepsy (TLE) is critical for successful outcome of surgery to relieve seizures. TLE affects brain regions beyond the temporal lobes and has been associated with aberrant brain networks, based on evidence from functional magnetic resonance imaging. We present here a machine learning-based method for determining the laterality of TLE, using features extracted from resting-state functional connectivity of the brain. A comprehensive feature space was constructed to include network properties within local brain regions, between brain regions, and across the whole network. Feature selection was performed based on random forest and a support vector machine was employed to train a linear model to predict the laterality of TLE on unseen patients. A leave-one-patient-out cross validation was carried out on 12 patients and a prediction accuracy of 83% was achieved. The importance of selected features was analyzed to demonstrate the contribution of resting-state connectivity attributes at voxel, region, and network levels to TLE lateralization.