32 resultados para automated text classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This technique paper describes a novel method for quantitatively and routinely identifying auroral breakup following substorm onset using the Time History of Events and Macroscale Interactions During Substorms (THEMIS) all-sky imagers (ASIs). Substorm onset is characterised by a brightening of the aurora that is followed by auroral poleward expansion and auroral breakup. This breakup can be identified by a sharp increase in the auroral intensity i(t) and the time derivative of auroral intensity i'(t). Utilising both i(t) and i'(t) we have developed an algorithm for identifying the time interval and spatial location of auroral breakup during the substorm expansion phase within the field of view of ASI data based solely on quantifiable characteristics of the optical auroral emissions. We compare the time interval determined by the algorithm to independently identified auroral onset times from three previously published studies. In each case the time interval determined by the algorithm is within error of the onset independently identified by the prior studies. We further show the utility of the algorithm by comparing the breakup intervals determined using the automated algorithm to an independent list of substorm onset times. We demonstrate that up to 50% of the breakup intervals characterised by the algorithm are within the uncertainty of the times identified in the independent list. The quantitative description and routine identification of an interval of auroral brightening during the substorm expansion phase provides a foundation for unbiased statistical analysis of the aurora to probe the physics of the auroral substorm as a new scientific tool for aiding the identification of the processes leading to auroral substorm onset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent studies showed that features extracted from brain MRIs can well discriminate Alzheimer’s disease from Mild Cognitive Impairment. This study provides an algorithm that sequentially applies advanced feature selection methods for findings the best subset of features in terms of binary classification accuracy. The classifiers that provided the highest accuracies, have been then used for solving a multi-class problem by the one-versus-one strategy. Although several approaches based on Regions of Interest (ROIs) extraction exist, the prediction power of features has not yet investigated by comparing filter and wrapper techniques. The findings of this work suggest that (i) the IntraCranial Volume (ICV) normalization can lead to overfitting and worst the accuracy prediction of test set and (ii) the combined use of a Random Forest-based filter with a Support Vector Machines-based wrapper, improves accuracy of binary classification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Various complex oscillatory processes are involved in the generation of the motor command. The temporal dynamics of these processes were studied for movement detection from single trial electroencephalogram (EEG). Autocorrelation analysis was performed on the EEG signals to find robust markers of movement detection. The evolution of the autocorrelation function was characterised via the relaxation time of the autocorrelation by exponential curve fitting. It was observed that the decay constant of the exponential curve increased during movement, indicating that the autocorrelation function decays slowly during motor execution. Significant differences were observed between movement and no moment tasks. Additionally, a linear discriminant analysis (LDA) classifier was used to identify movement trials with a peak accuracy of 74%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information was collated on the seed storage behaviour of 67 tree species native to the Amazon rainforest of Brazil; 38 appeared to show orthodox, 23 recalcitrant and six intermediate seed storage behaviour. A double-criteria key based on thousand-seed weight and seed moisture content at shedding to estimate likely seed storage behaviour, developed previously, showed good agreement with the above classifications. The key can aid seed storage behaviour identification considerably.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses ECG classification after parametrizing the ECG waveforms in the wavelet domain. The aim of the work is to develop an accurate classification algorithm that can be used to diagnose cardiac beat abnormalities detected using a mobile platform such as smart-phones. Continuous time recurrent neural network classifiers are considered for this task. Records from the European ST-T Database are decomposed in the wavelet domain using discrete wavelet transform (DWT) filter banks and the resulting DWT coefficients are filtered and used as inputs for training the neural network classifier. Advantages of the proposed methodology are the reduced memory requirement for the signals which is of relevance to mobile applications as well as an improvement in the ability of the neural network in its generalization ability due to the more parsimonious representation of the signal to its inputs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses ECG signal classification after parametrizing the ECG waveforms in the wavelet domain. Signal decomposition using perfect reconstruction quadrature mirror filter banks can provide a very parsimonious representation of ECG signals. In the current work, the filter parameters are adjusted by a numerical optimization algorithm in order to minimize a cost function associated to the filter cut-off sharpness. The goal consists of achieving a better compromise between frequency selectivity and time resolution at each decomposition level than standard orthogonal filter banks such as those of the Daubechies and Coiflet families. Our aim is to optimally decompose the signals in the wavelet domain so that they can be subsequently used as inputs for training to a neural network classifier.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Involuntary musical imagery (INMI) is the subject of much recent research interest. INMI covers a number of experience types such as musical obsessions and musical hallucinations. One type of experience has been called earworms, for which the literature provides a number of definitions. In this paper we consider the origins of the term earworm in the German language literature and compare that usage with the English language literature. We consider the published literature on earworms and conclude that there is merit in distinguishing between earworms and other types of types of involuntary musical imagery described in the scientific literature: e.g. musical hallucinations, musical obsessions. We also describe other experiences that can be considered under the term INMI. The aim of future research could be to ascertain similarities and differences between types of INMI with a view to refining the classification scheme proposed here.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this article is to study the problem of pedestrian classification across different light spectrum domains (visible and far-infrared (FIR)) and modalities (intensity, depth and motion). In recent years, there has been a number of approaches for classifying and detecting pedestrians in both FIR and visible images, but the methods are difficult to compare, because either the datasets are not publicly available or they do not offer a comparison between the two domains. Our two primary contributions are the following: (1) we propose a public dataset, named RIFIR , containing both FIR and visible images collected in an urban environment from a moving vehicle during daytime; and (2) we compare the state-of-the-art features in a multi-modality setup: intensity, depth and flow, in far-infrared over visible domains. The experiments show that features families, intensity self-similarity (ISS), local binary patterns (LBP), local gradient patterns (LGP) and histogram of oriented gradients (HOG), computed from FIR and visible domains are highly complementary, but their relative performance varies across different modalities. In our experiments, the FIR domain has proven superior to the visible one for the task of pedestrian classification, but the overall best results are obtained by a multi-domain multi-modality multi-feature fusion.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Observations from the Heliospheric Imager (HI) instruments aboard the twin STEREO spacecraft have enabled the compilation of several catalogues of coronal mass ejections (CMEs), each characterizing the propagation of CMEs through the inner heliosphere. Three such catalogues are the Rutherford Appleton Laboratory (RAL)-HI event list, the Solar Stormwatch CME catalogue, and, presented here, the J-tracker catalogue. Each catalogue uses a different method to characterize the location of CME fronts in the HI images: manual identification by an expert, the statistical reduction of the manual identifications of many citizen scientists, and an automated algorithm. We provide a quantitative comparison of the differences between these catalogues and techniques, using 51 CMEs common to each catalogue. The time-elongation profiles of these CME fronts are compared, as are the estimates of the CME kinematics derived from application of three widely used single-spacecraft-fitting techniques. The J-tracker and RAL-HI profiles are most similar, while the Solar Stormwatch profiles display a small systematic offset. Evidence is presented that these differences arise because the RAL-HI and J-tracker profiles follow the sunward edge of CME density enhancements, while Solar Stormwatch profiles track closer to the antisunward (leading) edge. We demonstrate that the method used to produce the time-elongation profile typically introduces more variability into the kinematic estimates than differences between the various single-spacecraft-fitting techniques. This has implications for the repeatability and robustness of these types of analyses, arguably especially so in the context of space weather forecasting, where it could make the results strongly dependent on the methods used by the forecaster.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work investigates the problem of feature selection in neuroimaging features from structural MRI brain images for the classification of subjects as healthy controls, suffering from Mild Cognitive Impairment or Alzheimer’s Disease. A Genetic Algorithm wrapper method for feature selection is adopted in conjunction with a Support Vector Machine classifier. In very large feature sets, feature selection is found to be redundant as the accuracy is often worsened when compared to an Support Vector Machine with no feature selection. However, when just the hippocampal subfields are used, feature selection shows a significant improvement of the classification accuracy. Three-class Support Vector Machines and two-class Support Vector Machines combined with weighted voting are also compared with the former and found more useful. The highest accuracy achieved at classifying the test data was 65.5% using a genetic algorithm for feature selection with a three-class Support Vector Machine classifier.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Pseudomonas syringae can cause stem necrosis and canker in a wide range of woody species including cherry, plum, peach, horse chestnut and ash. The detection and quantification of lesion progression over time in woody tissues is a key trait for breeders to select upon for resistance. Results In this study a general, rapid and reliable approach to lesion quantification using image recognition and an artificial neural network model was developed. This was applied to screen both the virulence of a range of P. syringae pathovars and the resistance of a set of cherry and plum accessions to bacterial canker. The method developed was more objective than scoring by eye and allowed the detection of putatively resistant plant material for further study. Conclusions Automated image analysis will facilitate rapid screening of material for resistance to bacterial and other phytopathogens, allowing more efficient selection and quantification of resistance responses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The personalised conditioning system (PCS) is widely studied. Potentially, it is able to reduce energy consumption while securing occupants’ thermal comfort requirements. It has been suggested that automatic optimised operation schemes for PCS should be introduced to avoid energy wastage and discomfort caused by inappropriate operation. In certain automatic operation schemes, personalised thermal sensation models are applied as key components to help in setting targets for PCS operation. In this research, a novel personal thermal sensation modelling method based on the C-Support Vector Classification (C-SVC) algorithm has been developed for PCS control. The personal thermal sensation modelling has been regarded as a classification problem. During the modelling process, the method ‘learns’ an occupant’s thermal preferences from his/her feedback, environmental parameters and personal physiological and behavioural factors. The modelling method has been verified by comparing the actual thermal sensation vote (TSV) with the modelled one based on 20 individual cases. Furthermore, the accuracy of each individual thermal sensation model has been compared with the outcomes of the PMV model. The results indicate that the modelling method presented in this paper is an effective tool to model personal thermal sensations and could be integrated within the PCS for optimised system operation and control.