930 resultados para Classification algorithms


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Remote sensing - the acquisition of information about an object or phenomenon without making physical contact with the object - is applied in a multitude of different areas, ranging from agriculture, forestry, cartography, hydrology, geology, meteorology, aerial traffic control, among many others. Regarding agriculture, an example of application of this information is regarding crop detection, to monitor existing crops easily and help in the region’s strategic planning. In any of these areas, there is always an ongoing search for better methods that allow us to obtain better results. For over forty years, the Landsat program has utilized satellites to collect spectral information from Earth’s surface, creating a historical archive unmatched in quality, detail, coverage, and length. The most recent one was launched on February 11, 2013, having a number of improvements regarding its predecessors. This project aims to compare classification methods in Portugal’s Ribatejo region, specifically regarding crop detection. The state of the art algorithms will be used in this region and their performance will be analyzed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a Bayesian image classification scheme for discriminating cloud, clear and sea-ice observations at high latitudes to improve identification of areas of clear-sky over ice-free ocean for SST retrieval. We validate the image classification against a manually classified dataset using Advanced Along Track Scanning Radiometer (AATSR) data. A three way classification scheme using a near-infrared textural feature improves classifier accuracy by 9.9 % over the nadir only version of the cloud clearing used in the ATSR Reprocessing for Climate (ARC) project in high latitude regions. The three way classification gives similar numbers of cloud and ice scenes misclassified as clear but significantly more clear-sky cases are correctly identified (89.9 % compared with 65 % for ARC). We also demonstrate the poetential of a Bayesian image classifier including information from the 0.6 micron channel to be used in sea-ice extent and ice surface temperature retrieval with 77.7 % of ice scenes correctly identified and an overall classifier accuracy of 96 %.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intelligent Transport Systems (ITS) consists in the application of ICT to transport to offer new and improved services to the mobility of people and freights. While using ITS, travellers produce large quantities of data that can be collected and analysed to study their behaviour and to provide information to decision makers and planners. The thesis proposes innovative deployments of classification algorithms for Intelligent Transport System with the aim to support the decisions on traffic rerouting, bus transport demand and behaviour of two wheelers vehicles. The first part of this work provides an overview and a classification of a selection of clustering algorithms that can be implemented for the analysis of ITS data. The first contribution of this thesis is an innovative use of the agglomerative hierarchical clustering algorithm to classify similar travels in terms of their origin and destination, together with the proposal for a methodology to analyse drivers’ route choice behaviour using GPS coordinates and optimal alternatives. The clusters of repetitive travels made by a sample of drivers are then analysed to compare observed route choices to the modelled alternatives. The results of the analysis show that drivers select routes that are more reliable but that are more expensive in terms of travel time. Successively, different types of users of a service that provides information on the real time arrivals of bus at stop are classified using Support Vector Machines. The results shows that the results of the classification of different types of bus transport users can be used to update or complement the census on bus transport flows. Finally, the problem of the classification of accidents made by two wheelers vehicles is presented together with possible future application of clustering methodologies aimed at identifying and classifying the different types of accidents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The early detection of subjects with probable Alzheimer's disease (AD) is crucial for effective appliance of treatment strategies. Here we explored the ability of a multitude of linear and non-linear classification algorithms to discriminate between the electroencephalograms (EEGs) of patients with varying degree of AD and their age-matched control subjects. Absolute and relative spectral power, distribution of spectral power, and measures of spatial synchronization were calculated from recordings of resting eyes-closed continuous EEGs of 45 healthy controls, 116 patients with mild AD and 81 patients with moderate AD, recruited in two different centers (Stockholm, New York). The applied classification algorithms were: principal component linear discriminant analysis (PC LDA), partial least squares LDA (PLS LDA), principal component logistic regression (PC LR), partial least squares logistic regression (PLS LR), bagging, random forest, support vector machines (SVM) and feed-forward neural network. Based on 10-fold cross-validation runs it could be demonstrated that even tough modern computer-intensive classification algorithms such as random forests, SVM and neural networks show a slight superiority, more classical classification algorithms performed nearly equally well. Using random forests classification a considerable sensitivity of up to 85% and a specificity of 78%, respectively for the test of even only mild AD patients has been reached, whereas for the comparison of moderate AD vs. controls, using SVM and neural networks, values of 89% and 88% for sensitivity and specificity were achieved. Such a remarkable performance proves the value of these classification algorithms for clinical diagnostics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today, due to globalization of the world the size of data set is increasing, it is necessary to discover the knowledge. The discovery of knowledge can be typically in the form of association rules, classification rules, clustering, discovery of frequent episodes and deviation detection. Fast and accurate classifiers for large databases are an important task in data mining. There is growing evidence that integrating classification and association rules mining, classification approaches based on heuristic, greedy search like decision tree induction. Emerging associative classification algorithms have shown good promises on producing accurate classifiers. In this paper we focus on performance of associative classification and present a parallel model for classifier building. For classifier building some parallel-distributed algorithms have been proposed for decision tree induction but so far no such work has been reported for associative classification.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Grasslands in semi-arid regions, like Mongolian steppes, are facing desertification and degradation processes, due to climate change. Mongolia’s main economic activity consists on an extensive livestock production and, therefore, it is a concerning matter for the decision makers. Remote sensing and Geographic Information Systems provide the tools for advanced ecosystem management and have been widely used for monitoring and management of pasture resources. This study investigates which is the higher thematic detail that is possible to achieve through remote sensing, to map the steppe vegetation, using medium resolution earth observation imagery in three districts (soums) of Mongolia: Dzag, Buutsagaan and Khureemaral. After considering different thematic levels of detail for classifying the steppe vegetation, the existent pasture types within the steppe were chosen to be mapped. In order to investigate which combination of data sets yields the best results and which classification algorithm is more suitable for incorporating these data sets, a comparison between different classification methods were tested for the study area. Sixteen classifications were performed using different combinations of estimators, Landsat-8 (spectral bands and Landsat-8 NDVI-derived) and geophysical data (elevation, mean annual precipitation and mean annual temperature) using two classification algorithms, maximum likelihood and decision tree. Results showed that the best performing model was the one that incorporated Landsat-8 bands with mean annual precipitation and mean annual temperature (Model 13), using the decision tree. For maximum likelihood, the model that incorporated Landsat-8 bands with mean annual precipitation (Model 5) and the one that incorporated Landsat-8 bands with mean annual precipitation and mean annual temperature (Model 13), achieved the higher accuracies for this algorithm. The decision tree models consistently outperformed the maximum likelihood ones.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation‑based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi‑resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Among the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, have the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical‑based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Among the challenges of pig farming in today's competitive market, there is factor of the product traceability that ensures, among many points, animal welfare. Vocalization is a valuable tool to identify situations of stress in pigs, and it can be used in welfare records for traceability. The objective of this work was to identify stress in piglets using vocalization, calling this stress on three levels: no stress, moderate stress, and acute stress. An experiment was conducted on a commercial farm in the municipality of Holambra, São Paulo State , where vocalizations of twenty piglets were recorded during the castration procedure, and separated into two groups: without anesthesia and local anesthesia with lidocaine base. For the recording of acoustic signals, a unidirectional microphone was connected to a digital recorder, in which signals were digitized at a frequency of 44,100 Hz. For evaluation of sound signals, Praat® software was used, and different data mining algorithms were applied using Weka® software. The selection of attributes improved model accuracy, and the best attribute selection was used by applying Wrapper method, while the best classification algorithms were the k-NN and Naive Bayes. According to the results, it was possible to classify the level of stress in pigs through their vocalization.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dans l'apprentissage machine, la classification est le processus d’assigner une nouvelle observation à une certaine catégorie. Les classifieurs qui mettent en œuvre des algorithmes de classification ont été largement étudié au cours des dernières décennies. Les classifieurs traditionnels sont basés sur des algorithmes tels que le SVM et les réseaux de neurones, et sont généralement exécutés par des logiciels sur CPUs qui fait que le système souffre d’un manque de performance et d’une forte consommation d'énergie. Bien que les GPUs puissent être utilisés pour accélérer le calcul de certains classifieurs, leur grande consommation de puissance empêche la technologie d'être mise en œuvre sur des appareils portables tels que les systèmes embarqués. Pour rendre le système de classification plus léger, les classifieurs devraient être capable de fonctionner sur un système matériel plus compact au lieu d'un groupe de CPUs ou GPUs, et les classifieurs eux-mêmes devraient être optimisés pour ce matériel. Dans ce mémoire, nous explorons la mise en œuvre d'un classifieur novateur sur une plate-forme matérielle à base de FPGA. Le classifieur, conçu par Alain Tapp (Université de Montréal), est basé sur une grande quantité de tables de recherche qui forment des circuits arborescents qui effectuent les tâches de classification. Le FPGA semble être un élément fait sur mesure pour mettre en œuvre ce classifieur avec ses riches ressources de tables de recherche et l'architecture à parallélisme élevé. Notre travail montre que les FPGAs peuvent implémenter plusieurs classifieurs et faire les classification sur des images haute définition à une vitesse très élevée.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, various types of fault detection methods for fuel cells are compared. For example, those that use a model based approach or a data driven approach or a combination of the two. The potential advantages and drawbacks of each method are discussed and comparisons between methods are made. In particular, classification algorithms are investigated, which separate a data set into classes or clusters based on some prior knowledge or measure of similarity. In particular, the application of classification methods to vectors of reconstructed currents by magnetic tomography or to vectors of magnetic field measurements directly is explored. Bases are simulated using the finite integration technique (FIT) and regularization techniques are employed to overcome ill-posedness. Fisher's linear discriminant is used to illustrate these concepts. Numerical experiments show that the ill-posedness of the magnetic tomography problem is a part of the classification problem on magnetic field measurements as well. This is independent of the particular working mode of the cell but influenced by the type of faulty behavior that is studied. The numerical results demonstrate the ill-posedness by the exponential decay behavior of the singular values for three examples of fault classes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women. In CAD systems, several studies have investigated the use of wavelet transform as a multiresolution analysis tool for texture analysis and could be interpreted as inputs to a classifier. In classification, polynomial classifier has been used due to the advantages of providing only one model for optimal separation of classes and to consider this as the solution of the problem. In this paper, a system is proposed for texture analysis and classification of lesions in mammographic images. Multiresolution analysis features were extracted from the region of interest of a given image. These features were computed based on three different wavelet functions, Daubechies 8, Symlet 8 and bi-orthogonal 3.7. For classification, we used the polynomial classification algorithm to define the mammogram images as normal or abnormal. We also made a comparison with other artificial intelligence algorithms (Decision Tree, SVM, K-NN). A Receiver Operating Characteristics (ROC) curve is used to evaluate the performance of the proposed system. Our system is evaluated using 360 digitized mammograms from DDSM database and the result shows that the algorithm has an area under the ROC curve Az of 0.98 ± 0.03. The performance of the polynomial classifier has proved to be better in comparison to other classification algorithms. © 2013 Elsevier Ltd. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The standard reference clinical score quantifying average Parkinson's disease (PD) symptom severity is the Unified Parkinson's Disease Rating Scale (UPDRS). At present, UPDRS is determined by the subjective clinical evaluation of the patient's ability to adequately cope with a range of tasks. In this study, we extend recent findings that UPDRS can be objectively assessed to clinically useful accuracy using simple, self-administered speech tests, without requiring the patient's physical presence in the clinic. We apply a wide range of known speech signal processing algorithms to a large database (approx. 6000 recordings from 42 PD patients, recruited to a six-month, multi-centre trial) and propose a number of novel, nonlinear signal processing algorithms which reveal pathological characteristics in PD more accurately than existing approaches. Robust feature selection algorithms select the optimal subset of these algorithms, which is fed into non-parametric regression and classification algorithms, mapping the signal processing algorithm outputs to UPDRS. We demonstrate rapid, accurate replication of the UPDRS assessment with clinically useful accuracy (about 2 UPDRS points difference from the clinicians' estimates, p < 0.001). This study supports the viability of frequent, remote, cost-effective, objective, accurate UPDRS telemonitoring based on self-administered speech tests. This technology could facilitate large-scale clinical trials into novel PD treatments.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Résumé : Face à l’accroissement de la résolution spatiale des capteurs optiques satellitaires, de nouvelles stratégies doivent être développées pour classifier les images de télédétection. En effet, l’abondance de détails dans ces images diminue fortement l’efficacité des classifications spectrales; de nombreuses méthodes de classification texturale, notamment les approches statistiques, ne sont plus adaptées. À l’inverse, les approches structurelles offrent une ouverture intéressante : ces approches orientées objet consistent à étudier la structure de l’image pour en interpréter le sens. Un algorithme de ce type est proposé dans la première partie de cette thèse. Reposant sur la détection et l’analyse de points-clés (KPC : KeyPoint-based Classification), il offre une solution efficace au problème de la classification d’images à très haute résolution spatiale. Les classifications effectuées sur les données montrent en particulier sa capacité à différencier des textures visuellement similaires. Par ailleurs, il a été montré dans la littérature que la fusion évidentielle, reposant sur la théorie de Dempster-Shafer, est tout à fait adaptée aux images de télédétection en raison de son aptitude à intégrer des concepts tels que l’ambiguïté et l’incertitude. Peu d’études ont en revanche été menées sur l’application de cette théorie à des données texturales complexes telles que celles issues de classifications structurelles. La seconde partie de cette thèse vise à combler ce manque, en s’intéressant à la fusion de classifications KPC multi-échelle par la théorie de Dempster-Shafer. Les tests menés montrent que cette approche multi-échelle permet d’améliorer la classification finale dans le cas où l’image initiale est de faible qualité. De plus, l’étude effectuée met en évidence le potentiel d’amélioration apporté par l’estimation de la fiabilité des classifications intermédiaires, et fournit des pistes pour mener ces estimations.