898 resultados para Classification accuracy
Resumo:
In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.
Resumo:
Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.
Resumo:
Feature discretization (FD) techniques often yield adequate and compact representations of the data, suitable for machine learning and pattern recognition problems. These representations usually decrease the training time, yielding higher classification accuracy while allowing for humans to better understand and visualize the data, as compared to the use of the original features. This paper proposes two new FD techniques. The first one is based on the well-known Linde-Buzo-Gray quantization algorithm, coupled with a relevance criterion, being able perform unsupervised, supervised, or semi-supervised discretization. The second technique works in supervised mode, being based on the maximization of the mutual information between each discrete feature and the class label. Our experimental results on standard benchmark datasets show that these techniques scale up to high-dimensional data, attaining in many cases better accuracy than existing unsupervised and supervised FD approaches, while using fewer discretization intervals.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
The rapid growth of big cities has been noticed since 1950s when the majority of world population turned to live in urban areas rather than villages, seeking better job opportunities and higher quality of services and lifestyle circumstances. This demographic transition from rural to urban is expected to have a continuous increase. Governments, especially in less developed countries, are going to face more challenges in different sectors, raising the essence of understanding the spatial pattern of the growth for an effective urban planning. The study aimed to detect, analyse and model the urban growth in Greater Cairo Region (GCR) as one of the fast growing mega cities in the world using remote sensing data. Knowing the current and estimated urbanization situation in GCR will help decision makers in Egypt to adjust their plans and develop new ones. These plans should focus on resources reallocation to overcome the problems arising in the future and to achieve a sustainable development of urban areas, especially after the high percentage of illegal settlements which took place in the last decades. The study focused on a period of 30 years; from 1984 to 2014, and the major transitions to urban were modelled to predict the future scenarios in 2025. Three satellite images of different time stamps (1984, 2003 and 2014) were classified using Support Vector Machines (SVM) classifier, then the land cover changes were detected by applying a high level mapping technique. Later the results were analyzed for higher accurate estimations of the urban growth in the future in 2025 using Land Change Modeler (LCM) embedded in IDRISI software. Moreover, the spatial and temporal urban growth patterns were analyzed using statistical metrics developed in FRAGSTATS software. The study resulted in an overall classification accuracy of 96%, 97.3% and 96.3% for 1984, 2003 and 2014’s map, respectively. Between 1984 and 2003, 19 179 hectares of vegetation and 21 417 hectares of desert changed to urban, while from 2003 to 2014, the transitions to urban from both land cover classes were found to be 16 486 and 31 045 hectares, respectively. The model results indicated that 14% of the vegetation and 4% of the desert in 2014 will turn into urban in 2025, representing 16 512 and 24 687 hectares, respectively.
Resumo:
With the recent advances in technology and miniaturization of devices such as GPS or IMU, Unmanned Aerial Vehicles became a feasible platform for a Remote Sensing applications. The use of UAVs compared to the conventional aerial platforms provides a set of advantages such as higher spatial resolution of the derived products. UAV - based imagery obtained by a user grade cameras introduces a set of problems which have to be solved, e. g. rotational or angular differences or unknown or insufficiently precise IO and EO camera parameters. In this work, UAV - based imagery of RGB and CIR type was processed using two different workflows based on PhotoScan and VisualSfM software solutions resulting in the DSM and orthophoto products. Feature detection and matching parameters influence on the result quality as well as a processing time was examined and the optimal parameter setup was presented. Products of the both workflows were compared in terms of a quality and a spatial accuracy. Both workflows were compared by presenting the processing times and quality of the results. Finally, the obtained products were used in order to demonstrate vegetation classification. Contribution of the IHS transformations was examined with respect to the classification accuracy.
Resumo:
Many texture measures have been developed and used for improving land-cover classification accuracy, but rarely has research examined the role of textures in improving the performance of aboveground biomass estimations. The relationship between texture and biomass is poorly understood. This paper used Landsat Thematic Mapper (TM) data to explore relationships between TM image textures and aboveground biomass in Rondônia, Brazilian Amazon. Eight grey level co-occurrence matrix (GLCM) based texture measures (i.e., mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation), associated with seven different window sizes (5x5, 7x7, 9x9, 11x11, 15x15, 19x19, and 25x25), and five TM bands (TM 2, 3, 4, 5, and 7) were analyzed. Pearson's correlation coefficient was used to analyze texture and biomass relationships. This research indicates that most textures are weakly correlated with successional vegetation biomass, but some textures are significantly correlated with mature forest biomass. In contrast, TM spectral signatures are significantly correlated with successional vegetation biomass, but weakly correlated with mature forest biomass. Our findings imply that textures may be critical in improving mature forest biomass estimation, but relatively less important for successional vegetation biomass estimation.
Resumo:
ABSTRACTThe Amazon várzeas are an important component of the Amazon biome, but anthropic and climatic impacts have been leading to forest loss and interruption of essential ecosystem functions and services. The objectives of this study were to evaluate the capability of the Landsat-based Detection of Trends in Disturbance and Recovery (LandTrendr) algorithm to characterize changes in várzeaforest cover in the Lower Amazon, and to analyze the potential of spectral and temporal attributes to classify forest loss as either natural or anthropogenic. We used a time series of 37 Landsat TM and ETM+ images acquired between 1984 and 2009. We used the LandTrendr algorithm to detect forest cover change and the attributes of "start year", "magnitude", and "duration" of the changes, as well as "NDVI at the end of series". Detection was restricted to areas identified as having forest cover at the start and/or end of the time series. We used the Support Vector Machine (SVM) algorithm to classify the extracted attributes, differentiating between anthropogenic and natural forest loss. Detection reliability was consistently high for change events along the Amazon River channel, but variable for changes within the floodplain. Spectral-temporal trajectories faithfully represented the nature of changes in floodplain forest cover, corroborating field observations. We estimated anthropogenic forest losses to be larger (1.071 ha) than natural losses (884 ha), with a global classification accuracy of 94%. We conclude that the LandTrendr algorithm is a reliable tool for studies of forest dynamics throughout the floodplain.
Resumo:
The Symbolic Aggregate Approximation (iSAX) is widely used in time series data mining. Its popularity arises from the fact that it largely reduces time series size, it is symbolic, allows lower bounding and is space efficient. However, it requires setting two parameters: the symbolic length and alphabet size, which limits the applicability of the technique. The optimal parameter values are highly application dependent. Typically, they are either set to a fixed value or experimentally probed for the best configuration. In this work we propose an approach to automatically estimate iSAX’s parameters. The approach – AutoiSAX – not only discovers the best parameter setting for each time series in the database, but also finds the alphabet size for each iSAX symbol within the same word. It is based on simple and intuitive ideas from time series complexity and statistics. The technique can be smoothly embedded in existing data mining tasks as an efficient sub-routine. We analyze its impact in visualization interpretability, classification accuracy and motif mining. Our contribution aims to make iSAX a more general approach as it evolves towards a parameter-free method.
Resumo:
We propose and validate a multivariate classification algorithm for characterizing changes in human intracranial electroencephalographic data (iEEG) after learning motor sequences. The algorithm is based on a Hidden Markov Model (HMM) that captures spatio-temporal properties of the iEEG at the level of single trials. Continuous intracranial iEEG was acquired during two sessions (one before and one after a night of sleep) in two patients with depth electrodes implanted in several brain areas. They performed a visuomotor sequence (serial reaction time task, SRTT) using the fingers of their non-dominant hand. Our results show that the decoding algorithm correctly classified single iEEG trials from the trained sequence as belonging to either the initial training phase (day 1, before sleep) or a later consolidated phase (day 2, after sleep), whereas it failed to do so for trials belonging to a control condition (pseudo-random sequence). Accurate single-trial classification was achieved by taking advantage of the distributed pattern of neural activity. However, across all the contacts the hippocampus contributed most significantly to the classification accuracy for both patients, and one fronto-striatal contact for one patient. Together, these human intracranial findings demonstrate that a multivariate decoding approach can detect learning-related changes at the level of single-trial iEEG. Because it allows an unbiased identification of brain sites contributing to a behavioral effect (or experimental condition) at the level of single subject, this approach could be usefully applied to assess the neural correlates of other complex cognitive functions in patients implanted with multiple electrodes.
Resumo:
Voxel-based morphometry from conventional T1-weighted images has proved effective to quantify Alzheimer's disease (AD) related brain atrophy and to enable fairly accurate automated classification of AD patients, mild cognitive impaired patients (MCI) and elderly controls. Little is known, however, about the classification power of volume-based morphometry, where features of interest consist of a few brain structure volumes (e.g. hippocampi, lobes, ventricles) as opposed to hundreds of thousands of voxel-wise gray matter concentrations. In this work, we experimentally evaluate two distinct volume-based morphometry algorithms (FreeSurfer and an in-house algorithm called MorphoBox) for automatic disease classification on a standardized data set from the Alzheimer's Disease Neuroimaging Initiative. Results indicate that both algorithms achieve classification accuracy comparable to the conventional whole-brain voxel-based morphometry pipeline using SPM for AD vs elderly controls and MCI vs controls, and higher accuracy for classification of AD vs MCI and early vs late AD converters, thereby demonstrating the potential of volume-based morphometry to assist diagnosis of mild cognitive impairment and Alzheimer's disease.
Resumo:
Although cross-sectional diffusion tensor imaging (DTI) studies revealed significant white matter changes in mild cognitive impairment (MCI), the utility of this technique in predicting further cognitive decline is debated. Thirty-five healthy controls (HC) and 67 MCI subjects with DTI baseline data were neuropsychologically assessed at one year. Among them, there were 40 stable (sMCI; 9 single domain amnestic, 7 single domain frontal, 24 multiple domain) and 27 were progressive (pMCI; 7 single domain amnestic, 4 single domain frontal, 16 multiple domain). Fractional anisotropy (FA) and longitudinal, radial, and mean diffusivity were measured using Tract-Based Spatial Statistics. Statistics included group comparisons and individual classification of MCI cases using support vector machines (SVM). FA was significantly higher in HC compared to MCI in a distributed network including the ventral part of the corpus callosum, right temporal and frontal pathways. There were no significant group-level differences between sMCI versus pMCI or between MCI subtypes after correction for multiple comparisons. However, SVM analysis allowed for an individual classification with accuracies up to 91.4% (HC versus MCI) and 98.4% (sMCI versus pMCI). When considering the MCI subgroups separately, the minimum SVM classification accuracy for stable versus progressive cognitive decline was 97.5% in the multiple domain MCI group. SVM analysis of DTI data provided highly accurate individual classification of stable versus progressive MCI regardless of MCI subtype, indicating that this method may become an easily applicable tool for early individual detection of MCI subjects evolving to dementia.
Resumo:
In this study we propose an evaluation of the angular effects altering the spectral response of the land-cover over multi-angle remote sensing image acquisitions. The shift in the statistical distribution of the pixels observed in an in-track sequence of WorldView-2 images is analyzed by means of a kernel-based measure of distance between probability distributions. Afterwards, the portability of supervised classifiers across the sequence is investigated by looking at the evolution of the classification accuracy with respect to the changing observation angle. In this context, the efficiency of various physically and statistically based preprocessing methods in obtaining angle-invariant data spaces is compared and possible synergies are discussed.
Resumo:
The objective of this work was to evaluate the application of the spectral-temporal response surface (STRS) classification method on Moderate Resolution Imaging Spectroradiometer (MODIS, 250 m) sensor images in order to estimate soybean areas in Mato Grosso state, Brazil. The classification was carried out using the maximum likelihood algorithm (MLA) adapted to the STRS method. Thirty segments of 30x30 km were chosen along the main agricultural regions of Mato Grosso state, using data from the summer season of 2005/2006 (from October to March), and were mapped based on fieldwork data, TM/Landsat-5 and CCD/CBERS-2 images. Five thematic classes were considered: Soybean, Forest, Cerrado, Pasture and Bare Soil. The classification by the STRS method was done over an area intersected with a subset of 30x30-km segments. In regions with soybean predominance, STRS classification overestimated in 21.31% of the reference values. In regions where soybean fields were less prevalent, the classifier overestimated 132.37% in the acreage of the reference. The overall classification accuracy was 80%. MODIS sensor images and the STRS algorithm showed to be promising for the classification of soybean areas in regions with the predominance of large farms. However, the results for fragmented areas and smaller farms were less efficient, overestimating soybean areas.