966 resultados para Hierarchical Spatial Classification
Resumo:
Automatic classification of makams from symbolic data is a rarely studied topic. In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using (almost) the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale. In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system.
Resumo:
Cabo Verde desde do século passado tem envidado esforço na florestação, sobretudo depois de 1975 para atenuar os efeitos da seca e da desertificação criando deste modo grandes áreas arborizadas. Entretanto, à medida que os recursos florestais foram sendo criados, a problemática da sua avaliação e da sua gestão sustentável, passaram a merecer maior atenção das autoridades nacionais. A lei florestal, promulgada em 1998 define como uma das atribuições e acções do Estado, através dos serviços florestais, a elaboração dos planos de gestão das zonas florestais. Este plano de gestão implica a análise e a apreciação de dados concretos e actualizados sobre a situação real das zonas florestais, sendo possível apenas através do inventário florestal nacional (IFN). Neste trabalho é proposta uma metodologia de processamento do IFN em que se utilizam as potencialidades dos Sistemas de Informação Geográfica (SIG). Foram utilizados para este trabalho os programas: ArcGis 9.1, para produção cartográfica, geoprocessamento e análise espacial e o Field-Map 8.1 para a classificação de ortofotos num esquema de classificação hierárquica, em cinco níveis, adaptado a Cabo Verde (classes de uso do solo adoptado ao esquema de classificação do território europeu – CORINE Land Cover e da Organização das Nações Unidas para a Agricultura e Alimentação (FAO). Os dados utilizados foram compilados no âmbito do projecto do inventário florestal. Os resultados obtidos, para a Ilha de Santiago, constituem uma base cartográfica para o IFN com diversos temas cartográficos, nomeadamente, mapas das zonas florestadas, mapas de ocupação do solo e mapas de amostras inventariáveis cuja metodologia de elaboração poderá ser facilmente replicada para as restantes ilhas do arquipélago.
Resumo:
In this paper we address the issue of locating hierarchical facilities in the presence of congestion. Two hierarchical models are presented, where lower level servers attend requests first, and then, some of the served customers are referred to higher level servers. In the first model, the objective is to find the minimum number of servers and theirlocations that will cover a given region with a distance or time standard. The second model is cast as a Maximal Covering Location formulation. A heuristic procedure is then presented together with computational experience. Finally, some extensions of these models that address other types of spatial configurations are offered.
Resumo:
In this paper, we propose two active learning algorithms for semiautomatic definition of training samples in remote sensing image classification. Based on predefined heuristics, the classifier ranks the unlabeled pixels and automatically chooses those that are considered the most valuable for its improvement. Once the pixels have been selected, the analyst labels them manually and the process is iterated. Starting with a small and nonoptimal training set, the model itself builds the optimal set of samples which minimizes the classification error. We have applied the proposed algorithms to a variety of remote sensing data, including very high resolution and hyperspectral images, using support vector machines. Experimental results confirm the consistency of the methods. The required number of training samples can be reduced to 10% using the methods proposed, reaching the same level of accuracy as larger data sets. A comparison with a state-of-the-art active learning method, margin sampling, is provided, highlighting advantages of the methods proposed. The effect of spatial resolution and separability of the classes on the quality of the selection of pixels is also discussed.
Resumo:
The loss of biodiversity has become a matter of urgent concern and a better understanding of local drivers is crucial for conservation. Although environmental heterogeneity is recognized as an important determinant of biodiversity, this has rarely been tested using field data at management scale. We propose and provide evidence for the simple hypothesis that local species diversity is related to spatial environmental heterogeneity. Species partition the environment into habitats. Biodiversity is therefore expected to be influenced by two aspects of spatial heterogeneity: 1) the variability of environmental conditions, which will affect the number of types of habitat, and 2) the spatial configuration of habitats, which will affect the rates of ecological processes, such as dispersal or competition. Earlier, simulation experiments predicted that both aspects of heterogeneity will influence plant species richness at a particular site. For the first time, these predictions were tested for plant communities using field data, which we collected in a wooded pasture in the Swiss Jura mountains using a four-level hierarchical sampling design. Richness generally increased with increasing environmental variability and "roughness" (i.e. decreasing spatial aggregation). Effects occurred at all scales, but the nature of the effect changed with scale, suggesting a change in the underlying mechanisms, which will need to be taken into account if scaling up to larger landscapes. Although we found significant effects of environmental heterogeneity, other factors such as history could also be important determinants. If a relationship between environmental heterogeneity and species richness can be shown to be general, recently available high-resolution environmental data can be used to complement the assessment of patterns of local richness and improve the prediction of the effects of land use change based on mean site conditions or land use history.
Resumo:
The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.
Resumo:
Although the influence of clay mineralogy on soil physical properties has been widely studied, spatial relationships between these features in Alfisols have rarely been examined. The purpose of this work was to relate the clay minerals and physical properties of an Alfisol of sandstone origin in two slope curvatures. The crystallographic properties such as mean crystallite size (MCS) and width at half height (WHH) of hematite, goethite, kaolinite and gibbsite; contents of hematite and goethite; aluminium substitution (AS) and specific surface area (SSA) of hematite and goethite; the goethite/(goethite+hematite) and kaolinite/(kaolinite+gibbsite) ratios; and the citrate/bicarbonate/dithionite extractable Fe (Fe d) were correlated with the soil physical properties through Pearson correlation coefficients and cross-semivariograms. The correlations found between aluminium substitution in goethite and the soil physical properties suggest that the degree of crystallinity of this mineral influences soil properties used as soil quality indicators. Thus, goethite with a high aluminium substitution resulted in large aggregate sizes and a high porosity, and also in a low bulk density and soil penetration resistance. The presence of highly crystalline gibbsite resulted in a high density and micropore content, as well as in smaller aggregates. Interpretation of the cross-semivariogram and classification of landscape compartments in terms of the spatial dependence pattern for the relief-dependent physical and mineralogical properties of the soil proved an effective supplementary method for assessing Pearson correlations between the soil physical and mineralogical properties.
Resumo:
Soil surveys are the main source of spatial information on soils and have a range of different applications, mainly in agriculture. The continuity of this activity has however been severely compromised, mainly due to a lack of governmental funding. The purpose of this study was to evaluate the feasibility of two different classifiers (artificial neural networks and a maximum likelihood algorithm) in the prediction of soil classes in the northwest of the state of Rio de Janeiro. Terrain attributes such as elevation, slope, aspect, plan curvature and compound topographic index (CTI) and indices of clay minerals, iron oxide and Normalized Difference Vegetation Index (NDVI), derived from Landsat 7 ETM+ sensor imagery, were used as discriminating variables. The two classifiers were trained and validated for each soil class using 300 and 150 samples respectively, representing the characteristics of these classes in terms of the discriminating variables. According to the statistical tests, the accuracy of the classifier based on artificial neural networks (ANNs) was greater than of the classic Maximum Likelihood Classifier (MLC). Comparing the results with 126 points of reference showed that the resulting ANN map (73.81 %) was superior to the MLC map (57.94 %). The main errors when using the two classifiers were caused by: a) the geological heterogeneity of the area coupled with problems related to the geological map; b) the depth of lithic contact and/or rock exposure, and c) problems with the environmental correlation model used due to the polygenetic nature of the soils. This study confirms that the use of terrain attributes together with remote sensing data by an ANN approach can be a tool to facilitate soil mapping in Brazil, primarily due to the availability of low-cost remote sensing data and the ease by which terrain attributes can be obtained.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.
Resumo:
The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.
Resumo:
This paper takes the shelf and digs into the complex population’s age structure of Catalan municipalities for the year 2009. Catalonia is a very heterogeneous territory, and age pyramids vary considerably across different areas of the territory, existing geographical factors shaping municipalities’ age distributions. By means of spatial statistics methodologies, this piece of research tries to assess which spatial factors determine the location, scale and shape of local distributions. The results show that there exist different distributional patterns across the geography according to specific local determinants. Keywords: Spatial Models. JEL Classification: C21.
Resumo:
Background: Conventional magnetic resonance imaging (MRI) techniques are highly sensitive to detect multiple sclerosis (MS) plaques, enabling a quantitative assessment of inflammatory activity and lesion load. In quantitative analyses of focal lesions, manual or semi-automated segmentations have been widely used to compute the total number of lesions and the total lesion volume. These techniques, however, are both challenging and time-consuming, being also prone to intra-observer and inter-observer variability.Aim: To develop an automated approach to segment brain tissues and MS lesions from brain MRI images. The goal is to reduce the user interaction and to provide an objective tool that eliminates the inter- and intra-observer variability.Methods: Based on the recent methods developed by Souplet et al. and de Boer et al., we propose a novel pipeline which includes the following steps: bias correction, skull stripping, atlas registration, tissue classification, and lesion segmentation. After the initial pre-processing steps, a MRI scan is automatically segmented into 4 classes: white matter (WM), grey matter (GM), cerebrospinal fluid (CSF) and partial volume. An expectation maximisation method which fits a multivariate Gaussian mixture model to T1-w, T2-w and PD-w images is used for this purpose. Based on the obtained tissue masks and using the estimated GM mean and variance, we apply an intensity threshold to the FLAIR image, which provides the lesion segmentation. With the aim of improving this initial result, spatial information coming from the neighbouring tissue labels is used to refine the final lesion segmentation.Results:The experimental evaluation was performed using real data sets of 1.5T and the corresponding ground truth annotations provided by expert radiologists. The following values were obtained: 64% of true positive (TP) fraction, 80% of false positive (FP) fraction, and an average surface distance of 7.89 mm. The results of our approach were quantitatively compared to our implementations of the works of Souplet et al. and de Boer et al., obtaining higher TP and lower FP values.Conclusion: Promising MS lesion segmentation results have been obtained in terms of TP. However, the high number of FP which is still a well-known problem of all the automated MS lesion segmentation approaches has to be improved in order to use them for the standard clinical practice. Our future work will focus on tackling this issue.
Resumo:
The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.
Resumo:
This letter presents advanced classification methods for very high resolution images. Efficient multisource information, both spectral and spatial, is exploited through the use of composite kernels in support vector machines. Weighted summations of kernels accounting for separate sources of spectral and spatial information are analyzed and compared to classical approaches such as pure spectral classification or stacked approaches using all the features in a single vector. Model selection problems are addressed, as well as the importance of the different kernels in the weighted summation.