953 resultados para Semi-supervised classification
Resumo:
El presente trabajo pretende la caracterización de la distribución espacial típica del cultivo de arroz en regadíos del valle del Ebro, donde la presencia del cultivo está ligada a la existencia de suelos salino-sódicos. Esta caracterización ha de permitir identificar las áreas donde es típica la presencia del cultivo año tras año y las áreas donde es frecuente su fluctuación debido tanto a condiciones variables de salinidad del suelo como a variabilidad en las condiciones de mercado. Para ello se ha recurrido al análisis de una serie temporal de mapas de cultivos (7 años) derivados de la clasificación supervisada de imágenes Landsat TM. La determinación de las áreas típicas y de fluctuación del cultivo de arroz se hace entonces a partir del análisis estadístico de clases, y mediante superposición espacial de coberturas en un entorno SIG-Raster.
Resumo:
The "Serra do Mar" region comprises the largest remnant of the Brazilian Atlantic Forest. The coast of the Paraná State is part of the core area of the "Serra do Mar" corridor and where actions for biodiversity conservation must be planned. In this study we aimed at characterizing the landscape structure in the APA-Guaraqueçaba, the largest protected area in this region, in order to assist environmental policies of this region. Based on a supervised classification of a mosaic of LANDSAT-5-TM satellite images (from March 2009), we developed a map (1:75,000 scale) with seven classes of land use and land cover and analyzed the relative quantities of forests and modified areas in slopes and lowlands. The APA-Guaraqueçaba is comprised mainly by the Dense Ombrophilous Forest (68.6% of total area) and secondary forests (9.1%), indicating a forested landscape matrix; anthropogenic and bare soil areas (0.8%) and the Pasture/Grasslands class (4.2%) were less representative. Slopes were less fragmented and more preserved (96.3% of Dense Ombrophilous Forest and secondary forest) than lowlands (71.3%), suggesting that restoration initiatives in the lowlands must be stimulated in this region. We concluded that most of the region sustains well-conserved ecosystems, highlighting the importance of Paraná northern coast for the biodiversity maintenance of the Atlantic Forest.
Resumo:
The objective of this study consisted on mapping the use and soil occupation and evaluation of the quality of irrigation water used in Salto do Lontra, in the state of Paraná, Brazil. Images of the satellite SPOT-5 were used to perform the supervised classification of the Maximum Likelihood algorithm - MAXVER, and the water quality parameters analyzed were pH, EC, HCO3-, Cl-, PO4(3-), NO3-, turbidity, temperature and thermotolerant coliforms in two distinct rainfall periods. The water quality data were subjected to statistical analysis by the techniques of PCA and FA, to identify the most relevant variables in assessing the quality of irrigation water. The characterization of soil use and occupation by the classifier MAXVER allowed the identification of the following classes: crops, bare soil/stubble, forests and urban area. The PCA technique applied to irrigation water quality data explained 53.27% of the variation in water quality among the sampled points. Nitrate, thermotolerant coliforms, temperature, electrical conductivity and bicarbonate were the parameters that best explained the spatial variation of water quality.
Resumo:
Coffee production was closely linked to the economic development of Brazil and, even today, coffee is an important product of the national agriculture. The State of Minas Gerais currently accounts for 52% of the whole coffee area in Brazil. Remote sensing data can provide information for monitoring and mapping of coffee crops, faster and cheaper than conventional methods. In this context, the objective of this study was to assess the effectiveness of coffee crop mapping in Monte Santo de Minas municipality, Minas Gerais State, Brazil, from fraction images derived from MODIS data, in both dry and rainy seasons. The Spectral Linear Mixing Model was used to derive fraction images of soil, coffee, and water/shade. These fraction images served as input data for the supervised automatic classification using the SVM - Support Vector Machine approach. The best results concerning Overall Accuracy and Kappa Index were obtained in the classification of the dry season, with 67% and 0.41, respectively.
Resumo:
The goal of most clustering algorithms is to find the optimal number of clusters (i.e. fewest number of clusters). However, analysis of molecular conformations of biological macromolecules obtained from computer simulations may benefit from a larger array of clusters. The Self-Organizing Map (SOM) clustering method has the advantage of generating large numbers of clusters, but often gives ambiguous results. In this work, SOMs have been shown to be reproducible when the same conformational dataset is independently clustered multiple times (~100), with the help of the Cramérs V-index (C_v). The ability of C_v to determine which SOMs are reproduced is generalizable across different SOM source codes. The conformational ensembles produced from MD (molecular dynamics) and REMD (replica exchange molecular dynamics) simulations of the penta peptide Met-enkephalin (MET) and the 34 amino acid protein human Parathyroid Hormone (hPTH) were used to evaluate SOM reproducibility. The training length for the SOM has a huge impact on the reproducibility. Analysis of MET conformational data definitively determined that toroidal SOMs cluster data better than bordered maps due to the fact that toroidal maps do not have an edge effect. For the source code from MATLAB, it was determined that the learning rate function should be LINEAR with an initial learning rate factor of 0.05 and the SOM should be trained by a sequential algorithm. The trained SOMs can be used as a supervised classification for another dataset. The toroidal 10×10 hexagonal SOMs produced from the MATLAB program for hPTH conformational data produced three sets of reproducible clusters (27%, 15%, and 13% of 100 independent runs) which find similar partitionings to those of smaller 6×6 SOMs. The χ^2 values produced as part of the C_v calculation were used to locate clusters with identical conformational memberships on independently trained SOMs, even those with different dimensions. The χ^2 values could relate the different SOM partitionings to each other.
Resumo:
On étudie l’application des algorithmes de décomposition matricielles tel que la Factorisation Matricielle Non-négative (FMN), aux représentations fréquentielles de signaux audio musicaux. Ces algorithmes, dirigés par une fonction d’erreur de reconstruction, apprennent un ensemble de fonctions de base et un ensemble de coef- ficients correspondants qui approximent le signal d’entrée. On compare l’utilisation de trois fonctions d’erreur de reconstruction quand la FMN est appliquée à des gammes monophoniques et harmonisées: moindre carré, divergence Kullback-Leibler, et une mesure de divergence dépendente de la phase, introduite récemment. Des nouvelles méthodes pour interpréter les décompositions résultantes sont présentées et sont comparées aux méthodes utilisées précédemment qui nécessitent des connaissances du domaine acoustique. Finalement, on analyse la capacité de généralisation des fonctions de bases apprises par rapport à trois paramètres musicaux: l’amplitude, la durée et le type d’instrument. Pour ce faire, on introduit deux algorithmes d’étiquetage des fonctions de bases qui performent mieux que l’approche précédente dans la majorité de nos tests, la tâche d’instrument avec audio monophonique étant la seule exception importante.
Resumo:
Les documents publiés par des entreprises, tels les communiqués de presse, contiennent une foule d’informations sur diverses activités des entreprises. C’est une source précieuse pour des analyses en intelligence d’affaire. Cependant, il est nécessaire de développer des outils pour permettre d’exploiter cette source automatiquement, étant donné son grand volume. Ce mémoire décrit un travail qui s’inscrit dans un volet d’intelligence d’affaire, à savoir la détection de relations d’affaire entre les entreprises décrites dans des communiqués de presse. Dans ce mémoire, nous proposons une approche basée sur la classification. Les méthodes de classifications existantes ne nous permettent pas d’obtenir une performance satisfaisante. Ceci est notamment dû à deux problèmes : la représentation du texte par tous les mots, qui n’aide pas nécessairement à spécifier une relation d’affaire, et le déséquilibre entre les classes. Pour traiter le premier problème, nous proposons une approche de représentation basée sur des mots pivots c’est-à-dire les noms d’entreprises concernées, afin de mieux cerner des mots susceptibles de les décrire. Pour le deuxième problème, nous proposons une classification à deux étapes. Cette méthode s’avère plus appropriée que les méthodes traditionnelles de ré-échantillonnage. Nous avons testé nos approches sur une collection de communiqués de presse dans le domaine automobile. Nos expérimentations montrent que les approches proposées peuvent améliorer la performance de classification. Notamment, la représentation du document basée sur les mots pivots nous permet de mieux centrer sur les mots utiles pour la détection de relations d’affaire. La classification en deux étapes apporte une solution efficace au problème de déséquilibre entre les classes. Ce travail montre que la détection automatique des relations d’affaire est une tâche faisable. Le résultat de cette détection pourrait être utilisé dans une analyse d’intelligence d’affaire.
Resumo:
Malgré des progrès constants en termes de capacité de calcul, mémoire et quantité de données disponibles, les algorithmes d'apprentissage machine doivent se montrer efficaces dans l'utilisation de ces ressources. La minimisation des coûts est évidemment un facteur important, mais une autre motivation est la recherche de mécanismes d'apprentissage capables de reproduire le comportement d'êtres intelligents. Cette thèse aborde le problème de l'efficacité à travers plusieurs articles traitant d'algorithmes d'apprentissage variés : ce problème est vu non seulement du point de vue de l'efficacité computationnelle (temps de calcul et mémoire utilisés), mais aussi de celui de l'efficacité statistique (nombre d'exemples requis pour accomplir une tâche donnée). Une première contribution apportée par cette thèse est la mise en lumière d'inefficacités statistiques dans des algorithmes existants. Nous montrons ainsi que les arbres de décision généralisent mal pour certains types de tâches (chapitre 3), de même que les algorithmes classiques d'apprentissage semi-supervisé à base de graphe (chapitre 5), chacun étant affecté par une forme particulière de la malédiction de la dimensionalité. Pour une certaine classe de réseaux de neurones, appelés réseaux sommes-produits, nous montrons qu'il peut être exponentiellement moins efficace de représenter certaines fonctions par des réseaux à une seule couche cachée, comparé à des réseaux profonds (chapitre 4). Nos analyses permettent de mieux comprendre certains problèmes intrinsèques liés à ces algorithmes, et d'orienter la recherche dans des directions qui pourraient permettre de les résoudre. Nous identifions également des inefficacités computationnelles dans les algorithmes d'apprentissage semi-supervisé à base de graphe (chapitre 5), et dans l'apprentissage de mélanges de Gaussiennes en présence de valeurs manquantes (chapitre 6). Dans les deux cas, nous proposons de nouveaux algorithmes capables de traiter des ensembles de données significativement plus grands. Les deux derniers chapitres traitent de l'efficacité computationnelle sous un angle différent. Dans le chapitre 7, nous analysons de manière théorique un algorithme existant pour l'apprentissage efficace dans les machines de Boltzmann restreintes (la divergence contrastive), afin de mieux comprendre les raisons qui expliquent le succès de cet algorithme. Finalement, dans le chapitre 8 nous présentons une application de l'apprentissage machine dans le domaine des jeux vidéo, pour laquelle le problème de l'efficacité computationnelle est relié à des considérations d'ingénierie logicielle et matérielle, souvent ignorées en recherche mais ô combien importantes en pratique.
Resumo:
mbikulam Tiger Reserve of Western Ghats using Geospatial technology. The major objectives of the study are Land use land cover mapping (LULC) and Phytodiversity analysis. Satellite data was used to map the land use / land cover using supervised classification techniques in Erdas imagine. The change for a period of 32 years was assessed using the multi-temporal satellite datasets from Landsat MSS (1973), Landsat TM (1990), and IRS P6 LISS III (2005). A geospatial approach was used for the land cover analysis. Digital elevation models, Satellite imageries and SOI topo sheets were the data sets used in the analysis. Vegetation sampling plots distributed over the different forest types were enumerated and studied for Phytodiversity analysis.
Resumo:
An analysis of historical Corona images, Landsat images, recent radar and Google Earth® images was conducted to determine land use and land cover changes of oases settlements and surrounding rangelands at the fringe of the Altay Mountains from 1964 to 2008. For the Landsat datasets supervised classification methods were used to test the suitability of the Maximum Likelihood Classifier with subsequent smoothing and the Sequential Maximum A Posteriori Classifier (SMAPC). The results show a trend typical for the steppe and desert regions of northern China. From 1964 to 2008 farmland strongly increased (+ 61%), while the area of grassland and forest in the floodplains decreased (- 43%). The urban areas increased threefold and 400 ha of former agricultural land were abandoned. Farmland apparently affected by soil salinity decreased in size from 1990 (1180 ha) to 2008 (630 ha). The vegetated areas of the surrounding rangelands decreased, mainly as a result of overgrazing and drought events.The SMAPC with subsequent post processing revealed the highest classification accuracy. However, the specific landscape characteristics of mountain oasis systems required labour intensive post processing. Further research is needed to test the use of ancillary information for an automated classification of the examined landscape features.
Resumo:
At many locations in Myanmar, ongoing changes in land use have negative environmental impacts and threaten natural ecosystems at local, regional and national scales. In particular, the watershed area of Inle Lake in eastern Myanmar is strongly affected by the environmental effects of deforestation and soil erosion caused by agricultural intensification and expansion of agricultural land, which are exacerbated by the increasing population pressure and the growing number of tourists. This thesis, therefore, focuses on land use changes in traditional farming systems and their effects on socio-economic and biophysical factors to improve our understanding of sustainable natural resource management of this wetland ecosystem. The main objectives of this research were to: (1) assess the noticeable land transformations in space and time, (2) identify the typical farming systems as well as the divergent livelihood strategies, and finally, (3) estimate soil erosion risk in the different agro-ecological zones surrounding the Inle Lake watershed area. GIS and remote sensing techniques allowed to identify the dynamic land use and land cover changes (LUCC) during the past 40 years based on historical Corona images (1968) and Landsat images (1989, 2000 and 2009). In this study, 12 land cover classes were identified and a supervised classification was used for the Landsat datasets, whereas a visual interpretation approach was conducted for the Corona images. Within the past 40 years, the main landscape transformation processes were deforestation (- 49%), urbanization (+ 203%), agricultural expansion (+ 34%) with a notably increase of floating gardens (+ 390%), land abandonment (+ 167%), and marshlands losses in wetland area (- 83%) and water bodies (- 16%). The main driving forces of LUCC appeared to be high population growth, urbanization and settlements, a lack of sustainable land use and environmental management policies, wide-spread rural poverty, an open market economy and changes in market prices and access. To identify the diverse livelihood strategies in the Inle Lake watershed area and the diversity of income generating activities, household surveys were conducted (total: 301 households) using a stratified random sampling design in three different agro-ecological zones: floating gardens (FG), lowland cultivation (LL) and upland cultivation (UP). A cluster and discriminant analysis revealed that livelihood strategies and socio-economic situations of local communities differed significantly in the different zones. For all three zones, different livelihood strategies were identified which differed mainly in the amount of on-farm and off-farm income, and the level of income diversification. The gross margin for each household from agricultural production in the floating garden, lowland and upland cultivation was US$ 2108, 892 and 619 ha-1 respectively. Among the typical farming systems in these zones, tomato (Lycopersicon esculentum L.) plantation in the floating gardens yielded the highest net benefits, but caused negative environmental impacts given the overuse of inorganic fertilizers and pesticides. The Revised Universal Soil Loss Equation (RUSLE) and spatial analysis within GIS were applied to estimate soil erosion risk in the different agricultural zones and for the main cropping systems of the study region. The results revealed that the average soil losses in year 1989, 2000 and 2009 amounted to 20, 10 and 26 t ha-1, respectively and barren land along the steep slopes had the highest soil erosion risk with 85% of the total soil losses in the study area. Yearly fluctuations were mainly caused by changes in the amount of annual precipitation and the dynamics of LUCC such as deforestation and agriculture extension with inappropriate land use and unsustainable cropping systems. Among the typical cropping systems, upland rainfed rice (Oryza sativa L.) cultivation had the highest rate of soil erosion (20 t ha-1yr-1) followed by sebesten (Cordia dichotoma) and turmeric (Curcuma longa) plantation in the UP zone. This study indicated that the hotspot region of soil erosion risk were upland mountain areas, especially in the western part of the Inle lake. Soil conservation practices are thus urgently needed to control soil erosion and lake sedimentation and to conserve the wetland ecosystem. Most farmers have not yet implemented soil conservation measures to reduce soil erosion impacts such as land degradation, sedimentation and water pollution in Inle Lake, which is partly due to the low economic development and poverty in the region. Key challenges of agriculture in the hilly landscapes can be summarized as follows: fostering the sustainable land use of farming systems for the maintenance of ecosystem services and functions while improving the social and economic well-being of the population, integrated natural resources management policies and increasing the diversification of income opportunities to reduce pressure on forest and natural resources.
Resumo:
We used ground surveys to identify breeding habitat for Whimbrel (Numenius phaeopus) in the outer Mackenzie Delta, Northwest Territories, and to test the value of high-resolution IKONOS imagery for mapping additional breeding habitat in the Delta. During ground surveys, we found Whimbrel nests (n = 28) in extensive areas of wet-sedge low-centered polygon (LCP) habitat on two islands in the Delta (Taglu and Fish islands) in 2006 and 2007. Supervised classification using spectral analysis of IKONOS imagery successfully identified additional areas of wet-sedge habitat in the region. However, ground surveys to test this classification found that many areas of wet-sedge habitat had dense shrubs, no standing water, and/or lacked polygon structure and did not support breeding Whimbrel. Visual examination of the IKONOS imagery was necessary to determine which areas exhibited LCP structure. Much lower densities of nesting Whimbrel were also found in upland habitats near wetlands. We used habitat maps developed from a combination of methods, to perform scenario analyses to estimate the potential effects of the Mackenzie Gas Project on Whimbrel habitat. Assuming effective complete habitat loss within 20 m, 50 m, or 250 m of any infrastructure or pipeline, the currently proposed pipeline development would result in loss of 8%, 12%, or 30% of existing Whimbrel habitat. If subsidence were to occur, most Whimbrel habitat could become unsuitable. If the facility is developed, follow-up surveys will be required to test these models.
Resumo:
The design of binary morphological operators that are translation-invariant and locally defined by a finite neighborhood window corresponds to the problem of designing Boolean functions. As in any supervised classification problem, morphological operators designed from a training sample also suffer from overfitting. Large neighborhood tends to lead to performance degradation of the designed operator. This work proposes a multilevel design approach to deal with the issue of designing large neighborhood-based operators. The main idea is inspired by stacked generalization (a multilevel classifier design approach) and consists of, at each training level, combining the outcomes of the previous level operators. The final operator is a multilevel operator that ultimately depends on a larger neighborhood than of the individual operators that have been combined. Experimental results show that two-level operators obtained by combining operators designed on subwindows of a large window consistently outperform the single-level operators designed on the full window. They also show that iterating two-level operators is an effective multilevel approach to obtain better results.
Resumo:
ABSTRACT World Heritage sites provide a glimpse into the stories and civilizations of the past. There are currently 1007 unique World Heritage properties with 779 being classified as cultural sites, 197 as natural sites, and 31 falling into the categories of both cultural and natural sites (UNESCO & World Heritage Centre, 1992-2015). However, of these 1007 World Heritage sites, at least 46 are categorized as in danger and this number continues to grow. These unique and irreplaceable sites are exceptional because of their universality. Consequently, since World Heritage sites belong to all the people of the world and provide inspiration and admiration to all who visit them, it is our responsibility to help preserve these sites. The key form of preservation involves the individual monitoring of each site over time. While traditional methods are still extremely valuable, more recent advances in the field of geographic and spatial technologies including geographic information systems (GIS), laser scanning, and remote sensing, are becoming more beneficial for the monitoring and overall safeguarding of World Heritage sites. Through the employment and analysis of more accurately detailed spatial data, World Heritage sites can be better managed. There is a strong urgency to protect these sites. The purpose of this thesis is to describe the importance of taking care of World Heritage sites and to depict a way in which spatial technologies can be used to monitor and in effect preserve World Heritage sites through the utilization of remote sensing imagery. The research conducted in this thesis centers on the Everglades National Park, a World Heritage site that is continually affected by changes in vegetation. Data used include Landsat satellite imagery that dates from 2001-2003, the Everglades' boundaries shapefile, and Google Earth imagery. In order to conduct the in-depth analysis of vegetation change within the selected World Heritage site, three main techniques were performed to study changes found within the imagery. These techniques consist of conducting supervised classification for each image, incorporating a vegetation index known as Normalized Vegetation Index (NDVI), and utilizing the change detection tool available in the Environment for Visualizing Images (ENVI) software. With the research and analysis conducted throughout this thesis, it has been shown that within the three year time span (2001-2003), there has been an overall increase in both areas of barren soil (5.760%) and areas of vegetation (1.263%) with a decrease in the percentage of areas classified as sparsely vegetated (-6.987%). These results were gathered through the use of the maximum likelihood classification process available in the ENVI software. The results produced by the change detection tool which further analyzed vegetation change correlate with the results produced by the classification method. As well, by utilizing the NDVI method, one is able to locate changes by selecting a specific area and comparing the vegetation index generated for each date. It has been found that through the utilization of remote sensing technology, it is possible to monitor and observe changes featured within a World Heritage site. Remote sensing is an extraordinary tool that can and should be used by all site managers and organizations whose goal it is to preserve and protect World Heritage sites. Remote sensing can be used to not only observe changes over time, but it can also be used to pinpoint threats within a World Heritage site. World Heritage sites are irreplaceable sources of beauty, culture, and inspiration. It is our responsibility, as citizens of this world, to guard these treasures.
Resumo:
The purpose of this paper was to evaluate attributes derived from fully polarimetric PALSAR data to discriminate and map macrophyte species in the Amazon floodplain wetlands. Fieldwork was carried out almost simultaneously to the radar acquisition, and macrophyte biomass and morphological variables were measured in the field. Attributes were calculated from the covariance matrix [C] derived from the single-look complex data. Image attributes and macrophyte variables were compared and analyzed to investigate the sensitivity of the attributes for discriminating among species. Based on these analyses, a rule-based classification was applied to map macrophyte species. Other classification approaches were tested and compared to the rule-based method: a classification based on the Freeman-Durden and Cloude-Pottier decomposition models, a hybrid classification (Wishart classifier with the input classes based on the H/a plane), and a statistical-based classification (supervised classification using Wishart distance measures). The findings show that attributes derived from fully polarimetric L-band data have good potential for discriminating herbaceous plant species based on morphology and that estimation of plant biomass and productivity could be improved by using these polarimetric attributes.