938 resultados para Elaborazione d’immagini, Microscopia, Istopatologia, Classificazione, K-means


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Autism Spectrum Disorder (ASD) is growing at a staggering rate, but, little is known about the cause of this condition. Inferring learning patterns from therapeutic performance data, and subsequently clustering ASD children into subgroups, is important to understand this domain, and more importantly to inform evidence-based intervention. However, this data-driven task was difficult in the past due to insufficiency of data to perform reliable analysis. For the first time, using data from a recent application for early intervention in autism (TOBY Play pad), whose download count is now exceeding 4500, we present in this paper the automatic discovery of learning patterns across 32 skills in sensory, imitation and language. We use unsupervised learning methods for this task, but a notorious problem with existing methods is the correct specification of number of patterns in advance, which in our case is even more difficult due to complexity of the data. To this end, we appeal to recent Bayesian nonparametric methods, in particular the use of Bayesian Nonparametric Factor Analysis. This model uses Indian Buffet Process (IBP) as prior on a binary matrix of infinite columns to allocate groups of intervention skills to children. The optimal number of learning patterns as well as subgroup assignments are inferred automatically from data. Our experimental results follow an exploratory approach, present different newly discovered learning patterns. To provide quantitative results, we also report the clustering evaluation against K-means and Nonnegative matrix factorization (NMF). In addition to the novelty of this new problem, we were able to demonstrate the suitability of Bayesian nonparametric models over parametric rivals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Courvisanos J., Jain A. and Mardaneh K. Economic resilience of regions under crises: a study of the Australian economy, Regional Studies. Identifying patterns of economic resilience in regions by industry categories is the focus of this paper. Patterns emerge from adaptive capacity in four distinct functional groups of local government regions in Australia, in respect of their resilience from shocks on specific industries. A model of regional adaptive cycles around four sequential phases – reorganization, exploitation, conservation and release – is adopted as the framework for recognizing such patterns. A data-mining method utilizes a k-means algorithm to evaluate the impact of two major shocks – a 13-year drought and the Global Financial Crisis – on four functional groups of regions, using census data from 2001, 2006 and 2011.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When no prior knowledge is available, clustering is a useful technique for categorizing data into meaningful groups or clusters. In this paper, a modified fuzzy min-max (MFMM) clustering neural network is proposed. Its efficacy for tackling power quality monitoring tasks is demonstrated. A literature review on various clustering techniques is first presented. To evaluate the proposed MFMM model, a performance comparison study using benchmark data sets pertaining to clustering problems is conducted. The results obtained are comparable with those reported in the literature. Then, a real-world case study on power quality monitoring tasks is performed. The results are compared with those from the fuzzy c-means and k-means clustering methods. The experimental outcome positively indicates the potential of MFMM in undertaking data clustering tasks and its applicability to the power systems domain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The users often have additional knowledge when Bayesian nonparametric models (BNP) are employed, e.g. for clustering there may be prior knowledge that some of the data instances should be in the same cluster (must-link constraint) or in different clusters (cannot-link constraint), and similarly for topic modeling some words should be grouped together or separately because of an underlying semantic. This can be achieved by imposing appropriate sampling probabilities based on such constraints. However, the traditional inference technique of BNP models via Gibbs sampling is time consuming and is not scalable for large data. Variational approximations are faster but many times they do not offer good solutions. Addressing this we present a small-variance asymptotic analysis of the MAP estimates of BNP models with constraints. We derive the objective function for Dirichlet process mixture model with constraints and devise a simple and efficient K-means type algorithm. We further extend the small-variance analysis to hierarchical BNP models with constraints and devise a similar simple objective function. Experiments on synthetic and real data sets demonstrate the efficiency and effectiveness of our algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O propósito dessa dissertação é avaliar, numa perspectiva geográfica, os setores industriais no Brasil nas últimas três décadas. Numa primeira instância, o objetivo é verificar o nível de especialização e concentração dos estados brasileiros em termos industriais, utilizando-se os índices de Krugman e Gini, respectivamente. Com os resultados desses dois índices, os estados brasileiros são separados em quatro grupos, segundo o método de grupamento de médias K. Através de um produto interno usual entre o vetor da distribuição da produção industrial dos setores nos estados e vetores de algumas características desses setores (chamado de Viés das Características da Indústria - VCI), verifica-se em que tipos de indústrias os estados estão se especializando e/ou concentrando. Uma análise multivariada de componentes principais é feita com os VCI’s, na qual esses componentes principais são usados para verificar a similaridade dos estados. Sob outra perspectiva, busca-se investigar o nível de concentração geográfico dos setores industriais brasileiros. Para tanto, utilizaram-se o índice Gini e o índice de Venables. Nesse último, a distância entre os estados não é negligenciada para mensuração da concentração. Os setores industriais são separados em três grupos pelo método de grupamento de médias K, no qual as variáveis utilizadas são os componentes principais das características das indústrias. Utilizando outro produto interno, o Viés da Característica dos Estados (VCE), observa-se em que tipo de estados os setores industriais estão se concentrando ou não. Para visualizar como essas duas perspectivas, ou seja, como as características dos estados e das indústrias influenciam a localização dos setores industriais no território brasileiro, um modelo econométrico de dados cruzados de Midelfart-Knarvik e outros (2000) é estabelecido para o caso brasileiro. Neste modelo econométrico, é possível investigar como a interação das características das indústrias e dos estados podem determinar onde a indústria se localiza. Os principais resultados mostram que os fortes investimentos em infraestrutura na década de 70 e a abertura comercial na década de 90 foram marcantes para localização da indústria brasileira.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of this study is to apply recently developed methods of physical-statistic to time series analysis, particularly in electrical induction s profiles of oil wells data, to study the petrophysical similarity of those wells in a spatial distribution. For this, we used the DFA method in order to know if we can or not use this technique to characterize spatially the fields. After obtain the DFA values for all wells, we applied clustering analysis. To do these tests we used the non-hierarchical method called K-means. Usually based on the Euclidean distance, the K-means consists in dividing the elements of a data matrix N in k groups, so that the similarities among elements belonging to different groups are the smallest possible. In order to test if a dataset generated by the K-means method or randomly generated datasets form spatial patterns, we created the parameter Ω (index of neighborhood). High values of Ω reveals more aggregated data and low values of Ω show scattered data or data without spatial correlation. Thus we concluded that data from the DFA of 54 wells are grouped and can be used to characterize spatial fields. Applying contour level technique we confirm the results obtained by the K-means, confirming that DFA is effective to perform spatial analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, the DFA introduced by Peng, was established as an important tool capable of detecting long-range autocorrelation in time series with non-stationary. This technique has been successfully applied to various areas such as: Econophysics, Biophysics, Medicine, Physics and Climatology. In this study, we used the DFA technique to obtain the Hurst exponent (H) of the profile of electric density profile (RHOB) of 53 wells resulting from the Field School of Namorados. In this work we want to know if we can or not use H to spatially characterize the spatial data field. Two cases arise: In the first a set of H reflects the local geology, with wells that are geographically closer showing similar H, and then one can use H in geostatistical procedures. In the second case each well has its proper H and the information of the well are uncorrelated, the profiles show only random fluctuations in H that do not show any spatial structure. Cluster analysis is a method widely used in carrying out statistical analysis. In this work we use the non-hierarchy method of k-means. In order to verify whether a set of data generated by the k-means method shows spatial patterns, we create the parameter Ω (index of neighborhood). High Ω shows more aggregated data, low Ω indicates dispersed or data without spatial correlation. With help of this index and the method of Monte Carlo. Using Ω index we verify that random cluster data shows a distribution of Ω that is lower than actual cluster Ω. Thus we conclude that the data of H obtained in 53 wells are grouped and can be used to characterize space patterns. The analysis of curves level confirmed the results of the k-means

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The extent of the Brazilian Atlantic rainforest, a global biodiversity hotspot, has been reduced to less than 7% of its original range. Yet, it contains one of the richest butterfly fauna in the world. Butterflies are commonly used as environmental indicators, mostly because of their strict association with host plants, microclimate and resource availability. This research describes diversity, composition and species richness of frugivorous butterflies in a forest fragment in the Brazilian Northeast. It compares communities in different physiognomies and seasons. The climate in the study area is classified as tropical rainy, with two well defined seasons. Butterfly captures were made with 60 Van Someren-Rydon traps, randomly located within six different habitat units (10 traps per unit) that varied from very open (e.g. coconut plantation) to forest interior. Sampling was made between January and December 2008, for five days each month. I captured 12090 individuals from 32 species. The most abundant species were Taygetis laches, Opsiphanes invirae and Hamadryas februa, which accounted for 70% of all captures. Similarity analysis identified two main groups, one of species associated with open or disturbed areas and a second by species associated with shaded areas. There was a strong seasonal component in species composition, with less species and lower abundance in the dry season and more species and higher abundance in the rainy season. K-means analysis indicates that choice of habitat units overestimated faunal perceptions, suggesting less distinct units. The species Taygetis virgilia, Hamadryas chloe, Callicore pygas e Morpho achilles were associated with less disturbed habitats, while Yphthimoides sp, Historis odius, H. acheronta, Hamadryas feronia e Siderone marthesia likey indicate open or disturbed habitats. This research brings important information for conservation of frugivorous butterflies, and will serve as baseline for future projects in environmental monitoring

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Os solos submetidos aos sistemas de produção sem preparo estão sujeitos à compactação, provocada pelo tráfego de máquinas, tornando necessário o acompanhamento das alterações do ambiente físico, que, quando desfavorável, restringe o crescimento radicular, podendo reduzir a produtividade das culturas. O objetivo do trabalho foi avaliar o efeito de diferentes intensidades de compactação na qualidade física de um Latossolo Vermelho textura média, localizado em Jaboticabal (SP), sob cultivo de milho, usando métodos de estatística multivariada. O delineamento experimental foi inteiramente casualizado, com seis intensidades de compactação e quatro repetições. Foram coletadas amostras indeformadas do solo nas camadas de 0,02-0,05, 0,08-0,11 e 0,15-0,18 m para determinação da densidade do solo (Ds), na camada de 0-0,20 m. As características da cultura avaliadas foram: densidade radicular, diâmetro radicular, matéria seca das raízes, altura das plantas, altura de inserção da primeira espiga, diâmetro do colmo e matéria seca das plantas. As análises de agrupamentos e componentes principais permitiram identificar três grupos de alta, média e baixa produtividade de plantas de milho, segundo variáveis do solo, do sistema radicular e da parte aérea das plantas. A classificação dos acessos em grupos foi feita por três métodos: método de agrupamentos hierárquico, método não-hierárquico k-means e análise de componentes principais. Os componentes principais evidenciaram que elevadas produtividades de milho estão correlacionadas com o bom crescimento da parte aérea das plantas, em condições de menor densidade do solo, proporcionando elevada produção de matéria seca das raízes, contudo, de pequeno diâmetro. A qualidade física do Latossolo Vermelho para o cultivo do milho foi assegurada até à densidade do solo de 1,38 Mg m-3.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A erodibilidade é um fator de extrema importância na caracterização da perda de solo, representando os processos que regulam a infiltração de água e sua resistência à desagregação e o transporte de partículas. Assim, por meio da análise de dependência espacial dos componentes principais da erodibilidade (fator K), objetivou-se estimar a erodibilidade do solo em uma área de nascentes da microbacia do Córrego do Tijuco, Monte Alto-SP, e analisar a variabilidade espacial das variáveis granulométricas do solo ao longo do relevo. A erodibilidade média da área foi considerada alta, e a análise de agrupamento k-means apontou para uma formação de cinco grupos: no primeiro, os altos teores de areia grossa (AG) e média (AM) condicionaram sua distribuição nas áreas planas; o segundo, caracterizado pelo alto teor de areia fina (AF), distribui-se nos declives mais convexos; o terceiro, com altos teores de silte e areia muito fina (AMF), concentrou-se nos maiores declives e concavidades; o quarto, com maior teor de argila, seguiu as zonas de escoamento de água; e o quinto, com alto teor de matéria orgânica (MO) e areia grossa (AG), distribui-se nas proximidades da zona urbana. A análise de componentes principais (ACP) mostrou quatro componentes com 87,4 % das informações, sendo o primeiro componente principal (CP1) discriminado pelo transporte seletivo de partículas principalmente em zonas pontuais de maior declividade e acúmulo de sedimentos; o segundo (CP2), discriminado pela baixa coesão entre as partículas, mostra acúmulo da areia fina nas áreas de menor cota em toda a área de concentração de água; o terceiro (CP3), discriminado pela maior agregação do solo, concentra-se principalmente nas bases de grandes declives; e o quarto (CP4), discriminado pela areia muito fina, distribui-se ao longo das declividades nas maiores altitudes. Os resultados sugerem o comportamento granulométrico do solo, que se mostra suscetível ao processo erosivo devido às condições texturais superficiais e à movimentação do relevo.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ln this work the implementation of the SOM (Self Organizing Maps) algorithm or Kohonen neural network is presented in the form of hierarchical structures, applied to the compression of images. The main objective of this approach is to develop an Hierarchical SOM algorithm with static structure and another one with dynamic structure to generate codebooks (books of codes) in the process of the image Vector Quantization (VQ), reducing the time of processing and obtaining a good rate of compression of images with a minimum degradation of the quality in relation to the original image. Both self-organizing neural networks developed here, were denominated HSOM, for static case, and DHSOM, for the dynamic case. ln the first form, the hierarchical structure is previously defined and in the later this structure grows in an automatic way in agreement with heuristic rules that explore the data of the training group without use of external parameters. For the network, the heuristic mIes determine the dynamics of growth, the pruning of ramifications criteria, the flexibility and the size of children maps. The LBO (Linde-Buzo-Oray) algorithm or K-means, one ofthe more used algorithms to develop codebook for Vector Quantization, was used together with the algorithm of Kohonen in its basic form, that is, not hierarchical, as a reference to compare the performance of the algorithms here proposed. A performance analysis between the two hierarchical structures is also accomplished in this work. The efficiency of the proposed processing is verified by the reduction in the complexity computational compared to the traditional algorithms, as well as, through the quantitative analysis of the images reconstructed in function of the parameters: (PSNR) peak signal-to-noise ratio and (MSE) medium squared error

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image segmentation is one of the image processing problems that deserves special attention from the scientific community. This work studies unsupervised methods to clustering and pattern recognition applicable to medical image segmentation. Natural Computing based methods have shown very attractive in such tasks and are studied here as a way to verify it's applicability in medical image segmentation. This work treats to implement the following methods: GKA (Genetic K-means Algorithm), GFCMA (Genetic FCM Algorithm), PSOKA (PSO and K-means based Clustering Algorithm) and PSOFCM (PSO and FCM based Clustering Algorithm). Besides, as a way to evaluate the results given by the algorithms, clustering validity indexes are used as quantitative measure. Visual and qualitative evaluations are realized also, mainly using data given by the BrainWeb brain simulator as ground truth