891 resultados para Fuzzy C-Means clustering
Resumo:
A nivel mundial, el cáncer de mama es el tipo de cáncer más frecuente además de una de las principales causas de muerte entre la población femenina. Actualmente, el método más eficaz para detectar lesiones mamarias en una etapa temprana es la mamografía. Ésta contribuye decisivamente al diagnóstico precoz de esta enfermedad que, si se detecta a tiempo, tiene una probabilidad de curación muy alta. Uno de los principales y más frecuentes hallazgos en una mamografía, son las microcalcificaciones, las cuales son consideradas como un indicador importante de cáncer de mama. En el momento de analizar las mamografías, factores como la capacidad de visualización, la fatiga o la experiencia profesional del especialista radiólogo hacen que el riesgo de omitir ciertas lesiones presentes se vea incrementado. Para disminuir dicho riesgo es importante contar con diferentes alternativas como por ejemplo, una segunda opinión por otro especialista o un doble análisis por el mismo. En la primera opción se eleva el coste y en ambas se prolonga el tiempo del diagnóstico. Esto supone una gran motivación para el desarrollo de sistemas de apoyo o asistencia en la toma de decisiones. En este trabajo de tesis se propone, se desarrolla y se justifica un sistema capaz de detectar microcalcificaciones en regiones de interés extraídas de mamografías digitalizadas, para contribuir a la detección temprana del cáncer demama. Dicho sistema estará basado en técnicas de procesamiento de imagen digital, de reconocimiento de patrones y de inteligencia artificial. Para su desarrollo, se tienen en cuenta las siguientes consideraciones: 1. Con el objetivo de entrenar y probar el sistema propuesto, se creará una base de datos de imágenes, las cuales pertenecen a regiones de interés extraídas de mamografías digitalizadas. 2. Se propone la aplicación de la transformada Top-Hat, una técnica de procesamiento digital de imagen basada en operaciones de morfología matemática. La finalidad de aplicar esta técnica es la de mejorar el contraste entre las microcalcificaciones y el tejido presente en la imagen. 3. Se propone un algoritmo novel llamado sub-segmentación, el cual está basado en técnicas de reconocimiento de patrones aplicando un algoritmo de agrupamiento no supervisado, el PFCM (Possibilistic Fuzzy c-Means). El objetivo es encontrar las regiones correspondientes a las microcalcificaciones y diferenciarlas del tejido sano. Además, con la finalidad de mostrar las ventajas y desventajas del algoritmo propuesto, éste es comparado con dos algoritmos del mismo tipo: el k-means y el FCM (Fuzzy c-Means). Por otro lado, es importante destacar que en este trabajo por primera vez la sub-segmentación es utilizada para detectar regiones pertenecientes a microcalcificaciones en imágenes de mamografía. 4. Finalmente, se propone el uso de un clasificador basado en una red neuronal artificial, específicamente un MLP (Multi-layer Perceptron). El propósito del clasificador es discriminar de manera binaria los patrones creados a partir de la intensidad de niveles de gris de la imagen original. Dicha clasificación distingue entre microcalcificación y tejido sano. ABSTRACT Breast cancer is one of the leading causes of women mortality in the world and its early detection continues being a key piece to improve the prognosis and survival. Currently, the most reliable and practical method for early detection of breast cancer is mammography.The presence of microcalcifications has been considered as a very important indicator ofmalignant types of breast cancer and its detection and classification are important to prevent and treat the disease. However, the detection and classification of microcalcifications continue being a hard work due to that, in mammograms there is a poor contrast between microcalcifications and the tissue around them. Factors such as visualization, tiredness or insufficient experience of the specialist increase the risk of omit some present lesions. To reduce this risk, is important to have alternatives such as a second opinion or a double analysis for the same specialist. In the first option, the cost increases and diagnosis time also increases for both of them. This is the reason why there is a great motivation for development of help systems or assistance in the decision making process. This work presents, develops and justifies a system for the detection of microcalcifications in regions of interest extracted fromdigitizedmammographies to contribute to the early detection of breast cancer. This systemis based on image processing techniques, pattern recognition and artificial intelligence. For system development the following features are considered: With the aim of training and testing the system, an images database is created, belonging to a region of interest extracted from digitized mammograms. The application of the top-hat transformis proposed. This image processing technique is based on mathematical morphology operations. The aim of this technique is to improve the contrast betweenmicrocalcifications and tissue present in the image. A novel algorithm called sub-segmentation is proposed. The sub-segmentation is based on pattern recognition techniques applying a non-supervised clustering algorithm known as Possibilistic Fuzzy c-Means (PFCM). The aim is to find regions corresponding to the microcalcifications and distinguish them from the healthy tissue. Furthermore,with the aim of showing themain advantages and disadvantages this is compared with two algorithms of same type: the k-means and the fuzzy c-means (FCM). On the other hand, it is important to highlight in this work for the first time the sub-segmentation is used for microcalcifications detection. Finally, a classifier based on an artificial neural network such as Multi-layer Perceptron is used. The purpose of this classifier is to discriminate froma binary perspective the patterns built from gray level intensity of the original image. This classification distinguishes between microcalcifications and healthy tissue.
Resumo:
The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.
Resumo:
Since different pedologists will draw different soil maps of a same area, it is important to compare the differences between mapping by specialists and mapping techniques, as for example currently intensively discussed Digital Soil Mapping. Four detailed soil maps (scale 1:10.000) of a 182-ha sugarcane farm in the county of Rafard, São Paulo State, Brazil, were compared. The area has a large variation of soil formation factors. The maps were drawn independently by four soil scientists and compared with a fifth map obtained by a digital soil mapping technique. All pedologists were given the same set of information. As many field expeditions and soil pits as required by each surveyor were provided to define the mapping units (MUs). For the Digital Soil Map (DSM), spectral data were extracted from Landsat 5 Thematic Mapper (TM) imagery as well as six terrain attributes from the topographic map of the area. These data were summarized by principal component analysis to generate the map designs of groups through Fuzzy K-means clustering. Field observations were made to identify the soils in the MUs and classify them according to the Brazilian Soil Classification System (BSCS). To compare the conventional and digital (DSM) soil maps, they were crossed pairwise to generate confusion matrices that were mapped. The categorical analysis at each classification level of the BSCS showed that the agreement between the maps decreased towards the lower levels of classification and the great influence of the surveyor on both the mapping and definition of MUs in the soil map. The average correspondence between the conventional and DSM maps was similar. Therefore, the method used to obtain the DSM yielded similar results to those obtained by the conventional technique, while providing additional information about the landscape of each soil, useful for applications in future surveys of similar areas.
Resumo:
O objetivo deste trabalho foi avaliar o potencial da espectroscopia de reflectância no VIS-NIR-SWIR, para a caracterização granulométrica de amostras de solos de diferentes classes texturais, e obter modelos de predição dos teores de argila, silte e areia no solo. Utilizou-se um conjunto de amostras representativas de Latossolos e Argissolo de cinco locais do Estado do Mato Grosso do Sul. Os espectros do visível e do infravermelho próximo ao infravermelho de ondas curtas (de 350 a 2.500 nm) das amostras foram obtidos e analisados. Empregaram-se a análise de componentes principais (ACP), agrupamento por "fuzzy c-means", regressão logística multinomial (RLM) e regressão por mínimos quadrados parciais. Espectros característicos para as diferentes classes texturais e a segregação de amostras de classes texturais e de locais de coleta com características distintas, por meio da ACP, "fuzzy c-means" e RLM, mostram o potencial semiquantitativo dos dados de reflectância no VIS-NIR-SWIR. Obteve-se quantificação satisfatória quanto à argila (R²=0,92, RPD=3,59), ao silte (R²=0,80, RPD=2,15) e à areia (R²=0,87, RPD=2,62). As técnicas de espectroscopia de reflectância podem auxiliar na determinação da textura e da variabilidade espacial do solo com metodologias semiquantitativas ou quantitativas.
Resumo:
ABSTRACT Precision agriculture (PA) allows farmers to identify and address variations in an agriculture field. Management zones (MZs) make PA more feasible and economical. The most important method for defining MZs is a fuzzy C-means algorithm, but selecting the variable for use as the input layer in the fuzzy process is problematic. BAZZI et al. (2013) used Moran’s bivariate spatial autocorrelation statistic to identify variables that are spatially correlated with yield while employing spatial autocorrelation. BAZZI et al. (2013) proposed that all redundant variables be eliminated and that the remaining variables would be considered appropriate on the MZ generation process. Thus, the objective of this work, a study case, was to test the hypothesis that redundant variables can harm the MZ delineation process. BAZZI This work was conducted in a 19.6-ha commercial field, and 15 MZ designs were generated by a fuzzy C-means algorithm and divided into two to five classes. Each design used a different composition of variables, including copper, silt, clay, and altitude. Some combinations of these variables produced superior MZs. None of the variable combinations produced statistically better performance that the MZ generated with no redundant variables. Thus, the other redundant variables can be discredited. The design with all variables did not provide a greater separation and organization of data among MZ classes and was not recommended.
Resumo:
Multi-parametric and quantitative magnetic resonance imaging (MRI) techniques have come into the focus of interest, both as a research and diagnostic modality for the evaluation of patients suffering from mild cognitive decline and overt dementia. In this study we address the question, if disease related quantitative magnetization transfer effects (qMT) within the intra- and extracellular matrices of the hippocampus may aid in the differentiation between clinically diagnosed patients with Alzheimer disease (AD), patients with mild cognitive impairment (MCI) and healthy controls. We evaluated 22 patients with AD (n=12) and MCI (n=10) and 22 healthy elderly (n=12) and younger (n=10) controls with multi-parametric MRI. Neuropsychological testing was performed in patients and elderly controls (n=34). In order to quantify the qMT effects, the absorption spectrum was sampled at relevant off-resonance frequencies. The qMT-parameters were calculated according to a two-pool spin-bath model including the T1- and T2 relaxation parameters of the free pool, determined in separate experiments. Histograms (fixed bin-size) of the normalized qMT-parameter values (z-scores) within the anterior and posterior hippocampus (hippocampal head and body) were subjected to a fuzzy-c-means classification algorithm with downstreamed PCA projection. The within-cluster sums of point-to-centroid distances were used to examine the effects of qMT- and diffusion anisotropy parameters on the discrimination of healthy volunteers, patients with Alzheimer and MCIs. The qMT-parameters T2(r) (T2 of the restricted pool) and F (fractional pool size) differentiated between the three groups (control, MCI and AD) in the anterior hippocampus. In our cohort, the MT ratio, as proposed in previous reports, did not differentiate between MCI and AD or healthy controls and MCI, but between healthy controls and AD.
Benthic foraminifera, stable isotope record and sedimentology of Holocene sediments in the Skagerrak
Resumo:
A high-resolution multi-proxy study of core MD99-2286 reveals a highly variable hydrographic environment in the Skagerrak from 9300 cal. yr BP to the present. The study includes foraminiferal faunas, stable isotopes and sedimentary parameters, as well as temperature and salinity reconstructions of a ca. 29 m long radiocarbon-dated core record. The multivariate technique fuzzy c-means was applied to the foraminiferal counts, and it was extremely valuable in defining subtle heterogeneities in the foraminiferal fauna data corresponding to hydrographic changes. The major mid-Holocene (Littorina) transgression, led to flooding of large former land areas in the North Sea, the opening of the English Channel and Danish straits and initiation of the modern circulation system. This is reflected by fluctuating C/N values and an explosive bloom of Hyalinea balthica. A slight indication of ameliorated conditions between 8000-5750 cal. yr BP is related to the Holocene Thermal Maximum. A subsequent increase in fresh water/Baltic water influence between 5750-4350 cal. yr BP is reflected by dominance of Bulimina marginata and depleted d18O-values. The Neoglacial cooling (after 4350 cal. yr BP) is seen in the Skagerrak as enhanced turbidity, increasing TOC-values and short-term changes in an overall Cassidulina laevigata dominated fauna suggesting a prevailing influence of Atlantic waters. This is in agreement with increased strength of westerly winds, as recorded for this period. The last 2000 years were also dominated by Atlantic Water conditions with generally abundant nutrient supply. However, during warm periods, particularly the Medieval Warm Period and the modern warming, the area was subject to a restriction in the supply of nutrients and/or the nutrient supply had a more refractory character.
Resumo:
Este trabajo esta orientado a resolver el problema de la caracterización de la copa de arboles frutales para la aplicacion localizada de fitosanitarios. Esta propuesta utiliza un mapa de profundidad (Depth image) y una imagen RGB combinadas (RGB-D), proporcionados por el sensor Kinect de Microsoft, para aplicar pesticidas de forma localizada. A través del mapa de profundidad se puede estimar la densidad de la copa y a partir de esta información determinar qué boquillas se deben abrir en cada momento. Se desarrollaron algoritmos implementados en Matlab que permiten además de la adquisición de las imágenes RGB-D, aplicar plaguicidas sólo a hojas y/o frutos según se desee. Estos algoritmos fueron implementados en un software que se comunica con el entorno de desarrollo "Kinect Windows SDK", encargado de extraer las imágenes desde el sensor Kinect. Por otra parte, para identificar hojas, se implementaron algoritmos de clasificación e identificación. Los algoritmos de clasificación utilizados fueron "Fuzzy C-Means con Gustafson Kessel" (FCM-GK) y "K-Means". Los centroides o prototipos de cada clase generados por FCM-GK fueron usados como semilla para K-Means, para acelerar la convergencia del algoritmo y mantener la coherencia temporal en los grupos generados por K-Means. Los algoritmos de clasificación fueron aplicados sobre las imágenes transformadas al espacio de color L*a*b*; específicamente se emplearon los canales a*, b* (canales cromáticos) con el fin de reducir el efecto de la luz sobre los colores. Los algoritmos de clasificación fueron configurados para buscar cuatro grupos: hojas, porosidad, frutas y tronco. Una vez que el clasificador genera los prototipos de los grupos, un clasificador denominado Máquina de Soporte Vectorial, que utiliza como núcleo una función Gaussiana base radial, identifica la clase de interés (hojas). La combinación de estos algoritmos ha mostrado bajos errores de clasificación, rendimiento del 4% de error en la identificación de hojas. Además, estos algoritmos de procesamiento de hasta 8.4 imágenes por segundo, lo que permite su aplicación en tiempo real. Los resultados demuestran la viabilidad de utilizar el sensor "Kinect" para determinar dónde y cuándo aplicar pesticidas. Por otra parte, también muestran que existen limitaciones en su uso, impuesta por las condiciones de luz. En otras palabras, es posible usar "Kinect" en exteriores, pero durante días nublados, temprano en la mañana o en la noche con iluminación artificial, o añadiendo un parasol en condiciones de luz intensa.
Resumo:
Abdominal Aortic Aneurism is a disease related to a weakening in the aortic wall that can cause a break in the aorta and the death. The detection of an unusual dilatation of a section of the aorta is an indicative of this disease. However, it is difficult to diagnose because it is necessary image diagnosis using computed tomography or magnetic resonance. An automatic diagnosis system would allow to analyze abdominal magnetic resonance images and to warn doctors if any anomaly is detected. We focus our research in magnetic resonance images because of the absence of ionizing radiation. Although there are proposals to identify this disease in magnetic resonance images, they need an intervention from clinicians to be precise and some of them are computationally hard. In this paper we develop a novel approach to analyze magnetic resonance abdominal images and detect the lumen and the aortic wall. The method combines different algorithms in two stages to improve the detection and the segmentation so it can be applied to similar problems with other type of images or structures. In a first stage, we use a spatial fuzzy C-means algorithm with morphological image analysis to detect and segment the lumen; and subsequently, in a second stage, we apply a graph cut algorithm to segment the aortic wall. The obtained results in the analyzed images are pretty successful obtaining an average of 79% of overlapping between the automatic segmentation provided by our method and the aortic wall identified by a medical specialist. The main impact of the proposed method is that it works in a completely automatic way with a low computational cost, which is of great significance for any expert and intelligent system.
Resumo:
Surface sediment samples representative for the tropical and subtropical South Atlantic (15°N to 40°S) were investigated by isothermal magnetic methods to delineate magnetic mineral distribution patterns and to identify their predominant Holocene climatic and oceanographic controls. Individual parameters reveal distinct, yet frequently overlapping, regional sedimentation characteristics. A probabilistic ('fuzzy c-means') cluster analysis was applied to five concentration independent magnetic properties assessing magnetite to hematite ratios and diagnostic of bulk and fine-particle magnetite grain size and coercivity spectra. The resultant 10 cluster structures establish an oceanwide magnetic sediment classification scheme tracing the major terrigenous eolian and fluvial fluxes, authigenic biogenic magnetite accumulation in high-productivity areas, transport by ocean current systems, and effects of bottom water velocity on depositional regimes. Distinct dissimilarities in magnetic mineral inventories between the eastern and western basins of the South Atlantic reflect prominent contrasts of both oceanic and continental influences.
Resumo:
Based on a well-established stratigraphic framework and 47 AMS-14C dated sediment cores, the distribution of facies types on the NW Iberian margin is analysed in response to the last deglacial sea-level rise, thus providing a case study on the sedimentary evolution of a high-energy, low-accumulation shelf system. Altogether, four main types of sedimentary facies are defined. (1) A gravel-dominated facies occurs mostly as time-transgressive ravinement beds, which initially developed as shoreface and storm deposits in shallow waters on the outer shelf during the last sea-level lowstand; (2) A widespread, time-transgressive mixed siliceous/biogenic-carbonaceous sand facies indicates areas of moderate hydrodynamic regimes, high contribution of reworked shelf material, and fluvial supply to the shelf; (3) A glaucony-containing sand facies in a stationary position on the outer shelf formed mostly during the last-glacial sea-level rise by reworking of older deposits as well as authigenic mineral formation; and (4) A mud facies is mostly restricted to confined Holocene fine-grained depocentres, which are located in mid-shelf position. The observed spatial and temporal distribution of these facies types on the high-energy, low-accumulation NW Iberian shelf was essentially controlled by the local interplay of sediment supply, shelf morphology, and strength of the hydrodynamic system. These patterns are in contrast to high-accumulation systems where extensive sediment supply is the dominant factor on the facies distribution. This study emphasises the importance of large-scale erosion and material recycling on the sedimentary buildup during the deglacial drowning of the shelf. The presence of a homogenous and up to 15-m thick transgressive cover above a lag horizon contradicts the common assumption of sparse and laterally confined sediment accumulation on high-energy shelf systems during deglacial sea-level rise. In contrast to this extensive sand cover, laterally very confined and maximal 4-m thin mud depocentres developed during the Holocene sea-level highstand. This restricted formation of fine-grained depocentres was related to the combination of: (1) frequently occurring high-energy hydrodynamic conditions; (2) low overall terrigenous input by the adjacent rivers; and (3) the large distance of the Galicia Mud Belt to its main sediment supplier.
Resumo:
This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.