33 resultados para Fuzzy C-Means clustering
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the RodriguesTriple Junction in the Indian Ocean were studied applying classical statistical methods(fuzzy c-means clustering, linear mixing model, principal component analysis) for theextraction of endmembers and evaluating the spatial and temporal variation ofgeochemical signals. Three main factors of sedimentation were expected by the marinegeologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. Thedisplay of fuzzy membership values and/or factor scores versus depth providedconsistent results for two factors only; the ultra-basic component could not beidentified. The reason for this may be that only traditional statistical methods wereapplied, i.e. the untransformed components were used and the cosine-theta coefficient assimilarity measure.During the last decade considerable progress in compositional data analysis was madeand many case studies were published using new tools for exploratory analysis of thesedata. Therefore it makes sense to check if the application of suitable data transformations,reduction of the D-part simplex to two or three factors and visualinterpretation of the factor scores would lead to a revision of earlier results and toanswers to open questions . In this paper we follow the lines of a paper of R. Tolosana-Delgado et al. (2005) starting with a problem-oriented interpretation of the biplotscattergram, extracting compositional factors, ilr-transformation of the components andvisualization of the factor scores in a spatial context: The compositional factors will beplotted versus depth (time) of the core samples in order to facilitate the identification ofthe expected sources of the sedimentary process.Kew words: compositional data analysis, biplot, deep sea sediments
Resumo:
Zonal management in vineyards requires the prior delineation of stable yield zones within the parcel. Among the different methodologies used for zone delineation, cluster analysis of yield data from several years is one of the possibilities cited in scientific literature. However, there exist reasonable doubts concerning the cluster algorithm to be used and the number of zones that have to be delineated within a field. In this paper two different cluster algorithms have been compared (k-means and fuzzy c-means) using the grape yield data corresponding to three successive years (2002, 2003 and 2004), for a ‘Pinot Noir’ vineyard parcel. Final choice of the most recommendable algorithm has been linked to obtaining a stable pattern of spatial yield distribution and to allowing for the delineation of compact and average sized areas. The general recommendation is to use reclassified maps of two clusters or yield classes (low yield zone and high yield zone) and, consequently, the site-specific vineyard management should be based on the prior delineation of just two different zones or sub-parcels. The two tested algorithms are good options for this purpose. However, the fuzzy c-means algorithm allows for a better zoning of the parcel, forming more compact areas and with more equilibrated zonal differences over time.
Resumo:
Aquest projecte presenta un estudi científic dels mètodes de generació de dades sintètiques dins de l’àrea de la privadesa de dades. Aquests mètodes permeten controlar la transferència de dades sensibles a terceres parts i la utilitat estadística de les dades que es generen sintèticament. S’han introduït tots els conceptes bàsics necessaris per a situar al lector i s’ha analitzat un dels mètodes existents més amplament utilitzat (IPSO). Seguidament, s’ha proposat un nou mètode per a la generació de dades sintètiques (FCRM) que es basa en Fuzzy c-Regression i permet controlar l’equilibri entre pèrdua d’informació i risc de revelació mitjançant un paràmetre c.
Mejora diagnóstica de hepatopatías de afectación difusa mediante técnicas de inteligencia artificial
Resumo:
The automatic diagnostic discrimination is an application of artificial intelligence techniques that can solve clinical cases based on imaging. Diffuse liver diseases are diseases of wide prominence in the population and insidious course, yet early in its progression. Early and effective diagnosis is necessary because many of these diseases progress to cirrhosis and liver cancer. The usual technique of choice for accurate diagnosis is liver biopsy, an invasive and not without incompatibilities one. It is proposed in this project an alternative non-invasive and free of contraindications method based on liver ultrasonography. The images are digitized and then analyzed using statistical techniques and analysis of texture. The results are validated from the pathology report. Finally, we apply artificial intelligence techniques as Fuzzy k-Means or Support Vector Machines and compare its significance to the analysis Statistics and the report of the clinician. The results show that this technique is significantly valid and a promising alternative as a noninvasive diagnostic chronic liver disease from diffuse involvement. Artificial Intelligence classifying techniques significantly improve the diagnosing discrimination compared to other statistics.
Resumo:
We present in this paper the results of the application of several visual methods on a group of locations, dated between VI and I centuries BC, of the ager Tarraconensis (Tarragona, Spain) a Hinterland of the roman colony of Tarraco. The difficulty in interpreting the diverse results in a combined way has been resolved by means of the use of statistical methods, such as Principal Components Analysis (PCA) and K-means clustering analysis. These methods have allowed us to carry out site classifications in function of the landscape's visual structure that contains them and of the visual relationships that could be given among them.
Resumo:
Our purpose is to provide a set-theoretical frame to clustering fuzzy relational data basically based on cardinality of the fuzzy subsets that represent objects and their complementaries, without applying any crisp property. From this perspective we define a family of fuzzy similarity indexes which includes a set of fuzzy indexes introduced by Tolias et al, and we analyze under which conditions it is defined a fuzzy proximity relation. Following an original idea due to S. Miyamoto we evaluate the similarity between objects and features by means the same mathematical procedure. Joining these concepts and methods we establish an algorithm to clustering fuzzy relational data. Finally, we present an example to make clear all the process
Resumo:
HEMOLIA (a project under European community’s 7th framework programme) is a new generation Anti-Money Laundering (AML) intelligent multi-agent alert and investigation system which in addition to the traditional financial data makes extensive use of modern society’s huge telecom data source, thereby opening up a new dimension of capabilities to all Money Laundering fighters (FIUs, LEAs) and Financial Institutes (Banks, Insurance Companies, etc.). This Master-Thesis project is done at AIA, one of the partners for the HEMOLIA project in Barcelona. The objective of this thesis is to find the clusters in a network drawn by using the financial data. An extensive literature survey has been carried out and several standard algorithms related to networks have been studied and implemented. The clustering problem is a NP-hard problem and several algorithms like K-Means and Hierarchical clustering are being implemented for studying several problems relating to sociology, evolution, anthropology etc. However, these algorithms have certain drawbacks which make them very difficult to implement. The thesis suggests (a) a possible improvement to the K-Means algorithm, (b) a novel approach to the clustering problem using the Genetic Algorithms and (c) a new algorithm for finding the cluster of a node using the Genetic Algorithm.
Resumo:
In 2000 the European Statistical Office published the guidelines for developing theHarmonized European Time Use Surveys system. Under such a unified framework,the first Time Use Survey of national scope was conducted in Spain during 2002–03. The aim of these surveys is to understand human behavior and the lifestyle ofpeople. Time allocation data are of compositional nature in origin, that is, they aresubject to non-negativity and constant-sum constraints. Thus, standard multivariatetechniques cannot be directly applied to analyze them. The goal of this work is toidentify homogeneous Spanish Autonomous Communities with regard to the typicalactivity pattern of their respective populations. To this end, fuzzy clustering approachis followed. Rather than the hard partitioning of classical clustering, where objects areallocated to only a single group, fuzzy method identify overlapping groups of objectsby allowing them to belong to more than one group. Concretely, the probabilistic fuzzyc-means algorithm is conveniently adapted to deal with the Spanish Time Use Surveymicrodata. As a result, a map distinguishing Autonomous Communities with similaractivity pattern is drawn.Key words: Time use data, Fuzzy clustering; FCM; simplex space; Aitchison distance
Resumo:
Many classification systems rely on clustering techniques in which a collection of training examples is provided as an input, and a number of clusters c1,...cm modelling some concept C results as an output, such that every cluster ci is labelled as positive or negative. Given a new, unlabelled instance enew, the above classification is used to determine to which particular cluster ci this new instance belongs. In such a setting clusters can overlap, and a new unlabelled instance can be assigned to more than one cluster with conflicting labels. In the literature, such a case is usually solved non-deterministically by making a random choice. This paper presents a novel, hybrid approach to solve this situation by combining a neural network for classification along with a defeasible argumentation framework which models preference criteria for performing clustering.
Resumo:
PLFC is a first-order possibilistic logic dealing with fuzzy constants and fuzzily restricted quantifiers. The refutation proof method in PLFC is mainly based on a generalized resolution rule which allows an implicit graded unification among fuzzy constants. However, unification for precise object constants is classical. In order to use PLFC for similarity-based reasoning, in this paper we extend a Horn-rule sublogic of PLFC with similarity-based unification of object constants. The Horn-rule sublogic of PLFC we consider deals only with disjunctive fuzzy constants and it is equipped with a simple and efficient version of PLFC proof method. At the semantic level, it is extended by equipping each sort with a fuzzy similarity relation, and at the syntactic level, by fuzzily “enlarging” each non-fuzzy object constant in the antecedent of a Horn-rule by means of a fuzzy similarity relation.
Resumo:
The generator problem was posed by Kadison in 1967, and it remains open until today. We provide a solution for the class of C*-algebras absorbing the Jiang-Su algebra Z tensorially. More precisely, we show that every unital, separable, Z-stable C*-algebra A is singly generated, which means that there exists an element x є A that is not contained in any proper sub-C*- algebra of A. To give applications of our result, we observe that Z can be embedded into the reduced group C*-algebra of a discrete group that contains a non-cyclic, free subgroup. It follows that certain tensor products with reduced group C*-algebras are singly generated. In particular, C*r (F ∞) ⨂ C*r (F ∞) is singly generated.
Resumo:
In image segmentation, clustering algorithms are very popular because they are intuitive and, some of them, easy to implement. For instance, the k-means is one of the most used in the literature, and many authors successfully compare their new proposal with the results achieved by the k-means. However, it is well known that clustering image segmentation has many problems. For instance, the number of regions of the image has to be known a priori, as well as different initial seed placement (initial clusters) could produce different segmentation results. Most of these algorithms could be slightly improved by considering the coordinates of the image as features in the clustering process (to take spatial region information into account). In this paper we propose a significant improvement of clustering algorithms for image segmentation. The method is qualitatively and quantitative evaluated over a set of synthetic and real images, and compared with classical clustering approaches. Results demonstrate the validity of this new approach
Resumo:
Our essay aims at studying suitable statistical methods for the clustering ofcompositional data in situations where observations are constituted by trajectories ofcompositional data, that is, by sequences of composition measurements along a domain.Observed trajectories are known as “functional data” and several methods have beenproposed for their analysis.In particular, methods for clustering functional data, known as Functional ClusterAnalysis (FCA), have been applied by practitioners and scientists in many fields. To ourknowledge, FCA techniques have not been extended to cope with the problem ofclustering compositional data trajectories. In order to extend FCA techniques to theanalysis of compositional data, FCA clustering techniques have to be adapted by using asuitable compositional algebra.The present work centres on the following question: given a sample of compositionaldata trajectories, how can we formulate a segmentation procedure giving homogeneousclasses? To address this problem we follow the steps described below.First of all we adapt the well-known spline smoothing techniques in order to cope withthe smoothing of compositional data trajectories. In fact, an observed curve can bethought of as the sum of a smooth part plus some noise due to measurement errors.Spline smoothing techniques are used to isolate the smooth part of the trajectory:clustering algorithms are then applied to these smooth curves.The second step consists in building suitable metrics for measuring the dissimilaritybetween trajectories: we propose a metric that accounts for difference in both shape andlevel, and a metric accounting for differences in shape only.A simulation study is performed in order to evaluate the proposed methodologies, usingboth hierarchical and partitional clustering algorithm. The quality of the obtained resultsis assessed by means of several indices
Resumo:
Background: Non-invasive monitoring of respiratory muscle function is an area of increasing research interest, resulting in the appearance of new monitoring devices, one of these being piezoelectric contact sensors. The present study was designed to test whether the use of piezoelectric contact (non-invasive) sensors could be useful in respiratory monitoring, in particular in measuring the timing of diaphragmatic contraction.Methods: Experiments were performed in an animal model: three pentobarbital anesthetized mongrel dogs. The motion of the thoracic cage was acquired by means of a piezoelectric contact sensor placed on the costal wall. This signal is compared with direct measurements of the diaphragmatic muscle length, made by sonomicrometry. Furthermore, to assess the diaphragmatic function other respiratory signals were acquired: respiratory airflow and transdiaphragmatic pressure. Diaphragm contraction time was estimated with these four signals. Using diaphragm length signal as reference, contraction times estimated with the other three signals were compared with the contraction time estimated with diaphragm length signal.Results: The contraction time estimated with the TM signal tends to give a reading 0.06 seconds lower than the measure made with the DL signal (-0.21 and 0.00 for FL and DP signals, respectively), with a standard deviation of 0.05 seconds (0.08 and 0.06 for FL and DP signals, respectively). Correlation coefficients indicated a close link between time contraction estimated with TM signal and contraction time estimated with DL signal (a Pearson correlation coefficient of 0.98, a reliability coefficient of 0.95, a slope of 1.01 and a Spearman's rank-order coefficient of 0.98). In general, correlation coefficients and mean and standard deviation of the difference were better in the inspiratory load respiratory test than in spontaneous ventilation tests.Conclusion: The technique presented in this work provides a non-invasive method to assess the timing of diaphragmatic contraction in canines, using a piezoelectric contact sensor placed on the costal wall.
Resumo:
In the present work, an analysis of the dark and optical capacitance transients obtained from Schottky Au:GaAs barriers implanted with boron has been carried out by means of the isothermal transient spectroscopy (ITS) and differential and optical ITS techniques. Unlike deep level transient spectroscopy, the use of these techniques allows one to easily distinguish contributions to the transients different from those of the usual deep trap emission kinetics. The results obtained show the artificial creation of the EL2, EL6, and EL5 defects by the boron implantation process. Moreover, the interaction mechanism between the EL2 and other defects, which gives rise to the U band, has been analyzed. The existence of a reorganization process of the defects involved has been observed, which prevents the interaction as the temperature increases. The activation energy of this process has been found to be dependent on the temperature of the annealing treatment after implantation, with values of 0.51 and 0.26 eV for the as‐implanted and 400 °C annealed samples, respectively. The analysis of the optical data has corroborated the existence of such interactions involving all the observed defects that affect their optical parameters