21 resultados para clustering techniques

em Universitat de Girona, Spain


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Our essay aims at studying suitable statistical methods for the clustering of compositional data in situations where observations are constituted by trajectories of compositional data, that is, by sequences of composition measurements along a domain. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methods for clustering functional data, known as Functional Cluster Analysis (FCA), have been applied by practitioners and scientists in many fields. To our knowledge, FCA techniques have not been extended to cope with the problem of clustering compositional data trajectories. In order to extend FCA techniques to the analysis of compositional data, FCA clustering techniques have to be adapted by using a suitable compositional algebra. The present work centres on the following question: given a sample of compositional data trajectories, how can we formulate a segmentation procedure giving homogeneous classes? To address this problem we follow the steps described below. First of all we adapt the well-known spline smoothing techniques in order to cope with the smoothing of compositional data trajectories. In fact, an observed curve can be thought of as the sum of a smooth part plus some noise due to measurement errors. Spline smoothing techniques are used to isolate the smooth part of the trajectory: clustering algorithms are then applied to these smooth curves. The second step consists in building suitable metrics for measuring the dissimilarity between trajectories: we propose a metric that accounts for difference in both shape and level, and a metric accounting for differences in shape only. A simulation study is performed in order to evaluate the proposed methodologies, using both hierarchical and partitional clustering algorithm. The quality of the obtained results is assessed by means of several indices

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In 2000 the European Statistical Office published the guidelines for developing the Harmonized European Time Use Surveys system. Under such a unified framework, the first Time Use Survey of national scope was conducted in Spain during 2002– 03. The aim of these surveys is to understand human behavior and the lifestyle of people. Time allocation data are of compositional nature in origin, that is, they are subject to non-negativity and constant-sum constraints. Thus, standard multivariate techniques cannot be directly applied to analyze them. The goal of this work is to identify homogeneous Spanish Autonomous Communities with regard to the typical activity pattern of their respective populations. To this end, fuzzy clustering approach is followed. Rather than the hard partitioning of classical clustering, where objects are allocated to only a single group, fuzzy method identify overlapping groups of objects by allowing them to belong to more than one group. Concretely, the probabilistic fuzzy c-means algorithm is conveniently adapted to deal with the Spanish Time Use Survey microdata. As a result, a map distinguishing Autonomous Communities with similar activity pattern is drawn. Key words: Time use data, Fuzzy clustering; FCM; simplex space; Aitchison distance

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estudi, disseny i implementació de diferents tècniques d’agrupament de fibres (clustering) per tal d’integrar a la plataforma DTIWeb diferents algorismes de clustering i tècniques de visualització de clústers de fibres de forma que faciliti la interpretació de dades de DTI als especialistes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En aquesta tesi s’estudia el problema de la segmentació del moviment. La tesi presenta una revisió dels principals algoritmes de segmentació del moviment, s’analitzen les característiques principals i es proposa una classificació de les tècniques més recents i importants. La segmentació es pot entendre com un problema d’agrupament d’espais (manifold clustering). Aquest estudi aborda alguns dels reptes més difícils de la segmentació de moviment a través l’agrupament d’espais. S’han proposat nous algoritmes per a l’estimació del rang de la matriu de trajectòries, s’ha presenta una mesura de similitud entre subespais, s’han abordat problemes relacionats amb el comportament dels angles canònics i s’ha desenvolupat una eina genèrica per estimar quants moviments apareixen en una seqüència. L´ultima part de l’estudi es dedica a la correcció de l’estimació inicial d’una segmentació. Aquesta correcció es du a terme ajuntant els problemes de la segmentació del moviment i de l’estructura a partir del moviment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main instrument used in psychological measurement is the self-report questionnaire. One of its major drawbacks however is its susceptibility to response biases. A known strategy to control these biases has been the use of so-called ipsative items. Ipsative items are items that require the respondent to make between-scale comparisons within each item. The selected option determines to which scale the weight of the answer is attributed. Consequently in questionnaires only consisting of ipsative items every respondent is allotted an equal amount, i.e. the total score, that each can distribute differently over the scales. Therefore this type of response format yields data that can be considered compositional from its inception. Methodological oriented psychologists have heavily criticized this type of item format, since the resulting data is also marked by the associated unfavourable statistical properties. Nevertheless, clinicians have kept using these questionnaires to their satisfaction. This investigation therefore aims to evaluate both positions and addresses the similarities and differences between the two data collection methods. The ultimate objective is to formulate a guideline when to use which type of item format. The comparison is based on data obtained with both an ipsative and normative version of three psychological questionnaires, which were administered to 502 first-year students in psychology according to a balanced within-subjects design. Previous research only compared the direct ipsative scale scores with the derived ipsative scale scores. The use of compositional data analysis techniques also enables one to compare derived normative score ratios with direct normative score ratios. The addition of the second comparison not only offers the advantage of a better-balanced research strategy. In principle it also allows for parametric testing in the evaluation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In order to obtain a high-resolution Pleistocene stratigraphy, eleven continuously cored boreholes, 100 to 220m deep were drilled in the northern part of the Po Plain by Regione Lombardia in the last five years. Quantitative provenance analysis (QPA, Weltje and von Eynatten, 2004) of Pleistocene sands was carried out by using multivariate statistical analysis (principal component analysis, PCA, and similarity analysis) on an integrated data set, including high-resolution bulk petrography and heavy-mineral analyses on Pleistocene sands and of 250 major and minor modern rivers draining the southern flank of the Alps from West to East (Garzanti et al, 2004; 2006). Prior to the onset of major Alpine glaciations, metamorphic and quartzofeldspathic detritus from the Western and Central Alps was carried from the axial belt to the Po basin longitudinally parallel to the SouthAlpine belt by a trunk river (Vezzoli and Garzanti, 2008). This scenario rapidly changed during the marine isotope stage 22 (0.87 Ma), with the onset of the first major Pleistocene glaciation in the Alps (Muttoni et al, 2003). PCA and similarity analysis from core samples show that the longitudinal trunk river at this time was shifted southward by the rapid southward and westward progradation of transverse alluvial river systems fed from the Central and Southern Alps. Sediments were transported southward by braided river systems as well as glacial sediments transported by Alpine valley glaciers invaded the alluvial plain. Kew words: Detrital modes; Modern sands; Provenance; Principal Components Analysis; Similarity, Canberra Distance; palaeodrainage

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Often practical performance of analytical redundancy for fault detection and diagnosis is decreased by uncertainties prevailing not only in the system model, but also in the measurements. In this paper, the problem of fault detection is stated as a constraint satisfaction problem over continuous domains with a big number of variables and constraints. This problem can be solved using modal interval analysis and consistency techniques. Consistency techniques are then shown to be particularly efficient to check the consistency of the analytical redundancy relations (ARRs), dealing with uncertain measurements and parameters. Through the work presented in this paper, it can be observed that consistency techniques can be used to increase the performance of a robust fault detection tool, which is based on interval arithmetic. The proposed method is illustrated using a nonlinear dynamic model of a hydraulic system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work provides a general description of the multi sensor data fusion concept, along with a new classification of currently used sensor fusion techniques for unmanned underwater vehicles (UUV). Unlike previous proposals that focus the classification on the sensors involved in the fusion, we propose a synthetic approach that is focused on the techniques involved in the fusion and their applications in UUV navigation. We believe that our approach is better oriented towards the development of sensor fusion systems, since a sensor fusion architecture should be first of all focused on its goals and then on the fused sensors

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In image segmentation, clustering algorithms are very popular because they are intuitive and, some of them, easy to implement. For instance, the k-means is one of the most used in the literature, and many authors successfully compare their new proposal with the results achieved by the k-means. However, it is well known that clustering image segmentation has many problems. For instance, the number of regions of the image has to be known a priori, as well as different initial seed placement (initial clusters) could produce different segmentation results. Most of these algorithms could be slightly improved by considering the coordinates of the image as features in the clustering process (to take spatial region information into account). In this paper we propose a significant improvement of clustering algorithms for image segmentation. The method is qualitatively and quantitative evaluated over a set of synthetic and real images, and compared with classical clustering approaches. Results demonstrate the validity of this new approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Obtaining automatic 3D profile of objects is one of the most important issues in computer vision. With this information, a large number of applications become feasible: from visual inspection of industrial parts to 3D reconstruction of the environment for mobile robots. In order to achieve 3D data, range finders can be used. Coded structured light approach is one of the most widely used techniques to retrieve 3D information of an unknown surface. An overview of the existing techniques as well as a new classification of patterns for structured light sensors is presented. This kind of systems belong to the group of active triangulation method, which are based on projecting a light pattern and imaging the illuminated scene from one or more points of view. Since the patterns are coded, correspondences between points of the image(s) and points of the projected pattern can be easily found. Once correspondences are found, a classical triangulation strategy between camera(s) and projector device leads to the reconstruction of the surface. Advantages and constraints of the different patterns are discussed

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The absolute necessity of obtaining 3D information of structured and unknown environments in autonomous navigation reduce considerably the set of sensors that can be used. The necessity to know, at each time, the position of the mobile robot with respect to the scene is indispensable. Furthermore, this information must be obtained in the least computing time. Stereo vision is an attractive and widely used method, but, it is rather limited to make fast 3D surface maps, due to the correspondence problem. The spatial and temporal correspondence among images can be alleviated using a method based on structured light. This relationship can be directly found codifying the projected light; then each imaged region of the projected pattern carries the needed information to solve the correspondence problem. We present the most significant techniques, used in recent years, concerning the coded structured light method

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The speed of fault isolation is crucial for the design and reconfiguration of fault tolerant control (FTC). In this paper the fault isolation problem is stated as a constraint satisfaction problem (CSP) and solved using constraint propagation techniques. The proposed method is based on constraint satisfaction techniques and uncertainty space refining of interval parameters. In comparison with other approaches based on adaptive observers, the major advantage of the presented method is that the isolation speed is fast even taking into account uncertainty in parameters, measurements and model errors and without the monotonicity assumption. In order to illustrate the proposed approach, a case study of a nonlinear dynamic system is presented

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our purpose is to provide a set-theoretical frame to clustering fuzzy relational data basically based on cardinality of the fuzzy subsets that represent objects and their complementaries, without applying any crisp property. From this perspective we define a family of fuzzy similarity indexes which includes a set of fuzzy indexes introduced by Tolias et al, and we analyze under which conditions it is defined a fuzzy proximity relation. Following an original idea due to S. Miyamoto we evaluate the similarity between objects and features by means the same mathematical procedure. Joining these concepts and methods we establish an algorithm to clustering fuzzy relational data. Finally, we present an example to make clear all the process

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pantoea agglomerans strains are among the most promising biocontrol agents for a variety of bacterial and fungal plant diseases, particularly fire blight of apple and pear. However, commercial registration of P. agglomerans biocontrol products is hampered because this species is currently listed as a biosafety level 2 (BL2) organism due to clinical reports as an opportunistic human pathogen. This study compares plant-origin and clinical strains in a search for discriminating genotypic/phenotypic markers using multi-locus phylogenetic analysis and fluorescent amplified fragment length polymorphisms (fAFLP) fingerprinting. Results: Majority of the clinical isolates from culture collections were found to be improperly designated as P. agglomerans after sequence analysis. The frequent taxonomic rearrangements underwent by the Enterobacter agglomerans/Erwinia herbicola complex may be a major problem in assessing clinical associations within P. agglomerans. In the P. agglomerans sensu stricto (in the stricter sense) group, there was no discrete clustering of clinical/biocontrol strains and no marker was identified that was uniquely associated to clinical strains. A putative biocontrol-specific fAFLP marker was identified only in biocontrol strains. The partial ORF located in this band corresponded to an ABC transporter that was found in all P. agglomerans strains. Conclusion: Taxonomic mischaracterization was identified as a major problem with P. agglomerans, and current techniques removed a majority of clinical strains from this species. Although clear discrimination between P. agglomerans plant and clinical strains was not obtained with phylogenetic analysis, a single marker characteristic of biocontrol strains was identified which may be of use in strain biosafety determinations. In addition, the lack of Koch's postulate fulfilment, rare retention of clinical strains for subsequent confirmation, and the polymicrobial nature of P. agglomerans clinical reports should be considered in biosafety assessment of beneficial strains in this species