4 resultados para Automatic Gridding of microarray images
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.
Resumo:
This study provides a comprehensive genetic overview on the endangered Italian wolf population. In particular, it focuses on two research lines. On one hand, we focalised on melanism in wolf in order to isolate a mutation related with black coat colour in canids. With several reported black individuals (an exception at European level), the Italian wolf population constituted a challenging research field posing many unanswered questions. As found in North American wolf, we reported that melanism in the Italian population is caused by a different melanocortin pathway component, the K locus, in which a beta-defensin protein acts as an alternative ligand for the Mc1r. This research project was conducted in collaboration with Prof. Gregory Barsh, Department of Genetics and Paediatrics, Stanford University. On the other hand, we performed analysis on a high number of SNPs thanks to a customized Canine microarray useful to integrate or substitute the STR markers for genotyping individuals and detecting wolf-dog hybrids. Thanks to DNA microchip technology, we obtained an impressive amount of genetic data which provides a solid base for future functional genomic studies. This study was undertaken in collaboration with Prof. Robert K. Wayne, Department of Ecology and Evolutionary Biology, University of California, Los Angeles (UCLA).
Resumo:
The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.
Resumo:
In distributed systems like clouds or service oriented frameworks, applications are typically assembled by deploying and connecting a large number of heterogeneous software components, spanning from fine-grained packages to coarse-grained complex services. The complexity of such systems requires a rich set of techniques and tools to support the automation of their deployment process. By relying on a formal model of components, a technique is devised for computing the sequence of actions allowing the deployment of a desired configuration. An efficient algorithm, working in polynomial time, is described and proven to be sound and complete. Finally, a prototype tool implementing the proposed algorithm has been developed. Experimental results support the adoption of this novel approach in real life scenarios.