5 resultados para Experimental Problems

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work presents hybrid Constraint Programming (CP) and metaheuristic methods for the solution of Large Scale Optimization Problems; it aims at integrating concepts and mechanisms from the metaheuristic methods to a CP-based tree search environment in order to exploit the advantages of both approaches. The modeling and solution of large scale combinatorial optimization problem is a topic which has arisen the interest of many researcherers in the Operations Research field; combinatorial optimization problems are widely spread in everyday life and the need of solving difficult problems is more and more urgent. Metaheuristic techniques have been developed in the last decades to effectively handle the approximate solution of combinatorial optimization problems; we will examine metaheuristics in detail, focusing on the common aspects of different techniques. Each metaheuristic approach possesses its own peculiarities in designing and guiding the solution process; our work aims at recognizing components which can be extracted from metaheuristic methods and re-used in different contexts. In particular we focus on the possibility of porting metaheuristic elements to constraint programming based environments, as constraint programming is able to deal with feasibility issues of optimization problems in a very effective manner. Moreover, CP offers a general paradigm which allows to easily model any type of problem and solve it with a problem-independent framework, differently from local search and metaheuristic methods which are highly problem specific. In this work we describe the implementation of the Local Branching framework, originally developed for Mixed Integer Programming, in a CP-based environment. Constraint programming specific features are used to ease the search process, still mantaining an absolute generality of the approach. We also propose a search strategy called Sliced Neighborhood Search, SNS, that iteratively explores slices of large neighborhoods of an incumbent solution by performing CP-based tree search and encloses concepts from metaheuristic techniques. SNS can be used as a stand alone search strategy, but it can alternatively be embedded in existing strategies as intensification and diversification mechanism. In particular we show its integration within the CP-based local branching. We provide an extensive experimental evaluation of the proposed approaches on instances of the Asymmetric Traveling Salesman Problem and of the Asymmetric Traveling Salesman Problem with Time Windows. The proposed approaches achieve good results on practical size problem, thus demonstrating the benefit of integrating metaheuristic concepts in CP-based frameworks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis we made the first steps towards the systematic application of a methodology for automatically building formal models of complex biological systems. Such a methodology could be useful also to design artificial systems possessing desirable properties such as robustness and evolvability. The approach we follow in this thesis is to manipulate formal models by means of adaptive search methods called metaheuristics. In the first part of the thesis we develop state-of-the-art hybrid metaheuristic algorithms to tackle two important problems in genomics, namely, the Haplotype Inference by parsimony and the Founder Sequence Reconstruction Problem. We compare our algorithms with other effective techniques in the literature, we show strength and limitations of our approaches to various problem formulations and, finally, we propose further enhancements that could possibly improve the performance of our algorithms and widen their applicability. In the second part, we concentrate on Boolean network (BN) models of gene regulatory networks (GRNs). We detail our automatic design methodology and apply it to four use cases which correspond to different design criteria and address some limitations of GRN modeling by BNs. Finally, we tackle the Density Classification Problem with the aim of showing the learning capabilities of BNs. Experimental evaluation of this methodology shows its efficacy in producing network that meet our design criteria. Our results, coherently to what has been found in other works, also suggest that networks manipulated by a search process exhibit a mixture of characteristics typical of different dynamical regimes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Heavy pig breeding in Italy is mainly oriented for the production of high quality processed products. Of particular importance is the dry cured ham production, which is strictly regulated and requires specific carcass characteristics correlated with green leg characteristics. Furthermore, as pigs are slaughtered at about 160 kg live weight, the Italian pig breeding sector faces severe problems of production efficiency that are related to all biological aspects linked to growth, feed conversion, fat deposition and so on. It is well known that production and carcass traits are in part genetically determined. Therefore, as a first step to understand genetic basis of traits that could have a direct or indirect impact on dry cured ham production, a candidate gene approach can be used to identify DNA markers associated with parameters of economic importance. In this thesis, we investigated three candidate genes for carcass and production traits (TRIB3, PCSK1, MUC4) in pig breeds used for dry cured ham production, using different experimental approaches in order to find molecular markers associated with these parameters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Different aspects of the electrochemistry of oxygen are examined through four experimental examples: corrosion, passivation via organic thin films, oxygen reduction and water oxidation catalysis are outlined in order to outline the very different ways and circumstances in which oxygen plays a major role in electrochemistry.