13 resultados para microarray profiling

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The study of protein expression profiles for biomarker discovery in serum and in mammalian cell populations needs the continuous improvement and combination of proteins/peptides separation techniques, mass spectrometry, statistical and bioinformatic approaches. In this thesis work two different mass spectrometry-based protein profiling strategies have been developed and applied to liver and inflammatory bowel diseases (IBDs) for the discovery of new biomarkers. The first of them, based on bulk solid-phase extraction combined with matrix-assisted laser desorption/ionization - Time of Flight mass spectrometry (MALDI-TOF MS) and chemometric analysis of serum samples, was applied to the study of serum protein expression profiles both in IBDs (Crohn’s disease and ulcerative colitis) and in liver diseases (cirrhosis, hepatocellular carcinoma, viral hepatitis). The approach allowed the enrichment of serum proteins/peptides due to the high interaction surface between analytes and solid phase and the high recovery due to the elution step performed directly on the MALDI-target plate. Furthermore the use of chemometric algorithm for the selection of the variables with higher discriminant power permitted to evaluate patterns of 20-30 proteins involved in the differentiation and classification of serum samples from healthy donors and diseased patients. These proteins profiles permit to discriminate among the pathologies with an optimum classification and prediction abilities. In particular in the study of inflammatory bowel diseases, after the analysis using C18 of 129 serum samples from healthy donors and Crohn’s disease, ulcerative colitis and inflammatory controls patients, a 90.7% of classification ability and a 72.9% prediction ability were obtained. In the study of liver diseases (hepatocellular carcinoma, viral hepatitis and cirrhosis) a 80.6% of prediction ability was achieved using IDA-Cu(II) as extraction procedure. The identification of the selected proteins by MALDITOF/ TOF MS analysis or by their selective enrichment followed by enzymatic digestion and MS/MS analysis may give useful information in order to identify new biomarkers involved in the diseases. The second mass spectrometry-based protein profiling strategy developed was based on a label-free liquid chromatography electrospray ionization quadrupole - time of flight differential analysis approach (LC ESI-QTOF MS), combined with targeted MS/MS analysis of only identified differences. The strategy was used for biomarker discovery in IBDs, and in particular of Crohn’s disease. The enriched serum peptidome and the subcellular fractions of intestinal epithelial cells (IECs) from healthy donors and Crohn’s disease patients were analysed. The combining of the low molecular weight serum proteins enrichment step and the LCMS approach allowed to evaluate a pattern of peptides derived from specific exoprotease activity in the coagulation and complement activation pathways. Among these peptides, particularly interesting was the discovery of clusters of peptides from fibrinopeptide A, Apolipoprotein E and A4, and complement C3 and C4. Further studies need to be performed to evaluate the specificity of these clusters and validate the results, in order to develop a rapid serum diagnostic test. The analysis by label-free LC ESI-QTOF MS differential analysis of the subcellular fractions of IECs from Crohn’s disease patients and healthy donors permitted to find many proteins that could be involved in the inflammation process. Among them heat shock protein 70, tryptase alpha-1 precursor and proteins whose upregulation can be explained by the increased activity of IECs in Crohn’s disease were identified. Follow-up studies for the validation of the results and the in-depth investigation of the inflammation pathways involved in the disease will be performed. Both the developed mass spectrometry-based protein profiling strategies have been proved to be useful tools for the discovery of disease biomarkers that need to be validated in further studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A systematic characterization of the composition and structure of the bacterial cell-surface proteome and its complexes can provide an invaluable tool for its comprehensive understanding. The knowledge of protein complexes composition and structure could offer new, more effective targets for a more specific and consequently effective immune response against a complex instead of a single protein. Large-scale protein-protein interaction screens are the first step towards the identification of complexes and their attribution to specific pathways. Currently, several methods exist for identifying protein interactions and protein microarrays provide the most appealing alternative to existing techniques for a high throughput screening of protein-protein interactions in vitro under reasonably straightforward conditions. In this study approximately 100 proteins of Group A Streptococcus (GAS) predicted to be secreted or surface exposed by genomic and proteomic approaches were purified in a His-tagged form and used to generate protein microarrays on nitrocellulose-coated slides. To identify protein-protein interactions each purified protein was then labeled with biotin, hybridized to the microarray and interactions were detected with Cy3-labelled streptavidin. Only reciprocal interactions, i. e. binding of the same two interactors irrespective of which of the two partners is in solid-phase or in solution, were taken as bona fide protein-protein interactions. Using this approach, we have identified 20 interactors of one of the potent toxins secreted by GAS and known as superantigens. Several of these interactors belong to the molecular chaperone or protein folding catalyst families and presumably are involved in the secretion and folding of the superantigen. In addition, a very interesting interaction was found between the superantigen and the substrate binding subunit of a well characterized ABC transporter. This finding opens a new perspective on the current understanding of how superantigens are modified by the bacterial cell in order to become major players in causing disease.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study provides a comprehensive genetic overview on the endangered Italian wolf population. In particular, it focuses on two research lines. On one hand, we focalised on melanism in wolf in order to isolate a mutation related with black coat colour in canids. With several reported black individuals (an exception at European level), the Italian wolf population constituted a challenging research field posing many unanswered questions. As found in North American wolf, we reported that melanism in the Italian population is caused by a different melanocortin pathway component, the K locus, in which a beta-defensin protein acts as an alternative ligand for the Mc1r. This research project was conducted in collaboration with Prof. Gregory Barsh, Department of Genetics and Paediatrics, Stanford University. On the other hand, we performed analysis on a high number of SNPs thanks to a customized Canine microarray useful to integrate or substitute the STR markers for genotyping individuals and detecting wolf-dog hybrids. Thanks to DNA microchip technology, we obtained an impressive amount of genetic data which provides a solid base for future functional genomic studies. This study was undertaken in collaboration with Prof. Robert K. Wayne, Department of Ecology and Evolutionary Biology, University of California, Los Angeles (UCLA).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Jasmonates (JAs) and spermidine (Sd) influence fruit (and seed) development and ripening. In order to unravel their effects in peach fruit, at molecular level, field applications of methyl jasmonate (MJ) and propyl dihydrojasmonate (PDJ), and Sd were performed at an early developmental stage (late S1). At commercial harvest, JA-treated fruit were less ripe than controls. Realtime RT-PCR analyses confirmed a down-regulation of ethylene biosynthetic, perception and signaling genes, and flesh softening-related genes. The expression of cell wall-related genes, of a sugar-transporter and hormone-related transcript levels was also affected by JAs. Seeds from JA-treated fruit showed a shift in the expression of developmental marker genes suggesting that the developmental program was probably slowed down, in agreement with the contention that JAs divert resources from growth to defense. JAs also affected phenolic content and biosynthetic gene expression in the mesocarp. Levels of hydroxycinnamic acids, as well as those of flavan-3-ols, were enhanced, mainly by MJ, in S2. Transcript levels of phenylpropanoid pathway genes were up-regulated by MJ, in agreement with phenolic content. Sd-treated fruits at harvest showed reduced ethylene production and flesh softening. Sd induced a short-term and long-term response patterns in endogenous polyamines. At ripening the up-regulation of the ethylene biosynthetic genes was dramatically counteracted by Sd, leading to a down-regulation of softening-related genes. Hormone-related gene expression was also altered both in the short- and long-term. Gene expression analyses suggest that Sd interfered with fruit development/ripening by interacting with multiple hormonal pathways and that fruit developmental marker gene expression was shifted ahead in accord with a developmental slowing down. 24-Epibrassinolide was applied to Flaminia peaches under field conditions early (S1) or later (S3) during development. Preliminary results showed that, at harvest, treated fruit tended to be larger and less mature though quality parameters did not change relative to controls.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

I linfomi a cellule T periferiche rappresentano circa il 12% di tutte le neoplasie linfoidi.In questo studio, abbiamo effettuato un’analisi di miRNA profiling (TaqMan Array MicroRNA Cards A) su 60 campioni FFPE suddivisi in: PTCLs/NOS (N=25), AITLs (N=10), ALCLs (N=12) e cellule T normali (N=13). Abbiamo identificato 4 miRNA differenzialmente espressi tra PTCLs e cellule T normali. Inoltre, abbiamo identificato tre set di mirna che discriminano le tre entita di PTCLs nodali

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Food suppliers currently measure apple quality considering basic pomological descriptors. Sensory analysis is expensive, does not permit to analyse many samples, and cannot be implemented for measuring quality properties in real time. However, sensory analysis is the best way to precisely describe food eating quality, since it is able to define, measure, and explain what is really perceivable by human senses and using a language that closely reflects the consumers’ perception. On the basis of such observations, we developed a detailed protocol for apple sensory profiling by descriptive sensory analysis and instrumental measurements. The collected sensory data were validated by applying rigorous scientific criteria for sensory analysis. The method was then applied for studying sensory properties of apples and their changes in relation to different pre- and post-harvest factors affecting fruit quality, and demonstrated to be able to discriminate fruit varieties and to highlight differences in terms of sensory properties. The instrumental measurements confirmed such results. Moreover, the correlation between sensory and instrumental data was studied, and a new effective approach was defined for the reliable prediction of sensory properties by instrumental characterisation. It is therefore possible to propose the application of this sensory-instrumental tool to all the stakeholders involved in apple production and marketing, to have a reliable description of apple fruit quality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adhesion, immune evasion and invasion are key determinants during bacterial pathogenesis. Pathogenic bacteria possess a wide variety of surface exposed and secreted proteins which allow them to adhere to tissues, escape the immune system and spread throughout the human body. Therefore, extensive contacts between the human and the bacterial extracellular proteomes take place at the host-pathogen interface at the protein level. Recent researches emphasized the importance of a global and deeper understanding of the molecular mechanisms which underlie bacterial immune evasion and pathogenesis. Through the use of a large-scale, unbiased, protein microarray-based approach and of wide libraries of human and bacterial purified proteins, novel host-pathogen interactions were identified. This approach was first applied to Staphylococcus aureus, cause of a wide variety of diseases ranging from skin infections to endocarditis and sepsis. The screening led to the identification of several novel interactions between the human and the S. aureus extracellular proteomes. The interaction between the S. aureus immune evasion protein FLIPr (formyl-peptide receptor like-1 inhibitory protein) and the human complement component C1q, key players of the offense-defense fighting, was characterized using label-free techniques and functional assays. The same approach was also applied to Neisseria meningitidis, major cause of bacterial meningitis and fulminant sepsis worldwide. The screening led to the identification of several potential human receptors for the neisserial adhesin A (NadA), an important adhesion protein and key determinant of meningococcal interactions with the human host at various stages. The interaction between NadA and human LOX-1 (low-density oxidized lipoprotein receptor) was confirmed using label-free technologies and cell binding experiments in vitro. Taken together, these two examples provided concrete insights into S. aureus and N. meningitidis pathogenesis, and identified protein microarray coupled with appropriate validation methodologies as a powerful large scale tool for host-pathogen interactions studies.