33 resultados para Microarray data
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
Identifying differential expression of genes in psoriatic and healthy skin by microarray data analysis is a key approach to understand the pathogenesis of psoriasis. Analysis of more than one dataset to identify genes commonly upregulated reduces the likelihood of false positives and narrows down the possible signature genes. Genes controlling the critical balance between T helper 17 and regulatory T cells are of special interest in psoriasis. Our objectives were to identify genes that are consistently upregulated in lesional skin from three published microarray datasets. We carried out a reanalysis of gene expression data extracted from three experiments on samples from psoriatic and nonlesional skin using the same stringency threshold and software and further compared the expression levels of 92 genes related to the T helper 17 and regulatory T cell signaling pathways. We found 73 probe sets representing 57 genes commonly upregulated in lesional skin from all datasets. These included 26 probe sets representing 20 genes that have no previous link to the etiopathogenesis of psoriasis. These genes may represent novel therapeutic targets and surely need more rigorous experimental testing to be validated. Our analysis also identified 12 of 92 genes known to be related to the T helper 17 and regulatory T cell signaling pathways, and these were found to be differentially expressed in the lesional skin samples.
Resumo:
This paper investigates the gene selection problem for microarray data with small samples and variant correlation. Most existing algorithms usually require expensive computational effort, especially under thousands of gene conditions. The main objective of this paper is to effectively select the most informative genes from microarray data, while making the computational expenses affordable. This is achieved by proposing a novel forward gene selection algorithm (FGSA). To overcome the small samples' problem, the augmented data technique is firstly employed to produce an augmented data set. Taking inspiration from other gene selection methods, the L2-norm penalty is then introduced into the recently proposed fast regression algorithm to achieve the group selection ability. Finally, by defining a proper regression context, the proposed method can be fast implemented in the software, which significantly reduces computational burden. Both computational complexity analysis and simulation results confirm the effectiveness of the proposed algorithm in comparison with other approaches
Resumo:
Quantile normalization (QN) is a technique for microarray data processing and is the default normalization method in the Robust Multi-array Average (RMA) procedure, which was primarily designed for analysing gene expression data from Affymetrix arrays. Given the abundance of Affymetrix microarrays and the popularity of the RMA method, it is crucially important that the normalization procedure is applied appropriately. In this study we carried out simulation experiments and also analysed real microarray data to investigate the suitability of RMA when it is applied to dataset with different groups of biological samples. From our experiments, we showed that RMA with QN does not preserve the biological signal included in each group, but rather it would mix the signals between the groups. We also showed that the Median Polish method in the summarization step of RMA has similar mixing effect. RMA is one of the most widely used methods in microarray data processing and has been applied to a vast volume of data in biomedical research. The problematic behaviour of this method suggests that previous studies employing RMA could have been misadvised or adversely affected. Therefore we think it is crucially important that the research community recognizes the issue and starts to address it. The two core elements of the RMA method, quantile normalization and Median Polish, both have the undesirable effects of mixing biological signals between different sample groups, which can be detrimental to drawing valid biological conclusions and to any subsequent analyses. Based on the evidence presented here and that in the literature, we recommend exercising caution when using RMA as a method of processing microarray gene expression data, particularly in situations where there are likely to be unknown subgroups of samples.
Resumo:
High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods.
Resumo:
In the study of complex genetic diseases, the identification of subgroups of patients sharing similar genetic characteristics represents a challenging task, for example, to improve treatment decision. One type of genetic lesion, frequently investigated in such disorders, is the change of the DNA copy number (CN) at specific genomic traits. Non-negative Matrix Factorization (NMF) is a standard technique to reduce the dimensionality of a data set and to cluster data samples, while keeping its most relevant information in meaningful components. Thus, it can be used to discover subgroups of patients from CN profiles. It is however computationally impractical for very high dimensional data, such as CN microarray data. Deciding the most suitable number of subgroups is also a challenging problem. The aim of this work is to derive a procedure to compact high dimensional data, in order to improve NMF applicability without compromising the quality of the clustering. This is particularly important for analyzing high-resolution microarray data. Many commonly used quality measures, as well as our own measures, are employed to decide the number of subgroups and to assess the quality of the results. Our measures are based on the idea of identifying robust subgroups, inspired by biologically/clinically relevance instead of simply aiming at well-separated clusters. We evaluate our procedure using four real independent data sets. In these data sets, our method was able to find accurate subgroups with individual molecular and clinical features and outperformed the standard NMF in terms of accuracy in the factorization fitness function. Hence, it can be useful for the discovery of subgroups of patients with similar CN profiles in the study of heterogeneous diseases.
Resumo:
Anillin is an actin-binding protein that can bind septins and is a component of the cytokinetic ring. We assessed the anillin expression in 7,579 human tissue samples and cell lines by DNA micro-array analysis. Anillin is expressed ubiquitously but with variable levels of expression, being highest in the central nervous system. The median level of anillin mRNA expression was higher in tumors than normal tissues (median fold increase 2.58; 95% confidence intervals, 2.19-5.68, P < 0.0001) except in the central nervous system where anillin in RNA levels were lower in tumors. We developed a sensitive reverse transcription-PCR strategy to show that anillin mRNA is expressed in cell lines and in cDNA panels derived from fetal and adult tissues, thus validating the microarray data. We compared anillin with Ki67 in RNA expression and found a significant linear relationship between anillin and Ki67 mRNA expression (Spearmann r similar to 0.6, P < 0.0001). Anillin mRNA expression was analyzed during tumor progression in breast, ovarian, kidney, colorectal, hepatic, lung, endometrial, and pancreatic tumors and in all tissues there was progressive, increase in anillin mRNA expression from normal to benign to malignant to metastatic disease. Finally, we used anti-anillin sera and found nuclear anillin immuncireactivity to be widespread in normal tissues, often not correlating with proliferative compartments. These data provide insight into the existence of non proliferation-associated activities of anillin and roles in interphase nuclei. Thus, anillin is overexpressed in diverse common human tumors, but not simply as a consequence of being a proliferation marker. Anillin may have potential as a novel biomarker.
Resumo:
BACKGROUND: MicroRNAs (miRNAs) are oligoribonucleotides with an important role in regulation of gene expression at the level of translation. Despite imperfect target complementarity, they can also significantly reduce mRNA levels. The validity of miRNA target gene predictions is difficult to assess at the protein level. We sought, therefore, to determine whether a general lowering of predicted target gene mRNA expression by endogenous miRNAs was detectable within microarray gene expression profiles. RESULTS: The target gene sets predicted for each miRNA were mapped onto known gene expression data from a range of tissues. Whether considering mean absolute target gene expression, rank sum tests or 'ranked ratios', many miRNAs with significantly reduced target gene expression corresponded to those known to be expressed in the cognate tissue. Expression levels of miRNAs with reduced target mRNA levels were higher than those of miRNAs with no detectable effect on mRNA expression. Analysis of microarray data gathered after artificial perturbation of expression of a specific miRNA confirmed the predicted increase or decrease in influence of the altered miRNA upon mRNA levels. Strongest associations were observed with targets predicted by TargetScan. CONCLUSION: We have demonstrated that the effect of a miRNA on its target mRNAs' levels can be measured within a single gene expression profile. This emphasizes the extent of this mode of regulation in vivo and confirms that many of the predicted miRNA-mRNA interactions are correct. The success of this approach has revealed the vast potential for extracting information about miRNA function from gene expression profiles.
Resumo:
This study examined variations in gene expression between FFPE blocks within tumors of individual patients. Microarray data were used to measure tumor heterogeneity within and between patients and disease states. Data were used to determine the number of samples needed to power biomarker discovery studies. Bias and variation in gene expression were assessed at the intrapatient and interpatient levels and between adenocarcinoma and squamous samples. A mixed-model analysis of variance was fitted to gene expression data and model signatures to assess the statistical significance of observed variations within and between samples and disease states. Sample size analysis, adjusted for sample heterogeneity, was used to determine the number of samples required to support biomarker discovery studies. Variation in gene expression was observed between blocks taken from a single patient. However, this variation was considerably less than differences between histological characteristics. This degree of block-to-block variation still permits biomarker discovery using either macrodissected tumors or whole FFPE sections, provided that intratumor heterogeneity is taken into account. Failure to consider intratumor heterogeneity may result in underpowered biomarker studies that may result in either the generation of longer gene signatures or the inability to identify a viable biomarker. Moreover, the results of this study indicate that a single biopsy sample is suitable for applying a biomarker in nonsmall-cell lung cancer. © 2012 American Society for Investigative Pathology and the Association for Molecular Pathology.
Resumo:
Clinically, our ability to predict disease outcome for patients with early stage lung cancer is currently poor. To address this issue, tumour specimens were collected at surgery from non-small cell lung cancer (NSCLC) patients as part of the European Early Lung Cancer (EUELC) consortium. The patients were followed-up for three years post-surgery and patients who suffered progressive disease (PD, tumour recurrence, metastasis or a second primary) or remained disease-free (DF) during follow-up were identified. RNA from both tumour and adjacent-normal lung tissue was extracted from patients and subjected to microarray expression profiling. These samples included 36 adenocarcinomas and 23 squamous cell carcinomas from both PD and DF patients. The microarray data was subject to a series of systematic bioinformatics analyses at gene, network and transcription factor levels. The focus of these analyses was 2-fold: firstly to determine whether there were specific biomarkers capable of differentiating between PD and DF patients, and secondly, to identify molecular networks which may contribute to the progressive tumour phenotype. The experimental design and analyses performed permitted the clear differentiation between PD and DF patients using a set of biomarkers implicated in neuroendocrine signalling and allowed the inference of a set of transcription factors whose activity may differ according to disease outcome. Potential links between the biomarkers, the transcription factors and the genes p21/CDKN1A and Myc, which have previously been implicated in NSCLC development, were revealed by a combination of pathway analysis and microarray meta-analysis. These findings suggest that neuroendocrine-related genes, potentially driven through p21/CDKN1A and Myc, are closely linked to whether or not a NSCLC patient will have poor clinical outcome.