965 resultados para Microarray Data
Resumo:
With recent advances in mass spectrometry techniques, it is now possible to investigate proteins over a wide range of molecular weights in small biological specimens. This advance has generated data-analytic challenges in proteomics, similar to those created by microarray technologies in genetics, namely, discovery of "signature" protein profiles specific to each pathologic state (e.g., normal vs. cancer) or differential profiles between experimental conditions (e.g., treated by a drug of interest vs. untreated) from high-dimensional data. We propose a data analytic strategy for discovering protein biomarkers based on such high-dimensional mass-spectrometry data. A real biomarker-discovery project on prostate cancer is taken as a concrete example throughout the paper: the project aims to identify proteins in serum that distinguish cancer, benign hyperplasia, and normal states of prostate using the Surface Enhanced Laser Desorption/Ionization (SELDI) technology, a recently developed mass spectrometry technique. Our data analytic strategy takes properties of the SELDI mass-spectrometer into account: the SELDI output of a specimen contains about 48,000 (x, y) points where x is the protein mass divided by the number of charges introduced by ionization and y is the protein intensity of the corresponding mass per charge value, x, in that specimen. Given high coefficients of variation and other characteristics of protein intensity measures (y values), we reduce the measures of protein intensities to a set of binary variables that indicate peaks in the y-axis direction in the nearest neighborhoods of each mass per charge point in the x-axis direction. We then account for a shifting (measurement error) problem of the x-axis in SELDI output. After these pre-analysis processing of data, we combine the binary predictors to generate classification rules for cancer, benign hyperplasia, and normal states of prostate. Our approach is to apply the boosting algorithm to select binary predictors and construct a summary classifier. We empirically evaluate sensitivity and specificity of the resulting summary classifiers with a test dataset that is independent from the training dataset used to construct the summary classifiers. The proposed method performed nearly perfectly in distinguishing cancer and benign hyperplasia from normal. In the classification of cancer vs. benign hyperplasia, however, an appreciable proportion of the benign specimens were classified incorrectly as cancer. We discuss practical issues associated with our proposed approach to the analysis of SELDI output and its application in cancer biomarker discovery.
Resumo:
We derive the additive-multiplicative error model for microarray intensities, and describe two applications. For the detection of differentially expressed genes, we obtain a statistic whose variance is approximately independent of the mean intensity. For the post hoc calibration (normalization) of data with respect to experimental factors, we describe a method for parameter estimation.
Resumo:
In most microarray technologies, a number of critical steps are required to convert raw intensity measurements into the data relied upon by data analysts, biologists and clinicians. These data manipulations, referred to as preprocessing, can influence the quality of the ultimate measurements. In the last few years, the high-throughput measurement of gene expression is the most popular application of microarray technology. For this application, various groups have demonstrated that the use of modern statistical methodology can substantially improve accuracy and precision of gene expression measurements, relative to ad-hoc procedures introduced by designers and manufacturers of the technology. Currently, other applications of microarrays are becoming more and more popular. In this paper we describe a preprocessing methodology for a technology designed for the identification of DNA sequence variants in specific genes or regions of the human genome that are associated with phenotypes of interest such as disease. In particular we describe methodology useful for preprocessing Affymetrix SNP chips and obtaining genotype calls with the preprocessed data. We demonstrate how our procedure improves existing approaches using data from three relatively large studies including one in which large number independent calls are available. Software implementing these ideas are avialble from the Bioconductor oligo package.
Resumo:
Cholangiocarcinoma is the second most common malignant tumor of the liver. We analyzed, immunohistochemically, the significance of cell cycle- and apoptosis-related markers in 128 cholangiocarcinomas (42 intrahepatic, 70 extrahepatic, and 16 gallbladder carcinomas) combined in a tissue microarray. Follow-up was available for 57 patients (44.5%). In comparison with normal tissue (29 specimens), cholangiocarcinomas expressed significantly more frequently p53, bcl-2, bax, and COX-2 (P.05 <). Intrahepatic tumors were significantly more frequently bcl-2+ and p16+, whereas extrahepatic tumors were more often p53+ (P < .05). Loss of p16 expression was associated with reduced survival of patients. Our data show that p53, bcl-2, bax, and COX-2 have an important role in the pathogenesis of cholangiocarcinomas. The differential expression of p16, bcl-2, and p53 between intrahepatic and extrahepatic tumors demonstrates that there are location-related differences in the phenotype and the genetic profiles of these tumors. Moreover, p16 was identified as an important prognostic marker in cholangiocarcinomas.
Resumo:
BACKGROUND AND OBJECTIVE Connective tissue grafts are frequently applied, together with Emdogain(®) , for root coverage. However, it is unknown whether fibroblasts from the gingiva and from the palate respond similarly to Emdogain. The aim of this study was therefore to evaluate the effect of Emdogain(®) on fibroblasts from palatal and gingival connective tissue using a genome-wide microarray approach. MATERIAL AND METHODS Human palatal and gingival fibroblasts were exposed to Emdogain(®) and RNA was subjected to microarray analysis followed by gene ontology screening with Database for Annotation, Visualization and Integrated Discovery functional annotation clustering, Kyoto Encyclopedia of Genes and Genomes pathway analysis and the Search Tool for the Retrieval of Interacting Genes/Proteins functional protein association network. Microarray results were confirmed by quantitative RT-PCR analysis. RESULTS The transcription levels of 106 genes were up-/down-regulated by at least five-fold in both gingival and palatal fibroblasts upon exposure to Emdogain(®) . Gene ontology screening assigned the respective genes into 118 biological processes, six cellular components, eight molecular functions and five pathways. Among the striking patterns observed were the changing expression of ligands targeting the transforming growth factor-beta and gp130 receptor family as well as the transition of mesenchymal epithelial cells. Moreover, Emdogain(®) caused changes in expression of receptors for chemokines, lipids and hormones, and for transcription factors such as SMAD3, peroxisome proliferator-activated receptor gamma and those of the ETS family. CONCLUSION The present data suggest that Emdogain(®) causes substantial alterations in gene expression, with similar patterns observed in palatal and gingival fibroblasts.
High-resolution microarray analysis of chromosome 20q in human colon cancer metastasis model systems
Resumo:
Amplification of human chromosome 20q DNA is the most frequently occurring chromosomal abnormality detected in sporadic colorectal carcinomas and shows significant correlation with liver metastases. Through comprehensive high-resolution microarray comparative genomic hybridization and microarray gene expression profiling, we have characterized chromosome 20q amplicon genes associated with human colorectal cancer metastasis in two in vitro metastasis model systems. The results revealed increasing complexity of the 20q genomic profile from the primary tumor-derived cell lines to the lymph node and liver metastasis derived cell lines. Expression analysis of chromosome 20q revealed a subset of over expressed genes residing within the regions of genomic copy number gain in all the tumor cell lines, suggesting these are Chromosome 20q copy number responsive genes. Bases on their preferential expression levels in the model system cell lines and known biological function, four of the over expressed genes mapping to the common intervals of genomic copy gain were considered the most promising candidate colorectal metastasis-associated genes. Validation of genomic copy number and expression array data was carried out on these genes, with one gene, DNMT3B, standing out as expressed at a relatively higher levels in the metastasis-derived cell lines compared with their primary-derived counterparts in both the models systems analyzed. The data provide evidence for the role of chromosome 20q genes with low copy gain and elevated expression in the clonal evolution of metastatic cells and suggests that such genes may serve as early biomarkers of metastatic potential. The data also support the utility of the combined microarray comparative genomic hybridization and expression array analysis for identifying copy number responsive genes in areas of low DNA copy gain in cancer cells. ^
Resumo:
Most studies of differential gene-expressions have been conducted between two given conditions. The two-condition experimental (TCE) approach is simple in that all genes detected display a common differential expression pattern responsive to a common two-condition difference. Therefore, the genes that are differentially expressed under the other conditions other than the given two conditions are undetectable with the TCE approach. In order to address the problem, we propose a new approach called multiple-condition experiment (MCE) without replication and develop corresponding statistical methods including inference of pairs of conditions for genes, new t-statistics, and a generalized multiple-testing method for any multiple-testing procedure via a control parameter C. We applied these statistical methods to analyze our real MCE data from breast cancer cell lines and found that 85 percent of gene-expression variations were caused by genotypic effects and genotype-ANAX1 overexpression interactions, which agrees well with our expected results. We also applied our methods to the adenoma dataset of Notterman et al. and identified 93 differentially expressed genes that could not be found in TCE. The MCE approach is a conceptual breakthrough in many aspects: (a) many conditions of interests can be conducted simultaneously; (b) study of association between differential expressions of genes and conditions becomes easy; (c) it can provide more precise information for molecular classification and diagnosis of tumors; (d) it can save lot of experimental resources and time for investigators.^
Resumo:
—Microarray-based global gene expression profiling, with the use of sophisticated statistical algorithms is providing new insights into the pathogenesis of autoimmune diseases. We have applied a novel statistical technique for gene selection based on machine learning approaches to analyze microarray expression data gathered from patients with systemic lupus erythematosus (SLE) and primary antiphospholipid syndrome (PAPS), two autoimmune diseases of unknown genetic origin that share many common features. The methodology included a combination of three data discretization policies, a consensus gene selection method, and a multivariate correlation measurement. A set of 150 genes was found to discriminate SLE and PAPS patients from healthy individuals. Statistical validations demonstrate the relevance of this gene set from an univariate and multivariate perspective. Moreover, functional characterization of these genes identified an interferon-regulated gene signature, consistent with previous reports. It also revealed the existence of other regulatory pathways, including those regulated by PTEN, TNF, and BCL-2, which are altered in SLE and PAPS. Remarkably, a significant number of these genes carry E2F binding motifs in their promoters, projecting a role for E2F in the regulation of autoimmunity.
Resumo:
Cross-reactivity of plant foods is an important phenomenon in allergy, with geographical variations with respect to the number and prevalence of the allergens involved in this process, whose complexity requires detailed studies. We have addressed the role of thaumatin-like proteins (TLPs) in cross-reactivity between fruit and pollen allergies. A representative panel of 16 purified TLPs was printed onto an allergen microarray. The proteins selected belonged to the sources most frequently associated with peach allergy in representative regions of Spain. Sera from two groups of well characterized patients, one with allergy to Rosaceae fruit (FAG) and another against pollens but tolerant to food-plant allergens (PAG), were obtained from seven geographical areas with different environmental pollen profiles. Cross-reactivity between members of this family was demonstrated by inhibition assays. Only 6 out of 16 purified TLPs showed noticeable allergenic activity in the studied populations. Pru p 2.0201, the peach TLP (41%), chestnut TLP (24%) and plane pollen TLP (22%) proved to be allergens of probable relevance to fruit allergy, being mainly associated with pollen sensitization, and strongly linked to specific geographical areas such as Barcelona, Bilbao, the Canary Islands and Madrid. The patients exhibited mayor que50% positive response to Pru p 2.0201 and to chestnut TLP in these specific areas. Therefore, their recognition patterns were associated with the geographical area, suggesting a role for pollen in the sensitization of these allergens. Finally, the co-sensitizations of patients considering pairs of TLP allergens were analyzed by using the co-sensitization graph associated with an allergen microarray immunoassay. Our data indicate that TLPs are significant allergens in plant food allergy and should be considered when diagnosing and treating pollen-food allergy.
Resumo:
We sought to create a comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle. To this end, we used DNA microarrays and samples from yeast cultures synchronized by three independent methods: α factor arrest, elutriation, and arrest of a cdc15 temperature-sensitive mutant. Using periodicity and correlation algorithms, we identified 800 genes that meet an objective minimum criterion for cell cycle regulation. In separate experiments, designed to examine the effects of inducing either the G1 cyclin Cln3p or the B-type cyclin Clb2p, we found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins. Furthermore, we analyzed our set of cell cycle–regulated genes for known and new promoter elements and show that several known elements (or variations thereof) contain information predictive of cell cycle regulation. A full description and complete data sets are available at http://cellcycle-www.stanford.edu
Resumo:
We present statistical methods for analyzing replicated cDNA microarray expression data and report the results of a controlled experiment. The study was conducted to investigate inherent variability in gene expression data and the extent to which replication in an experiment produces more consistent and reliable findings. We introduce a statistical model to describe the probability that mRNA is contained in the target sample tissue, converted to probe, and ultimately detected on the slide. We also introduce a method to analyze the combined data from all replicates. Of the 288 genes considered in this controlled experiment, 32 would be expected to produce strong hybridization signals because of the known presence of repetitive sequences within them. Results based on individual replicates, however, show that there are 55, 36, and 58 highly expressed genes in replicates 1, 2, and 3, respectively. On the other hand, an analysis by using the combined data from all 3 replicates reveals that only 2 of the 288 genes are incorrectly classified as expressed. Our experiment shows that any single microarray output is subject to substantial variability. By pooling data from replicates, we can provide a more reliable analysis of gene expression data. Therefore, we conclude that designing experiments with replications will greatly reduce misclassification rates. We recommend that at least three replicates be used in designing experiments by using cDNA microarrays, particularly when gene expression data from single specimens are being analyzed.
Resumo:
We describe the time evolution of gene expression levels by using a time translational matrix to predict future expression levels of genes based on their expression levels at some initial time. We deduce the time translational matrix for previously published DNA microarray gene expression data sets by modeling them within a linear framework by using the characteristic modes obtained by singular value decomposition. The resulting time translation matrix provides a measure of the relationships among the modes and governs their time evolution. We show that a truncated matrix linking just a few modes is a good approximation of the full time translation matrix. This finding suggests that the number of essential connections among the genes is small.
Resumo:
Upon the completion of the Saccharomyces cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) Nature, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the Saccharomyces Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at genome-www.stanford.edu/Saccharomyces/.