3 resultados para OLIGONUCLEOTIDE ARRAYS

em Collection Of Biostatistics Research Archive


Relevância:

30.00% 30.00%

Publicador:

Resumo:

High density oligonucleotide expression arrays are a widely used tool for the measurement of gene expression on a large scale. Affymetrix GeneChip arrays appear to dominate this market. These arrays use short oligonucleotides to probe for genes in an RNA sample. Due to optical noise, non-specific hybridization, probe-specific effects, and measurement error, ad-hoc measures of expression, that summarize probe intensities, can lead to imprecise and inaccurate results. Various researchers have demonstrated that expression measures based on simple statistical models can provide great improvements over the ad-hoc procedure offered by Affymetrix. Recently, physical models based on molecular hybridization theory, have been proposed as useful tools for prediction of, for example, non-specific hybridization. These physical models show great potential in terms of improving existing expression measures. In this paper we demonstrate that the system producing the measured intensities is too complex to be fully described with these relatively simple physical models and we propose empirically motivated stochastic models that compliment the above mentioned molecular hybridization theory to provide a comprehensive description of the data. We discuss how the proposed model can be used to obtain improved measures of expression useful for the data analysts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Submicroscopic changes in chromosomal DNA copy number dosage are common and have been implicated in many heritable diseases and cancers. Recent high-throughput technologies have a resolution that permits the detection of segmental changes in DNA copy number that span thousands of basepairs across the genome. Genome-wide association studies (GWAS) may simultaneously screen for copy number-phenotype and SNP-phenotype associations as part of the analytic strategy. However, genome-wide array analyses are particularly susceptible to batch effects as the logistics of preparing DNA and processing thousands of arrays often involves multiple laboratories and technicians, or changes over calendar time to the reagents and laboratory equipment. Failure to adjust for batch effects can lead to incorrect inference and requires inefficient post-hoc quality control procedures that exclude regions that are associated with batch. Our work extends previous model-based approaches for copy number estimation by explicitly modeling batch effects and using shrinkage to improve locus-specific estimates of copy number uncertainty. Key features of this approach include the use of diallelic genotype calls from experimental data to estimate batch- and locus-specific parameters of background and signal without the requirement of training data. We illustrate these ideas using a study of bipolar disease and a study of chromosome 21 trisomy. The former has batch effects that dominate much of the observed variation in quantile-normalized intensities, while the latter illustrates the robustness of our approach to datasets where as many as 25% of the samples have altered copy number. Locus-specific estimates of copy number can be plotted on the copy-number scale to investigate mosaicism and guide the choice of appropriate downstream approaches for smoothing the copy number as a function of physical position. The software is open source and implemented in the R package CRLMM available at Bioconductor (http:www.bioconductor.org).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In most microarray technologies, a number of critical steps are required to convert raw intensity measurements into the data relied upon by data analysts, biologists and clinicians. These data manipulations, referred to as preprocessing, can influence the quality of the ultimate measurements. In the last few years, the high-throughput measurement of gene expression is the most popular application of microarray technology. For this application, various groups have demonstrated that the use of modern statistical methodology can substantially improve accuracy and precision of gene expression measurements, relative to ad-hoc procedures introduced by designers and manufacturers of the technology. Currently, other applications of microarrays are becoming more and more popular. In this paper we describe a preprocessing methodology for a technology designed for the identification of DNA sequence variants in specific genes or regions of the human genome that are associated with phenotypes of interest such as disease. In particular we describe methodology useful for preprocessing Affymetrix SNP chips and obtaining genotype calls with the preprocessed data. We demonstrate how our procedure improves existing approaches using data from three relatively large studies including one in which large number independent calls are available. Software implementing these ideas are avialble from the Bioconductor oligo package.