939 resultados para DNA data banks


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article gives an overview over the methods used in the low--level analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of nucleic acid abundance in a set of tissues or cell populations for thousands of transcripts or loci simultaneously. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. This includes the design of probes, the experimental design, the image analysis of microarray scanned images, the normalization of fluorescence intensities, the assessment of the quality of microarray data and incorporation of quality information in subsequent analyses, the combination of information across arrays and across sets of experiments, the discovery and recognition of patterns in expression at the single gene and multiple gene levels, and the assessment of significance of these findings, considering the fact that there is a lot of noise and thus random features in the data. For all of these components, access to a flexible and efficient statistical computing environment is an essential aspect.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

(1) A mathematical theory for computing the probabilities of various nucleotide configurations is developed, and the probability of obtaining the correct phylogenetic tree (model tree) from sequence data is evaluated for six phylogenetic tree-making methods (UPGMA, distance Wagner method, transformed distance method, Fitch-Margoliash's method, maximum parsimony method, and compatibility method). The number of nucleotides (m*) necessary to obtain the correct tree with a probability of 95% is estimated with special reference to the human, chimpanzee, and gorilla divergence. m* is at least 4,200, but the availability of outgroup species greatly reduces m* for all methods except UPGMA. m* increases if transitions occur more frequently than transversions as in the case of mitochondrial DNA. (2) A new tree-making method called the neighbor-joining method is proposed. This method is applicable either for distance data or character state data. Computer simulation has shown that the neighbor-joining method is generally better than UPGMA, Farris' method, Li's method, and modified Farris method on recovering the true topology when distance data are used. A related method, the simultaneous partitioning method, is also discussed. (3) The maximum likelihood (ML) method for phylogeny reconstruction under the assumption of both constant and varying evolutionary rates is studied, and a new algorithm for obtaining the ML tree is presented. This method gives a tree similar to that obtained by UPGMA when constant evolutionary rate is assumed, whereas it gives a tree similar to that obtained by the maximum parsimony tree and the neighbor-joining method when varying evolutionary rate is assumed. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Current methods for detection of copy number variants (CNV) and aberrations (CNA) from targeted sequencing data are based on the depth of coverage of captured exons. Accurate CNA determination is complicated by uneven genomic distribution and non-uniform capture efficiency of targeted exons. Here we present CopywriteR, which eludes these problems by exploiting 'off-target' sequence reads. CopywriteR allows for extracting uniformly distributed copy number information, can be used without reference, and can be applied to sequencing data obtained from various techniques including chromatin immunoprecipitation and target enrichment on small gene panels. CopywriteR outperforms existing methods and constitutes a widely applicable alternative to available tools.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Genetic investigations on eukaryotic plankton confirmed the existence of modern biogeographic patterns, but analyses of palaeoecological data exploring the temporal variability of these patterns have rarely been presented. Ancient sedimentary DNA proved suitable for investigations of past assemblage turnover in the course of environmental change, but genetic relatedness of the identified lineages has not yet been undertaken. Here, we investigate the relatedness of diatom lineages in Siberian lakes along environmental gradients (i.e. across treeline transects), over geographic distance and through time (i.e. the last 7000 years) using modern and ancient sedimentary DNA. Our results indicate that closely-related Staurosira lineages occur in similar environments and less-related lineages in dissimilar environments, in our case different vegetation and co-varying climatic and limnic variables across treeline transects. Thus our study reveals that environmental conditions rather than geographic distance is reflected by diatom-relatedness patterns in space and time. We tentatively speculate that the detected relatedness pattern in Staurosira across the treeline could be a result of adaptation to diverse environmental conditions across the arctic boreal treeline, however, a geographically-driven divergence and subsequent repopulation of ecologically different habitats might also be a potential explanation for the observed pattern.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes seagrass species and percentage cover point-based field data sets derived from georeferenced photo transects. Annually or biannually over a ten year period (2004-2015) data sets were collected using 30-50 transects, 500-800 m in length distributed across a 142 km**2 shallow, clear water seagrass habitat, the Eastern Banks, Moreton Bay, Australia. Each of the eight data sets include seagrass property information derived from approximately 3000 georeferenced, downward looking photographs captured at 2-4 m intervals along the transects. Photographs were manually interpreted to estimate seagrass species composition and percentage cover (Coral Point Count excel; CPCe). Understanding seagrass biology, ecology and dynamics for scientific and management purposes requires point-based data on species composition and cover. This data set, and the methods used to derive it are a globally unique example for seagrass ecological applications. It provides the basis for multiple further studies at this site, regional to global comparative studies, and, for the design of similar monitoring programs elsewhere.