5 resultados para Geo-statistical model
em National Center for Biotechnology Information - NCBI
Resumo:
A “most probable state” equilibrium statistical theory for random distributions of hetons in a closed basin is developed here in the context of two-layer quasigeostrophic models for the spreading phase of open-ocean convection. The theory depends only on bulk conserved quantities such as energy, circulation, and the range of values of potential vorticity in each layer. The simplest theory is formulated for a uniform cooling event over the entire basin that triggers a homogeneous random distribution of convective towers. For a small Rossby deformation radius typical for open-ocean convection sites, the most probable states that arise from this theory strongly resemble the saturated baroclinic states of the spreading phase of convection, with a stabilizing barotropic rim current and localized temperature anomaly.
Resumo:
Structural genomics aims to solve a large number of protein structures that represent the protein space. Currently an exhaustive solution for all structures seems prohibitively expensive, so the challenge is to define a relatively small set of proteins with new, currently unknown folds. This paper presents a method that assigns each protein with a probability of having an unsolved fold. The method makes extensive use of protomap, a sequence-based classification, and scop, a structure-based classification. According to protomap, the protein space encodes the relationship among proteins as a graph whose vertices correspond to 13,354 clusters of proteins. A representative fold for a cluster with at least one solved protein is determined after superposition of all scop (release 1.37) folds onto protomap clusters. Distances within the protomap graph are computed from each representative fold to the neighboring folds. The distribution of these distances is used to create a statistical model for distances among those folds that are already known and those that have yet to be discovered. The distribution of distances for solved/unsolved proteins is significantly different. This difference makes it possible to use Bayes' rule to derive a statistical estimate that any protein has a yet undetermined fold. Proteins that score the highest probability to represent a new fold constitute the target list for structural determination. Our predicted probabilities for unsolved proteins correlate very well with the proportion of new folds among recently solved structures (new scop 1.39 records) that are disjoint from our original training set.
Resumo:
We present statistical methods for analyzing replicated cDNA microarray expression data and report the results of a controlled experiment. The study was conducted to investigate inherent variability in gene expression data and the extent to which replication in an experiment produces more consistent and reliable findings. We introduce a statistical model to describe the probability that mRNA is contained in the target sample tissue, converted to probe, and ultimately detected on the slide. We also introduce a method to analyze the combined data from all replicates. Of the 288 genes considered in this controlled experiment, 32 would be expected to produce strong hybridization signals because of the known presence of repetitive sequences within them. Results based on individual replicates, however, show that there are 55, 36, and 58 highly expressed genes in replicates 1, 2, and 3, respectively. On the other hand, an analysis by using the combined data from all 3 replicates reveals that only 2 of the 288 genes are incorrectly classified as expressed. Our experiment shows that any single microarray output is subject to substantial variability. By pooling data from replicates, we can provide a more reliable analysis of gene expression data. Therefore, we conclude that designing experiments with replications will greatly reduce misclassification rates. We recommend that at least three replicates be used in designing experiments by using cDNA microarrays, particularly when gene expression data from single specimens are being analyzed.
Resumo:
A statistical modeling approach is proposed for use in searching large microarray data sets for genes that have a transcriptional response to a stimulus. The approach is unrestricted with respect to the timing, magnitude or duration of the response, or the overall abundance of the transcript. The statistical model makes an accommodation for systematic heterogeneity in expression levels. Corresponding data analyses provide gene-specific information, and the approach provides a means for evaluating the statistical significance of such information. To illustrate this strategy we have derived a model to depict the profile expected for a periodically transcribed gene and used it to look for budding yeast transcripts that adhere to this profile. Using objective criteria, this method identifies 81% of the known periodic transcripts and 1,088 genes, which show significant periodicity in at least one of the three data sets analyzed. However, only one-quarter of these genes show significant oscillations in at least two data sets and can be classified as periodic with high confidence. The method provides estimates of the mean activation and deactivation times, induced and basal expression levels, and statistical measures of the precision of these estimates for each periodic transcript.
Resumo:
Understanding the mechanism of protein secondary structure formation is an essential part of the protein-folding puzzle. Here, we describe a simple statistical mechanical model for the formation of a β-hairpin, the minimal structural element of the antiparallel β-pleated sheet. The model accurately describes the thermodynamic and kinetic behavior of a 16-residue, β-hairpin-forming peptide, successfully explaining its two-state behavior and apparent negative activation energy for folding. The model classifies structures according to their backbone conformation, defined by 15 pairs of dihedral angles, and is further simplified by considering only the 120 structures with contiguous stretches of native pairs of backbone dihedral angles. This single sequence approximation is tested by comparison with a more complete model that includes the 215 possible conformations and 15 × 215 possible kinetic transitions. Finally, we use the model to predict the equilibrium unfolding curves and kinetics for several variants of the β-hairpin peptide.