950 resultados para Geo-statistical model
Resumo:
The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, “MobyDick,” is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6,000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners.
Resumo:
The present work develops and implements a biomathematical statement of how reciprocal connectivity drives stress-adaptive homeostasis in the corticotropic (hypothalamo-pituitary-adrenal) axis. In initial analyses with this interactive construct, we test six specific a priori hypotheses of mechanisms linking circadian (24-h) rhythmicity to pulsatile secretory output. This formulation offers a dynamic framework for later statistical estimation of unobserved in vivo neurohormone secretion and within-axis, dose-responsive interfaces in health and disease. Explication of the core dynamics of the stress-responsive corticotropic axis based on secure physiological precepts should help to unveil new biomedical hypotheses of stressor-specific system failure.
Resumo:
A model of interdependent decision making has been developed to understand group differences in socioeconomic behavior such as nonmarital fertility, school attendance, and drug use. The statistical mechanical structure of the model illustrates how the physical sciences contain useful tools for the study of socioeconomic phenomena.
Resumo:
A molecular model of poorly understood hydrophobic effects is heuristically developed using the methods of information theory. Because primitive hydrophobic effects can be tied to the probability of observing a molecular-sized cavity in the solvent, the probability distribution of the number of solvent centers in a cavity volume is modeled on the basis of the two moments available from the density and radial distribution of oxygen atoms in liquid water. The modeled distribution then yields the probability that no solvent centers are found in the cavity volume. This model is shown to account quantitatively for the central hydrophobic phenomena of cavity formation and association of inert gas solutes. The connection of information theory to statistical thermodynamics provides a basis for clarification of hydrophobic effects. The simplicity and flexibility of the approach suggest that it should permit applications to conformational equilibria of nonpolar solutes and hydrophobic residues in biopolymers.
Resumo:
Geographic knowledge discovery (GKD) is the process of extracting information and knowledge from massive georeferenced databases. Usually the process is accomplished by two different systems, the Geographic Information Systems (GIS) and the data mining engines. However, the development of those systems is a complex task due to it does not follow a systematic, integrated and standard methodology. To overcome these pitfalls, in this paper, we propose a modeling framework that addresses the development of the different parts of a multilayer GKD process. The main advantages of our framework are that: (i) it reduces the design effort, (ii) it improves quality systems obtained, (iii) it is independent of platforms, (iv) it facilitates the use of data mining techniques on geo-referenced data, and finally, (v) it ameliorates the communication between different users.
Resumo:
In order to build dynamic models for prediction and management of degraded Mediterranean forest areas was necessary to build MARIOLA model, which is a calculation computer program. This model includes the following subprograms. 1) bioshrub program, which calculates total, green and woody shrubs biomass and it establishes the time differences to calculate the growth. 2) selego program, which builds the flow equations from the experimental data. It is based on advanced procedures of statistical multiple regression. 3) VEGETATION program, which solves the state equations with Euler or Runge-Kutta integration methods. Each one of these subprograms can act as independent or as linked programs.
Resumo:
Statistical machine translation (SMT) is an approach to Machine Translation (MT) that uses statistical models whose parameter estimation is based on the analysis of existing human translations (contained in bilingual corpora). From a translation student’s standpoint, this dissertation aims to explain how a phrase-based SMT system works, to determine the role of the statistical models it uses in the translation process and to assess the quality of the translations provided that system is trained with in-domain goodquality corpora. To that end, a phrase-based SMT system based on Moses has been trained and subsequently used for the English to Spanish translation of two texts related in topic to the training data. Finally, the quality of this output texts produced by the system has been assessed through a quantitative evaluation carried out with three different automatic evaluation measures and a qualitative evaluation based on the Multidimensional Quality Metrics (MQM).
Resumo:
Pspline uses xtmixed to fit a penalized spline regression and plots the smoothed function. Additional covariates can be specified to adjust the smooth and plot partial residuals.
Resumo:
nlcheck is a simple diagnostic tool that can be used after fitting a model to quickly check the linearity assumption for a given predictor. nlcheck categorizes the predictor into bins, refits the model including dummy variables for the bins, and then performs a joint Wald test for the added parameters. Alternative, nlcheck uses linear splines for the adaptive model. Support for discrete variables is also provided. Optionally, nlcheck also displays a graph of the adjusted linear predictions from the original model and the adaptive model
Resumo:
rrreg fits a linear probability model for randomized response data
Resumo:
Soupy and mousse-like fabrics are disturbance sedimentary features that result from the dissociation of gas hydrate, a process that releases water. During the core retrieval process, soupy and mousse-like fabrics are produced in the gas hydrate-bearing sediments due to changes in pressure and temperature conditions. Therefore, the identification of soupy and mousse-like fabrics can be used as a proxy for the presence of gas hydrate in addition to other evidence, such as pore water freshening or anomalously cool temperature. We present here grain-size results, mineralogical composition and magnetic susceptibility data of soupy and mousse-like samples from the southern Hydrate Ridge (Cascadia accretionary complex) acquired during Leg 204 of the Ocean Drilling Program. In order to study the relationship between sedimentary texture and the presence of gas hydrates, we have compared these results with the main textural and compositional data available from the same area. Most of the disturbed analyzed samples from the summit and the western flank of southern Hydrate Ridge show a mean grain size coarser than the average mean grain size of the hemipelagic samples from the same area. The depositional features of the sediments are not recognised due to disturbance. However, their granulometric statistical parameters and distribution curves, and magnetic susceptibility logs indicate that they correspond to a turbidite facies. These results suggest that gas hydrates in the southern Hydrate Ridge could form preferentially in coarser grain-size layers that could act as conduits feeding gas from below the BSR. Two samples from the uppermost metres near the seafloor at the summit of the southern Hydrate Ridge show a finer mean grain-size value than the average of hemipelagic samples. They were located where the highest amount of gas hydrates was detected, suggesting that in this area the availability of methane gas was high enough to generate gas hydrates, even within low-permeability layers. The mineralogical composition of the soupy and mousse-like sediments does not show any specific characteristic with respect to the other samples from the southern Hydrate Ridge.
Resumo:
Transportation Department, Office of University Research, Washington, D.C.
Resumo:
Mode of access: Internet.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Mixture models implemented via the expectation-maximization (EM) algorithm are being increasingly used in a wide range of problems in pattern recognition such as image segmentation. However, the EM algorithm requires considerable computational time in its application to huge data sets such as a three-dimensional magnetic resonance (MR) image of over 10 million voxels. Recently, it was shown that a sparse, incremental version of the EM algorithm could improve its rate of convergence. In this paper, we show how this modified EM algorithm can be speeded up further by adopting a multiresolution kd-tree structure in performing the E-step. The proposed algorithm outperforms some other variants of the EM algorithm for segmenting MR images of the human brain. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.