917 resultados para vector error correction model
Resumo:
The scope of this study was to estimate calibrated values for dietary data obtained by the Food Frequency Questionnaire for Adolescents (FFQA) and illustrate the effect of this approach on food consumption data. The adolescents were assessed on two occasions, with an average interval of twelve months. In 2004, 393 adolescents participated, and 289 were then reassessed in 2005. Dietary data obtained by the FFQA were calibrated using the regression coefficients estimated from the average of two 24-hour recalls (24HR) of the subsample. The calibrated values were similar to the the 24HR reference measurement in the subsample. In 2004 and 2005 a significant difference was observed between the average consumption levels of the FFQA before and after calibration for all nutrients. With the use of calibrated data the proportion of schoolchildren who had fiber intake below the recommended level increased. Therefore, it is seen that calibrated data can be used to obtain adjusted associations due to reclassification of subjects within the predetermined categories.
Resumo:
Abstract Background To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. Results We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. Conclusion The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models.
Resumo:
We employ the approach of stochastic dynamics to describe the dissemination of vector-borne diseases such as dengue, and we focus our attention on the characterization of the threshold of the epidemic. The coexistence space comprises two representative spatial structures for both human and mosquito populations. The human population has its evolution described by a process that is similar to the Susceptible-Infected-Recovered (SIR) dynamics. The population of mosquitoes follows a dynamic of the type of the Susceptible Infected-Susceptible (SIS) model. The coexistence space is a bipartite lattice constituted by two structures representing the human and mosquito populations. We develop a truncation scheme to solve the evolution equations for the densities and the two-site correlations from which we get the threshold of the disease and the reproductive ratio. We present a precise deØnition of the reproductive ratio which reveals the importance of the correlations developed in the early stage of the disease. According to our deØnition, the reproductive rate is directed related to the conditional probability of the occurrence of a susceptible human (mosquito) given the presence in the neighborhood of an infected mosquito (human). The threshold of the epidemic as well as the phase transition between the epidemic and the non-epidemic states are also obtained by performing Monte Carlo simulations. References: [1] David R. de Souza, T^ania Tom∂e, , Suani R. T. Pinho, Florisneide R. Barreto and M∂ario J. de Oliveira, Phys. Rev. E 87, 012709 (2013). [2] D. R. de Souza, T. Tom∂e and R. M. ZiÆ, J. Stat. Mech. P03006 (2011).
Resumo:
The quality of temperature and humidity retrievals from the infrared SEVIRI sensors on the geostationary Meteosat Second Generation (MSG) satellites is assessed by means of a one dimensional variational algorithm. The study is performed with the aim of improving the spatial and temporal resolution of available observations to feed analysis systems designed for high resolution regional scale numerical weather prediction (NWP) models. The non-hydrostatic forecast model COSMO (COnsortium for Small scale MOdelling) in the ARPA-SIM operational configuration is used to provide background fields. Only clear sky observations over sea are processed. An optimised 1D–VAR set-up comprising of the two water vapour and the three window channels is selected. It maximises the reduction of errors in the model backgrounds while ensuring ease of operational implementation through accurate bias correction procedures and correct radiative transfer simulations. The 1D–VAR retrieval quality is firstly quantified in relative terms employing statistics to estimate the reduction in the background model errors. Additionally the absolute retrieval accuracy is assessed comparing the analysis with independent radiosonde and satellite observations. The inclusion of satellite data brings a substantial reduction in the warm and dry biases present in the forecast model. Moreover it is shown that the retrieval profiles generated by the 1D–VAR are well correlated with the radiosonde measurements. Subsequently the 1D–VAR technique is applied to two three–dimensional case–studies: a false alarm case–study occurred in Friuli–Venezia–Giulia on the 8th of July 2004 and a heavy precipitation case occurred in Emilia–Romagna region between 9th and 12th of April 2005. The impact of satellite data for these two events is evaluated in terms of increments in the integrated water vapour and saturation water vapour over the column, in the 2 meters temperature and specific humidity and in the surface temperature. To improve the 1D–VAR technique a method to calculate flow–dependent model error covariance matrices is also assessed. The approach employs members from an ensemble forecast system generated by perturbing physical parameterisation schemes inside the model. The improved set–up applied to the case of 8th of July 2004 shows a substantial neutral impact.
Resumo:
The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information.
Resumo:
The construction of a reliable, practically useful prediction rule for future response is heavily dependent on the "adequacy" of the fitted regression model. In this article, we consider the absolute prediction error, the expected value of the absolute difference between the future and predicted responses, as the model evaluation criterion. This prediction error is easier to interpret than the average squared error and is equivalent to the mis-classification error for the binary outcome. We show that the distributions of the apparent error and its cross-validation counterparts are approximately normal even under a misspecified fitted model. When the prediction rule is "unsmooth", the variance of the above normal distribution can be estimated well via a perturbation-resampling method. We also show how to approximate the distribution of the difference of the estimated prediction errors from two competing models. With two real examples, we demonstrate that the resulting interval estimates for prediction errors provide much more information about model adequacy than the point estimates alone.
Resumo:
OBJECTIVES Spinal muscular atrophy (SMA) is caused by reduced levels of survival motor neuron (SMN) protein, which results in motoneuron loss. Therapeutic strategies to increase SMN levels including drug compounds, antisense oligonucleotides, and scAAV9 gene therapy have proved effective in mice. We wished to determine whether reduction of SMN in postnatal motoneurons resulted in SMA in a large animal model, whether SMA could be corrected after development of muscle weakness, and the response of clinically relevant biomarkers. METHODS Using intrathecal delivery of scAAV9 expressing an shRNA targeting pig SMN1, SMN was knocked down in motoneurons postnatally to SMA levels. This resulted in an SMA phenotype representing the first large animal model of SMA. Restoration of SMN was performed at different time points with scAAV9 expressing human SMN (scAAV9-SMN), and electrophysiology measurements and pathology were performed. RESULTS Knockdown of SMN in postnatal motoneurons results in overt proximal weakness, fibrillations on electromyography indicating active denervation, and reduced compound muscle action potential (CMAP) and motor unit number estimation (MUNE), as in human SMA. Neuropathology showed loss of motoneurons and motor axons. Presymptomatic delivery of scAAV9-SMN prevented SMA symptoms, indicating that all changes are SMN dependent. Delivery of scAAV9-SMN after symptom onset had a marked impact on phenotype, electrophysiological measures, and pathology. INTERPRETATION High SMN levels are critical in postnatal motoneurons, and reduction of SMN results in an SMA phenotype that is SMN dependent. Importantly, clinically relevant biomarkers including CMAP and MUNE are responsive to SMN restoration, and abrogation of phenotype can be achieved even after symptom onset.
Resumo:
In regression analysis, covariate measurement error occurs in many applications. The error-prone covariates are often referred to as latent variables. In this proposed study, we extended the study of Chan et al. (2008) on recovering latent slope in a simple regression model to that in a multiple regression model. We presented an approach that applied the Monte Carlo method in the Bayesian framework to the parametric regression model with the measurement error in an explanatory variable. The proposed estimator applied the conditional expectation of latent slope given the observed outcome and surrogate variables in the multiple regression models. A simulation study was presented showing that the method produces estimator that is efficient in the multiple regression model, especially when the measurement error variance of surrogate variable is large.^
Resumo:
Geostrophic surface velocities can be derived from the gradients of the mean dynamic topography-the difference between the mean sea surface and the geoid. Therefore, independently observed mean dynamic topography data are valuable input parameters and constraints for ocean circulation models. For a successful fit to observational dynamic topography data, not only the mean dynamic topography on the particular ocean model grid is required, but also information about its inverse covariance matrix. The calculation of the mean dynamic topography from satellite-based gravity field models and altimetric sea surface height measurements, however, is not straightforward. For this purpose, we previously developed an integrated approach to combining these two different observation groups in a consistent way without using the common filter approaches (Becker et al. in J Geodyn 59(60):99-110, 2012, doi:10.1016/j.jog.2011.07.0069; Becker in Konsistente Kombination von Schwerefeld, Altimetrie und hydrographischen Daten zur Modellierung der dynamischen Ozeantopographie, 2012, http://nbn-resolving.de/nbn:de:hbz:5n-29199). Within this combination method, the full spectral range of the observations is considered. Further, it allows the direct determination of the normal equations (i.e., the inverse of the error covariance matrix) of the mean dynamic topography on arbitrary grids, which is one of the requirements for ocean data assimilation. In this paper, we report progress through selection and improved processing of altimetric data sets. We focus on the preprocessing steps of along-track altimetry data from Jason-1 and Envisat to obtain a mean sea surface profile. During this procedure, a rigorous variance propagation is accomplished, so that, for the first time, the full covariance matrix of the mean sea surface is available. The combination of the mean profile and a combined GRACE/GOCE gravity field model yields a mean dynamic topography model for the North Atlantic Ocean that is characterized by a defined set of assumptions. We show that including the geodetically derived mean dynamic topography with the full error structure in a 3D stationary inverse ocean model improves modeled oceanographic features over previous estimates.