991 resultados para covariance analysis
Resumo:
This paper discusses an important issue related to the implementation and interpretation of the analysis scheme in the ensemble Kalman filter . I t i s shown that the obser vations must be treated as random variables at the analysis steps. That is, one should add random perturbations with the correct statistics to the obser vations and generate an ensemble of obser vations that then is used in updating the ensemble of model states. T raditionally , this has not been done in previous applications of the ensemble Kalman filter and, as will be shown, this has resulted in an updated ensemble with a variance that is too low . This simple modification of the analysis scheme results in a completely consistent approach if the covariance of the ensemble of model states is interpreted as the prediction error covariance, and there are no further requirements on the ensemble Kalman filter method, except for the use of an ensemble of sufficient size. Thus, there is a unique correspondence between the error statistics from the ensemble Kalman filter and the standard Kalman filter approach
Resumo:
With the development of convection-permitting numerical weather prediction the efficient use of high resolution observations in data assimilation is becoming increasingly important. The operational assimilation of these observations, such as Dopplerradar radial winds, is now common, though to avoid violating the assumption of un- correlated observation errors the observation density is severely reduced. To improve the quantity of observations used and the impact that they have on the forecast will require the introduction of the full, potentially correlated, error statistics. In this work, observation error statistics are calculated for the Doppler radar radial winds that are assimilated into the Met Office high resolution UK model using a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. This is the first in-depth study using the diagnostic to estimate both horizontal and along-beam correlated observation errors. By considering the new results obtained it is found that the Doppler radar radial wind error standard deviations are similar to those used operationally and increase as the observation height increases. Surprisingly the estimated observation error correlation length scales are longer than the operational thinning distance. They are dependent on both the height of the observation and on the distance of the observation away from the radar. Further tests show that the long correlations cannot be attributed to the use of superobservations or the background error covariance matrix used in the assimilation. The large horizontal correlation length scales are, however, in part, a result of using a simplified observation operator.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion combining local component analysis for the finite mixture model. We start with a Parzen window estimator which has the Gaussian kernels with a common covariance matrix, the local component analysis is initially applied to find the covariance matrix using expectation maximization algorithm. Since the constraint on the mixing coefficients of a finite mixture model is on the multinomial manifold, we then use the well-known Riemannian trust-region algorithm to find the set of sparse mixing coefficients. The first and second order Riemannian geometry of the multinomial manifold are utilized in the Riemannian trust-region algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
The study of the genetic variance/covariance matrix (G-matrix) is a recent and fruitful approach in evolutionary biology, providing a window of investigating for the evolution of complex characters. Although G-matrix studies were originally conducted for microevolutionary timescales, they could be extrapolated to macroevolution as long as the G-matrix remains relatively constant, or proportional, along the period of interest. A promising approach to investigating the constancy of G-matrices is to compare their phenotypic counterparts (P-matrices) in a large group of related species; if significant similarity is found among several taxa, it is very likely that the underlying G-matrices are also equivalent. Here we study the similarity of covariance and correlation structure in a broad sample of Old World monkeys and apes (Catarrhini). We made phylogenetically structured comparisons of correlation and covariance matrices derived from 39 skull traits, ranging from between species to the superfamily level. We also compared the overall magnitude of integration between skull traits (r(2)) for all Catarrhim genera. Our results show that P-matrices were not strictly constant among catarrhines, but the amount of divergence observed among taxa was generally low. There was significant and positive correlation between the amount of divergence in correlation and covariance patterns among the 30 genera and their phylogenetic distances derived from a recently proposed phylogenetic hypothesis. Our data demonstrate that the P-matrices remained relatively similar along the evolutionary history of catarrhines, and comparisons with the G-matrix available for a New World monkey genus (Saguinus) suggests that the same holds for all anthropoids. The magnitude of integration, in contrast, varied considerably among genera, indicating that evolution of the magnitude, rather than the pattern of inter-trait correlations, might have played an important role in the diversification of the catarrhine skull. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Modeling of spatial dependence structure, concerning geoestatistics approach, is an indispensable tool for fixing parameters that define this structure, applied on interpolation of values in places that are not sampled, by kriging techniques. However, the estimation of parameters can be greatly affected by the presence of atypical observations on sampled data. Thus, this trial aimed at using diagnostics techniques of local influence in spatial linear Gaussians models, applied at geoestatistics in order to evaluate sensitivity of maximum likelihood estimators and restrict maximum likelihood to small perturbations in these data. So, studies with simulated and experimental data were performed. Those results, obtained from the study of real data, allowed us to conclude that the presence of atypical values among the sampled data can have a strong influence on thematic maps, changing, therefore, the spatial dependence. The application of diagnostics techniques of local influence should be part of any geoestatistic analysis, ensuring that the information from thematic maps has better quality and can be used with greater security by farmers.
Resumo:
We examine bivariate extensions of Aït-Sahalia’s approach to the estimation of univariate diffusions. Our message is that extending his idea to a bivariate setting is not straightforward. In higher dimensions, as opposed to the univariate case, the elements of the Itô and Fokker-Planck representations do not coincide; and, even imposing sensible assumptions on the marginal drifts and volatilities is not sufficient to obtain direct generalisations. We develop exploratory estimation and testing procedures, by parametrizing the drifts of both component processes and setting restrictions on the terms of either the Itô or the Fokker-Planck covariance matrices. This may lead to highly nonlinear ordinary differential equations, where the definition of boundary conditions is crucial. For the methods developed, the Fokker-Planck representation seems more tractable than the Itô’s. Questions for further research include the design of regularity conditions on the time series dependence in the data, the kernels actually used and the bandwidths, to obtain asymptotic properties for the estimators proposed. A particular case seems promising: “causal bivariate models” in which only one of the diffusions contributes to the volatility of the other. Hedging strategies which estimate separately the univariate diffusions at stake may thus be improved.
Resumo:
This paper uses an output oriented Data Envelopment Analysis (DEA) measure of technical efficiency to assess the technical efficiencies of the Brazilian banking system. Four approaches to estimation are compared in order to assess the significance of factors affecting inefficiency. These are nonparametric Analysis of Covariance, maximum likelihood using a family of exponential distributions, maximum likelihood using a family of truncated normal distributions, and the normal Tobit model. The sole focus of the paper is on a combined measure of output and the data analyzed refers to the year 2001. The factors of interest in the analysis and likely to affect efficiency are bank nature (multiple and commercial), bank type (credit, business, bursary and retail), bank size (large, medium, small and micro), bank control (private and public), bank origin (domestic and foreign), and non-performing loans. The latter is a measure of bank risk. All quantitative variables, including non-performing loans, are measured on a per employee basis. The best fits to the data are provided by the exponential family and the nonparametric Analysis of Covariance. The significance of a factor however varies according to the model fit although it can be said that there is some agreements between the best models. A highly significant association in all models fitted is observed only for nonperforming loans. The nonparametric Analysis of Covariance is more consistent with the inefficiency median responses observed for the qualitative factors. The findings of the analysis reinforce the significant association of the level of bank inefficiency, measured by DEA residuals, with the risk of bank failure.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The objective of the present study was to investigate the effect of data structure on estimated genetic parameters and predicted breeding values of direct and maternal genetic effects for weaning weight (WW) and weight gain from birth to weaning (BWG), including or not the genetic covariance between direct and maternal effects. Records of 97,490 Nellore animals born between 1993 and 2006, from the Jacarezinho cattle raising farm, were used. Two different data sets were analyzed: DI_all, which included all available progenies of dams without their own performance; DII_all, which included DI_all + 20% of recorded progenies with maternal phenotypes. Two subsets were obtained from each data set (DI_all and DII_all): DI_1 and DII_1, which included only dams with three or fewer progenies; DI_5 and DII_5, which included only dams with five or more progenies. (Co)variance components and heritabilities were estimated by Bayesian inference through Gibbs sampling using univariate animal models. In general, for the population and traits studied, the proportion of dams with known phenotypic information and the number of progenies per dam influenced direct and maternal heritabilities, as well as the contribution of maternal permanent environmental variance to phenotypic variance. Only small differences were observed in the genetic and environmental parameters when the genetic covariance between direct and maternal effects was set to zero in the data sets studied. Thus, the inclusion or not of the genetic covariance between direct and maternal effects had little effect on the ranking of animals according to their breeding values for WW and BWG. Accurate estimation of genetic correlations between direct and maternal genetic effects depends on the data structure. Thus, this covariance should be set to zero in Nellore data sets in which the proportion of dams with phenotypic information is low, the number of progenies per dam is small, and pedigree relationships are poorly known. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Linear mixed effects models are frequently used to analyse longitudinal data, due to their flexibility in modelling the covariance structure between and within observations. Further, it is easy to deal with unbalanced data, either with respect to the number of observations per subject or per time period, and with varying time intervals between observations. In most applications of mixed models to biological sciences, a normal distribution is assumed both for the random effects and for the residuals. This, however, makes inferences vulnerable to the presence of outliers. Here, linear mixed models employing thick-tailed distributions for robust inferences in longitudinal data analysis are described. Specific distributions discussed include the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted, and the Gibbs sampler and the Metropolis-Hastings algorithms are used to carry out the posterior analyses. An example with data on orthodontic distance growth in children is discussed to illustrate the methodology. Analyses based on either the Student-t distribution or on the usual Gaussian assumption are contrasted. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process for modelling distributions of the random effects and of residuals in linear mixed models, and the MCMC implementation allows the computations to be performed in a flexible manner.
Resumo:
Weight records of Brazilian Nelore cattle, from birth to 630 d of age, recorded every 3 mo, were analyzed using random regression models. Independent variables were Legendre polynomials of age at recording. The model of analysis included contemporary groups as fixed effects and age of dam as a linear and quadratic covariable. Mean trends were modeled through a cubic regression on orthogonal polynomials of age. Up to four sets of random regression coefficients were fitted for animals' direct and maternal, additive genetic, and permanent environmental effects. Changes in measurement error variances with age were modeled through a variance function. Orders of polyno-mial fit from three to six were considered, resulting in up to 77 parameters to be estimated. Models fitting random regressions modeled the pattern of variances in the data adequately, with estimates similar to those from corresponding univariate analysis. Direct heritability estimates decreased after birth and tended to be lowest at ages at which maternal effect estimates tended to be highest. Maternal heritability estimates increased after birth to a peak around 110 to 120 d of age and decreased thereafter. Additive genetic direct correlation estimates between weights at standard ages (birth, weaning, yearling, and final weight) were moderate to high and maternal genetic and environmental correlations were consistently high. © 2001 American Society of Animal Science. All rights reserved.
Resumo:
A total of 20,065 weights recorded on 3016 Nelore animals were used to estimate covariance functions for growth from birth to 630 days of age, assuming a parametric correlation structure to model within-animal correlations. The model of analysis included fixed effects of contemporary groups and age of dam as quadratic covariable. Mean trends were taken into account by a cubic regression on orthogonal polynomials of animal age. Genetic effects of the animal and its dam and maternal permanent environmental effects were modelled by random regressions on Legendre polynomials of age at recording. Changes in direct permanent environmental effect variances were modelled by a polynomial variance function, together with a parametric correlation function to account for correlations between ages. Stationary and nonstationary models were used to model within-animal correlations between different ages. Residual variances were considered homogeneous or heterogeneous, with changes modelled by a step or polynomial function of age at recording. Based on Bayesian information criterion, a model with a cubic variance function combined with a nonstationary correlation function for permanent environmental effects, with 49 parameters to be estimated, fitted best. Modelling within-animal correlations through a parametric correlation structure can describe the variation pattern adequately. Moreover, the number of parameters to be estimated can be decreased substantially compared to a model fitting random regression on Legendre polynomial of age. © 2004 Elsevier B.V. All rights reserved.
Resumo:
Pós-graduação em Genética e Melhoramento Animal - FCAV
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)