885 resultados para Asymptotic covariance matrix
Resumo:
We systematically compare the performance of ETKF-4DVAR, 4DVAR-BEN and 4DENVAR with respect to two traditional methods (4DVAR and ETKF) and an ensemble transform Kalman smoother (ETKS) on the Lorenz 1963 model. We specifically investigated this performance with increasing nonlinearity and using a quasi-static variational assimilation algorithm as a comparison. Using the analysis root mean square error (RMSE) as a metric, these methods have been compared considering (1) assimilation window length and observation interval size and (2) ensemble size to investigate the influence of hybrid background error covariance matrices and nonlinearity on the performance of the methods. For short assimilation windows with close to linear dynamics, it has been shown that all hybrid methods show an improvement in RMSE compared to the traditional methods. For long assimilation window lengths in which nonlinear dynamics are substantial, the variational framework can have diffculties fnding the global minimum of the cost function, so we explore a quasi-static variational assimilation (QSVA) framework. Of the hybrid methods, it is seen that under certain parameters, hybrid methods which do not use a climatological background error covariance do not need QSVA to perform accurately. Generally, results show that the ETKS and hybrid methods that do not use a climatological background error covariance matrix with QSVA outperform all other methods due to the full flow dependency of the background error covariance matrix which also allows for the most nonlinearity.
Resumo:
The disadvantage of the majority of data assimilation schemes is the assumption that the conditional probability density function of the state of the system given the observations [posterior probability density function (PDF)] is distributed either locally or globally as a Gaussian. The advantage, however, is that through various different mechanisms they ensure initial conditions that are predominantly in linear balance and therefore spurious gravity wave generation is suppressed. The equivalent-weights particle filter is a data assimilation scheme that allows for a representation of a potentially multimodal posterior PDF. It does this via proposal densities that lead to extra terms being added to the model equations and means the advantage of the traditional data assimilation schemes, in generating predominantly balanced initial conditions, is no longer guaranteed. This paper looks in detail at the impact the equivalent-weights particle filter has on dynamical balance and gravity wave generation in a primitive equation model. The primary conclusions are that (i) provided the model error covariance matrix imposes geostrophic balance, then each additional term required by the equivalent-weights particle filter is also geostrophically balanced; (ii) the relaxation term required to ensure the particles are in the locality of the observations has little effect on gravity waves and actually induces a reduction in gravity wave energy if sufficiently large; and (iii) the equivalent-weights term, which leads to the particles having equivalent significance in the posterior PDF, produces a change in gravity wave energy comparable to the stochastic model error. Thus, the scheme does not produce significant spurious gravity wave energy and so has potential for application in real high-dimensional geophysical applications.
Resumo:
To improve the quantity and impact of observations used in data assimilation it is necessary to take into account the full, potentially correlated, observation error statistics. A number of methods for estimating correlated observation errors exist, but a popular method is a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. The accuracy of the results it yields is unknown as the diagnostic is sensitive to the difference between the exact background and exact observation error covariances and those that are chosen for use within the assimilation. It has often been stated in the literature that the results using this diagnostic are only valid when the background and observation error correlation length scales are well separated. Here we develop new theory relating to the diagnostic. For observations on a 1D periodic domain we are able to the show the effect of changes in the assumed error statistics used in the assimilation on the estimated observation error covariance matrix. We also provide bounds for the estimated observation error variance and eigenvalues of the estimated observation error correlation matrix. We demonstrate that it is still possible to obtain useful results from the diagnostic when the background and observation error length scales are similar. In general, our results suggest that when correlated observation errors are treated as uncorrelated in the assimilation, the diagnostic will underestimate the correlation length scale. We support our theoretical results with simple illustrative examples. These results have potential use for interpreting the derived covariances estimated using an operational system.
Resumo:
With the development of convection-permitting numerical weather prediction the efficient use of high resolution observations in data assimilation is becoming increasingly important. The operational assimilation of these observations, such as Dopplerradar radial winds, is now common, though to avoid violating the assumption of un- correlated observation errors the observation density is severely reduced. To improve the quantity of observations used and the impact that they have on the forecast will require the introduction of the full, potentially correlated, error statistics. In this work, observation error statistics are calculated for the Doppler radar radial winds that are assimilated into the Met Office high resolution UK model using a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. This is the first in-depth study using the diagnostic to estimate both horizontal and along-beam correlated observation errors. By considering the new results obtained it is found that the Doppler radar radial wind error standard deviations are similar to those used operationally and increase as the observation height increases. Surprisingly the estimated observation error correlation length scales are longer than the operational thinning distance. They are dependent on both the height of the observation and on the distance of the observation away from the radar. Further tests show that the long correlations cannot be attributed to the use of superobservations or the background error covariance matrix used in the assimilation. The large horizontal correlation length scales are, however, in part, a result of using a simplified observation operator.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion combining local component analysis for the finite mixture model. We start with a Parzen window estimator which has the Gaussian kernels with a common covariance matrix, the local component analysis is initially applied to find the covariance matrix using expectation maximization algorithm. Since the constraint on the mixing coefficients of a finite mixture model is on the multinomial manifold, we then use the well-known Riemannian trust-region algorithm to find the set of sparse mixing coefficients. The first and second order Riemannian geometry of the multinomial manifold are utilized in the Riemannian trust-region algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
Morphological integration refers to the modular structuring of inter-trait relationships in an organism, which could bias the direction and rate of morphological change, either constraining or facilitating evolution along certain dimensions of the morphospace. Therefore, the description of patterns and magnitudes of morphological integration and the analysis of their evolutionary consequences are central to understand the evolution of complex traits. Here we analyze morphological integration in the skull of several mammalian orders, addressing the following questions: are there common patterns of inter-trait relationships? Are these patterns compatible with hypotheses based on shared development and function? Do morphological integration patterns and magnitudes vary in the same way across groups? We digitized more than 3,500 specimens spanning 15 mammalian orders, estimated the correspondent pooled within-group correlation and variance/covariance matrices for 35 skull traits and compared those matrices among the orders. We also compared observed patterns of integration to theoretical expectations based on common development and function. Our results point to a largely shared pattern of inter-trait correlations, implying that mammalian skull diversity has been produced upon a common covariance structure that remained similar for at least 65 million years. Comparisons with a rodent genetic variance/covariance matrix suggest that this broad similarity extends also to the genetic factors underlying phenotypic variation. In contrast to the relative constancy of inter-trait correlation/covariance patterns, magnitudes varied markedly across groups. Several morphological modules hypothesized from shared development and function were detected in the mammalian taxa studied. Our data provide evidence that mammalian skull evolution can be viewed as a history of inter-module parcellation, with the modules themselves being more clearly marked in those lineages with lower overall magnitude of integration. The implication of these findings is that the main evolutionary trend in the mammalian skull was one of decreasing the constraints to evolution by promoting a more modular architecture.
Resumo:
In this paper we deal with robust inference in heteroscedastic measurement error models Rather than the normal distribution we postulate a Student t distribution for the observed variables Maximum likelihood estimates are computed numerically Consistent estimation of the asymptotic covariance matrices of the maximum likelihood and generalized least squares estimators is also discussed Three test statistics are proposed for testing hypotheses of interest with the asymptotic chi-square distribution which guarantees correct asymptotic significance levels Results of simulations and an application to a real data set are also reported (C) 2009 The Korean Statistical Society Published by Elsevier B V All rights reserved
Resumo:
Mixed linear models are commonly used in repeated measures studies. They account for the dependence amongst observations obtained from the same experimental unit. Often, the number of observations is small, and it is thus important to use inference strategies that incorporate small sample corrections. In this paper, we develop modified versions of the likelihood ratio test for fixed effects inference in mixed linear models. In particular, we derive a Bartlett correction to such a test, and also to a test obtained from a modified profile likelihood function. Our results generalize those in [Zucker, D.M., Lieberman, O., Manor, O., 2000. Improved small sample inference in the mixed linear model: Bartlett correction and adjusted likelihood. Journal of the Royal Statistical Society B, 62,827-838] by allowing the parameter of interest to be vector-valued. Additionally, our Bartlett corrections allow for random effects nonlinear covariance matrix structure. We report simulation results which show that the proposed tests display superior finite sample behavior relative to the standard likelihood ratio test. An application is also presented and discussed. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
This paper derives the second-order biases Of maximum likelihood estimates from a multivariate normal model where the mean vector and the covariance matrix have parameters in common. We show that the second order bias can always be obtained by means of ordinary weighted least-squares regressions. We conduct simulation studies which indicate that the bias correction scheme yields nearly unbiased estimators. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Em cenas naturais, ocorrem com certa freqüência classes espectralmente muito similares, isto é, os vetores média são muito próximos. Em situações como esta, dados de baixa dimensionalidade (LandSat-TM, Spot) não permitem uma classificação acurada da cena. Por outro lado, sabe-se que dados em alta dimensionalidade [FUK 90] tornam possível a separação destas classes, desde que as matrizes covariância sejam suficientemente distintas. Neste caso, o problema de natureza prática que surge é o da estimação dos parâmetros que caracterizam a distribuição de cada classe. Na medida em que a dimensionalidade dos dados cresce, aumenta o número de parâmetros a serem estimados, especialmente na matriz covariância. Contudo, é sabido que, no mundo real, a quantidade de amostras de treinamento disponíveis, é freqüentemente muito limitada, ocasionando problemas na estimação dos parâmetros necessários ao classificador, degradando portanto a acurácia do processo de classificação, na medida em que a dimensionalidade dos dados aumenta. O Efeito de Hughes, como é chamado este fenômeno, já é bem conhecido no meio científico, e estudos vêm sendo realizados com o objetivo de mitigar este efeito. Entre as alternativas propostas com a finalidade de mitigar o Efeito de Hughes, encontram-se as técnicas de regularização da matriz covariância. Deste modo, técnicas de regularização para a estimação da matriz covariância das classes, tornam-se um tópico interessante de estudo, bem como o comportamento destas técnicas em ambientes de dados de imagens digitais de alta dimensionalidade em sensoriamento remoto, como por exemplo, os dados fornecidos pelo sensor AVIRIS. Neste estudo, é feita uma contextualização em sensoriamento remoto, descrito o sistema sensor AVIRIS, os princípios da análise discriminante linear (LDA), quadrática (QDA) e regularizada (RDA) são apresentados, bem como os experimentos práticos dos métodos, usando dados reais do sensor. Os resultados mostram que, com um número limitado de amostras de treinamento, as técnicas de regularização da matriz covariância foram eficientes em reduzir o Efeito de Hughes. Quanto à acurácia, em alguns casos o modelo quadrático continua sendo o melhor, apesar do Efeito de Hughes, e em outros casos o método de regularização é superior, além de suavizar este efeito. Esta dissertação está organizada da seguinte maneira: No primeiro capítulo é feita uma introdução aos temas: sensoriamento remoto (radiação eletromagnética, espectro eletromagnético, bandas espectrais, assinatura espectral), são também descritos os conceitos, funcionamento do sensor hiperespectral AVIRIS, e os conceitos básicos de reconhecimento de padrões e da abordagem estatística. No segundo capítulo, é feita uma revisão bibliográfica sobre os problemas associados à dimensionalidade dos dados, à descrição das técnicas paramétricas citadas anteriormente, aos métodos de QDA, LDA e RDA, e testes realizados com outros tipos de dados e seus resultados.O terceiro capítulo versa sobre a metodologia que será utilizada nos dados hiperespectrais disponíveis. O quarto capítulo apresenta os testes e experimentos da Análise Discriminante Regularizada (RDA) em imagens hiperespectrais obtidos pelo sensor AVIRIS. No quinto capítulo são apresentados as conclusões e análise final. A contribuição científica deste estudo, relaciona-se à utilização de métodos de regularização da matriz covariância, originalmente propostos por Friedman [FRI 89] para classificação de dados em alta dimensionalidade (dados sintéticos, dados de enologia), para o caso especifico de dados de sensoriamento remoto em alta dimensionalidade (imagens hiperespectrais). A conclusão principal desta dissertação é que o método RDA é útil no processo de classificação de imagens com dados em alta dimensionalidade e classes com características espectrais muito próximas.
Resumo:
This paper develops a general method for constructing similar tests based on the conditional distribution of nonpivotal statistics in a simultaneous equations model with normal errors and known reducedform covariance matrix. The test based on the likelihood ratio statistic is particularly simple and has good power properties. When identification is strong, the power curve of this conditional likelihood ratio test is essentially equal to the power envelope for similar tests. Monte Carlo simulations also suggest that this test dominates the Anderson- Rubin test and the score test. Dropping the restrictive assumption of disturbances normally distributed with known covariance matrix, approximate conditional tests are found that behave well in small samples even when identification is weak.
Resumo:
Este trabalho se dedica a analisar o desempenho de modelos de otimização de carteiras regularizadas, empregando ativos financeiros do mercado brasileiro. Em particular, regularizamos as carteiras através do uso de restrições sobre a norma dos pesos dos ativos, assim como DeMiguel et al. (2009). Adicionalmente, também analisamos o desempenho de carteiras que levam em consideração informações sobre a estrutura de grupos de ativos com características semelhantes, conforme proposto por Fernandes, Rocha e Souza (2011). Enquanto a matriz de covariância empregada nas análises é a estimada através dos dados amostrais, os retornos esperados são obtidos através da otimização reversa da carteira de equilíbrio de mercado proposta por Black e Litterman (1992). A análise empírica fora da amostra para o período entre janeiro de 2010 e outubro de 2014 sinaliza-nos que, em linha com estudos anteriores, a penalização das normas dos pesos pode levar (dependendo da norma escolhida e da intensidade da restrição) a melhores performances em termos de Sharpe e retorno médio, em relação a carteiras obtidas via o modelo tradicional de Markowitz. Além disso, a inclusão de informações sobre os grupos de ativos também pode trazer benefícios ao cálculo de portfolios ótimos, tanto em relação aos métodos tradicionais quanto em relação aos casos sem uso da estrutura de grupos.
Resumo:
Parent, L. E., Natale, W. and Ziadi, N. 2009. Compositional nutrient diagnosis of corn using the Mahalanobis distance as nutrient imbalance index. Can. J. Soil Sci. 89: 383-390. Compositional nutrient diagnosis (CND) provides a plant nutrient imbalance index (CND - r(2)) with assumed chi(2) distribution. The Mahalanobis distance D(2), which detects outliers in compositional data sets, also has a chi(2) distribution. The objective of this paper was to compare D(2) and CND - r(2) nutrient imbalance indexes in corn (Zea mays L.). We measured grain yield as well as N, P, K, Ca, Mg, Cu, Fe, Mn, and Zn concentrations in the ear leaf at silk stage for 210 calibration sites in the St. Lawrence Lowlands [2300-2700 corn thermal units (CTU)] as well as 30 phosphorus (2300-2700 CTU; 10 sites) and 10 nitrogen (1900-2100 CTU; one site) replicated fertilizer treatments for validation. We derived CND norms as mean, standard deviation, and the inverse covariance matrix of centred log ratios (clr) for high yielding specimens (>= 9.0 Mg grain ha(-1) at 150 g H(2)O kg(-1) moisture content) in the 2300-2700 CTU zone. Using chi(2) = 17 (P < 0.05) with nine degrees of freedom (i.e., nine nutrients) as a rejection criterion for outliers and a yield threshold of 8.6 Mg ha(-1) after Cate-Nelson partitioning between low- and high-yielders in the P validation data set, D(2) misclassified two specimens compared with nine for CND -r(2). The D(2) classification was not significantly different from a chi(2) classification (P > 0.05), but the CND - r(2) classification differed significantly from chi(2) or D(2) (P < 0.001). A threshold value for nutrient imbalance could thus be derived probabilistically for conducting D(2) diagnosis, while the CND - r(2) nutrient imbalance threshold must be calibrated using fertilizer trials. In the proposed CND - D(2) procedure, D(2) is first computed to classify the specimen as possible outlier. Thereafter, nutrient indices are ranked in their order of limitation. The D(2) norms appeared less effective in the 1900-2100 CTU zone.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)