Biblioteca Digital

956 resultados para Variance-covariance Matrices

Approximation du calcul de la taille échantillonnale pour les tests à hypothèses multiples lorsque r parmis m hypothèses doivent être significatives

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Généralement, dans les situations d’hypothèses multiples on cherche à rejeter toutes les hypothèses ou bien une seule d’entre d’elles. Depuis quelques temps on voit apparaître le besoin de répondre à la question : « Peut-on rejeter au moins r hypothèses ? ». Toutefois, les outils statisques pour répondre à cette question sont rares dans la littérature. Nous avons donc entrepris de développer les formules générales de puissance pour les procédures les plus utilisées, soit celles de Bonferroni, de Hochberg et de Holm. Nous avons développé un package R pour le calcul de la taille échantilonnalle pour les tests à hypothèses multiples (multiple endpoints), où l’on désire qu’au moins r des m hypothèses soient significatives. Nous nous limitons au cas où toutes les variables sont continues et nous présentons quatre situations différentes qui dépendent de la structure de la matrice de variance-covariance des données.

A new distribution on the simplex containing the Dirichlet family

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Dirichlet family owes its privileged status within simplex distributions to easyness of interpretation and good mathematical properties. In particular, we recall fundamental properties for the analysis of compositional data such as closure under amalgamation and subcomposition. From a probabilistic point of view, it is characterised (uniquely) by a variety of independence relationships which makes it indisputably the reference model for expressing the non trivial idea of substantial independence for compositions. Indeed, its well known inadequacy as a general model for compositional data stems from such an independence structure together with the poorness of its parametrisation. In this paper a new class of distributions (called Flexible Dirichlet) capable of handling various dependence structures and containing the Dirichlet as a special case is presented. The new model exhibits a considerably richer parametrisation which, for example, allows to model the means and (part of) the variance-covariance matrix separately. Moreover, such a model preserves some good mathematical properties of the Dirichlet, i.e. closure under amalgamation and subcomposition with new parameters simply related to the parent composition parameters. Furthermore, the joint and conditional distributions of subcompositions and relative totals can be expressed as simple mixtures of two Flexible Dirichlet distributions. The basis generating the Flexible Dirichlet, though keeping compositional invariance, shows a dependence structure which allows various forms of partitional dependence to be contemplated by the model (e.g. non-neutrality, subcompositional dependence and subcompositional non-invariance), independence cases being identified by suitable parameter configurations. In particular, within this model substantial independence among subsets of components of the composition naturally occurs when the subsets have a Dirichlet distribution

Descomposición de la estructura a términos de las tasas de interés de los bonos soberanos de Estados Unidos y Colombia

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En el presente documento se descompone la estructura a términos de las tasas de interés de los bonos soberanos de EE.UU. y Colombia. Se utiliza un modelo afín de cuatro factores, donde el primero de ellos corresponde a un factor de pronóstico de los retornos y, los demás, a los tres primeros componentes principales de la matriz de varianza-covarianza de las tasas de interés. Para la descomposición de las tasas de interés de Colombia se utiliza el factor de pronóstico de EE.UU. para capturar efectos de spillovers. Se logra concluir que las tasas en EE.UU. no tienen un efecto sobre el nivel de tasas en Colombia pero sí influyen en los excesos de retorno esperado de los bonos y también existen efectos sobre los factores locales, aunque el factor determinante de la dinámica de las tasas locales es el “nivel”. De la descomposición se obtienen las expectativas de la tasa corta y la prima por vencimiento. En ese sentido, se observa que el valor de la prima por vencimiento y su volatilidad incrementa con el vencimiento y que este valor ha venido disminuyendo en el tiempo.

Variational data assimilation for parameter estimation: application to a simple morphodynamic model

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data assimilation is a sophisticated mathematical technique for combining observational data with model predictions to produce state and parameter estimates that most accurately approximate the current and future states of the true system. The technique is commonly used in atmospheric and oceanic modelling, combining empirical observations with model predictions to produce more accurate and well-calibrated forecasts. Here, we consider a novel application within a coastal environment and describe how the method can also be used to deliver improved estimates of uncertain morphodynamic model parameters. This is achieved using a technique known as state augmentation. Earlier applications of state augmentation have typically employed the 4D-Var, Kalman filter or ensemble Kalman filter assimilation schemes. Our new method is based on a computationally inexpensive 3D-Var scheme, where the specification of the error covariance matrices is crucial for success. A simple 1D model of bed-form propagation is used to demonstrate the method. The scheme is capable of recovering near-perfect parameter values and, therefore, improves the capability of our model to predict future bathymetry. Such positive results suggest the potential for application to more complex morphodynamic models.

Influence-matrix diagnostic of a data assimilation system

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society

Resolution of sharp fronts in the presence of model error in variational data assimilation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We show that the four-dimensional variational data assimilation method (4DVar) can be interpreted as a form of Tikhonov regularization, a very familiar method for solving ill-posed inverse problems. It is known from image restoration problems that L1-norm penalty regularization recovers sharp edges in the image more accurately than Tikhonov, or L2-norm, penalty regularization. We apply this idea from stationary inverse problems to 4DVar, a dynamical inverse problem, and give examples for an L1-norm penalty approach and a mixed total variation (TV) L1–L2-norm penalty approach. For problems with model error where sharp fronts are present and the background and observation error covariances are known, the mixed TV L1–L2-norm penalty performs better than either the L1-norm method or the strong constraint 4DVar (L2-norm)method. A strength of the mixed TV L1–L2-norm regularization is that in the case where a simplified form of the background error covariance matrix is used it produces a much more accurate analysis than 4DVar. The method thus has the potential in numerical weather prediction to overcome operational problems with poorly tuned background error covariance matrices.

Comparing hybrid data assimilation methods on the Lorenz 1963 model with increasing nonlinearity

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We systematically compare the performance of ETKF-4DVAR, 4DVAR-BEN and 4DENVAR with respect to two traditional methods (4DVAR and ETKF) and an ensemble transform Kalman smoother (ETKS) on the Lorenz 1963 model. We specifically investigated this performance with increasing nonlinearity and using a quasi-static variational assimilation algorithm as a comparison. Using the analysis root mean square error (RMSE) as a metric, these methods have been compared considering (1) assimilation window length and observation interval size and (2) ensemble size to investigate the influence of hybrid background error covariance matrices and nonlinearity on the performance of the methods. For short assimilation windows with close to linear dynamics, it has been shown that all hybrid methods show an improvement in RMSE compared to the traditional methods. For long assimilation window lengths in which nonlinear dynamics are substantial, the variational framework can have diffculties fnding the global minimum of the cost function, so we explore a quasi-static variational assimilation (QSVA) framework. Of the hybrid methods, it is seen that under certain parameters, hybrid methods which do not use a climatological background error covariance do not need QSVA to perform accurately. Generally, results show that the ETKS and hybrid methods that do not use a climatological background error covariance matrix with QSVA outperform all other methods due to the full flow dependency of the background error covariance matrix which also allows for the most nonlinearity.

How is the balance of a forecast ensemble affected by adaptive and non-adaptive localization schemes?

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper investigates the effect on balance of a number of Schur product-type localization schemes which have been designed with the primary function of reducing spurious far-field correlations in forecast error statistics. The localization schemes studied comprise a non-adaptive scheme (where the moderation matrix is decomposed in a spectral basis), and two adaptive schemes, namely a simplified version of SENCORP (Smoothed ENsemble COrrelations Raised to a Power) and ECO-RAP (Ensemble COrrelations Raised to A Power). The paper shows, we believe for the first time, how the degree of balance (geostrophic and hydrostatic) implied by the error covariance matrices localized by these schemes can be diagnosed. Here it is considered that an effective localization scheme is one that reduces spurious correlations adequately but also minimizes disruption of balance (where the 'correct' degree of balance or imbalance is assumed to be possessed by the unlocalized ensemble). By varying free parameters that describe each scheme (e.g. the degree of truncation in the schemes that use the spectral basis, the 'order' of each scheme, and the degree of ensemble smoothing), it is found that a particular configuration of the ECO-RAP scheme is best suited to the convective-scale system studied. According to our diagnostics this ECO-RAP configuration still weakens geostrophic and hydrostatic balance, but overall this is less so than for other schemes.

SIZE AS A LINE OF LEAST RESISTANCE II: DIRECT SELECTION ON SIZE OR CORRELATED RESPONSE DUE TO CONSTRAINTS?

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Evolutionary change in New World Monkey (NWM) skulls occurred primarily along the line of least resistance defined by size (including allometric) variation (g(max)). Although the direction of evolution was aligned with this axis, it was not clear whether this macroevolutionary pattern results from the conservation of within population genetic covariance patterns (long-term constraint) or long-term selection along a size dimension, or whether both, constraints and selection, were inextricably involved. Furthermore, G-matrix stability can also be a consequence of selection, which implies that both, constraints embodied in g(max) and evolutionary changes observed on the trait averages, would be influenced by selection Here, we describe a combination of approaches that allows one to test whether any particular instance of size evolution is a correlated by-product due to constraints (g(max)) or is due to direct selection on size and apply it to NWM lineages as a case study. The approach is based on comparing the direction and amount of evolutionary change produced by two different simulated sets of net-selection gradients (beta), a size (isometric and allometric size) and a nonsize set. Using this approach it is possible to distinguish between the two hypotheses (indirect size evolution due to constraints or direct selection on size), because although both may produce an evolutionary response aligned with g(max), the amount of change produced by random selection operating through the variance/covariance patterns (constraints hypothesis) will be much smaller than that produced by selection on size (selection hypothesis). Furthermore, the alignment of simulated evolutionary changes with g(max) when selection is not on size is not as tight as when selection is actually on size, allowing a statistical test of whether a particular observed case of evolution along the line of least resistance is the result of selection along it or not. Also, with matrix diagonalization (principal components [PC]) it is possible to calculate directly the net-selection gradient on size alone (first PC [PC1]) by dividing the amount of phenotypic difference between any two populations by the amount of variation in PC1, which allows one to benchmark whether selection was on size or not

The Evolution of Modularity in the Mammalian Skull II: Evolutionary Consequences

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Changes in patterns and magnitudes of integration may influence the ability of a species to respond to selection. Consequently, modularity has often been linked to the concept of evolvability, but their relationship has rarely been tested empirically. One possible explanation is the lack of analytical tools to compare patterns and magnitudes of integration among diverse groups that explicitly relate these aspects to the quantitative genetics framework. We apply such framework here using the multivariate response to selection equation to simulate the evolutionary behavior of several mammalian orders in terms of their flexibility, evolvability and constraints in the skull. We interpreted these simulation results in light of the integration patterns and magnitudes of the same mammalian groups, described in a companion paper. We found that larger magnitudes of integration were associated with a blur of the modules in the skull and to larger portions of the total variation explained by size variation, which in turn can exert a strong evolutionary constraint, thus decreasing the evolutionary flexibility. Conversely, lower overall magnitudes of integration were associated with distinct modules in the skull, to smaller fraction of the total variation associated with size and, consequently, to weaker constraints and more evolutionary flexibility. Flexibility and constraints are, therefore, two sides of the same coin and we found them to be quite variable among mammals. Neither the overall magnitude of morphological integration, the modularity itself, nor its consequences in terms of constraints and flexibility, were associated with absolute size of the organisms, but were strongly associated with the proportion of the total variation in skull morphology captured by size. Therefore, the history of the mammalian skull is marked by a trade-off between modularity and evolvability. Our data provide evidence that, despite the stasis in integration patterns, the plasticity in the magnitude of integration in the skull had important consequences in terms of evolutionary flexibility of the mammalian lineages.

Robust inference in an heteroscedastic measurement error model

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we deal with robust inference in heteroscedastic measurement error models Rather than the normal distribution we postulate a Student t distribution for the observed variables Maximum likelihood estimates are computed numerically Consistent estimation of the asymptotic covariance matrices of the maximum likelihood and generalized least squares estimators is also discussed Three test statistics are proposed for testing hypotheses of interest with the asymptotic chi-square distribution which guarantees correct asymptotic significance levels Results of simulations and an application to a real data set are also reported (C) 2009 The Korean Statistical Society Published by Elsevier B V All rights reserved

Uso da análise discriminante regularizada (RDA) no reconhecimento de padrões em imagens digitais hiperespectral de sensoriamento remoto

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Em cenas naturais, ocorrem com certa freqüência classes espectralmente muito similares, isto é, os vetores média são muito próximos. Em situações como esta, dados de baixa dimensionalidade (LandSat-TM, Spot) não permitem uma classificação acurada da cena. Por outro lado, sabe-se que dados em alta dimensionalidade [FUK 90] tornam possível a separação destas classes, desde que as matrizes covariância sejam suficientemente distintas. Neste caso, o problema de natureza prática que surge é o da estimação dos parâmetros que caracterizam a distribuição de cada classe. Na medida em que a dimensionalidade dos dados cresce, aumenta o número de parâmetros a serem estimados, especialmente na matriz covariância. Contudo, é sabido que, no mundo real, a quantidade de amostras de treinamento disponíveis, é freqüentemente muito limitada, ocasionando problemas na estimação dos parâmetros necessários ao classificador, degradando portanto a acurácia do processo de classificação, na medida em que a dimensionalidade dos dados aumenta. O Efeito de Hughes, como é chamado este fenômeno, já é bem conhecido no meio científico, e estudos vêm sendo realizados com o objetivo de mitigar este efeito. Entre as alternativas propostas com a finalidade de mitigar o Efeito de Hughes, encontram-se as técnicas de regularização da matriz covariância. Deste modo, técnicas de regularização para a estimação da matriz covariância das classes, tornam-se um tópico interessante de estudo, bem como o comportamento destas técnicas em ambientes de dados de imagens digitais de alta dimensionalidade em sensoriamento remoto, como por exemplo, os dados fornecidos pelo sensor AVIRIS. Neste estudo, é feita uma contextualização em sensoriamento remoto, descrito o sistema sensor AVIRIS, os princípios da análise discriminante linear (LDA), quadrática (QDA) e regularizada (RDA) são apresentados, bem como os experimentos práticos dos métodos, usando dados reais do sensor. Os resultados mostram que, com um número limitado de amostras de treinamento, as técnicas de regularização da matriz covariância foram eficientes em reduzir o Efeito de Hughes. Quanto à acurácia, em alguns casos o modelo quadrático continua sendo o melhor, apesar do Efeito de Hughes, e em outros casos o método de regularização é superior, além de suavizar este efeito. Esta dissertação está organizada da seguinte maneira: No primeiro capítulo é feita uma introdução aos temas: sensoriamento remoto (radiação eletromagnética, espectro eletromagnético, bandas espectrais, assinatura espectral), são também descritos os conceitos, funcionamento do sensor hiperespectral AVIRIS, e os conceitos básicos de reconhecimento de padrões e da abordagem estatística. No segundo capítulo, é feita uma revisão bibliográfica sobre os problemas associados à dimensionalidade dos dados, à descrição das técnicas paramétricas citadas anteriormente, aos métodos de QDA, LDA e RDA, e testes realizados com outros tipos de dados e seus resultados.O terceiro capítulo versa sobre a metodologia que será utilizada nos dados hiperespectrais disponíveis. O quarto capítulo apresenta os testes e experimentos da Análise Discriminante Regularizada (RDA) em imagens hiperespectrais obtidos pelo sensor AVIRIS. No quinto capítulo são apresentados as conclusões e análise final. A contribuição científica deste estudo, relaciona-se à utilização de métodos de regularização da matriz covariância, originalmente propostos por Friedman [FRI 89] para classificação de dados em alta dimensionalidade (dados sintéticos, dados de enologia), para o caso especifico de dados de sensoriamento remoto em alta dimensionalidade (imagens hiperespectrais). A conclusão principal desta dissertação é que o método RDA é útil no processo de classificação de imagens com dados em alta dimensionalidade e classes com características espectrais muito próximas.

Exploratory semiparametric analysis of two-dimensional diffusions in finance

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We examine bivariate extensions of Aït-Sahalia’s approach to the estimation of univariate diffusions. Our message is that extending his idea to a bivariate setting is not straightforward. In higher dimensions, as opposed to the univariate case, the elements of the Itô and Fokker-Planck representations do not coincide; and, even imposing sensible assumptions on the marginal drifts and volatilities is not sufficient to obtain direct generalisations. We develop exploratory estimation and testing procedures, by parametrizing the drifts of both component processes and setting restrictions on the terms of either the Itô or the Fokker-Planck covariance matrices. This may lead to highly nonlinear ordinary differential equations, where the definition of boundary conditions is crucial. For the methods developed, the Fokker-Planck representation seems more tractable than the Itô’s. Questions for further research include the design of regularity conditions on the time series dependence in the data, the kernels actually used and the bandwidths, to obtain asymptotic properties for the estimators proposed. A particular case seems promising: “causal bivariate models” in which only one of the diffusions contributes to the volatility of the other. Hedging strategies which estimate separately the univariate diffusions at stake may thus be improved.

Rastreamento de objetos usando descritores estatísticos

Relevância:

80.00% 80.00%

Publicador:

Compositional nutrient diagnosis of corn using the Mahalanobis distance as nutrient imbalance index

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Parent, L. E., Natale, W. and Ziadi, N. 2009. Compositional nutrient diagnosis of corn using the Mahalanobis distance as nutrient imbalance index. Can. J. Soil Sci. 89: 383-390. Compositional nutrient diagnosis (CND) provides a plant nutrient imbalance index (CND - r(2)) with assumed chi(2) distribution. The Mahalanobis distance D(2), which detects outliers in compositional data sets, also has a chi(2) distribution. The objective of this paper was to compare D(2) and CND - r(2) nutrient imbalance indexes in corn (Zea mays L.). We measured grain yield as well as N, P, K, Ca, Mg, Cu, Fe, Mn, and Zn concentrations in the ear leaf at silk stage for 210 calibration sites in the St. Lawrence Lowlands [2300-2700 corn thermal units (CTU)] as well as 30 phosphorus (2300-2700 CTU; 10 sites) and 10 nitrogen (1900-2100 CTU; one site) replicated fertilizer treatments for validation. We derived CND norms as mean, standard deviation, and the inverse covariance matrix of centred log ratios (clr) for high yielding specimens (>= 9.0 Mg grain ha(-1) at 150 g H(2)O kg(-1) moisture content) in the 2300-2700 CTU zone. Using chi(2) = 17 (P < 0.05) with nine degrees of freedom (i.e., nine nutrients) as a rejection criterion for outliers and a yield threshold of 8.6 Mg ha(-1) after Cate-Nelson partitioning between low- and high-yielders in the P validation data set, D(2) misclassified two specimens compared with nine for CND -r(2). The D(2) classification was not significantly different from a chi(2) classification (P > 0.05), but the CND - r(2) classification differed significantly from chi(2) or D(2) (P < 0.001). A threshold value for nutrient imbalance could thus be derived probabilistically for conducting D(2) diagnosis, while the CND - r(2) nutrient imbalance threshold must be calibrated using fertilizer trials. In the proposed CND - D(2) procedure, D(2) is first computed to classify the specimen as possible outlier. Thereafter, nutrient indices are ranked in their order of limitation. The D(2) norms appeared less effective in the 1900-2100 CTU zone.

«
1
2
3
4
5
6
7
8
...
63
64
»