53 resultados para Genetic Variance-covariance Matrix
Resumo:
The problem of spurious excitation of gravity waves in the context of four-dimensional data assimilation is investigated using a simple model of balanced dynamics. The model admits a chaotic vortical mode coupled to a comparatively fast gravity wave mode, and can be initialized such that the model evolves on a so-called slow manifold, where the fast motion is suppressed. Identical twin assimilation experiments are performed, comparing the extended and ensemble Kalman filters (EKF and EnKF, respectively). The EKF uses a tangent linear model (TLM) to estimate the evolution of forecast error statistics in time, whereas the EnKF uses the statistics of an ensemble of nonlinear model integrations. Specifically, the case is examined where the true state is balanced, but observation errors project onto all degrees of freedom, including the fast modes. It is shown that the EKF and EnKF will assimilate observations in a balanced way only if certain assumptions hold, and that, outside of ideal cases (i.e., with very frequent observations), dynamical balance can easily be lost in the assimilation. For the EKF, the repeated adjustment of the covariances by the assimilation of observations can easily unbalance the TLM, and destroy the assumptions on which balanced assimilation rests. It is shown that an important factor is the choice of initial forecast error covariance matrix. A balance-constrained EKF is described and compared to the standard EKF, and shown to offer significant improvement for observation frequencies where balance in the standard EKF is lost. The EnKF is advantageous in that balance in the error covariances relies only on a balanced forecast ensemble, and that the analysis step is an ensemble-mean operation. Numerical experiments show that the EnKF may be preferable to the EKF in terms of balance, though its validity is limited by ensemble size. It is also found that overobserving can lead to a more unbalanced forecast ensemble and thus to an unbalanced analysis.
Resumo:
Remote sensing observations often have correlated errors, but the correlations are typically ignored in data assimilation for numerical weather prediction. The assumption of zero correlations is often used with data thinning methods, resulting in a loss of information. As operational centres move towards higher-resolution forecasting, there is a requirement to retain data providing detail on appropriate scales. Thus an alternative approach to dealing with observation error correlations is needed. In this article, we consider several approaches to approximating observation error correlation matrices: diagonal approximations, eigendecomposition approximations and Markov matrices. These approximations are applied in incremental variational assimilation experiments with a 1-D shallow water model using synthetic observations. Our experiments quantify analysis accuracy in comparison with a reference or ‘truth’ trajectory, as well as with analyses using the ‘true’ observation error covariance matrix. We show that it is often better to include an approximate correlation structure in the observation error covariance matrix than to incorrectly assume error independence. Furthermore, by choosing a suitable matrix approximation, it is feasible and computationally cheap to include error correlation structure in a variational data assimilation algorithm.
Resumo:
Calcium (Ca) and magnesium (Mg) are the most abundant group II elements in both plants and animals. Genetic variation in shoot Ca and shoot Mg concentration (shoot Ca and Mg) in plants can be exploited to biofortify food crops and thereby increase dietary Ca and Mg intake for humans and livestock. We present a comprehensive analysis of within-species genetic variation for shoot Ca and Mg, demonstrating that shoot mineral concentration differs significantly between subtaxa (varietas). We established a structured diversity foundation set of 376 accessions to capture a high proportion of species-wide allelic diversity within domesticated Brassica oleracea, including representation of wild relatives (C genome, 1n = 9) from natural populations. These accessions and 74 modern F-1 hybrid cultivars were grown in glasshouse and field environments. Shoot Ca and Mg varied 2- and 2.3-fold, respectively, and was typically not inversely correlated with shoot biomass, within most subtaxa. The closely related capitata (cabbage) and sabauda (Savoy cabbage) subtaxa consistently had the highest mean shoot Ca and Mg. Shoot Ca and Mg in glasshouse-grown plants was highly correlated with data from the field. To understand and dissect the genetic basis of variation in shoot Ca and Mg, we studied homozygous lines from a segregating B. oleracea mapping population. Shoot Ca and Mg was highly heritable (up to 40). Quantitative trait loci (QTL) for shoot Ca and Mg were detected on chromosomes C2, C6, C7, C8, and, in particular, C9, where QTL accounted for 14 to 55 of the total genetic variance. The presence of QTL on C9 was substantiated by scoring recurrent backcross substitution lines, derived from the same parents. This also greatly increased the map resolution, with strong evidence that a 4-cM region on C9 influences shoot Ca. This region corresponds to a 0.41-Mb region on Arabidopsis (Arabidopsis thaliana) chromosome 5 that includes 106 genes. There is also evidence that pleiotropic loci on C8 and C9 affect shoot Ca and Mg. Map-based cloning of these loci will reveal how shoot-level phenotypes relate to Ca 21 and Mg 21 uptake and homeostasis at the molecular level.
Resumo:
Cereal grains are the dominant source of cadmium in the human diet, with rice being to the fore. Here we explore the effect of geographic, genetic, and processing (milling) factors on rice grain cadmium and rice consumption rates that lead to dietary variance in cadmium intake. From a survey of 12 countries on four continents, cadmium levels in rice grain were the highest in Bangladesh and Sri Lanka, with both these countries also having high per capita rice intakes. For Bangladesh and Sri Lanka, there was high weekly intake of cadmium from rice, leading to intakes deemed unsafe by international and national regulators. While genetic variance, and to a lesser extent milling, provide strategies for reducing cadmium in rice, caution has to be used, as there is environmental regulation as well as genetic regulation of cadmium accumulation within rice grains. For countries that import rice, grain cadmium can be controlled by where that rice is sourced, but for countries with subsistence rice economies that have high levels of cadmium in rice grain, agronomic and breeding strategies are required to lower grain cadmium.
Resumo:
The observation-error covariance matrix used in data assimilation contains contributions from instrument errors, representativity errors and errors introduced by the approximated observation operator. Forward model errors arise when the observation operator does not correctly model the observations or when observations can resolve spatial scales that the model cannot. Previous work to estimate the observation-error covariance matrix for particular observing instruments has shown that it contains signifcant correlations. In particular, correlations for humidity data are more significant than those for temperature. However it is not known what proportion of these correlations can be attributed to the representativity errors. In this article we apply an existing method for calculating representativity error, previously applied to an idealised system, to NWP data. We calculate horizontal errors of representativity for temperature and humidity using data from the Met Office high-resolution UK variable resolution model. Our results show that errors of representativity are correlated and more significant for specific humidity than temperature. We also find that representativity error varies with height. This suggests that the assimilation scheme may be improved if these errors are explicitly included in a data assimilation scheme. This article is published with the permission of the Controller of HMSO and the Queen's Printer for Scotland.
Resumo:
For certain observing types, such as those that are remotely sensed, the observation errors are correlated and these correlations are state- and time-dependent. In this work, we develop a method for diagnosing and incorporating spatially correlated and time-dependent observation error in an ensemble data assimilation system. The method combines an ensemble transform Kalman filter with a method that uses statistical averages of background and analysis innovations to provide an estimate of the observation error covariance matrix. To evaluate the performance of the method, we perform identical twin experiments using the Lorenz ’96 and Kuramoto-Sivashinsky models. Using our approach, a good approximation to the true observation error covariance can be recovered in cases where the initial estimate of the error covariance is incorrect. Spatial observation error covariances where the length scale of the true covariance changes slowly in time can also be captured. We find that using the estimated correlated observation error in the assimilation improves the analysis.
Resumo:
Satellite-based (e.g., Synthetic Aperture Radar [SAR]) water level observations (WLOs) of the floodplain can be sequentially assimilated into a hydrodynamic model to decrease forecast uncertainty. This has the potential to keep the forecast on track, so providing an Earth Observation (EO) based flood forecast system. However, the operational applicability of such a system for floods developed over river networks requires further testing. One of the promising techniques for assimilation in this field is the family of ensemble Kalman (EnKF) filters. These filters use a limited-size ensemble representation of the forecast error covariance matrix. This representation tends to develop spurious correlations as the forecast-assimilation cycle proceeds, which is a further complication for dealing with floods in either urban areas or river junctions in rural environments. Here we evaluate the assimilation of WLOs obtained from a sequence of real SAR overpasses (the X-band COSMO-Skymed constellation) in a case study. We show that a direct application of a global Ensemble Transform Kalman Filter (ETKF) suffers from filter divergence caused by spurious correlations. However, a spatially-based filter localization provides a substantial moderation in the development of the forecast error covariance matrix, directly improving the forecast and also making it possible to further benefit from a simultaneous online inflow error estimation and correction. Additionally, we propose and evaluate a novel along-network metric for filter localization, which is physically-meaningful for the flood over a network problem. Using this metric, we further evaluate the simultaneous estimation of channel friction and spatially-variable channel bathymetry, for which the filter seems able to converge simultaneously to sensible values. Results also indicate that friction is a second order effect in flood inundation models applied to gradually varied flow in large rivers. The study is not conclusive regarding whether in an operational situation the simultaneous estimation of friction and bathymetry helps the current forecast. Overall, the results indicate the feasibility of stand-alone EO-based operational flood forecasting.
Resumo:
We systematically compare the performance of ETKF-4DVAR, 4DVAR-BEN and 4DENVAR with respect to two traditional methods (4DVAR and ETKF) and an ensemble transform Kalman smoother (ETKS) on the Lorenz 1963 model. We specifically investigated this performance with increasing nonlinearity and using a quasi-static variational assimilation algorithm as a comparison. Using the analysis root mean square error (RMSE) as a metric, these methods have been compared considering (1) assimilation window length and observation interval size and (2) ensemble size to investigate the influence of hybrid background error covariance matrices and nonlinearity on the performance of the methods. For short assimilation windows with close to linear dynamics, it has been shown that all hybrid methods show an improvement in RMSE compared to the traditional methods. For long assimilation window lengths in which nonlinear dynamics are substantial, the variational framework can have diffculties fnding the global minimum of the cost function, so we explore a quasi-static variational assimilation (QSVA) framework. Of the hybrid methods, it is seen that under certain parameters, hybrid methods which do not use a climatological background error covariance do not need QSVA to perform accurately. Generally, results show that the ETKS and hybrid methods that do not use a climatological background error covariance matrix with QSVA outperform all other methods due to the full flow dependency of the background error covariance matrix which also allows for the most nonlinearity.
Resumo:
The disadvantage of the majority of data assimilation schemes is the assumption that the conditional probability density function of the state of the system given the observations [posterior probability density function (PDF)] is distributed either locally or globally as a Gaussian. The advantage, however, is that through various different mechanisms they ensure initial conditions that are predominantly in linear balance and therefore spurious gravity wave generation is suppressed. The equivalent-weights particle filter is a data assimilation scheme that allows for a representation of a potentially multimodal posterior PDF. It does this via proposal densities that lead to extra terms being added to the model equations and means the advantage of the traditional data assimilation schemes, in generating predominantly balanced initial conditions, is no longer guaranteed. This paper looks in detail at the impact the equivalent-weights particle filter has on dynamical balance and gravity wave generation in a primitive equation model. The primary conclusions are that (i) provided the model error covariance matrix imposes geostrophic balance, then each additional term required by the equivalent-weights particle filter is also geostrophically balanced; (ii) the relaxation term required to ensure the particles are in the locality of the observations has little effect on gravity waves and actually induces a reduction in gravity wave energy if sufficiently large; and (iii) the equivalent-weights term, which leads to the particles having equivalent significance in the posterior PDF, produces a change in gravity wave energy comparable to the stochastic model error. Thus, the scheme does not produce significant spurious gravity wave energy and so has potential for application in real high-dimensional geophysical applications.
Resumo:
With the development of convection-permitting numerical weather prediction the efficient use of high resolution observations in data assimilation is becoming increasingly important. The operational assimilation of these observations, such as Dopplerradar radial winds, is now common, though to avoid violating the assumption of un- correlated observation errors the observation density is severely reduced. To improve the quantity of observations used and the impact that they have on the forecast will require the introduction of the full, potentially correlated, error statistics. In this work, observation error statistics are calculated for the Doppler radar radial winds that are assimilated into the Met Office high resolution UK model using a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. This is the first in-depth study using the diagnostic to estimate both horizontal and along-beam correlated observation errors. By considering the new results obtained it is found that the Doppler radar radial wind error standard deviations are similar to those used operationally and increase as the observation height increases. Surprisingly the estimated observation error correlation length scales are longer than the operational thinning distance. They are dependent on both the height of the observation and on the distance of the observation away from the radar. Further tests show that the long correlations cannot be attributed to the use of superobservations or the background error covariance matrix used in the assimilation. The large horizontal correlation length scales are, however, in part, a result of using a simplified observation operator.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion combining local component analysis for the finite mixture model. We start with a Parzen window estimator which has the Gaussian kernels with a common covariance matrix, the local component analysis is initially applied to find the covariance matrix using expectation maximization algorithm. Since the constraint on the mixing coefficients of a finite mixture model is on the multinomial manifold, we then use the well-known Riemannian trust-region algorithm to find the set of sparse mixing coefficients. The first and second order Riemannian geometry of the multinomial manifold are utilized in the Riemannian trust-region algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
Genetic parameters and breeding values for dairy cow fertility were estimated from 62 443 lactation records. Two-trait analysis of fertility and milk yield was investigated as a method to estimate fertility breeding values when culling or selection based on milk yield in early lactation determines presence or absence of fertility observations in later lactations. Fertility traits were calving interval, intervals from calving to first service, calving to conception and first to last service, conception success to first service and number of services per conception. Milk production traits were 305-day milk, fat and protein yield. For fertility traits, range of estimates of heritability (h(2)) was 0.012 to 0.028 and of permanent environmental variance (c(2)) was 0.016 to 0.032. Genetic correlations (r(g)) among fertility traits were generally high ( > 0.70). Genetic correlations of fertility with milk production traits were unfavourable (range -0.11 to 0.46). Single and two-trait analyses of fertility were compared using the same data set. The estimates of h(2) and c(2) were similar for two types of analyses. However, there were differences between estimated breeding values and rankings for the same trait from single versus multi-trait analyses. The range for rank correlation was 0.69-0.83 for all animals in the pedigree and 0.89-0.96 for sires with more than 25 daughters. As single-trait method is biased due to selection on milk yield, a multi-trait evaluation of fertility with milk yield is recommended. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
The Sardinian mountain newt Euproctus platycephalus, endemic to the island of Sardinia, (Italy), is considered a rare and threatened species and is classed as critically endangered by IUCN. It inhabits streams, small lakes and pools on the main mountain systems of the island. Threats from climatic and anthropogenic factors have raised concerns for the long-term survival of newt populations on the island. MtDNA sequencing was used to investigate the genetic population structure and phylogeography of this endemic species. Patterns of genetic variation were assessed by sequencing the complete Dloop region and part of the 12SrRNA, from 74 individuals representing four different populations. Analyses of molecular variance suggest that populations are significantly differentiated, and the distribution of haplotypes across the island shows strong geographical structuring. However, phylogenetic analyses also suggest that the Sardinian population consists of two distinct mtDNA groups, which may reflect ancient isolation and expansion events. Population structure, evolutionary history of the species and implications for the conservation of newt populations are discussed.
Resumo:
Cedrus atlantica (Pinaceae) is a large and exceptionally long-lived conifer native to the Rif and Atlas Mountains of North Africa. To assess levels and patterns of genetic diversity of this species. samples were obtained throughout the natural range in Morocco and from a forest plantation in Arbucies, Girona (Spain) and analyzed using RAPD markers. Within-population genetic diversity was high and comparable to that revealed by isozymes. Managed populations harbored levels of genetic variation similar to those found in their natural counterparts. Genotypic analyses Of Molecular variance (AMOVA) found that most variation was within populations. but significant differentiation was also found between populations. particularly in Morocco. Bayesian estimates of F,, corroborated the AMOVA partitioning and provided evidence for Population differentiation in C. atlantica. Both distance- and Bayesian-based Clustering methods revealed that Moroccan populations comprise two genetically distinct groups. Within each group, estimates of population differentiation were close to those previously reported in other gymnosperms. These results are interpreted in the context of the postglacial history of the species and human impact. The high degree of among-group differentiation recorded here highlights the need for additional conservation measures for some Moroccan Populations of C. atlantica.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society