99 resultados para COVARIANCE
Resumo:
A new spectral-based approach is presented to find orthogonal patterns from gridded weather/climate data. The method is based on optimizing the interpolation error variance. The optimally interpolated patterns (OIP) are then given by the eigenvectors of the interpolation error covariance matrix, obtained using the cross-spectral matrix. The formulation of the approach is presented, and the application to low-dimension stochastic toy models and to various reanalyses datasets is performed. In particular, it is found that the lowest-frequency patterns correspond to largest eigenvalues, that is, variances, of the interpolation error matrix. The approach has been applied to the Northern Hemispheric (NH) and tropical sea level pressure (SLP) and to the Indian Ocean sea surface temperature (SST). Two main OIP patterns are found for the NH SLP representing respectively the North Atlantic Oscillation and the North Pacific pattern. The leading tropical SLP OIP represents the Southern Oscillation. For the Indian Ocean SST, the leading OIP pattern shows a tripole-like structure having one sign over the eastern and north- and southwestern parts and an opposite sign in the remaining parts of the basin. The pattern is also found to have a high lagged correlation with the Niño-3 index with 6-months lag.
Resumo:
Accurate estimation of the soil water balance (SWB) is important for a number of applications (e.g. environmental, meteorological, agronomical and hydrological). The objective of this study was to develop and test techniques for the estimation of soil water fluxes and SWB components (particularly infiltration, evaporation and drainage below the root zone) from soil water records. The work presented here is based on profile soil moisture data measured using dielectric methods, at 30-min resolution, at an experimental site with different vegetation covers (barley, sunflower and bare soil). Estimates of infiltration were derived by assuming that observed gains in the soil profile water content during rainfall were due to infiltration. Inaccuracies related to diurnal fluctuations present in the dielectric-based soil water records are resolved by filtering the data with adequate threshold values. Inconsistencies caused by the redistribution of water after rain events were corrected by allowing for a redistribution period before computing water gains. Estimates of evaporation and drainage were derived from water losses above and below the deepest zero flux plane (ZFP), respectively. The evaporation estimates for the sunflower field were compared to evaporation data obtained with an eddy covariance (EC) system located elsewhere in the field. The EC estimate of total evaporation for the growing season was about 25% larger than that derived from the soil water records. This was consistent with differences in crop growth (based on direct measurements of biomass, and field mapping of vegetation using laser altimetry) between the EC footprint and the area of the field used for soil moisture monitoring. Copyright (c) 2007 John Wiley & Sons, Ltd.
Resumo:
It has been generally accepted that the method of moments (MoM) variogram, which has been widely applied in soil science, requires about 100 sites at an appropriate interval apart to describe the variation adequately. This sample size is often larger than can be afforded for soil surveys of agricultural fields or contaminated sites. Furthermore, it might be a much larger sample size than is needed where the scale of variation is large. A possible alternative in such situations is the residual maximum likelihood (REML) variogram because fewer data appear to be required. The REML method is parametric and is considered reliable where there is trend in the data because it is based on generalized increments that filter trend out and only the covariance parameters are estimated. Previous research has suggested that fewer data are needed to compute a reliable variogram using a maximum likelihood approach such as REML, however, the results can vary according to the nature of the spatial variation. There remain issues to examine: how many fewer data can be used, how should the sampling sites be distributed over the site of interest, and how do different degrees of spatial variation affect the data requirements? The soil of four field sites of different size, physiography, parent material and soil type was sampled intensively, and MoM and REML variograms were calculated for clay content. The data were then sub-sampled to give different sample sizes and distributions of sites and the variograms were computed again. The model parameters for the sets of variograms for each site were used for cross-validation. Predictions based on REML variograms were generally more accurate than those from MoM variograms with fewer than 100 sampling sites. A sample size of around 50 sites at an appropriate distance apart, possibly determined from variograms of ancillary data, appears adequate to compute REML variograms for kriging soil properties for precision agriculture and contaminated sites. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
An efficient method is described for the approximate calculation of the intensity of multiply scattered lidar returns. It divides the outgoing photons into three populations, representing those that have experienced zero, one, and more than one forward-scattering event. Each population is parameterized at each range gate by its total energy, its spatial variance, the variance of photon direction, and the covariance, of photon direction and position. The result is that for an N-point profile the calculation is O(N-2) efficient and implicitly includes up to N-order scattering, making it ideal for use in iterative retrieval algorithms for which speed is crucial. In contrast, models that explicitly consider each scattering order separately are at best O(N-m/m!) efficient for m-order scattering and often cannot be performed to more than the third or fourth order in retrieval algorithms. For typical cloud profiles and a wide range of lidar fields of view, the new algorithm is as accurate as an explicit calculation truncated at the fifth or sixth order but faster by several orders of magnitude. (C) 2006 Optical Society of America.
Resumo:
Indirect and direct models of sexual selection make different predictions regarding the quantitative genetic relationships between sexual ornaments and fitness. Indirect models predict that ornaments should have a high heritability and that strong positive genetic covariance should exist between fitness and the ornament. Direct models, on the other hand, make no such assumptions about the level of genetic variance in fitness and the ornament, and are therefore likely to be more important when environmental sources of variation are large. Here we test these predictions in a wild population of the blue tit (Parus caeruleus), a species in which plumage coloration has been shown to be under sexual selection. Using 3 years of cross-fostering data from over 250 breeding attempts, we partition the covariance between parental coloration and aspects of nestling fitness into a genetic and environmental component. Contrary to indirect models of sexual selection, but in agreement with direct models, we show that variation in coloration is only weakly heritable (h(2) < 0.11), and that two components of offspring fitness-nestling size and fledgling recruitment-are strongly dependent on parental effects, rather than genetic effects. Furthermore, there was no evidence of significant positive genetic covariation between parental colour and offspring traits. Contrary to direct benefit models, however, we find little evidence that variation in colour reliably indicates the level of parental care provided by either males or females. Taken together, these results indicate that the assumptions of indirect models of sexual selection are not supported by the genetic basis of the traits reported on here.
Resumo:
In clinical trials, situations often arise where more than one response from each patient is of interest; and it is required that any decision to stop the study be based upon some or all of these measures simultaneously. Theory for the design of sequential experiments with simultaneous bivariate responses is described by Jennison and Turnbull (Jennison, C., Turnbull, B. W. (1993). Group sequential tests for bivariate response: interim analyses of clinical trials with both efficacy and safety endpoints. Biometrics 49:741-752) and Cook and Farewell (Cook, R. J., Farewell, V. T. (1994). Guidelines for monitoring efficacy and toxicity responses in clinical trials. Biometrics 50:1146-1152) in the context of one efficacy and one safety response. These expositions are in terms of normally distributed data with known covariance. The methods proposed require specification of the correlation, ρ between test statistics monitored as part of the sequential test. It can be difficult to quantify ρ and previous authors have suggested simply taking the lowest plausible value, as this will guarantee power. This paper begins with an illustration of the effect that inappropriate specification of ρ can have on the preservation of trial error rates. It is shown that both the type I error and the power can be adversely affected. As a possible solution to this problem, formulas are provided for the calculation of correlation from data collected as part of the trial. An adaptive approach is proposed and evaluated that makes use of these formulas and an example is provided to illustrate the method. Attention is restricted to the bivariate case for ease of computation, although the formulas derived are applicable in the general multivariate case.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society
Resumo:
Fast-growing poplar trees may in future be used as a source of renewable energy for heat, electricity and biofuels such as bioethanol. Water use in Populus x euramericana (clone I214), following long-term exposure to elevated CO2 in the POPFACE (poplar free-air carbon dioxide enrichment) experiment, is quantified here. Stomatal conductance was measured and, during two measurement campaigns made before and after coppicing, whole-tree water use was determined using heat-balance sap-flow gauges, first validated using eddy covariance measurements of latent heat flux. Water use was determined by the balance between leaf-level reductions in stomatal conductance and tree-level stimulations in transpiration. Reductions in stomatal conductance were found that varied between 16 and 39% relative to ambient air. Whole-tree sap flow was increased in plants growing under elevated CO2, on average, by 12 and 23%, respectively, in the first and in the second measurement campaigns. These results suggest that future CO2 concentrations may result in an increase in seasonal water use in fast-growing, short-rotation Populus plantations.
Resumo:
This research is associated with the goal of the horticultural sector of the Colombian southwest, which is to obtain climatic information, specifically, to predict the monthly average temperature in sites where it has not been measured. The data correspond to monthly average temperature, and were recorded in meteorological stations at Valle del Cauca, Colombia, South America. Two components are identified in the data of this research: (1) a component due to the temporal aspects, determined by characteristics of the time series, distribution of the monthly average temperature through the months and the temporal phenomena, which increased (El Nino) and decreased (La Nina) the temperature values, and (2) a component due to the sites, which is determined for the clear differentiation of two populations, the valley and the mountains, which are associated with the pattern of monthly average temperature and with the altitude. Finally, due to the closeness between meteorological stations it is possible to find spatial correlation between data from nearby sites. In the first instance a random coefficient model without spatial covariance structure in the errors is obtained by month and geographical location (mountains and valley, respectively). Models for wet periods in mountains show a normal distribution in the errors; models for the valley and dry periods in mountains do not exhibit a normal pattern in the errors. In models of mountains and wet periods, omni-directional weighted variograms for residuals show spatial continuity. The random coefficient model without spatial covariance structure in the errors and the random coefficient model with spatial covariance structure in the errors are capturing the influence of the El Nino and La Nina phenomena, which indicates that the inclusion of the random part in the model is appropriate. The altitude variable contributes significantly in the models for mountains. In general, the cross-validation process indicates that the random coefficient model with spatial spherical and the random coefficient model with spatial Gaussian are the best models for the wet periods in mountains, and the worst model is the model used by the Colombian Institute for Meteorology, Hydrology and Environmental Studies (IDEAM) to predict temperature.
Resumo:
Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The comparison of cognitive and linguistic skills in individuals with developmental disorders is fraught with methodological and psychometric difficulties. In this paper, we illustrate some of these issues by comparing the receptive vocabulary knowledge and non-verbal reasoning abilities of 41 children with Williams syndrome, a genetic disorder in which language abilities are often claimed to be relatively strong. Data from this group were compared with data from typically developing children, children with Down syndrome, and children with non-specific learning difficulties using a number of approaches including comparison of age-equivalent scores, matching, analysis of covariance, and regression-based standardization. Across these analyses children with Williams syndrome consistently demonstrated relatively good receptive vocabulary knowledge, although this effect appeared strongest in the oldest children.
Resumo:
A tunable radial basis function (RBF) network model is proposed for nonlinear system identification using particle swarm optimisation (PSO). At each stage of orthogonal forward regression (OFR) model construction, PSO optimises one RBF unit's centre vector and diagonal covariance matrix by minimising the leave-one-out (LOO) mean square error (MSE). This PSO aided OFR automatically determines how many tunable RBF nodes are sufficient for modelling. Compared with the-state-of-the-art local regularisation assisted orthogonal least squares algorithm based on the LOO MSE criterion for constructing fixed-node RBF network models, the PSO tuned RBF model construction produces more parsimonious RBF models with better generalisation performance and is computationally more efficient.
Nonlinear system identification using particle swarm optimisation tuned radial basis function models
Resumo:
A novel particle swarm optimisation (PSO) tuned radial basis function (RBF) network model is proposed for identification of non-linear systems. At each stage of orthogonal forward regression (OFR) model construction process, PSO is adopted to tune one RBF unit's centre vector and diagonal covariance matrix by minimising the leave-one-out (LOO) mean square error (MSE). This PSO aided OFR automatically determines how many tunable RBF nodes are sufficient for modelling. Compared with the-state-of-the-art local regularisation assisted orthogonal least squares algorithm based on the LOO MSE criterion for constructing fixed-node RBF network models, the PSO tuned RBF model construction produces more parsimonious RBF models with better generalisation performance and is often more efficient in model construction. The effectiveness of the proposed PSO aided OFR algorithm for constructing tunable node RBF models is demonstrated using three real data sets.
Resumo:
Nonlinear system identification is considered using a generalized kernel regression model. Unlike the standard kernel model, which employs a fixed common variance for all the kernel regressors, each kernel regressor in the generalized kernel model has an individually tuned diagonal covariance matrix that is determined by maximizing the correlation between the training data and the regressor using a repeated guided random search based on boosting optimization. An efficient construction algorithm based on orthogonal forward regression with leave-one-out (LOO) test statistic and local regularization (LR) is then used to select a parsimonious generalized kernel regression model from the resulting full regression matrix. The proposed modeling algorithm is fully automatic and the user is not required to specify any criterion to terminate the construction procedure. Experimental results involving two real data sets demonstrate the effectiveness of the proposed nonlinear system identification approach.
Resumo:
An orthogonal forward selection (OFS) algorithm based on leave-one-out (LOO) criteria is proposed for the construction of radial basis function (RBF) networks with tunable nodes. Each stage of the construction process determines an RBF node, namely, its center vector and diagonal covariance matrix, by minimizing the LOO statistics. For regression application, the LOO criterion is chosen to be the LOO mean-square error, while the LOO misclassification rate is adopted in two-class classification application. This OFS-LOO algorithm is computationally efficient, and it is capable of constructing parsimonious RBF networks that generalize well. Moreover, the proposed algorithm is fully automatic, and the user does not need to specify a termination criterion for the construction process. The effectiveness of the proposed RBF network construction procedure is demonstrated using examples taken from both regression and classification applications.