158 resultados para sampling error
em CentAUR: Central Archive University of Reading - UK
Resumo:
[1] Cloud cover is conventionally estimated from satellite images as the observed fraction of cloudy pixels. Active instruments such as radar and Lidar observe in narrow transects that sample only a small percentage of the area over which the cloud fraction is estimated. As a consequence, the fraction estimate has an associated sampling uncertainty, which usually remains unspecified. This paper extends a Bayesian method of cloud fraction estimation, which also provides an analytical estimate of the sampling error. This method is applied to test the sensitivity of this error to sampling characteristics, such as the number of observed transects and the variability of the underlying cloud field. The dependence of the uncertainty on these characteristics is investigated using synthetic data simulated to have properties closely resembling observations of the spaceborne Lidar NASA-LITE mission. Results suggest that the variance of the cloud fraction is greatest for medium cloud cover and least when conditions are mostly cloudy or clear. However, there is a bias in the estimation, which is greatest around 25% and 75% cloud cover. The sampling uncertainty is also affected by the mean lengths of clouds and of clear intervals; shorter lengths decrease uncertainty, primarily because there are more cloud observations in a transect of a given length. Uncertainty also falls with increasing number of transects. Therefore a sampling strategy aimed at minimizing the uncertainty in transect derived cloud fraction will have to take into account both the cloud and clear sky length distributions as well as the cloud fraction of the observed field. These conclusions have implications for the design of future satellite missions. This paper describes the first integrated methodology for the analytical assessment of sampling uncertainty in cloud fraction observations from forthcoming spaceborne radar and Lidar missions such as NASA's Calipso and CloudSat.
Resumo:
There are now considerable expectations that semi-distributed models are useful tools for supporting catchment water quality management. However, insufficient attention has been given to evaluating the uncertainties inherent to this type of model, especially those associated with the spatial disaggregation of the catchment. The Integrated Nitrogen in Catchments model (INCA) is subjected to an extensive regionalised sensitivity analysis in application to the River Kennet, part of the groundwater-dominated upper Thames catchment, UK The main results are: (1) model output was generally insensitive to land-phase parameters, very sensitive to groundwater parameters, including initial conditions, and significantly sensitive to in-river parameters; (2) INCA was able to produce good fits simultaneously to the available flow, nitrate and ammonium in-river data sets; (3) representing parameters as heterogeneous over the catchment (206 calibrated parameters) rather than homogeneous (24 calibrated parameters) produced a significant improvement in fit to nitrate but no significant improvement to flow and caused a deterioration in ammonium performance; (4) the analysis indicated that calibrating the flow-related parameters first, then calibrating the remaining parameters (as opposed to calibrating all parameters together) was not a sensible strategy in this case; (5) even the parameters to which the model output was most sensitive suffered from high uncertainty due to spatial inconsistencies in the estimated optimum values, parameter equifinality and the sampling error associated with the calibration method; (6) soil and groundwater nutrient and flow data are needed to reduce. uncertainty in initial conditions, residence times and nitrogen transformation parameters, and long-term historic data are needed so that key responses to changes in land-use management can be assimilated. The results indicate the general, difficulty of reconciling the questions which catchment nutrient models are expected to answer with typically limited data sets and limited knowledge about suitable model structures. The results demonstrate the importance of analysing semi-distributed model uncertainties prior to model application, and illustrate the value and limitations of using Monte Carlo-based methods for doing so. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
This paper introduces a method for simulating multivariate samples that have exact means, covariances, skewness and kurtosis. We introduce a new class of rectangular orthogonal matrix which is fundamental to the methodology and we call these matrices L matrices. They may be deterministic, parametric or data specific in nature. The target moments determine the L matrix then infinitely many random samples with the same exact moments may be generated by multiplying the L matrix by arbitrary random orthogonal matrices. This methodology is thus termed “ROM simulation”. Considering certain elementary types of random orthogonal matrices we demonstrate that they generate samples with different characteristics. ROM simulation has applications to many problems that are resolved using standard Monte Carlo methods. But no parametric assumptions are required (unless parametric L matrices are used) so there is no sampling error caused by the discrete approximation of a continuous distribution, which is a major source of error in standard Monte Carlo simulations. For illustration, we apply ROM simulation to determine the value-at-risk of a stock portfolio.
Resumo:
This paper provides evidence regarding the risk-adjusted performance of 19 UK real estate funds in the UK, over the period 1991-2001. Using Jensen’s alpha the results are generally favourable towards the hypothesis that real estate fund managers showed superior risk-adjusted performance over this period. However, using three widely known parametric statistical procedures to jointly test for timing and selection ability the results are less conclusive. The paper then utilises the meta-analysis technique to further examine the regression results in an attempt to estimate the proportion of variation in results attributable to sampling error. The meta-analysis results reveal strong evidence, across all models, that the variation in findings is real and may not be attributed to sampling error. Thus, the meta-analysis results provide strong evidence that on average the sample of real estate funds analysed in this study delivered significant risk-adjusted performance over this period. The meta-analysis for the three timing and selection models strongly indicating that this out performance of the benchmark resulted from superior selection ability, while the evidence for the ability of real estate fund managers to time the market is at best weak. Thus, we can say that although real estate fund managers are unable to outperform a passive buy and hold strategy through timing, they are able to improve their risk-adjusted performance through selection ability.
Resumo:
Following a malicious or accidental atmospheric release in an outdoor environment it is essential for first responders to ensure safety by identifying areas where human life may be in danger. For this to happen quickly, reliable information is needed on the source strength and location, and the type of chemical agent released. We present here an inverse modelling technique that estimates the source strength and location of such a release, together with the uncertainty in those estimates, using a limited number of measurements of concentration from a network of chemical sensors considering a single, steady, ground-level source. The technique is evaluated using data from a set of dispersion experiments conducted in a meteorological wind tunnel, where simultaneous measurements of concentration time series were obtained in the plume from a ground-level point-source emission of a passive tracer. In particular, we analyze the response to the number of sensors deployed and their arrangement, and to sampling and model errors. We find that the inverse algorithm can generate acceptable estimates of the source characteristics with as few as four sensors, providing these are well-placed and that the sampling error is controlled. Configurations with at least three sensors in a profile across the plume were found to be superior to other arrangements examined. Analysis of the influence of sampling error due to the use of short averaging times showed that the uncertainty in the source estimates grew as the sampling time decreased. This demonstrated that averaging times greater than about 5min (full scale time) lead to acceptable accuracy.
Resumo:
In the Radiative Atmospheric Divergence Using ARM Mobile Facility GERB and AMMA Stations (RADAGAST) project we calculate the divergence of radiative flux across the atmosphere by comparing fluxes measured at each end of an atmospheric column above Niamey, in the African Sahel region. The combination of broadband flux measurements from geostationary orbit and the deployment for over 12 months of a comprehensive suite of active and passive instrumentation at the surface eliminates a number of sampling issues that could otherwise affect divergence calculations of this sort. However, one sampling issue that challenges the project is the fact that the surface flux data are essentially measurements made at a point, while the top-of-atmosphere values are taken over a solid angle that corresponds to an area at the surface of some 2500 km2. Variability of cloud cover and aerosol loading in the atmosphere mean that the downwelling fluxes, even when averaged over a day, will not be an exact match to the area-averaged value over that larger area, although we might expect that it is an unbiased estimate thereof. The heterogeneity of the surface, for example, fixed variations in albedo, further means that there is a likely systematic difference in the corresponding upwelling fluxes. In this paper we characterize and quantify this spatial sampling problem. We bound the root-mean-square error in the downwelling fluxes by exploiting a second set of surface flux measurements from a site that was run in parallel with the main deployment. The differences in the two sets of fluxes lead us to an upper bound to the sampling uncertainty, and their correlation leads to another which is probably optimistic as it requires certain other conditions to be met. For the upwelling fluxes we use data products from a number of satellite instruments to characterize the relevant heterogeneities and so estimate the systematic effects that arise from the flux measurements having to be taken at a single point. The sampling uncertainties vary with the season, being higher during the monsoon period. We find that the sampling errors for the daily average flux are small for the shortwave irradiance, generally less than 5 W m−2, under relatively clear skies, but these increase to about 10 W m−2 during the monsoon. For the upwelling fluxes, again taking daily averages, systematic errors are of order 10 W m−2 as a result of albedo variability. The uncertainty on the longwave component of the surface radiation budget is smaller than that on the shortwave component, in all conditions, but a bias of 4 W m−2 is calculated to exist in the surface leaving longwave flux.
Resumo:
Models developed to identify the rates and origins of nutrient export from land to stream require an accurate assessment of the nutrient load present in the water body in order to calibrate model parameters and structure. These data are rarely available at a representative scale and in an appropriate chemical form except in research catchments. Observational errors associated with nutrient load estimates based on these data lead to a high degree of uncertainty in modelling and nutrient budgeting studies. Here, daily paired instantaneous P and flow data for 17 UK research catchments covering a total of 39 water years (WY) have been used to explore the nature and extent of the observational error associated with nutrient flux estimates based on partial fractions and infrequent sampling. The daily records were artificially decimated to create 7 stratified sampling records, 7 weekly records, and 30 monthly records from each WY and catchment. These were used to evaluate the impact of sampling frequency on load estimate uncertainty. The analysis underlines the high uncertainty of load estimates based on monthly data and individual P fractions rather than total P. Catchments with a high baseflow index and/or low population density were found to return a lower RMSE on load estimates when sampled infrequently than those with a tow baseflow index and high population density. Catchment size was not shown to be important, though a limitation of this study is that daily records may fail to capture the full range of P export behaviour in smaller catchments with flashy hydrographs, leading to an underestimate of uncertainty in Load estimates for such catchments. Further analysis of sub-daily records is needed to investigate this fully. Here, recommendations are given on load estimation methodologies for different catchment types sampled at different frequencies, and the ways in which this analysis can be used to identify observational error and uncertainty for model calibration and nutrient budgeting studies. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Long-term monitoring of forest soils as part of a pan-European network to detect environmental change depends on an accurate determination of the mean of the soil properties at each monitoring event. Forest soil is known to be very variable spatially, however. A study was undertaken to explore and quantify this variability at three forest monitoring plots in Britain. Detailed soil sampling was carried out, and the data from the chemical analyses were analysed by classical statistics and geostatistics. An analysis of variance showed that there were no consistent effects from the sample sites in relation to the position of the trees. The variogram analysis showed that there was spatial dependence at each site for several variables and some varied in an apparently periodic way. An optimal sampling analysis based on the multivariate variogram for each site suggested that a bulked sample from 36 cores would reduce error to an acceptable level. Future sampling should be designed so that it neither targets nor avoids trees and disturbed ground. This can be achieved best by using a stratified random sampling design.
Resumo:
The Representative Soil Sampling Scheme of England and Wales has recorded information on the soil of agricultural land in England and Wales since 1969. It is a valuable source of information about the soil in the context of monitoring for sustainable agricultural development. Changes in soil nutrient status and pH were examined over the period 1971-2001. Several methods of statistical analysis were applied to data from the surveys during this period. The main focus here is on the data for 1971, 1981, 1991 and 2001. The results of examining change over time in general show that levels of potassium in the soil have increased, those of magnesium have remained fairly constant, those of phosphorus have declined and pH has changed little. Future sampling needs have been assessed in the context of monitoring, to determine the mean at a given level of confidence and tolerable error and to detect change in the mean over time at these same levels over periods of 5 and 10 years. The results of a non-hierarchical multivariate classification suggest that England and Wales could be stratified to optimize future sampling and analysis. To monitor soil quality and health more generally than for agriculture, more of the country should be sampled and a wider range of properties recorded.
Resumo:
In this paper, we generalise a previously-described model of the error-prone polymerase chain reaction (PCR) reaction to conditions of arbitrarily variable amplification efficiency and initial population size. Generalisation of the model to these conditions improves the correspondence to observed and expected behaviours of PCR, and restricts the extent to which the model may explore sequence space for a prescribed set of parameters. Error-prone PCR in realistic reaction conditions is predicted to be less effective at generating grossly divergent sequences than the original model. The estimate of mutation rate per cycle by sampling sequences from an in vitro PCR experiment is correspondingly affected by the choice of model and parameters. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Numerical climate models constitute the best available tools to tackle the problem of climate prediction. Two assumptions lie at the heart of their suitability: (1) a climate attractor exists, and (2) the numerical climate model's attractor lies on the actual climate attractor, or at least on the projection of the climate attractor on the model's phase space. In this contribution, the Lorenz '63 system is used both as a prototype system and as an imperfect model to investigate the implications of the second assumption. By comparing results drawn from the Lorenz '63 system and from numerical weather and climate models, the implications of using imperfect models for the prediction of weather and climate are discussed. It is shown that the imperfect model's orbit and the system's orbit are essentially different, purely due to model error and not to sensitivity to initial conditions. Furthermore, if a model is a perfect model, then the attractor, reconstructed by sampling a collection of initialised model orbits (forecast orbits), will be invariant to forecast lead time. This conclusion provides an alternative method for the assessment of climate models.
Resumo:
Two wavelet-based control variable transform schemes are described and are used to model some important features of forecast error statistics for use in variational data assimilation. The first is a conventional wavelet scheme and the other is an approximation of it. Their ability to capture the position and scale-dependent aspects of covariance structures is tested in a two-dimensional latitude-height context. This is done by comparing the covariance structures implied by the wavelet schemes with those found from the explicit forecast error covariance matrix, and with a non-wavelet- based covariance scheme used currently in an operational assimilation scheme. Qualitatively, the wavelet-based schemes show potential at modeling forecast error statistics well without giving preference to either position or scale-dependent aspects. The degree of spectral representation can be controlled by changing the number of spectral bands in the schemes, and the least number of bands that achieves adequate results is found for the model domain used. Evidence is found of a trade-off between the localization of features in positional and spectral spaces when the number of bands is changed. By examining implied covariance diagnostics, the wavelet-based schemes are found, on the whole, to give results that are closer to diagnostics found from the explicit matrix than from the nonwavelet scheme. Even though the nature of the covariances has the right qualities in spectral space, variances are found to be too low at some wavenumbers and vertical correlation length scales are found to be too long at most scales. The wavelet schemes are found to be good at resolving variations in position and scale-dependent horizontal length scales, although the length scales reproduced are usually too short. The second of the wavelet-based schemes is often found to be better than the first in some important respects, but, unlike the first, it has no exact inverse transform.
Resumo:
Turbulence statistics obtained by direct numerical simulations are analysed to investigate spatial heterogeneity within regular arrays of building-like cubical obstacles. Two different array layouts are studied, staggered and square, both at a packing density of λp=0.25 . The flow statistics analysed are mean streamwise velocity ( u− ), shear stress ( u′w′−−−− ), turbulent kinetic energy (k) and dispersive stress fraction ( u˜w˜ ). The spatial flow patterns and spatial distribution of these statistics in the two arrays are found to be very different. Local regions of high spatial variability are identified. The overall spatial variances of the statistics are shown to be generally very significant in comparison with their spatial averages within the arrays. Above the arrays the spatial variances as well as dispersive stresses decay rapidly to zero. The heterogeneity is explored further by separately considering six different flow regimes identified within the arrays, described here as: channelling region, constricted region, intersection region, building wake region, canyon region and front-recirculation region. It is found that the flow in the first three regions is relatively homogeneous, but that spatial variances in the latter three regions are large, especially in the building wake and canyon regions. The implication is that, in general, the flow immediately behind (and, to a lesser extent, in front of) a building is much more heterogeneous than elsewhere, even in the relatively dense arrays considered here. Most of the dispersive stress is concentrated in these regions. Considering the experimental difficulties of obtaining enough point measurements to form a representative spatial average, the error incurred by degrading the sampling resolution is investigated. It is found that a good estimate for both area and line averages can be obtained using a relatively small number of strategically located sampling points.