74 resultados para Stratified sampling
em CentAUR: Central Archive University of Reading - UK
Resumo:
It is common practice to design a survey with a large number of strata. However, in this case the usual techniques for variance estimation can be inaccurate. This paper proposes a variance estimator for estimators of totals. The method proposed can be implemented with standard statistical packages without any specific programming, as it involves simple techniques of estimation, such as regression fitting.
Resumo:
Models developed to identify the rates and origins of nutrient export from land to stream require an accurate assessment of the nutrient load present in the water body in order to calibrate model parameters and structure. These data are rarely available at a representative scale and in an appropriate chemical form except in research catchments. Observational errors associated with nutrient load estimates based on these data lead to a high degree of uncertainty in modelling and nutrient budgeting studies. Here, daily paired instantaneous P and flow data for 17 UK research catchments covering a total of 39 water years (WY) have been used to explore the nature and extent of the observational error associated with nutrient flux estimates based on partial fractions and infrequent sampling. The daily records were artificially decimated to create 7 stratified sampling records, 7 weekly records, and 30 monthly records from each WY and catchment. These were used to evaluate the impact of sampling frequency on load estimate uncertainty. The analysis underlines the high uncertainty of load estimates based on monthly data and individual P fractions rather than total P. Catchments with a high baseflow index and/or low population density were found to return a lower RMSE on load estimates when sampled infrequently than those with a tow baseflow index and high population density. Catchment size was not shown to be important, though a limitation of this study is that daily records may fail to capture the full range of P export behaviour in smaller catchments with flashy hydrographs, leading to an underestimate of uncertainty in Load estimates for such catchments. Further analysis of sub-daily records is needed to investigate this fully. Here, recommendations are given on load estimation methodologies for different catchment types sampled at different frequencies, and the ways in which this analysis can be used to identify observational error and uncertainty for model calibration and nutrient budgeting studies. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
The sampling of certain solid angle is a fundamental operation in realistic image synthesis, where the rendering equation describing the light propagation in closed domains is solved. Monte Carlo methods for solving the rendering equation use sampling of the solid angle subtended by unit hemisphere or unit sphere in order to perform the numerical integration of the rendering equation. In this work we consider the problem for generation of uniformly distributed random samples over hemisphere and sphere. Our aim is to construct and study the parallel sampling scheme for hemisphere and sphere. First we apply the symmetry property for partitioning of hemisphere and sphere. The domain of solid angle subtended by a hemisphere is divided into a number of equal sub-domains. Each sub-domain represents solid angle subtended by orthogonal spherical triangle with fixed vertices and computable parameters. Then we introduce two new algorithms for sampling of orthogonal spherical triangles. Both algorithms are based on a transformation of the unit square. Similarly to the Arvo's algorithm for sampling of arbitrary spherical triangle the suggested algorithms accommodate the stratified sampling. We derive the necessary transformations for the algorithms. The first sampling algorithm generates a sample by mapping of the unit square onto orthogonal spherical triangle. The second algorithm directly compute the unit radius vector of a sampling point inside to the orthogonal spherical triangle. The sampling of total hemisphere and sphere is performed in parallel for all sub-domains simultaneously by using the symmetry property of partitioning. The applicability of the corresponding parallel sampling scheme for Monte Carlo and Quasi-D/lonte Carlo solving of rendering equation is discussed.
Resumo:
Long-term monitoring of forest soils as part of a pan-European network to detect environmental change depends on an accurate determination of the mean of the soil properties at each monitoring event. Forest soil is known to be very variable spatially, however. A study was undertaken to explore and quantify this variability at three forest monitoring plots in Britain. Detailed soil sampling was carried out, and the data from the chemical analyses were analysed by classical statistics and geostatistics. An analysis of variance showed that there were no consistent effects from the sample sites in relation to the position of the trees. The variogram analysis showed that there was spatial dependence at each site for several variables and some varied in an apparently periodic way. An optimal sampling analysis based on the multivariate variogram for each site suggested that a bulked sample from 36 cores would reduce error to an acceptable level. Future sampling should be designed so that it neither targets nor avoids trees and disturbed ground. This can be achieved best by using a stratified random sampling design.
Resumo:
The Representative Soil Sampling Scheme of England and Wales has recorded information on the soil of agricultural land in England and Wales since 1969. It is a valuable source of information about the soil in the context of monitoring for sustainable agricultural development. Changes in soil nutrient status and pH were examined over the period 1971-2001. Several methods of statistical analysis were applied to data from the surveys during this period. The main focus here is on the data for 1971, 1981, 1991 and 2001. The results of examining change over time in general show that levels of potassium in the soil have increased, those of magnesium have remained fairly constant, those of phosphorus have declined and pH has changed little. Future sampling needs have been assessed in the context of monitoring, to determine the mean at a given level of confidence and tolerable error and to detect change in the mean over time at these same levels over periods of 5 and 10 years. The results of a non-hierarchical multivariate classification suggest that England and Wales could be stratified to optimize future sampling and analysis. To monitor soil quality and health more generally than for agriculture, more of the country should be sampled and a wider range of properties recorded.
Resumo:
This paper is turned to the advanced Monte Carlo methods for realistic image creation. It offers a new stratified approach for solving the rendering equation. We consider the numerical solution of the rendering equation by separation of integration domain. The hemispherical integration domain is symmetrically separated into 16 parts. First 9 sub-domains are equal size of orthogonal spherical triangles. They are symmetric each to other and grouped with a common vertex around the normal vector to the surface. The hemispherical integration domain is completed with more 8 sub-domains of equal size spherical quadrangles, also symmetric each to other. All sub-domains have fixed vertices and computable parameters. The bijections of unit square into an orthogonal spherical triangle and into a spherical quadrangle are derived and used to generate sampling points. Then, the symmetric sampling scheme is applied to generate the sampling points distributed over the hemispherical integration domain. The necessary transformations are made and the stratified Monte Carlo estimator is presented. The rate of convergence is obtained and one can see that the algorithm is of super-convergent type.
Resumo:
The application of forecast ensembles to probabilistic weather prediction has spurred considerable interest in their evaluation. Such ensembles are commonly interpreted as Monte Carlo ensembles meaning that the ensemble members are perceived as random draws from a distribution. Under this interpretation, a reasonable property to ask for is statistical consistency, which demands that the ensemble members and the verification behave like draws from the same distribution. A widely used technique to assess statistical consistency of a historical dataset is the rank histogram, which uses as a criterion the number of times that the verification falls between pairs of members of the ordered ensemble. Ensemble evaluation is rendered more specific by stratification, which means that ensembles that satisfy a certain condition (e.g., a certain meteorological regime) are evaluated separately. Fundamental relationships between Monte Carlo ensembles, their rank histograms, and random sampling from the probability simplex according to the Dirichlet distribution are pointed out. Furthermore, the possible benefits and complications of ensemble stratification are discussed. The main conclusion is that a stratified Monte Carlo ensemble might appear inconsistent with the verification even though the original (unstratified) ensemble is consistent. The apparent inconsistency is merely a result of stratification. Stratified rank histograms are thus not necessarily flat. This result is demonstrated by perfect ensemble simulations and supplemented by mathematical arguments. Possible methods to avoid or remove artifacts that stratification induces in the rank histogram are suggested.
Resumo:
In the Radiative Atmospheric Divergence Using ARM Mobile Facility GERB and AMMA Stations (RADAGAST) project we calculate the divergence of radiative flux across the atmosphere by comparing fluxes measured at each end of an atmospheric column above Niamey, in the African Sahel region. The combination of broadband flux measurements from geostationary orbit and the deployment for over 12 months of a comprehensive suite of active and passive instrumentation at the surface eliminates a number of sampling issues that could otherwise affect divergence calculations of this sort. However, one sampling issue that challenges the project is the fact that the surface flux data are essentially measurements made at a point, while the top-of-atmosphere values are taken over a solid angle that corresponds to an area at the surface of some 2500 km2. Variability of cloud cover and aerosol loading in the atmosphere mean that the downwelling fluxes, even when averaged over a day, will not be an exact match to the area-averaged value over that larger area, although we might expect that it is an unbiased estimate thereof. The heterogeneity of the surface, for example, fixed variations in albedo, further means that there is a likely systematic difference in the corresponding upwelling fluxes. In this paper we characterize and quantify this spatial sampling problem. We bound the root-mean-square error in the downwelling fluxes by exploiting a second set of surface flux measurements from a site that was run in parallel with the main deployment. The differences in the two sets of fluxes lead us to an upper bound to the sampling uncertainty, and their correlation leads to another which is probably optimistic as it requires certain other conditions to be met. For the upwelling fluxes we use data products from a number of satellite instruments to characterize the relevant heterogeneities and so estimate the systematic effects that arise from the flux measurements having to be taken at a single point. The sampling uncertainties vary with the season, being higher during the monsoon period. We find that the sampling errors for the daily average flux are small for the shortwave irradiance, generally less than 5 W m−2, under relatively clear skies, but these increase to about 10 W m−2 during the monsoon. For the upwelling fluxes, again taking daily averages, systematic errors are of order 10 W m−2 as a result of albedo variability. The uncertainty on the longwave component of the surface radiation budget is smaller than that on the shortwave component, in all conditions, but a bias of 4 W m−2 is calculated to exist in the surface leaving longwave flux.
Resumo:
Inertia-gravity waves exist ubiquitously throughout the stratified parts of the atmosphere and ocean. They are generated by local velocity shears, interactions with topography, and as geostrophic (or spontaneous) adjustment radiation. Relatively little is known about the details of their interaction with the large-scale flow, however. We report on a joint model/laboratory study of a flow in which inertia-gravity waves are generated as spontaneous adjustment radiation by an evolving large-scale mode. We show that their subsequent impact upon the large-scale dynamics is generally small. However, near a potential transition from one large-scale mode to another, in a flow which is simultaneously baroclinically-unstable to more than one mode, the inertia-gravity waves may strongly influence the selection of the mode which actually occurs.
Resumo:
We report on a numerical study of the impact of short, fast inertia-gravity waves on the large-scale, slowly-evolving flow with which they co-exist. A nonlinear quasi-geostrophic numerical model of a stratified shear flow is used to simulate, at reasonably high resolution, the evolution of a large-scale mode which grows due to baroclinic instability and equilibrates at finite amplitude. Ageostrophic inertia-gravity modes are filtered out of the model by construction, but their effects on the balanced flow are incorporated using a simple stochastic parameterization of the potential vorticity anomalies which they induce. The model simulates a rotating, two-layer annulus laboratory experiment, in which we recently observed systematic inertia-gravity wave generation by an evolving, large-scale flow. We find that the impact of the small-amplitude stochastic contribution to the potential vorticity tendency, on the model balanced flow, is generally small, as expected. In certain circumstances, however, the parameterized fast waves can exert a dominant influence. In a flow which is baroclinically-unstable to a range of zonal wavenumbers, and in which there is a close match between the growth rates of the multiple modes, the stochastic waves can strongly affect wavenumber selection. This is illustrated by a flow in which the parameterized fast modes dramatically re-partition the probability-density function for equilibrated large-scale zonal wavenumber. In a second case study, the stochastic perturbations are shown to force spontaneous wavenumber transitions in the large-scale flow, which do not occur in their absence. These phenomena are due to a stochastic resonance effect. They add to the evidence that deterministic parameterizations in general circulation models, of subgrid-scale processes such as gravity wave drag, cannot always adequately capture the full details of the nonlinear interaction.
Resumo:
In this paper, the available potential energy (APE) framework of Winters et al. (J. Fluid Mech., vol. 289, 1995, p. 115) is extended to the fully compressible Navier– Stokes equations, with the aims of clarifying (i) the nature of the energy conversions taking place in turbulent thermally stratified fluids; and (ii) the role of surface buoyancy fluxes in the Munk & Wunsch (Deep-Sea Res., vol. 45, 1998, p. 1977) constraint on the mechanical energy sources of stirring required to maintain diapycnal mixing in the oceans. The new framework reveals that the observed turbulent rate of increase in the background gravitational potential energy GPEr , commonly thought to occur at the expense of the diffusively dissipated APE, actually occurs at the expense of internal energy, as in the laminar case. The APE dissipated by molecular diffusion, on the other hand, is found to be converted into internal energy (IE), similar to the viscously dissipated kinetic energy KE. Turbulent stirring, therefore, does not introduce a new APE/GPEr mechanical-to-mechanical energy conversion, but simply enhances the existing IE/GPEr conversion rate, in addition to enhancing the viscous dissipation and the entropy production rates. This, in turn, implies that molecular diffusion contributes to the dissipation of the available mechanical energy ME =APE +KE, along with viscous dissipation. This result has important implications for the interpretation of the concepts of mixing efficiency γmixing and flux Richardson number Rf , for which new physically based definitions are proposed and contrasted with previous definitions. The new framework allows for a more rigorous and general re-derivation from the first principles of Munk & Wunsch (1998, hereafter MW98)’s constraint, also valid for a non-Boussinesq ocean: G(KE) ≈ 1 − ξ Rf ξ Rf Wr, forcing = 1 + (1 − ξ )γmixing ξ γmixing Wr, forcing , where G(KE) is the work rate done by the mechanical forcing, Wr, forcing is the rate of loss of GPEr due to high-latitude cooling and ξ is a nonlinearity parameter such that ξ =1 for a linear equation of state (as considered by MW98), but ξ <1 otherwise. The most important result is that G(APE), the work rate done by the surface buoyancy fluxes, must be numerically as large as Wr, forcing and, therefore, as important as the mechanical forcing in stirring and driving the oceans. As a consequence, the overall mixing efficiency of the oceans is likely to be larger than the value γmixing =0.2 presently used, thereby possibly eliminating the apparent shortfall in mechanical stirring energy that results from using γmixing =0.2 in the above formula.
Resumo:
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.