909 resultados para stratified sampling
Resumo:
Milkfish and prawn pond operation in the Philippines is often associated with lab-lab culture. Lab-lab is a biological complex of blue-green algae, diatoms, bacteria and various animals which form a mat at the bottom of nursery ponds or floating patches along the margins of ponds. This complex is considered the most favorable food of milkfish in brackishwater ponds. Variations in the quantity and quality of lab-lab between and within areas of a 1,000 sq. m. pond was determined over 2 culture periods (6 month duration) and the applicability and suitability of stratified random sampling as a method of sampling lab-lab was evaluated.
Resumo:
Long-term monitoring of forest soils as part of a pan-European network to detect environmental change depends on an accurate determination of the mean of the soil properties at each monitoring event. Forest soil is known to be very variable spatially, however. A study was undertaken to explore and quantify this variability at three forest monitoring plots in Britain. Detailed soil sampling was carried out, and the data from the chemical analyses were analysed by classical statistics and geostatistics. An analysis of variance showed that there were no consistent effects from the sample sites in relation to the position of the trees. The variogram analysis showed that there was spatial dependence at each site for several variables and some varied in an apparently periodic way. An optimal sampling analysis based on the multivariate variogram for each site suggested that a bulked sample from 36 cores would reduce error to an acceptable level. Future sampling should be designed so that it neither targets nor avoids trees and disturbed ground. This can be achieved best by using a stratified random sampling design.
Resumo:
The Representative Soil Sampling Scheme of England and Wales has recorded information on the soil of agricultural land in England and Wales since 1969. It is a valuable source of information about the soil in the context of monitoring for sustainable agricultural development. Changes in soil nutrient status and pH were examined over the period 1971-2001. Several methods of statistical analysis were applied to data from the surveys during this period. The main focus here is on the data for 1971, 1981, 1991 and 2001. The results of examining change over time in general show that levels of potassium in the soil have increased, those of magnesium have remained fairly constant, those of phosphorus have declined and pH has changed little. Future sampling needs have been assessed in the context of monitoring, to determine the mean at a given level of confidence and tolerable error and to detect change in the mean over time at these same levels over periods of 5 and 10 years. The results of a non-hierarchical multivariate classification suggest that England and Wales could be stratified to optimize future sampling and analysis. To monitor soil quality and health more generally than for agriculture, more of the country should be sampled and a wider range of properties recorded.
Resumo:
This paper is turned to the advanced Monte Carlo methods for realistic image creation. It offers a new stratified approach for solving the rendering equation. We consider the numerical solution of the rendering equation by separation of integration domain. The hemispherical integration domain is symmetrically separated into 16 parts. First 9 sub-domains are equal size of orthogonal spherical triangles. They are symmetric each to other and grouped with a common vertex around the normal vector to the surface. The hemispherical integration domain is completed with more 8 sub-domains of equal size spherical quadrangles, also symmetric each to other. All sub-domains have fixed vertices and computable parameters. The bijections of unit square into an orthogonal spherical triangle and into a spherical quadrangle are derived and used to generate sampling points. Then, the symmetric sampling scheme is applied to generate the sampling points distributed over the hemispherical integration domain. The necessary transformations are made and the stratified Monte Carlo estimator is presented. The rate of convergence is obtained and one can see that the algorithm is of super-convergent type.
Resumo:
The application of forecast ensembles to probabilistic weather prediction has spurred considerable interest in their evaluation. Such ensembles are commonly interpreted as Monte Carlo ensembles meaning that the ensemble members are perceived as random draws from a distribution. Under this interpretation, a reasonable property to ask for is statistical consistency, which demands that the ensemble members and the verification behave like draws from the same distribution. A widely used technique to assess statistical consistency of a historical dataset is the rank histogram, which uses as a criterion the number of times that the verification falls between pairs of members of the ordered ensemble. Ensemble evaluation is rendered more specific by stratification, which means that ensembles that satisfy a certain condition (e.g., a certain meteorological regime) are evaluated separately. Fundamental relationships between Monte Carlo ensembles, their rank histograms, and random sampling from the probability simplex according to the Dirichlet distribution are pointed out. Furthermore, the possible benefits and complications of ensemble stratification are discussed. The main conclusion is that a stratified Monte Carlo ensemble might appear inconsistent with the verification even though the original (unstratified) ensemble is consistent. The apparent inconsistency is merely a result of stratification. Stratified rank histograms are thus not necessarily flat. This result is demonstrated by perfect ensemble simulations and supplemented by mathematical arguments. Possible methods to avoid or remove artifacts that stratification induces in the rank histogram are suggested.
Resumo:
1. Distance sampling is a widely used technique for estimating the size or density of biological populations. Many distance sampling designs and most analyses use the software Distance. 2. We briefly review distance sampling and its assumptions, outline the history, structure and capabilities of Distance, and provide hints on its use. 3. Good survey design is a crucial prerequisite for obtaining reliable results. Distance has a survey design engine, with a built-in geographic information system, that allows properties of different proposed designs to be examined via simulation, and survey plans to be generated. 4. A first step in analysis of distance sampling data is modeling the probability of detection. Distance contains three increasingly sophisticated analysis engines for this: conventional distance sampling, which models detection probability as a function of distance from the transect and assumes all objects at zero distance are detected; multiple-covariate distance sampling, which allows covariates in addition to distance; and mark–recapture distance sampling, which relaxes the assumption of certain detection at zero distance. 5. All three engines allow estimation of density or abundance, stratified if required, with associated measures of precision calculated either analytically or via the bootstrap. 6. Advanced analysis topics covered include the use of multipliers to allow analysis of indirect surveys (such as dung or nest surveys), the density surface modeling analysis engine for spatial and habitat-modeling, and information about accessing the analysis engines directly from other software. 7. Synthesis and applications. Distance sampling is a key method for producing abundance and density estimates in challenging field conditions. The theory underlying the methods continues to expand to cope with realistic estimation situations. In step with theoretical developments, state-of- the-art software that implements these methods is described that makes the methods accessible to practicing ecologists.
Resumo:
BACKGROUND: In order to optimise the cost-effectiveness of active surveillance to substantiate freedom from disease, a new approach using targeted sampling of farms was developed and applied on the example of infectious bovine rhinotracheitis (IBR) and enzootic bovine leucosis (EBL) in Switzerland. Relevant risk factors (RF) for the introduction of IBR and EBL into Swiss cattle farms were identified and their relative risks defined based on literature review and expert opinions. A quantitative model based on the scenario tree method was subsequently used to calculate the required sample size of a targeted sampling approach (TS) for a given sensitivity. We compared the sample size with that of a stratified random sample (sRS) with regard to efficiency. RESULTS: The required sample sizes to substantiate disease freedom were 1,241 farms for IBR and 1,750 farms for EBL to detect 0.2% herd prevalence with 99% sensitivity. Using conventional sRS, the required sample sizes were 2,259 farms for IBR and 2,243 for EBL. Considering the additional administrative expenses required for the planning of TS, the risk-based approach was still more cost-effective than a sRS (40% reduction on the full survey costs for IBR and 8% for EBL) due to the considerable reduction in sample size. CONCLUSIONS: As the model depends on RF selected through literature review and was parameterised with values estimated by experts, it is subject to some degree of uncertainty. Nevertheless, this approach provides the veterinary authorities with a promising tool for future cost-effective sampling designs.
Resumo:
In this paper, we consider estimation of the causal effect of a treatment on an outcome from observational data collected in two phases. In the first phase, a simple random sample of individuals are drawn from a population. On these individuals, information is obtained on treatment, outcome, and a few low-dimensional confounders. These individuals are then stratified according to these factors. In the second phase, a random sub-sample of individuals are drawn from each stratum, with known, stratum-specific selection probabilities. On these individuals, a rich set of confounding factors are collected. In this setting, we introduce four estimators: (1) simple inverse weighted, (2) locally efficient, (3) doubly robust and (4)enriched inverse weighted. We evaluate the finite-sample performance of these estimators in a simulation study. We also use our methodology to estimate the causal effect of trauma care on in-hospital mortality using data from the National Study of Cost and Outcomes of Trauma.
Resumo:
This data set contains soil carbon measurements (Organic carbon, inorganic carbon, and total carbon; all measured in dried soil samples) from the main experiment plots of a large grassland biodiversity experiment (the Jena Experiment; see further details below). In the main experiment, 82 grassland plots of 20 x 20 m were established from a pool of 60 species belonging to four functional groups (grasses, legumes, tall and small herbs). In May 2002, varying numbers of plant species from this species pool were sown into the plots to create a gradient of plant species richness (1, 2, 4, 8, 16 and 60 species) and functional richness (1, 2, 3, 4 functional groups). Plots were maintained by bi-annual weeding and mowing. Stratified soil sampling to a depth of 1 m was repeated in April 2007 (as had been done before sowing in April 2002). Three independent samples per plot were taken of all plots in block 2 using a motor-driven soil column cylinder (Cobra, Eijkelkamp, 8.3 cm in diameter). Soil samples were dried at 40°C and segmented to a depth resolution of 5 cm giving 20 depth subsamples per core. All samples were analyzed independently. All soil samples were passed through a sieve with a mesh size of 2 mm. Because of much higher proportions of roots in the soil, the samples in 2007 were further sieved to 1 mm according to common root removal methods. No additional mineral particles were removed by this procedure. Total carbon concentration was analyzed on ball-milled subsamples (time 4 min, frequency 30 s**-1) by an elemental analyzer at 1150°C (Elementaranalysator vario Max CN; Elementar Analysensysteme GmbH, Hanau, Germany). We measured inorganic carbon concentration by elemental analysis at 1150°C after removal of organic carbon for 16 h at 450°C in a muffle furnace. Organic carbon concentration was calculated as the difference between both measurements of total and inorganic carbon.
Resumo:
This data set contains soil carbon measurements (Organic carbon, inorganic carbon, and total carbon; all measured in dried soil samples) from the main experiment plots of a large grassland biodiversity experiment (the Jena Experiment; see further details below). In the main experiment, 82 grassland plots of 20 x 20 m were established from a pool of 60 species belonging to four functional groups (grasses, legumes, tall and small herbs). In May 2002, varying numbers of plant species from this species pool were sown into the plots to create a gradient of plant species richness (1, 2, 4, 8, 16 and 60 species) and functional richness (1, 2, 3, 4 functional groups). Plots were maintained by bi-annual weeding and mowing. Stratified soil sampling to a depth of 1 m was performed before sowing in April 2002. Three independent samples per plot were taken of all plots in block 2 using a motor-driven soil column cylinder (Cobra, Eijkelkamp, 8.3 cm in diameter). Soil samples were dried at 40°C and segmented to a depth resolution of 5 cm giving 20 depth subsamples per core. All samples were analyzed independently. All soil samples were passed through a sieve with a mesh size of 2 mm. Rarely present visible plant remains were removed using tweezers. Total carbon concentration was analyzed on ball-milled subsamples (time 4 min, frequency 30 s**-1) by an elemental analyzer at 1150°C (Elementaranalysator vario Max CN; Elementar Analysensysteme GmbH, Hanau, Germany). We measured inorganic carbon concentration by elemental analysis at 1150°C after removal of organic carbon for 16 h at 450°C in a muffle furnace. Organic carbon concentration was calculated as the difference between both measurements of total and inorganic carbon.
Resumo:
Background There is a paucity of data describing the prevalence of childhood refractive error in the United Kingdom. The Northern Ireland Childhood Errors of Refraction study, along with its sister study the Aston Eye Study, are the first population-based surveys of children using both random cluster sampling and cycloplegic autorefraction to quantify levels of refractive error in the United Kingdom. Methods Children aged 6–7 years and 12–13 years were recruited from a stratified random sample of primary and post-primary schools, representative of the population of Northern Ireland as a whole. Measurements included assessment of visual acuity, oculomotor balance, ocular biometry and cycloplegic binocular open-field autorefraction. Questionnaires were used to identify putative risk factors for refractive error. Results 399 (57%) of 6–7 years and 669 (60%) of 12–13 years participated. School participation rates did not vary statistically significantly with the size of the school, whether the school is urban or rural, or whether it is in a deprived/non-deprived area. The gender balance, ethnicity and type of schooling of participants are reflective of the Northern Ireland population. Conclusions The study design, sample size and methodology will ensure accurate measures of the prevalence of refractive errors in the target population and will facilitate comparisons with other population-based refractive data.
Resumo:
Sustainability can be indicated by a number of factors. Populations need to be aged evenly, ensuring a healthy equilibrium. Job opportunities must be numerous and of wide varieties to balance incomes from different employment sectors. Regions must also sustain vital natural resources in the area which are directly related to a place being self-sustaining. These indicators prove to be true, especially in Newfoundland, where people have struggled to remain in the small traditional communities that they consider being there 'home.' The population of Corner Brook and the surrounding areas can be stratified according to the values people hold to their special place. Even though people in western Newfoundland hold strong ties to their home, some parts of the region even though people in western Newfoundland hold strong ties to their home, some parts of the region struggle with employment, low incomes, out-migration, and dependency on declining natural resources. The aim of this paper is to present the process of designing a sample strategy for a human values pilot survey conducted in the city of Corner Brook. It will present a theoretical background over the period 2002-2006 to be used for sampling strategy.
Resumo:
With Tweet volumes reaching 500 million a day, sampling is inevitable for any application using Twitter data. Realizing this, data providers such as Twitter, Gnip and Boardreader license sampled data streams priced in accordance with the sample size. Big Data applications working with sampled data would be interested in working with a large enough sample that is representative of the universal dataset. Previous work focusing on the representativeness issue has considered ensuring the global occurrence rates of key terms, be reliably estimated from the sample. Present technology allows sample size estimation in accordance with probabilistic bounds on occurrence rates for the case of uniform random sampling. In this paper, we consider the problem of further improving sample size estimates by leveraging stratification in Twitter data. We analyze our estimates through an extensive study using simulations and real-world data, establishing the superiority of our method over uniform random sampling. Our work provides the technical know-how for data providers to expand their portfolio to include stratified sampled datasets, whereas applications are benefited by being able to monitor more topics/events at the same data and computing cost.