954 resultados para Sampling design
Resumo:
The Biogeography Branch’s Sampling Design Tool for ArcGIS provides a means to effectively develop sampling strategies in a geographic information system (GIS) environment. The tool was produced as part of an iterative process of sampling design development, whereby existing data informs new design decisions. The objective of this process, and hence a product of this tool, is an optimal sampling design which can be used to achieve accurate, highprecision estimates of population metrics at a minimum of cost. Although NOAA’s Biogeography Branch focuses on marine habitats and some examples reflects this, the tool can be used to sample any type of population defined in space, be it coral reefs or corn fields.
Resumo:
The Flower Garden Banks National Marine Sanctuary (FGBNMS) is located in the northwestern Gulf of Mexico approximately 180 km south of Galveston, Texas. The sanctuary’s distance from shore combined with its depth (the coral caps reach to within approximately 17 m of the surface) result in limited exposure of this coral reef ecosystem to natural and human-induced impacts compared to other coral reefs of the western Atlantic. In spite of this, the sanctuary still confronts serious impacts including hurricanes events, recent outbreaks of coral disease, an increase in the frequency of coral bleaching and the massive Diadema antillarum die-off during the mid-1980s. Anthropogenic impacts include large vessel anchoring, commercial and recreational fishing, recreational scuba diving, and oil and gas related activities. The FGBNMS was designated in 1992 to help protect against some of these impacts. Basic monitoring and research efforts have been conducted on the banks since the 1970s. Early on, these efforts focused primarily on describing the benthic communities (corals, sponges) and providing qualitative characterizations of the fish community. Subsequently, more quantitative work has been conducted; however, it has been limited in spatial scope. To complement these efforts, the current study addresses the following two goals put forth by sanctuary management: 1) to develop a sampling design for monitoring benthic fish communities across the coral caps; and 2) to obtain a spatial and quantitative characterization of those communities and their associated habitats.
Resumo:
The Biogeography Branch’s Sampling Design Tool for ArcGIS provides a means to effectively develop sampling strategies in a geographic information system (GIS) environment. The tool was produced as part of an iterative process of sampling design development, whereby existing data informs new design decisions. The objective of this process, and hence a product of this tool, is an optimal sampling design which can be used to achieve accurate, high-precision estimates of population metrics at a minimum of cost. Although NOAA’s Biogeography Branch focuses on marine habitats and some examples reflects this, the tool can be used to sample any type of population defined in space, be it coral reefs or corn fields.
Resumo:
There is no abstract for this record.
Resumo:
Mathematical models and statistical analysis are key instruments in soil science scientific research as they can describe and/or predict the current state of a soil system. These tools allow us to explore the behavior of soil related processes and properties as well as to generate new hypotheses for future experimentation. A good model and analysis of soil properties variations, that permit us to extract suitable conclusions and estimating spatially correlated variables at unsampled locations, is clearly dependent on the amount and quality of data and of the robustness techniques and estimators. On the other hand, the quality of data is obviously dependent from a competent data collection procedure and from a capable laboratory analytical work. Following the standard soil sampling protocols available, soil samples should be collected according to key points such as a convenient spatial scale, landscape homogeneity (or non-homogeneity), land color, soil texture, land slope, land solar exposition. Obtaining good quality data from forest soils is predictably expensive as it is labor intensive and demands many manpower and equipment both in field work and in laboratory analysis. Also, the sampling collection scheme that should be used on a data collection procedure in forest field is not simple to design as the sampling strategies chosen are strongly dependent on soil taxonomy. In fact, a sampling grid will not be able to be followed if rocks at the predicted collecting depth are found, or no soil at all is found, or large trees bar the soil collection. Considering this, a proficient design of a soil data sampling campaign in forest field is not always a simple process and sometimes represents a truly huge challenge. In this work, we present some difficulties that have occurred during two experiments on forest soil that were conducted in order to study the spatial variation of some soil physical-chemical properties. Two different sampling protocols were considered for monitoring two types of forest soils located in NW Portugal: umbric regosol and lithosol. Two different equipments for sampling collection were also used: a manual auger and a shovel. Both scenarios were analyzed and the results achieved have allowed us to consider that monitoring forest soil in order to do some mathematical and statistical investigations needs a sampling procedure to data collection compatible to established protocols but a pre-defined grid assumption often fail when the variability of the soil property is not uniform in space. In this case, sampling grid should be conveniently adapted from one part of the landscape to another and this fact should be taken into consideration of a mathematical procedure.
Resumo:
Despite widespread use of species-area relationships (SARs), dispute remains over the most representative SAR model. Using data of small-scale SARs of Estonian dry grassland communities, we address three questions: (1) Which model describes these SARs best when known artifacts are excluded? (2) How do deviating sampling procedures (marginal instead of central position of the smaller plots in relation to the largest plot; single values instead of average values; randomly located subplots instead of nested subplots) influence the properties of the SARs? (3) Are those effects likely to bias the selection of the best model? Our general dataset consisted of 16 series of nested-plots (1 cm(2)-100 m(2), any-part system), each of which comprised five series of subplots located in the four corners and the centre of the 100-m(2) plot. Data for the three pairs of compared sampling designs were generated from this dataset by subsampling. Five function types (power, quadratic power, logarithmic, Michaelis-Menten, Lomolino) were fitted with non-linear regression. In some of the communities, we found extremely high species densities (including bryophytes and lichens), namely up to eight species in 1 cm(2) and up to 140 species in 100 m(2), which appear to be the highest documented values on these scales. For SARs constructed from nested-plot average-value data, the regular power function generally was the best model, closely followed by the quadratic power function, while the logarithmic and Michaelis-Menten functions performed poorly throughout. However, the relative fit of the latter two models increased significantly relative to the respective best model when the single-value or random-sampling method was applied, however, the power function normally remained far superior. These results confirm the hypothesis that both single-value and random-sampling approaches cause artifacts by increasing stochasticity in the data, which can lead to the selection of inappropriate models.
Resumo:
Tree-rings offer one of the few possibilities to empirically quantify and reconstruct forest growth dynamics over years to millennia. Contemporaneously with the growing scientific community employing tree-ring parameters, recent research has suggested that commonly applied sampling designs (i.e. how and which trees are selected for dendrochronological sampling) may introduce considerable biases in quantifications of forest responses to environmental change. To date, a systematic assessment of the consequences of sampling design on dendroecological and-climatological conclusions has not yet been performed. Here, we investigate potential biases by sampling a large population of trees and replicating diverse sampling designs. This is achieved by retroactively subsetting the population and specifically testing for biases emerging for climate reconstruction, growth response to climate variability, long-term growth trends, and quantification of forest productivity. We find that commonly applied sampling designs can impart systematic biases of varying magnitude to any type of tree-ring-based investigations, independent of the total number of samples considered. Quantifications of forest growth and productivity are particularly susceptible to biases, whereas growth responses to short-term climate variability are less affected by the choice of sampling design. The world's most frequently applied sampling design, focusing on dominant trees only, can bias absolute growth rates by up to 459% and trends in excess of 200%. Our findings challenge paradigms, where a subset of samples is typically considered to be representative for the entire population. The only two sampling strategies meeting the requirements for all types of investigations are the (i) sampling of all individuals within a fixed area; and (ii) fully randomized selection of trees. This result advertises the consistent implementation of a widely applicable sampling design to simultaneously reduce uncertainties in tree-ring-based quantifications of forest growth and increase the comparability of datasets beyond individual studies, investigators, laboratories, and geographical boundaries.
Resumo:
Mode of access: Internet.
Resumo:
Activity of 7-ethoxyresorufin-O-deethylase (EROD) in fish is certainly the best-studied biomarker of exposure applied in the field to evaluate biological effects of contamination in the marine environment. Since 1991, a feasibility study for a monitoring network using this biomarker of exposure has been conducted along French coasts. Using data obtained during several cruises, this study aims to determine the number of fish required to detect a given difference between 2 mean EROD activities, i.e. to achieve an a priori fixed statistical power (l-beta) given significance level (alpha), variance estimations and projected ratio of unequal sample sizes (k). Mean EROD activity and standard error were estimated at each of 82 sampling stations. The inter-individual variance component was dominant in estimating the variance of mean EROD activity. Influences of alpha, beta, k and variability on sample sizes are illustrated and discussed in terms of costs. In particular, sample sizes do not have to be equal, especially if such a requirement would lead to a significant cost in sampling extra material. Finally, the feasibility of longterm monitoring is discussed.
Resumo:
Monitoring stream networks through time provides important ecological information. The sampling design problem is to choose locations where measurements are taken so as to maximise information gathered about physicochemical and biological variables on the stream network. This paper uses a pseudo-Bayesian approach, averaging a utility function over a prior distribution, in finding a design which maximizes the average utility. We use models for correlations of observations on the stream network that are based on stream network distances and described by moving average error models. Utility functions used reflect the needs of the experimenter, such as prediction of location values or estimation of parameters. We propose an algorithmic approach to design with the mean utility of a design estimated using Monte Carlo techniques and an exchange algorithm to search for optimal sampling designs. In particular we focus on the problem of finding an optimal design from a set of fixed designs and finding an optimal subset of a given set of sampling locations. As there are many different variables to measure, such as chemical, physical and biological measurements at each location, designs are derived from models based on different types of response variables: continuous, counts and proportions. We apply the methodology to a synthetic example and the Lake Eacham stream network on the Atherton Tablelands in Queensland, Australia. We show that the optimal designs depend very much on the choice of utility function, varying from space filling to clustered designs and mixtures of these, but given the utility function, designs are relatively robust to the type of response variable.
Resumo:
We examine some variations of standard probability designs that preferentially sample sites based on how easy they are to access. Preferential sampling designs deliver unbiased estimates of mean and sampling variance and will ease the burden of data collection but at what cost to our design efficiency? Preferential sampling has the potential to either increase or decrease sampling variance depending on the application. We carry out a simulation study to gauge what effect it will have when sampling Soil Organic Carbon (SOC) values in a large agricultural region in south-eastern Australia. Preferential sampling in this region can reduce the distance to travel by up to 16%. Our study is based on a dataset of predicted SOC values produced from a datamining exercise. We consider three designs and two ways to determine ease of access. The overall conclusion is that sampling performance deteriorates as the strength of preferential sampling increases, due to the fact the regions of high SOC are harder to access. So our designs are inadvertently targeting regions of low SOC value. The good news, however, is that Generalised Random Tessellation Stratification (GRTS) sampling designs are not as badly affected as others and GRTS remains an efficient design compared to competitors.
Resumo:
Sampling strategies are developed based on the idea of ranked set sampling (RSS) to increase efficiency and therefore to reduce the cost of sampling in fishery research. The RSS incorporates information on concomitant variables that are correlated with the variable of interest in the selection of samples. For example, estimating a monitoring survey abundance index would be more efficient if the sampling sites were selected based on the information from previous surveys or catch rates of the fishery. We use two practical fishery examples to demonstrate the approach: site selection for a fishery-independent monitoring survey in the Australian northern prawn fishery (NPF) and fish age prediction by simple linear regression modelling a short-lived tropical clupeoid. The relative efficiencies of the new designs were derived analytically and compared with the traditional simple random sampling (SRS). Optimal sampling schemes were measured by different optimality criteria. For the NPF monitoring survey, the efficiency in terms of variance or mean squared errors of the estimated mean abundance index ranged from 114 to 199% compared with the SRS. In the case of a fish ageing study for Tenualosa ilisha in Bangladesh, the efficiency of age prediction from fish body weight reached 140%.
Resumo:
Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.
Resumo:
A spatial sampling design that uses pair-copulas is presented that aims to reduce prediction uncertainty by selecting additional sampling locations based on both the spatial configuration of existing locations and the values of the observations at those locations. The novelty of the approach arises in the use of pair-copulas to estimate uncertainty at unsampled locations. Spatial pair-copulas are able to more accurately capture spatial dependence compared to other types of spatial copula models. Additionally, unlike traditional kriging variance, uncertainty estimates from the pair-copula account for influence from measurement values and not just the configuration of observations. This feature is beneficial, for example, for more accurate identification of soil contamination zones where high contamination measurements are located near measurements of varying contamination. The proposed design methodology is applied to a soil contamination example from the Swiss Jura region. A partial redesign of the original sampling configuration demonstrates the potential of the proposed methodology.
Resumo:
Atlantic menhaden, Brrvoortia tyrannus, the object of a major purse-seine fishery along the U.S. east coast, are landed at plants from northern Florida to central Maine. The National Marine Fisheries Service has sampled these landings since 1955 for length, weight, and age. Together with records of landings at each plant, the samples are used to estimate numbers of fish landed at each age. This report analyzes the sampling design in terms of probablity sampling theory. The design is c1assified as two-stage cluster sampling, the first stage consisting of purse-seine sets randomly selected from the population of all sets landed, and the second stage consisting of fish randomly selected from each sampled set. Implicit assumptions of this design are discussed with special attention to current sampling procedures. Methods are developed for estimating mean fish weight, numbers of fish landed, and age composition of the catch, with approximate 95% confidence intervals. Based on specific results from three ports (port Monmouth, N.J., Reedville, Va., and Beaufort, N.C.) for the 1979 fishing season, recommendations are made for improving sampling procedures to comply more exactly with assumptions of the sampling design. These recommendatlons include adopting more formal methods for randomizing set and fish selection, increasing the number of sets sampled, considering the bias introduced by unequal set sizes, and developing methods to optimize the use of funds and personnel. (PDF file contains 22 pages.)