3 resultados para Small Area Estimation
em DigitalCommons@University of Nebraska - Lincoln
Resumo:
Classical sampling methods can be used to estimate the mean of a finite or infinite population. Block kriging also estimates the mean, but of an infinite population in a continuous spatial domain. In this paper, I consider a finite population version of block kriging (FPBK) for plot-based sampling. The data are assumed to come from a spatial stochastic process. Minimizing mean-squared-prediction errors yields best linear unbiased predictions that are a finite population version of block kriging. FPBK has versions comparable to simple random sampling and stratified sampling, and includes the general linear model. This method has been tested for several years for moose surveys in Alaska, and an example is given where results are compared to stratified random sampling. In general, assuming a spatial model gives three main advantages over classical sampling: (1) FPBK is usually more precise than simple or stratified random sampling, (2) FPBK allows small area estimation, and (3) FPBK allows nonrandom sampling designs.
Resumo:
We consider a fully model-based approach for the analysis of distance sampling data. Distance sampling has been widely used to estimate abundance (or density) of animals or plants in a spatially explicit study area. There is, however, no readily available method of making statistical inference on the relationships between abundance and environmental covariates. Spatial Poisson process likelihoods can be used to simultaneously estimate detection and intensity parameters by modeling distance sampling data as a thinned spatial point process. A model-based spatial approach to distance sampling data has three main benefits: it allows complex and opportunistic transect designs to be employed, it allows estimation of abundance in small subregions, and it provides a framework to assess the effects of habitat or experimental manipulation on density. We demonstrate the model-based methodology with a small simulation study and analysis of the Dubbo weed data set. In addition, a simple ad hoc method for handling overdispersion is also proposed. The simulation study showed that the model-based approach compared favorably to conventional distance sampling methods for abundance estimation. In addition, the overdispersion correction performed adequately when the number of transects was high. Analysis of the Dubbo data set indicated a transect effect on abundance via Akaike’s information criterion model selection. Further goodness-of-fit analysis, however, indicated some potential confounding of intensity with the detection function.
Resumo:
In active learning, a machine learning algorithmis given an unlabeled set of examples U, and is allowed to request labels for a relatively small subset of U to use for training. The goal is then to judiciously choose which examples in U to have labeled in order to optimize some performance criterion, e.g. classification accuracy. We study how active learning affects AUC. We examine two existing algorithms from the literature and present our own active learning algorithms designed to maximize the AUC of the hypothesis. One of our algorithms was consistently the top performer, and Closest Sampling from the literature often came in second behind it. When good posterior probability estimates were available, our heuristics were by far the best.