973 resultados para MISSING VALUE ESTIMATION
Resumo:
1. The establishment of grassy strips at the margins of arable fields is an agri-environment scheme that aims to provide resources for native flora and fauna and thus increase farmland biodiversity. These margins can be managed to target certain groups, such as farmland birds and pollinators, but the impact of such management on the soil fauna has been poorly studied. This study assessed the effect of seed mix and management on the biodiversity, conservation and functional value of field margins for soil macrofauna. 2. Experimental margin plots were established in 2001 in a winter wheat field in Cambridgeshire, UK, using a factorial design of three seed mixes and three management practices [spring cut, herbicide application and soil disturbance (scarification)]. In spring and autumn 2005, soil cores taken from the margin plots and the crop were hand-sorted for soil macrofauna. The Lumbricidae, Isopoda, Chilopoda, Diplopoda, Carabidae and Staphylinidae were identified to species and classified according to feeding type. 3. Diversity in the field margins was generally higher than in the crop, with the Lumbricidae, Isopoda and Coleoptera having significantly more species and/or higher abundances in the margins. Within the margins, management had a significant effect on the soil macrofauna, with scarified plots containing lower abundances and fewer species of Isopods. The species composition of the scarified plots was similar to that of the crop. 4. Scarification also reduced soil- and litter-feeder abundances and predator species densities, although populations appeared to recover by the autumn, probably as a result of dispersal from neighbouring plots and boundary features. The implications of the responses of these feeding groups for ecosystem services are discussed. 5. Synthesis and applications. This study shows that the management of agri-environment schemes can significantly influence their value for soil macrofauna. In order to encourage the litter-dwelling invertebrates that tend to be missing from arable systems, agri-environment schemes should aim to minimize soil cultivation and develop a substantial surface litter layer. However, this may conflict with other aims of these schemes, such as enhancing floristic and pollinator diversity.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
This paper considers the problem of estimation when one of a number of populations, assumed normal with known common variance, is selected on the basis of it having the largest observed mean. Conditional on selection of the population, the observed mean is a biased estimate of the true mean. This problem arises in the analysis of clinical trials in which selection is made between a number of experimental treatments that are compared with each other either with or without an additional control treatment. Attempts to obtain approximately unbiased estimates in this setting have been proposed by Shen [2001. An improved method of evaluating drug effect in a multiple dose clinical trial. Statist. Medicine 20, 1913–1929] and Stallard and Todd [2005. Point estimates and confidence regions for sequential trials involving selection. J. Statist. Plann. Inference 135, 402–419]. This paper explores the problem in the simple setting in which two experimental treatments are compared in a single analysis. It is shown that in this case the estimate of Stallard and Todd is the maximum-likelihood estimate (m.l.e.), and this is compared with the estimate proposed by Shen. In particular, it is shown that the m.l.e. has infinite expectation whatever the true value of the mean being estimated. We show that there is no conditionally unbiased estimator, and propose a new family of approximately conditionally unbiased estimators, comparing these with the estimators suggested by Shen.
Resumo:
This paper compares a number of different extreme value models for determining the value at risk (VaR) of three LIFFE futures contracts. A semi-nonparametric approach is also proposed, where the tail events are modeled using the generalised Pareto distribution, and normal market conditions are captured by the empirical distribution function. The value at risk estimates from this approach are compared with those of standard nonparametric extreme value tail estimation approaches, with a small sample bias-corrected extreme value approach, and with those calculated from bootstrapping the unconditional density and bootstrapping from a GARCH(1,1) model. The results indicate that, for a holdout sample, the proposed semi-nonparametric extreme value approach yields superior results to other methods, but the small sample tail index technique is also accurate.
Resumo:
Statistical graphics are a fundamental, yet often overlooked, set of components in the repertoire of data analytic tools. Graphs are quick and efficient, yet simple instruments of preliminary exploration of a dataset to understand its structure and to provide insight into influential aspects of inference such as departures from assumptions and latent patterns. In this paper, we present and assess a graphical device for choosing a method for estimating population size in capture-recapture studies of closed populations. The basic concept is derived from a homogeneous Poisson distribution where the ratios of neighboring Poisson probabilities multiplied by the value of the larger neighbor count are constant. This property extends to the zero-truncated Poisson distribution which is of fundamental importance in capture–recapture studies. In practice however, this distributional property is often violated. The graphical device developed here, the ratio plot, can be used for assessing specific departures from a Poisson distribution. For example, simple contaminations of an otherwise homogeneous Poisson model can be easily detected and a robust estimator for the population size can be suggested. Several robust estimators are developed and a simulation study is provided to give some guidance on which should be used in practice. More systematic departures can also easily be detected using the ratio plot. In this paper, the focus is on Gamma mixtures of the Poisson distribution which leads to a linear pattern (called structured heterogeneity) in the ratio plot. More generally, the paper shows that the ratio plot is monotone for arbitrary mixtures of power series densities.
Resumo:
In this paper we perform an analytical and numerical study of Extreme Value distributions in discrete dynamical systems. In this setting, recent works have shown how to get a statistics of extremes in agreement with the classical Extreme Value Theory. We pursue these investigations by giving analytical expressions of Extreme Value distribution parameters for maps that have an absolutely continuous invariant measure. We compare these analytical results with numerical experiments in which we study the convergence to limiting distributions using the so called block-maxima approach, pointing out in which cases we obtain robust estimation of parameters. In regular maps for which mixing properties do not hold, we show that the fitting procedure to the classical Extreme Value Distribution fails, as expected. However, we obtain an empirical distribution that can be explained starting from a different observable function for which Nicolis et al. (Phys. Rev. Lett. 97(21): 210602, 2006) have found analytical results.
Resumo:
This paper presents novel observer-based techniques for the estimation of flow demands in gas networks, from sparse pressure telemetry. A completely observable model is explored, constructed by incorporating difference equations that assume the flow demands are steady. Since the flow demands usually vary slowly with time, this is a reasonable approximation. Two techniques for constructing robust observers are employed: robust eigenstructure assignment and singular value assignment. These techniques help to reduce the effects of the system approximation. Modelling error may be further reduced by making use of known profiles for the flow demands. The theory is extended to deal successfully with the problem of measurement bias. The pressure measurements available are subject to constant biases which degrade the flow demand estimates, and such biases need to be estimated. This is achieved by constructing a further model variation that incorporates the biases into an augmented state vector, but now includes information about the flow demand profiles in a new form.
Resumo:
The general focus of this paper is the regional estimation of marginal benefits of targeted water pollution abatement to instream uses. Benefit estimates are derived from actual consumer choices of recreational fishing activities and the implied expenditures for various levels of water quality. The methodology is applied to measuring the benefits accruing to recreational anglers in Indiana from the abatement of pollutants that are by-products of agricultural crop production.
Resumo:
A method has been developed to estimate Aerosol Optical Depth (AOD), Fine Mode Fraction (FMF) and Single Scattering Albedo (SSA) over land surfaces using simulated Sentinel-3 data. The method uses inversion of a coupled surface/atmosphere radiative transfer model, and includes a general physical model of angular surface reflectance. An iterative process is used to determine the optimum value of the aerosol properties providing the best fit of the corrected reflectance values for a number of view angles and wavelengths with those provided by the physical model. A method of estimating AOD using only angular retrieval has previously been demonstrated on data from the ENVISAT and PROBA-1 satellite instruments, and is extended here to the synergistic spectral and angular sampling of Sentinel-3 and the additional aerosol properties. The method is tested using hyperspectral, multi-angle Compact High Resolution Imaging Spectrometer (CHRIS) images. The values obtained from these CHRIS observations are validated using ground based sun-photometer measurements. Results from 22 image sets using the synergistic retrieval and improved aerosol models show an RMSE of 0.06 in AOD, reduced to 0.03 over vegetated targets.
Resumo:
A method has been developed to estimate aerosol optical depth (AOD) over land surfaces using high spatial resolution, hyperspectral, and multiangle Compact High Resolution Imaging Spectrometer (CHRIS)/Project for On Board Autonomy (PROBA) images. The CHRIS instrument is mounted aboard the PROBA satellite and provides up to 62 bands. The PROBA satellite allows pointing to obtain imagery from five different view angles within a short time interval. The method uses inversion of a coupled surface/atmosphere radiative transfer model and includes a general physical model of angular surface reflectance. An iterative process is used to determine the optimum value providing the best fit of the corrected reflectance values for a number of view angles and wavelengths with those provided by the physical model. This method has previously been demonstrated on data from the Advanced Along-Track Scanning Radiometer and is extended here to the spectral and angular sampling of CHRIS/PROBA. The values obtained from these observations are validated using ground-based sun-photometer measurements. Results from 22 image sets show an rms error of 0.11 in AOD at 550 nm, which is reduced to 0.06 after an automatic screening procedure.
Resumo:
We develop a method to derive aerosol properties over land surfaces using combined spectral and angular information, such as available from ESA Sentinel-3 mission, to be launched in 2015. A method of estimating aerosol optical depth (AOD) using only angular retrieval has previously been demonstrated on data from the ENVISAT and PROBA-1 satellite instruments, and is extended here to the synergistic spectral and angular sampling of Sentinel-3. The method aims to improve the estimation of AOD, and to explore the estimation of fine mode fraction (FMF) and single scattering albedo (SSA) over land surfaces by inversion of a coupled surface/atmosphere radiative transfer model. The surface model includes a general physical model of angular and spectral surface reflectance. An iterative process is used to determine the optimum value of the aerosol properties providing the best fit of the corrected reflectance values to the physical model. The method is tested using hyperspectral, multi-angle Compact High Resolution Imaging Spectrometer (CHRIS) images. The values obtained from these CHRIS observations are validated using ground-based sun photometer measurements. Results from 22 image sets using the synergistic retrieval and improved aerosol models show an RMSE of 0.06 in AOD, reduced to 0.03 over vegetated targets.
Resumo:
In numerical weather prediction, parameterisations are used to simulate missing physics in the model. These can be due to a lack of scientific understanding or a lack of computing power available to address all the known physical processes. Parameterisations are sources of large uncertainty in a model as parameter values used in these parameterisations cannot be measured directly and hence are often not well known; and the parameterisations themselves are also approximations of the processes present in the true atmosphere. Whilst there are many efficient and effective methods for combined state/parameter estimation in data assimilation (DA), such as state augmentation, these are not effective at estimating the structure of parameterisations. A new method of parameterisation estimation is proposed that uses sequential DA methods to estimate errors in the numerical models at each space-time point for each model equation. These errors are then fitted to pre-determined functional forms of missing physics or parameterisations that are based upon prior information. We applied the method to a one-dimensional advection model with additive model error, and it is shown that the method can accurately estimate parameterisations, with consistent error estimates. Furthermore, it is shown how the method depends on the quality of the DA results. The results indicate that this new method is a powerful tool in systematic model improvement.
Resumo:
Densities and viscosities of five vegetable oils (Babassu oil, Buriti oil, Brazil nut oil, macadamia oil, and grape seed oil) and of three blends of Buriti oil and soybean oil were measured as a function of temperature and correlated by empirical equations. The estimation capability of two types of predictive methodologies was tested using the measured data. The first group of methods was based on the fatty acid composition of the oils, while the other was based on their triacylglycerol composition, as a multicomponent system. In general, the six models tested presented a good representation of the physical properties considered in this work. A simple method of calculation is also proposed to predict the dynamic viscosity of methyl and ethyl ester biodiesels, based on the fatty acid composition of the original oil. Data presented in this work and the developed model can be valuable for designing processes and equipment for the edible oil industry and for biodiesel production.
Resumo:
In this paper, we proposed a new two-parameter lifetime distribution with increasing failure rate, the complementary exponential geometric distribution, which is complementary to the exponential geometric model proposed by Adamidis and Loukas (1998). The new distribution arises on a latent complementary risks scenario, in which the lifetime associated with a particular risk is not observable; rather, we observe only the maximum lifetime value among all risks. The properties of the proposed distribution are discussed, including a formal proof of its probability density function and explicit algebraic formulas for its reliability and failure rate functions, moments, including the mean and variance, variation coefficient, and modal value. The parameter estimation is based on the usual maximum likelihood approach. We report the results of a misspecification simulation study performed in order to assess the extent of misspecification errors when testing the exponential geometric distribution against our complementary one in the presence of different sample size and censoring percentage. The methodology is illustrated on four real datasets; we also make a comparison between both modeling approaches. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.