891 resultados para sampling error


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analysis of a major multi-site epidemiologic study of heart disease has required estimation of the pairwise correlation of several measurements across sub-populations. Because the measurements from each sub-population were subject to sampling variability, the Pearson product moment estimator of these correlations produces biased estimates. This paper proposes a model that takes into account within and between sub-population variation, provides algorithms for obtaining maximum likelihood estimates of these correlations and discusses several approaches for obtaining interval estimates. (C) 1997 by John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Combinatorial optimization problems share an interesting property with spin glass systems in that their state spaces can exhibit ultrametric structure. We use sampling methods to analyse the error surfaces of feedforward multi-layer perceptron neural networks learning encoder problems. The third order statistics of these points of attraction are examined and found to be arranged in a highly ultrametric way. This is a unique result for a finite, continuous parameter space. The implications of this result are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The choice of genotyping families vs unrelated individuals is a critical factor in any large-scale linkage disequilibrium (LD) study. The use of unrelated individuals for such studies is promising, but in contrast to family designs, unrelated samples do not facilitate detection of genotyping errors, which have been shown to be of great importance for LD and linkage studies and may be even more important in genotyping collaborations across laboratories. Here we employ some of the most commonly-used analysis methods to examine the relative accuracy of haplotype estimation using families vs unrelateds in the presence of genotyping error. The results suggest that even slight amounts of genotyping error can significantly decrease haplotype frequency and reconstruction accuracy, that the ability to detect such errors in large families is essential when the number/complexity of haplotypes is high (low LD/common alleles). In contrast, in situations of low haplotype complexity (high LD and/or many rare alleles) unrelated individuals offer such a high degree of accuracy that there is little reason for less efficient family designs. Moreover, parent-child trios, which comprise the most popular family design and the most efficient in terms of the number of founder chromosomes per genotype but which contain little information for error detection, offer little or no gain over unrelated samples in nearly all cases, and thus do not seem a useful sampling compromise between unrelated individuals and large families. The implications of these results are discussed in the context of large-scale LD mapping projects such as the proposed genome-wide haplotype map.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forest regrowth occupies an extensive and increasing area in the Amazon basin, but accurate assessment of the impact of regrowth on carbon and nutrient cycles has been hampered by a paucity of available allometric equations. We develop pooled and species-specific equations for total aboveground biomass for a study site in the eastern Amazon that had been abandoned for 15 years. Field work was conducted using randomized branch sampling, a rapid technique that has seen little use in tropical forests. High consistency of sample paths in randomized branch sampling, as measured by the standard error of individual paths (14%), suggests the method may provide substantial efficiencies when compared to traditional procedures. The best fitting equations in this study used the traditional form Y=a×DBHb, where Y is biomass, DBH is diameter at breast height, and a and b are both species-specific parameters. Species-specific equations of the form Y=a(BA×H), where Y is biomass, BA is tree basal area, H is tree height, and a is a species-specific parameter, fit almost as well. Comparison with previously published equations indicated errors from -33% to +29% would have occurred using off-site relationships. We also present equations for stemwood, twigs, and foliage as biomass components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several methods have been suggested to estimate non-linear models with interaction terms in the presence of measurement error. Structural equation models eliminate measurement error bias, but require large samples. Ordinary least squares regression on summated scales, regression on factor scores and partial least squares are appropriate for small samples but do not correct measurement error bias. Two stage least squares regression does correct measurement error bias but the results strongly depend on the instrumental variable choice. This article discusses the old disattenuated regression method as an alternative for correcting measurement error in small samples. The method is extended to the case of interaction terms and is illustrated on a model that examines the interaction effect of innovation and style of use of budgets on business performance. Alternative reliability estimates that can be used to disattenuate the estimates are discussed. A comparison is made with the alternative methods. Methods that do not correct for measurement error bias perform very similarly and considerably worse than disattenuated regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Le suivi thérapeutique est recommandé pour l’ajustement de la dose des agents immunosuppresseurs. La pertinence de l’utilisation de la surface sous la courbe (SSC) comme biomarqueur dans l’exercice du suivi thérapeutique de la cyclosporine (CsA) dans la transplantation des cellules souches hématopoïétiques est soutenue par un nombre croissant d’études. Cependant, pour des raisons intrinsèques à la méthode de calcul de la SSC, son utilisation en milieu clinique n’est pas pratique. Les stratégies d’échantillonnage limitées, basées sur des approches de régression (R-LSS) ou des approches Bayésiennes (B-LSS), représentent des alternatives pratiques pour une estimation satisfaisante de la SSC. Cependant, pour une application efficace de ces méthodologies, leur conception doit accommoder la réalité clinique, notamment en requérant un nombre minimal de concentrations échelonnées sur une courte durée d’échantillonnage. De plus, une attention particulière devrait être accordée à assurer leur développement et validation adéquates. Il est aussi important de mentionner que l’irrégularité dans le temps de la collecte des échantillons sanguins peut avoir un impact non-négligeable sur la performance prédictive des R-LSS. Or, à ce jour, cet impact n’a fait l’objet d’aucune étude. Cette thèse de doctorat se penche sur ces problématiques afin de permettre une estimation précise et pratique de la SSC. Ces études ont été effectuées dans le cadre de l’utilisation de la CsA chez des patients pédiatriques ayant subi une greffe de cellules souches hématopoïétiques. D’abord, des approches de régression multiple ainsi que d’analyse pharmacocinétique de population (Pop-PK) ont été utilisées de façon constructive afin de développer et de valider adéquatement des LSS. Ensuite, plusieurs modèles Pop-PK ont été évalués, tout en gardant à l’esprit leur utilisation prévue dans le contexte de l’estimation de la SSC. Aussi, la performance des B-LSS ciblant différentes versions de SSC a également été étudiée. Enfin, l’impact des écarts entre les temps d’échantillonnage sanguins réels et les temps nominaux planifiés, sur la performance de prédiction des R-LSS a été quantifié en utilisant une approche de simulation qui considère des scénarios diversifiés et réalistes représentant des erreurs potentielles dans la cédule des échantillons sanguins. Ainsi, cette étude a d’abord conduit au développement de R-LSS et B-LSS ayant une performance clinique satisfaisante, et qui sont pratiques puisqu’elles impliquent 4 points d’échantillonnage ou moins obtenus dans les 4 heures post-dose. Une fois l’analyse Pop-PK effectuée, un modèle structural à deux compartiments avec un temps de délai a été retenu. Cependant, le modèle final - notamment avec covariables - n’a pas amélioré la performance des B-LSS comparativement aux modèles structuraux (sans covariables). En outre, nous avons démontré que les B-LSS exhibent une meilleure performance pour la SSC dérivée des concentrations simulées qui excluent les erreurs résiduelles, que nous avons nommée « underlying AUC », comparée à la SSC observée qui est directement calculée à partir des concentrations mesurées. Enfin, nos résultats ont prouvé que l’irrégularité des temps de la collecte des échantillons sanguins a un impact important sur la performance prédictive des R-LSS; cet impact est en fonction du nombre des échantillons requis, mais encore davantage en fonction de la durée du processus d’échantillonnage impliqué. Nous avons aussi mis en évidence que les erreurs d’échantillonnage commises aux moments où la concentration change rapidement sont celles qui affectent le plus le pouvoir prédictif des R-LSS. Plus intéressant, nous avons mis en exergue que même si différentes R-LSS peuvent avoir des performances similaires lorsque basées sur des temps nominaux, leurs tolérances aux erreurs des temps d’échantillonnage peuvent largement différer. En fait, une considération adéquate de l'impact de ces erreurs peut conduire à une sélection et une utilisation plus fiables des R-LSS. Par une investigation approfondie de différents aspects sous-jacents aux stratégies d’échantillonnages limités, cette thèse a pu fournir des améliorations méthodologiques notables, et proposer de nouvelles voies pour assurer leur utilisation de façon fiable et informée, tout en favorisant leur adéquation à la pratique clinique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several methods have been suggested to estimate non-linear models with interaction terms in the presence of measurement error. Structural equation models eliminate measurement error bias, but require large samples. Ordinary least squares regression on summated scales, regression on factor scores and partial least squares are appropriate for small samples but do not correct measurement error bias. Two stage least squares regression does correct measurement error bias but the results strongly depend on the instrumental variable choice. This article discusses the old disattenuated regression method as an alternative for correcting measurement error in small samples. The method is extended to the case of interaction terms and is illustrated on a model that examines the interaction effect of innovation and style of use of budgets on business performance. Alternative reliability estimates that can be used to disattenuate the estimates are discussed. A comparison is made with the alternative methods. Methods that do not correct for measurement error bias perform very similarly and considerably worse than disattenuated regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the Radiative Atmospheric Divergence Using ARM Mobile Facility GERB and AMMA Stations (RADAGAST) project we calculate the divergence of radiative flux across the atmosphere by comparing fluxes measured at each end of an atmospheric column above Niamey, in the African Sahel region. The combination of broadband flux measurements from geostationary orbit and the deployment for over 12 months of a comprehensive suite of active and passive instrumentation at the surface eliminates a number of sampling issues that could otherwise affect divergence calculations of this sort. However, one sampling issue that challenges the project is the fact that the surface flux data are essentially measurements made at a point, while the top-of-atmosphere values are taken over a solid angle that corresponds to an area at the surface of some 2500 km2. Variability of cloud cover and aerosol loading in the atmosphere mean that the downwelling fluxes, even when averaged over a day, will not be an exact match to the area-averaged value over that larger area, although we might expect that it is an unbiased estimate thereof. The heterogeneity of the surface, for example, fixed variations in albedo, further means that there is a likely systematic difference in the corresponding upwelling fluxes. In this paper we characterize and quantify this spatial sampling problem. We bound the root-mean-square error in the downwelling fluxes by exploiting a second set of surface flux measurements from a site that was run in parallel with the main deployment. The differences in the two sets of fluxes lead us to an upper bound to the sampling uncertainty, and their correlation leads to another which is probably optimistic as it requires certain other conditions to be met. For the upwelling fluxes we use data products from a number of satellite instruments to characterize the relevant heterogeneities and so estimate the systematic effects that arise from the flux measurements having to be taken at a single point. The sampling uncertainties vary with the season, being higher during the monsoon period. We find that the sampling errors for the daily average flux are small for the shortwave irradiance, generally less than 5 W m−2, under relatively clear skies, but these increase to about 10 W m−2 during the monsoon. For the upwelling fluxes, again taking daily averages, systematic errors are of order 10 W m−2 as a result of albedo variability. The uncertainty on the longwave component of the surface radiation budget is smaller than that on the shortwave component, in all conditions, but a bias of 4 W m−2 is calculated to exist in the surface leaving longwave flux.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Models developed to identify the rates and origins of nutrient export from land to stream require an accurate assessment of the nutrient load present in the water body in order to calibrate model parameters and structure. These data are rarely available at a representative scale and in an appropriate chemical form except in research catchments. Observational errors associated with nutrient load estimates based on these data lead to a high degree of uncertainty in modelling and nutrient budgeting studies. Here, daily paired instantaneous P and flow data for 17 UK research catchments covering a total of 39 water years (WY) have been used to explore the nature and extent of the observational error associated with nutrient flux estimates based on partial fractions and infrequent sampling. The daily records were artificially decimated to create 7 stratified sampling records, 7 weekly records, and 30 monthly records from each WY and catchment. These were used to evaluate the impact of sampling frequency on load estimate uncertainty. The analysis underlines the high uncertainty of load estimates based on monthly data and individual P fractions rather than total P. Catchments with a high baseflow index and/or low population density were found to return a lower RMSE on load estimates when sampled infrequently than those with a tow baseflow index and high population density. Catchment size was not shown to be important, though a limitation of this study is that daily records may fail to capture the full range of P export behaviour in smaller catchments with flashy hydrographs, leading to an underestimate of uncertainty in Load estimates for such catchments. Further analysis of sub-daily records is needed to investigate this fully. Here, recommendations are given on load estimation methodologies for different catchment types sampled at different frequencies, and the ways in which this analysis can be used to identify observational error and uncertainty for model calibration and nutrient budgeting studies. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Long-term monitoring of forest soils as part of a pan-European network to detect environmental change depends on an accurate determination of the mean of the soil properties at each monitoring event. Forest soil is known to be very variable spatially, however. A study was undertaken to explore and quantify this variability at three forest monitoring plots in Britain. Detailed soil sampling was carried out, and the data from the chemical analyses were analysed by classical statistics and geostatistics. An analysis of variance showed that there were no consistent effects from the sample sites in relation to the position of the trees. The variogram analysis showed that there was spatial dependence at each site for several variables and some varied in an apparently periodic way. An optimal sampling analysis based on the multivariate variogram for each site suggested that a bulked sample from 36 cores would reduce error to an acceptable level. Future sampling should be designed so that it neither targets nor avoids trees and disturbed ground. This can be achieved best by using a stratified random sampling design.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Representative Soil Sampling Scheme of England and Wales has recorded information on the soil of agricultural land in England and Wales since 1969. It is a valuable source of information about the soil in the context of monitoring for sustainable agricultural development. Changes in soil nutrient status and pH were examined over the period 1971-2001. Several methods of statistical analysis were applied to data from the surveys during this period. The main focus here is on the data for 1971, 1981, 1991 and 2001. The results of examining change over time in general show that levels of potassium in the soil have increased, those of magnesium have remained fairly constant, those of phosphorus have declined and pH has changed little. Future sampling needs have been assessed in the context of monitoring, to determine the mean at a given level of confidence and tolerable error and to detect change in the mean over time at these same levels over periods of 5 and 10 years. The results of a non-hierarchical multivariate classification suggest that England and Wales could be stratified to optimize future sampling and analysis. To monitor soil quality and health more generally than for agriculture, more of the country should be sampled and a wider range of properties recorded.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we generalise a previously-described model of the error-prone polymerase chain reaction (PCR) reaction to conditions of arbitrarily variable amplification efficiency and initial population size. Generalisation of the model to these conditions improves the correspondence to observed and expected behaviours of PCR, and restricts the extent to which the model may explore sequence space for a prescribed set of parameters. Error-prone PCR in realistic reaction conditions is predicted to be less effective at generating grossly divergent sequences than the original model. The estimate of mutation rate per cycle by sampling sequences from an in vitro PCR experiment is correspondingly affected by the choice of model and parameters. (c) 2005 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Numerical climate models constitute the best available tools to tackle the problem of climate prediction. Two assumptions lie at the heart of their suitability: (1) a climate attractor exists, and (2) the numerical climate model's attractor lies on the actual climate attractor, or at least on the projection of the climate attractor on the model's phase space. In this contribution, the Lorenz '63 system is used both as a prototype system and as an imperfect model to investigate the implications of the second assumption. By comparing results drawn from the Lorenz '63 system and from numerical weather and climate models, the implications of using imperfect models for the prediction of weather and climate are discussed. It is shown that the imperfect model's orbit and the system's orbit are essentially different, purely due to model error and not to sensitivity to initial conditions. Furthermore, if a model is a perfect model, then the attractor, reconstructed by sampling a collection of initialised model orbits (forecast orbits), will be invariant to forecast lead time. This conclusion provides an alternative method for the assessment of climate models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mixed models may be defined with or without reference to sampling, and can be used to predict realized random effects, as when estimating the latent values of study subjects measured with response error. When the model is specified without reference to sampling, a simple mixed model includes two random variables, one stemming from an exchangeable distribution of latent values of study subjects and the other, from the study subjects` response error distributions. Positive probabilities are assigned to both potentially realizable responses and artificial responses that are not potentially realizable, resulting in artificial latent values. In contrast, finite population mixed models represent the two-stage process of sampling subjects and measuring their responses, where positive probabilities are only assigned to potentially realizable responses. A comparison of the estimators over the same potentially realizable responses indicates that the optimal linear mixed model estimator (the usual best linear unbiased predictor, BLUP) is often (but not always) more accurate than the comparable finite population mixed model estimator (the FPMM BLUP). We examine a simple example and provide the basis for a broader discussion of the role of conditioning, sampling, and model assumptions in developing inference.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main object of this paper is to discuss the Bayes estimation of the regression coefficients in the elliptically distributed simple regression model with measurement errors. The posterior distribution for the line parameters is obtained in a closed form, considering the following: the ratio of the error variances is known, informative prior distribution for the error variance, and non-informative prior distributions for the regression coefficients and for the incidental parameters. We proved that the posterior distribution of the regression coefficients has at most two real modes. Situations with a single mode are more likely than those with two modes, especially in large samples. The precision of the modal estimators is studied by deriving the Hessian matrix, which although complicated can be computed numerically. The posterior mean is estimated by using the Gibbs sampling algorithm and approximations by normal distributions. The results are applied to a real data set and connections with results in the literature are reported. (C) 2011 Elsevier B.V. All rights reserved.