998 resultados para biased estimation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biased estimation has the advantage of reducing the mean squared error (MSE) of an estimator. The question of interest is how biased estimation affects model selection. In this paper, we introduce biased estimation to a range of model selection criteria. Specifically, we analyze the performance of the minimum description length (MDL) criterion based on biased and unbiased estimation and compare it against modern model selection criteria such as Kay's conditional model order estimator (CME), the bootstrap and the more recently proposed hook-and-loop resampling based model selection. The advantages and limitations of the considered techniques are discussed. The results indicate that, in some cases, biased estimators can slightly improve the selection of the correct model. We also give an example for which the CME with an unbiased estimator fails, but could regain its power when a biased estimator is used.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Macroscopic Fundamental Diagram (MFD) relates space-mean density and flow. Since the MFD represents the area-wide network traffic performance, studies on perimeter control strategies and network-wide traffic state estimation utilising the MFD concept have been reported. Most previous works have utilised data from fixed sensors, such as inductive loops, to estimate the MFD, which can cause biased estimation in urban networks due to queue spillovers at intersections. To overcome the limitation, recent literature reports the use of trajectory data obtained from probe vehicles. However, these studies have been conducted using simulated datasets; limited works have discussed the limitations of real datasets and their impact on the variable estimation. This study compares two methods for estimating traffic state variables of signalised arterial sections: a method based on cumulative vehicle counts (CUPRITE), and one based on vehicles’ trajectory from taxi Global Positioning System (GPS) log. The comparisons reveal some characteristics of taxi trajectory data available in Brisbane, Australia. The current trajectory data have limitations in quantity (i.e., the penetration rate), due to which the traffic state variables tend to be underestimated. Nevertheless, the trajectory-based method successfully captures the features of traffic states, which suggests that the trajectories from taxis can be a good estimator for the network-wide traffic states.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Estimating rare events from zero-heavy data (data with many zero values) is a common challenge in fisheries science and ecology. For example, loggerhead sea turtles (Caretta caretta) and leatherback sea turtles (Dermochelys coriacea) account for less than 1% of total catch in the U.S. Atlantic pelagic longline fishery. Nevertheless, the Southeast Fisheries Science Center (SEFSC) of the National Marine Fisheries Service (NMFS) is charged with assessing the effect of this fishery on these federally protected species. Annual estimates of loggerhead and leatherback bycatch in a fishery can affect fishery management and species conservation decisions. However, current estimates have wide confidence intervals, and their accuracy is unknown. We evaluate 3 estimation methods, each at 2 spatiotemporal scales, in simulations of 5 spatial scenarios representing incidental capture of sea turtles by the U.S. Atlantic pelagic longline fishery. The delta-log normal method of estimating bycatch for calendar quarter and fishing area strata was the least biased estimation method in the spatial scenarios believed to be most realistic. This result supports the current estimation procedure used by the SEFSC.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A National Frog Survey of Ireland is planned for spring 2011. We conducted a pilot survey of 25 water bodies in ten 0.25 km2 survey squares in Co. Mayo during spring 2010. Drainage ditches were the most commonly available site for breeding and, generally, two 100 m stretches of ditch were surveyed in each square. The restricted period for peak spawning activity renders any methodology utilizing only one site visit inherently risky. Consequently, each site was visited three times from late March to early April. Occurrence of spawn declined significantly from 72 % to 44 % between the first and third visit whilst the overall occurrence of spawn at all sites was 76 %. As the breeding season advanced, spawn either hatched or was predated and, therefore, disappeared. In those water bodies where spawning was late, however, greater densities of spawn were deposited than in those sites where breeding was early. Consequently, spawn density and estimated frog density did not differ significantly between site visits. Future surveys should nevertheless include multiple site visits to avoid biased estimation of species occurrence and distribution. Ecological succession was identified as the main threat present at 44 % of sites.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Obesity has been linked with elevated levels of C-reactive protein (CRP), and both have been associated with increased risk of mortality and cardiovascular disease (CVD). Previous studies have used a single ‘baseline’ measurement and such analyses cannot account for possible changes in these which may lead to a biased estimation of risk. Using four cohorts from CHANCES which had repeated measures in participants 50 years and older, multivariate time-dependent Cox proportional hazards was used to estimate hazard ratios (HR) and 95 % confidence intervals (CI) to examine the relationship between body mass index (BMI) and CRP with all-cause mortality and CVD. Being overweight (≥25–<30 kg/m2) or moderately obese (≥30–<35) tended to be associated with a lower risk of mortality compared to normal (≥18.5–<25): ESTHER, HR (95 % CI) 0.69 (0.58–0.82) and 0.78 (0.63–0.97); Rotterdam, 0.86 (0.79–0.94) and 0.80 (0.72–0.89). A similar relationship was found, but only for overweight in Glostrup, HR (95 % CI) 0.88 (0.76–1.02); and moderately obese in Tromsø, HR (95 % CI) 0.79 (0.62–1.01). Associations were not evident between repeated measures of BMI and CVD. Conversely, increasing CRP concentrations, measured on more than one occasion, were associated with an increasing risk of mortality and CVD. Being overweight or moderately obese is associated with a lower risk of mortality, while CRP, independent of BMI, is positively associated with mortality and CVD risk. If inflammation links CRP and BMI, they may participate in distinct/independent pathways. Accounting for independent changes in risk factors over time may be crucial for unveiling their effects on mortality and disease morbidity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Os objetivos neste trabalho foram comparar estimativas de parâmetros genéticos obtidas por meio de dois modelos - um contendo apenas efeitos aditivos e de dominância e outro que incluiu os efeitos aditivo-conjunto (complementaridade) e epistático - e testar alternativas de critérios objetivos para determinação do coeficiente lambda na aplicação da regressão de cumeeira. Os resultados obtidos revelaram que a escolha de um critério para determinação do coeficiente lambda em regressão de cumeeira depende não apenas do conjunto de dados e do modelo utilizado, mas, sobretudo, de um conhecimento prévio acerca do fenômeno estudado e do significado prático e da interpretação dos parâmetros encontrados. Pelo uso de modelos mais completos para avaliação de efeitos genéticos em bovinos de corte, pode-se identificar a contribuição dos efeitos aditivo-conjunto e epistático, que encontram-se embutidos no efeito de heterose estimado por modelos mais simples. A regressão de cumeeira é uma ferramenta que viabiliza a obtenção dessas estimativas mesmo na presença de forte multicolinearidade.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. A similar algorithm was proposed by Kimura, Yamamura, and Kobayashi (1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free parameter β ∈ [0,1) (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter β is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter, Bartlett, & Weaver, 2001) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward. ©2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the utility to computational Bayesian analyses of a particular family of recursive marginal likelihood estimators characterized by the (equivalent) algorithms known as "biased sampling" or "reverse logistic regression" in the statistics literature and "the density of states" in physics. Through a pair of numerical examples (including mixture modeling of the well-known galaxy dataset) we highlight the remarkable diversity of sampling schemes amenable to such recursive normalization, as well as the notable efficiency of the resulting pseudo-mixture distributions for gauging prior-sensitivity in the Bayesian model selection context. Our key theoretical contributions are to introduce a novel heuristic ("thermodynamic integration via importance sampling") for qualifying the role of the bridging sequence in this procedure, and to reveal various connections between these recursive estimators and the nested sampling technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a method for the estimation of thrust model parameters of uninhabited airborne systems using specific flight tests. Particular tests are proposed to simplify the estimation. The proposed estimation method is based on three steps. The first step uses a regression model in which the thrust is assumed constant. This allows us to obtain biased initial estimates of the aerodynamic coeficients of the surge model. In the second step, a robust nonlinear state estimator is implemented using the initial parameter estimates, and the model is augmented by considering the thrust as random walk. In the third step, the estimate of the thrust obtained by the observer is used to fit a polynomial model in terms of the propeller advanced ratio. We consider a numerical example based on Monte-Carlo simulations to quantify the sampling properties of the proposed estimator given realistic flight conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider estimating the total load from frequent flow data but less frequent concentration data. There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates that minimizes the biases and makes use of informative predictive variables. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized rating-curve approach with additional predictors that capture unique features in the flow data, such as the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and the discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. Forming this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach for two rivers delivering to the Great Barrier Reef, Queensland, Australia. One is a data set from the Burdekin River, and consists of the total suspended sediment (TSS) and nitrogen oxide (NO(x)) and gauged flow for 1997. The other dataset is from the Tully River, for the period of July 2000 to June 2008. For NO(x) Burdekin, the new estimates are very similar to the ratio estimates even when there is no relationship between the concentration and the flow. However, for the Tully dataset, by incorporating the additional predictive variables namely the discounted flow and flow phases (rising or recessing), we substantially improved the model fit, and thus the certainty with which the load is estimated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Previous studies have shown that the external growth records of the posterior adductor muscle scar (PAMS) of the bivalve Pinna nobilis are incomplete and do not produce accurate age estimations. We have developed a new methodology to study age and growth using the inner record of the PAMS, which avoids the necessity of costly in situ shell measurements or isotopic studies. Using the inner record we identified the positions of PAMS previously obscured by nacre and estimated the number of missing records in adult specimens with strong abrasion of the calcite layer in the anterior portion of the shell. The study of the PAMS and inner record of two shells that were 6 years old when collected showed that only 2 and 3 PAMS were observed, while 6 inner records could be counted, thus confirming our working methodology. Growth parameters of a P. nobilis population located in Moraira, Spain (western Mediterranean) were estimated with the new methodology and compared to those obtained using PAMS data and in situ measurements. For the comparisons, we applied different models considering the data alternatively as length-at-age (LA) and tag-recapture (TR). Among every method we tested to fit the Von Bertalanffy growth model, we observed that LA data from inner record fitted to the model using non-linear mixed effects and the estimation of missing records using the calcite width was the most appropriate. The equation obtained with this method, L = 573*(1 - e(-0.16(t-0.02))), is very similar to that calculated previously from in situ measurements for the same population.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the analysis of tagging data, it has been found that the least-squares method, based on the increment function known as the Fabens method, produces biased estimates because individual variability in growth is not allowed for. This paper modifies the Fabens method to account for individual variability in the length asymptote. Significance tests using t-statistics or log-likelihood ratio statistics may be applied to show the level of individual variability. Simulation results indicate that the modified method reduces the biases in the estimates to negligible proportions. Tagging data from tiger prawns (Penaeus esculentus and Penaeus semisulcatus) and rock lobster (Panulirus ornatus) are analysed as an illustration.