997 resultados para Traffic estimation.
Resumo:
We derive a new method for determining size-transition matrices (STMs) that eliminates probabilities of negative growth and accounts for individual variability. STMs are an important part of size-structured models, which are used in the stock assessment of aquatic species. The elements of STMs represent the probability of growth from one size class to another, given a time step. The growth increment over this time step can be modelled with a variety of methods, but when a population construct is assumed for the underlying growth model, the resulting STM may contain entries that predict negative growth. To solve this problem, we use a maximum likelihood method that incorporates individual variability in the asymptotic length, relative age at tagging, and measurement error to obtain von Bertalanffy growth model parameter estimates. The statistical moments for the future length given an individual's previous length measurement and time at liberty are then derived. We moment match the true conditional distributions with skewed-normal distributions and use these to accurately estimate the elements of the STMs. The method is investigated with simulated tag-recapture data and tag-recapture data gathered from the Australian eastern king prawn (Melicertus plebejus).
Resumo:
We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.
Resumo:
Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.
Resumo:
We consider estimating the total load from frequent flow data but less frequent concentration data. There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates that minimizes the biases and makes use of informative predictive variables. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized rating-curve approach with additional predictors that capture unique features in the flow data, such as the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and the discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. Forming this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach for two rivers delivering to the Great Barrier Reef, Queensland, Australia. One is a data set from the Burdekin River, and consists of the total suspended sediment (TSS) and nitrogen oxide (NO(x)) and gauged flow for 1997. The other dataset is from the Tully River, for the period of July 2000 to June 2008. For NO(x) Burdekin, the new estimates are very similar to the ratio estimates even when there is no relationship between the concentration and the flow. However, for the Tully dataset, by incorporating the additional predictive variables namely the discounted flow and flow phases (rising or recessing), we substantially improved the model fit, and thus the certainty with which the load is estimated.
Resumo:
The method of generalized estimating equations (GEEs) provides consistent estimates of the regression parameters in a marginal regression model for longitudinal data, even when the working correlation model is misspecified (Liang and Zeger, 1986). However, the efficiency of a GEE estimate can be seriously affected by the choice of the working correlation model. This study addresses this problem by proposing a hybrid method that combines multiple GEEs based on different working correlation models, using the empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working correlation structures correctly models the within-subject correlations, then this hybrid method provides the most efficient parameter estimates. In simulations, the hybrid method's finite-sample performance is superior to a GEE under any of the commonly used working correlation models and is almost fully efficient in all scenarios studied. The hybrid method is illustrated using data from a longitudinal study of the respiratory infection rates in 275 Indonesian children.
Resumo:
Despite great advances in very large scale integrated-circuit design and manufacturing, performance of even the best available high-speed, high-resolution analog-to-digital converter (ADC) is known to deteriorate while acquiring fast-rising, high-frequency, and nonrepetitive waveforms. Waveform digitizers (ADCs) used in high-voltage impulse recordings and measurements are invariably subjected to such waveforms. Errors resulting from a lowered ADC performance can be unacceptably high, especially when higher accuracies have to be achieved (e.g., when part of a reference measuring system). Static and dynamic nonlinearities (estimated independently) are vital indices for evaluating performance and suitability of ADCs to be used in such environments. Typically, the estimation of static nonlinearity involves 10-12 h of time or more (for a 12-b ADC) and the acquisition of millions of samples at high input frequencies for dynamic characterization. ADCs with even higher resolution and faster sampling speeds will soon become available. So, there is a need to reduce testing time for evaluating these parameters. This paper proposes a novel and time-efficient method for the simultaneous estimation of static and dynamic nonlinearity from a single test. This is achieved by conceiving a test signal, comprised of a high-frequency sinusoid (which addresses dynamic assessment) modulated by a low-frequency ramp (relevant to the static part). Details of implementation and results on two digitizers are presented and compared with nonlinearities determined by the existing standardized approaches. Good agreement in results and time savings achievable indicates its suitability.
Resumo:
Consider a general regression model with an arbitrary and unknown link function and a stochastic selection variable that determines whether the outcome variable is observable or missing. The paper proposes U-statistics that are based on kernel functions as estimators for the directions of the parameter vectors in the link function and the selection equation, and shows that these estimators are consistent and asymptotically normal.
Resumo:
We propose a simple method of constructing quasi-likelihood functions for dependent data based on conditional-mean-variance relationships, and apply the method to estimating the fractal dimension from box-counting data. Simulation studies were carried out to compare this method with the traditional methods. We also applied this technique to real data from fishing grounds in the Gulf of Carpentaria, Australia
Resumo:
Robust estimation often relies on a dispersion function that is more slowly varying at large values than the square function. However, the choice of tuning constant in dispersion functions may impact the estimation efficiency to a great extent. For a given family of dispersion functions such as the Huber family, we suggest obtaining the "best" tuning constant from the data so that the asymptotic efficiency is maximized. This data-driven approach can automatically adjust the value of the tuning constant to provide the necessary resistance against outliers. Simulation studies show that substantial efficiency can be gained by this data-dependent approach compared with the traditional approach in which the tuning constant is fixed. We briefly illustrate the proposed method using two datasets.
Resumo:
We consider estimation of mortality rates and growth parameters from length-frequency data of a fish stock and derive the underlying length distribution of the population and the catch when there is individual variability in the von Bertalanffy growth parameter L-infinity. The model is flexible enough to accommodate 1) any recruitment pattern as a function of both time and length, 2) length-specific selectivity, and 3) varying fishing effort over time. The maximum likelihood method gives consistent estimates, provided the underlying distribution for individual variation in growth is correctly specified. Simulation results indicate that our method is reasonably robust to violations in the assumptions. The method is applied to tiger prawn data (Penaeus semisulcatus) to obtain estimates of natural and fishing mortality.
Resumo:
This article develops a method for analysis of growth data with multiple recaptures when the initial ages for all individuals are unknown. The existing approaches either impute the initial ages or model them as random effects. Assumptions about the initial age are not verifiable because all the initial ages are unknown. We present an alternative approach that treats all the lengths including the length at first capture as correlated repeated measures for each individual. Optimal estimating equations are developed using the generalized estimating equations approach that only requires the first two moment assumptions. Explicit expressions for estimation of both mean growth parameters and variance components are given to minimize the computational complexity. Simulation studies indicate that the proposed method works well. Two real data sets are analyzed for illustration, one from whelks (Dicathais aegaota) and the other from southern rock lobster (Jasus edwardsii) in South Australia.
Resumo:
The method of generalised estimating equations for regression modelling of clustered outcomes allows for specification of a working matrix that is intended to approximate the true correlation matrix of the observations. We investigate the asymptotic relative efficiency of the generalised estimating equation for the mean parameters when the correlation parameters are estimated by various methods. The asymptotic relative efficiency depends on three-features of the analysis, namely (i) the discrepancy between the working correlation structure and the unobservable true correlation structure, (ii) the method by which the correlation parameters are estimated and (iii) the 'design', by which we refer to both the structures of the predictor matrices within clusters and distribution of cluster sizes. Analytical and numerical studies of realistic data-analysis scenarios show that choice of working covariance model has a substantial impact on regression estimator efficiency. Protection against avoidable loss of efficiency associated with covariance misspecification is obtained when a 'Gaussian estimation' pseudolikelihood procedure is used with an AR(1) structure.
Resumo:
We consider estimation of mortality rates and growth parameters from length-frequency data of a fish stock when there is individual variability in the von Bertalanffy growth parameter L-infinity and investigate the possible bias in the estimates when the individual variability is ignored. Three methods are examined: (i) the regression method based on the Beverton and Holt's (1956, Rapp. P.V. Reun. Cons. Int. Explor. Mer, 140: 67-83) equation; (ii) the moment method of Powell (1979, Rapp. PV. Reun. Int. Explor. Mer, 175: 167-169); and (iii) a generalization of Powell's method that estimates the individual variability to be incorporated into the estimation. It is found that the biases in the estimates from the existing methods are, in general, substantial, even when individual variability in growth is small and recruitment is uniform, and the generalized method performs better in terms of bias but is subject to a larger variation. There is a need to develop robust and flexible methods to deal with individual variability in the analysis of length-frequency data.
Resumo:
In the analysis of tagging data, it has been found that the least-squares method, based on the increment function known as the Fabens method, produces biased estimates because individual variability in growth is not allowed for. This paper modifies the Fabens method to account for individual variability in the length asymptote. Significance tests using t-statistics or log-likelihood ratio statistics may be applied to show the level of individual variability. Simulation results indicate that the modified method reduces the biases in the estimates to negligible proportions. Tagging data from tiger prawns (Penaeus esculentus and Penaeus semisulcatus) and rock lobster (Panulirus ornatus) are analysed as an illustration.
Resumo:
The von Bertalanffy growth model is extended to incorporate explanatory variables. The generalized model includes the switched growth model and the seasonal growth model as special cases, and can also be used to assess the tagging effect on growth. Distribution-free and consistent estimating functions are constructed for estimation of growth parameters from tag-recapture data in which age at release is unknown. This generalizes the work of James (1991, Biometrics 47 1519-1530) who considered the classical model and allowed for individual variability in growth. A real dataset from barramundi (Lates calcarifer) is analysed to estimate the growth parameters and possible effect of tagging on growth.