966 resultados para density estimation
Resumo:
We consider estimating the total load from frequent flow data but less frequent concentration data. There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates that minimizes the biases and makes use of informative predictive variables. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized rating-curve approach with additional predictors that capture unique features in the flow data, such as the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and the discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. Forming this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach for two rivers delivering to the Great Barrier Reef, Queensland, Australia. One is a data set from the Burdekin River, and consists of the total suspended sediment (TSS) and nitrogen oxide (NO(x)) and gauged flow for 1997. The other dataset is from the Tully River, for the period of July 2000 to June 2008. For NO(x) Burdekin, the new estimates are very similar to the ratio estimates even when there is no relationship between the concentration and the flow. However, for the Tully dataset, by incorporating the additional predictive variables namely the discounted flow and flow phases (rising or recessing), we substantially improved the model fit, and thus the certainty with which the load is estimated.
Resumo:
The method of generalized estimating equations (GEEs) provides consistent estimates of the regression parameters in a marginal regression model for longitudinal data, even when the working correlation model is misspecified (Liang and Zeger, 1986). However, the efficiency of a GEE estimate can be seriously affected by the choice of the working correlation model. This study addresses this problem by proposing a hybrid method that combines multiple GEEs based on different working correlation models, using the empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working correlation structures correctly models the within-subject correlations, then this hybrid method provides the most efficient parameter estimates. In simulations, the hybrid method's finite-sample performance is superior to a GEE under any of the commonly used working correlation models and is almost fully efficient in all scenarios studied. The hybrid method is illustrated using data from a longitudinal study of the respiratory infection rates in 275 Indonesian children.
Resumo:
We consider ranked-based regression models for clustered data analysis. A weighted Wilcoxon rank method is proposed to take account of within-cluster correlations and varying cluster sizes. The asymptotic normality of the resulting estimators is established. A method to estimate covariance of the estimators is also given, which can bypass estimation of the density function. Simulation studies are carried out to compare different estimators for a number of scenarios on the correlation structure, presence/absence of outliers and different correlation values. The proposed methods appear to perform well, in particular, the one incorporating the correlation in the weighting achieves the highest efficiency and robustness against misspecification of correlation structure and outliers. A real example is provided for illustration.
Resumo:
Despite great advances in very large scale integrated-circuit design and manufacturing, performance of even the best available high-speed, high-resolution analog-to-digital converter (ADC) is known to deteriorate while acquiring fast-rising, high-frequency, and nonrepetitive waveforms. Waveform digitizers (ADCs) used in high-voltage impulse recordings and measurements are invariably subjected to such waveforms. Errors resulting from a lowered ADC performance can be unacceptably high, especially when higher accuracies have to be achieved (e.g., when part of a reference measuring system). Static and dynamic nonlinearities (estimated independently) are vital indices for evaluating performance and suitability of ADCs to be used in such environments. Typically, the estimation of static nonlinearity involves 10-12 h of time or more (for a 12-b ADC) and the acquisition of millions of samples at high input frequencies for dynamic characterization. ADCs with even higher resolution and faster sampling speeds will soon become available. So, there is a need to reduce testing time for evaluating these parameters. This paper proposes a novel and time-efficient method for the simultaneous estimation of static and dynamic nonlinearity from a single test. This is achieved by conceiving a test signal, comprised of a high-frequency sinusoid (which addresses dynamic assessment) modulated by a low-frequency ramp (relevant to the static part). Details of implementation and results on two digitizers are presented and compared with nonlinearities determined by the existing standardized approaches. Good agreement in results and time savings achievable indicates its suitability.
Resumo:
Consider a general regression model with an arbitrary and unknown link function and a stochastic selection variable that determines whether the outcome variable is observable or missing. The paper proposes U-statistics that are based on kernel functions as estimators for the directions of the parameter vectors in the link function and the selection equation, and shows that these estimators are consistent and asymptotically normal.
Resumo:
We propose a simple method of constructing quasi-likelihood functions for dependent data based on conditional-mean-variance relationships, and apply the method to estimating the fractal dimension from box-counting data. Simulation studies were carried out to compare this method with the traditional methods. We also applied this technique to real data from fishing grounds in the Gulf of Carpentaria, Australia
Resumo:
Robust estimation often relies on a dispersion function that is more slowly varying at large values than the square function. However, the choice of tuning constant in dispersion functions may impact the estimation efficiency to a great extent. For a given family of dispersion functions such as the Huber family, we suggest obtaining the "best" tuning constant from the data so that the asymptotic efficiency is maximized. This data-driven approach can automatically adjust the value of the tuning constant to provide the necessary resistance against outliers. Simulation studies show that substantial efficiency can be gained by this data-dependent approach compared with the traditional approach in which the tuning constant is fixed. We briefly illustrate the proposed method using two datasets.
Resumo:
We consider estimation of mortality rates and growth parameters from length-frequency data of a fish stock and derive the underlying length distribution of the population and the catch when there is individual variability in the von Bertalanffy growth parameter L-infinity. The model is flexible enough to accommodate 1) any recruitment pattern as a function of both time and length, 2) length-specific selectivity, and 3) varying fishing effort over time. The maximum likelihood method gives consistent estimates, provided the underlying distribution for individual variation in growth is correctly specified. Simulation results indicate that our method is reasonably robust to violations in the assumptions. The method is applied to tiger prawn data (Penaeus semisulcatus) to obtain estimates of natural and fishing mortality.
Resumo:
The effects of fish density distribution and effort distribution on the overall catchability coefficient are examined. Emphasis is also on how aggregation and effort distribution interact to affect overall catch rate [catch per unit effort (cpue)]. In particular, it is proposed to evaluate three indices, the catchability index, the knowledge parameter, and the aggregation index, to describe the effectiveness of targeting and the effects on overall catchability in the stock area. Analytical expressions are provided so that these indices can easily be calculated. The average of the cpue calculated from small units where fishing is random is a better index for measuring the stock abundance. The overall cpue, the ratio of lumped catch and effort, together with the average cpue, can be used to assess the effectiveness of targeting. The proposed methods are applied to the commercial catch and effort data from the Australian northern prawn fishery. The indices are obtained assuming a power law for the effort distribution as an approximation of targeting during the fishing operation. Targeting increased catchability in some areas by 10%, which may have important implications on management advice.
Resumo:
This article develops a method for analysis of growth data with multiple recaptures when the initial ages for all individuals are unknown. The existing approaches either impute the initial ages or model them as random effects. Assumptions about the initial age are not verifiable because all the initial ages are unknown. We present an alternative approach that treats all the lengths including the length at first capture as correlated repeated measures for each individual. Optimal estimating equations are developed using the generalized estimating equations approach that only requires the first two moment assumptions. Explicit expressions for estimation of both mean growth parameters and variance components are given to minimize the computational complexity. Simulation studies indicate that the proposed method works well. Two real data sets are analyzed for illustration, one from whelks (Dicathais aegaota) and the other from southern rock lobster (Jasus edwardsii) in South Australia.
Resumo:
The method of generalised estimating equations for regression modelling of clustered outcomes allows for specification of a working matrix that is intended to approximate the true correlation matrix of the observations. We investigate the asymptotic relative efficiency of the generalised estimating equation for the mean parameters when the correlation parameters are estimated by various methods. The asymptotic relative efficiency depends on three-features of the analysis, namely (i) the discrepancy between the working correlation structure and the unobservable true correlation structure, (ii) the method by which the correlation parameters are estimated and (iii) the 'design', by which we refer to both the structures of the predictor matrices within clusters and distribution of cluster sizes. Analytical and numerical studies of realistic data-analysis scenarios show that choice of working covariance model has a substantial impact on regression estimator efficiency. Protection against avoidable loss of efficiency associated with covariance misspecification is obtained when a 'Gaussian estimation' pseudolikelihood procedure is used with an AR(1) structure.
Resumo:
We consider estimation of mortality rates and growth parameters from length-frequency data of a fish stock when there is individual variability in the von Bertalanffy growth parameter L-infinity and investigate the possible bias in the estimates when the individual variability is ignored. Three methods are examined: (i) the regression method based on the Beverton and Holt's (1956, Rapp. P.V. Reun. Cons. Int. Explor. Mer, 140: 67-83) equation; (ii) the moment method of Powell (1979, Rapp. PV. Reun. Int. Explor. Mer, 175: 167-169); and (iii) a generalization of Powell's method that estimates the individual variability to be incorporated into the estimation. It is found that the biases in the estimates from the existing methods are, in general, substantial, even when individual variability in growth is small and recruitment is uniform, and the generalized method performs better in terms of bias but is subject to a larger variation. There is a need to develop robust and flexible methods to deal with individual variability in the analysis of length-frequency data.
Resumo:
In the analysis of tagging data, it has been found that the least-squares method, based on the increment function known as the Fabens method, produces biased estimates because individual variability in growth is not allowed for. This paper modifies the Fabens method to account for individual variability in the length asymptote. Significance tests using t-statistics or log-likelihood ratio statistics may be applied to show the level of individual variability. Simulation results indicate that the modified method reduces the biases in the estimates to negligible proportions. Tagging data from tiger prawns (Penaeus esculentus and Penaeus semisulcatus) and rock lobster (Panulirus ornatus) are analysed as an illustration.
Resumo:
The von Bertalanffy growth model is extended to incorporate explanatory variables. The generalized model includes the switched growth model and the seasonal growth model as special cases, and can also be used to assess the tagging effect on growth. Distribution-free and consistent estimating functions are constructed for estimation of growth parameters from tag-recapture data in which age at release is unknown. This generalizes the work of James (1991, Biometrics 47 1519-1530) who considered the classical model and allowed for individual variability in growth. A real dataset from barramundi (Lates calcarifer) is analysed to estimate the growth parameters and possible effect of tagging on growth.
Resumo:
Records of shrimp growth and water quality made during 12 crops from each of 48 ponds, over a period of 6.5 years, were provided by a Queensland, Australia, commercial shrimp farm, These data were analysed with a new growth model derived from the Gompertz model. The results indicate that water temperature, mortality and pond age significantly affect growth rates. After 180 days, shrimp reach 34 g at constant 30 degrees C, but only 15 g after the same amount of time at 20 degrees C. Mortality, through thinning the density of shrimp in the ponds, increased the growth rate, but the effect is small. With continual production, growth rates at first remained steady, then appeared to decrease for the sixth and seventh crop, after which they have increased steadily with each crop. It appears that conservative pond management, together with a gradual improvement in husbandry techniques, particularly feed management, brought about this change. This has encouraging implications for the long-term sustainability of the farming methods used. The growth model can be used to predict productivity, and hence, profitability, of new aquaculture locations or new production strategies.