57 resultados para WLT Estimators
Resumo:
We consider ranked-based regression models for clustered data analysis. A weighted Wilcoxon rank method is proposed to take account of within-cluster correlations and varying cluster sizes. The asymptotic normality of the resulting estimators is established. A method to estimate covariance of the estimators is also given, which can bypass estimation of the density function. Simulation studies are carried out to compare different estimators for a number of scenarios on the correlation structure, presence/absence of outliers and different correlation values. The proposed methods appear to perform well, in particular, the one incorporating the correlation in the weighting achieves the highest efficiency and robustness against misspecification of correlation structure and outliers. A real example is provided for illustration.
Resumo:
In analysis of longitudinal data, the variance matrix of the parameter estimates is usually estimated by the 'sandwich' method, in which the variance for each subject is estimated by its residual products. We propose smooth bootstrap methods by perturbing the estimating functions to obtain 'bootstrapped' realizations of the parameter estimates for statistical inference. Our extensive simulation studies indicate that the variance estimators by our proposed methods can not only correct the bias of the sandwich estimator but also improve the confidence interval coverage. We applied the proposed method to a data set from a clinical trial of antibiotics for leprosy.
Resumo:
We consider rank-based regression models for repeated measures. To account for possible withinsubject correlations, we decompose the total ranks into between- and within-subject ranks and obtain two different estimators based on between- and within-subject ranks. A simple perturbation method is then introduced to generate bootstrap replicates of the estimating functions and the parameter estimates. This provides a convenient way for combining the corresponding two types of estimating function for more efficient estimation.
Resumo:
This paper considers the one-sample sign test for data obtained from general ranked set sampling when the number of observations for each rank are not necessarily the same, and proposes a weighted sign test because observations with different ranks are not identically distributed. The optimal weight for each observation is distribution free and only depends on its associated rank. It is shown analytically that (1) the weighted version always improves the Pitman efficiency for all distributions; and (2) the optimal design is to select the median from each ranked set.
Resumo:
We consider the analysis of longitudinal data when the covariance function is modeled by additional parameters to the mean parameters. In general, inconsistent estimators of the covariance (variance/correlation) parameters will be produced when the "working" correlation matrix is misspecified, which may result in great loss of efficiency of the mean parameter estimators (albeit the consistency is preserved). We consider using different "Working" correlation models for the variance and the mean parameters. In particular, we find that an independence working model should be used for estimating the variance parameters to ensure their consistency in case the correlation structure is misspecified. The designated "working" correlation matrices should be used for estimating the mean and the correlation parameters to attain high efficiency for estimating the mean parameters. Simulation studies indicate that the proposed algorithm performs very well. We also applied different estimation procedures to a data set from a clinical trial for illustration.
Resumo:
Consider a general regression model with an arbitrary and unknown link function and a stochastic selection variable that determines whether the outcome variable is observable or missing. The paper proposes U-statistics that are based on kernel functions as estimators for the directions of the parameter vectors in the link function and the selection equation, and shows that these estimators are consistent and asymptotically normal.
Resumo:
Robust methods are useful in making reliable statistical inferences when there are small deviations from the model assumptions. The widely used method of the generalized estimating equations can be "robustified" by replacing the standardized residuals with the M-residuals. If the Pearson residuals are assumed to be unbiased from zero, parameter estimators from the robust approach are asymptotically biased when error distributions are not symmetric. We propose a distribution-free method for correcting this bias. Our extensive numerical studies show that the proposed method can reduce the bias substantially. Examples are given for illustration.
Resumo:
The approach of generalized estimating equations (GEE) is based on the framework of generalized linear models but allows for specification of a working matrix for modeling within-subject correlations. The variance is often assumed to be a known function of the mean. This article investigates the impacts of misspecifying the variance function on estimators of the mean parameters for quantitative responses. Our numerical studies indicate that (1) correct specification of the variance function can improve the estimation efficiency even if the correlation structure is misspecified; (2) misspecification of the variance function impacts much more on estimators for within-cluster covariates than for cluster-level covariates; and (3) if the variance function is misspecified, correct choice of the correlation structure may not necessarily improve estimation efficiency. We illustrate impacts of different variance functions using a real data set from cow growth.
Resumo:
Statistical methods are often used to analyse commercial catch and effort data to provide standardised fishing effort and/or a relative index of fish abundance for input into stock assessment models. Achieving reliable results has proved difficult in Australia's Northern Prawn Fishery (NPF), due to a combination of such factors as the biological characteristics of the animals, some aspects of the fleet dynamics, and the changes in fishing technology. For this set of data, we compared four modelling approaches (linear models, mixed models, generalised estimating equations, and generalised linear models) with respect to the outcomes of the standardised fishing effort or the relative index of abundance. We also varied the number and form of vessel covariates in the models. Within a subset of data from this fishery, modelling correlation structures did not alter the conclusions from simpler statistical models. The random-effects models also yielded similar results. This is because the estimators are all consistent even if the correlation structure is mis-specified, and the data set is very large. However, the standard errors from different models differed, suggesting that different methods have different statistical efficiency. We suggest that there is value in modelling the variance function and the correlation structure, to make valid and efficient statistical inferences and gain insight into the data. We found that fishing power was separable from the indices of prawn abundance only when we offset the impact of vessel characteristics at assumed values from external sources. This may be due to the large degree of confounding within the data, and the extreme temporal changes in certain aspects of individual vessels, the fleet and the fleet dynamics.
Resumo:
Purpose – Preliminary cost estimates for construction projects are often the basis of financial feasibility and budgeting decisions in the early stages of planning and for effective project control, monitoring and execution. The purpose of this paper is to identify and better understand the cost drivers and factors that contribute to the accuracy of estimates in residential construction projects from the developers’ perspective. Design/methodology/approach – The paper uses a literature review to determine the drivers that affect the accuracy of developers’ early stage cost estimates and the factors influencing the construction costs of residential construction projects. It used cost variance data and other supporting documentation collected from two case study projects in South East Queensland, Australia, along with semi-structured interviews conducted with the practitioners involved. Findings – It is found that many cost drivers or factors of cost uncertainty identified in the literature for large-scale projects are not as apparent and relevant for developers’ small-scale residential construction projects. Specifically, the certainty and completeness of project-specific information, suitability of historical cost data, contingency allowances, methods of estimating and the estimator’s level of experience significantly affect the accuracy of cost estimates. Developers of small-scale residential projects use pre-established and suitably priced bills of quantities as the prime estimating method, which is considered to be the most efficient and accurate method for standard house designs. However, this method needs to be backed with the expertise and experience of the estimator. Originality/value – There is a lack of research on the accuracy of developers’ early stage cost estimates and the relevance and applicability of cost drivers and factors in the residential construction projects. This research has practical significance for improving the accuracy of such preliminary cost estimates.
Resumo:
Cyclostationary analysis has proven effective in identifying signal components for diagnostic purposes. A key descriptor in this framework is the cyclic power spectrum, traditionally estimated by the averaged cyclic periodogram and the smoothed cyclic periodogram. A lengthy debate about the best estimator finally found a solution in a cornerstone work by Antoni, who proposed a unified form for the two families, thus allowing a detailed statistical study of their properties. Since then, the focus of cyclostationary research has shifted towards algorithms, in terms of computational efficiency and simplicity of implementation. Traditional algorithms have proven computationally inefficient and the sophisticated "cyclostationary" definition of these estimators slowed their spread in the industry. The only attempt to increase the computational efficiency of cyclostationary estimators is represented by the cyclic modulation spectrum. This indicator exploits the relationship between cyclostationarity and envelope analysis. The link with envelope analysis allows a leap in computational efficiency and provides a "way in" for the understanding by industrial engineers. However, the new estimator lies outside the unified form described above and an unbiased version of the indicator has not been proposed. This paper will therefore extend the analysis of envelope-based estimators of the cyclic spectrum, proposing a new approach to include them in the unified form of cyclostationary estimators. This will enable the definition of a new envelope-based algorithm and the detailed analysis of the properties of the cyclic modulation spectrum. The computational efficiency of envelope-based algorithms will be also discussed quantitatively for the first time in comparison with the averaged cyclic periodogram. Finally, the algorithms will be validated with numerical and experimental examples.
Resumo:
State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.