996 resultados para quantile function
Resumo:
Hot spot identification (HSID) aims to identify potential sites—roadway segments, intersections, crosswalks, interchanges, ramps, etc.—with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset.
Resumo:
We aim to design strategies for sequential decision making that adjust to the difficulty of the learning problem. We study this question both in the setting of prediction with expert advice, and for more general combinatorial decision tasks. We are not satisfied with just guaranteeing minimax regret rates, but we want our algorithms to perform significantly better on easy data. Two popular ways to formalize such adaptivity are second-order regret bounds and quantile bounds. The underlying notions of 'easy data', which may be paraphrased as "the learning problem has small variance" and "multiple decisions are useful", are synergetic. But even though there are sophisticated algorithms that exploit one of the two, no existing algorithm is able to adapt to both. In this paper we outline a new method for obtaining such adaptive algorithms, based on a potential function that aggregates a range of learning rates (which are essential tuning parameters). By choosing the right prior we construct efficient algorithms and show that they reap both benefits by proving the first bounds that are both second-order and incorporate quantiles.
Resumo:
This thesis studies quantile residuals and uses different methodologies to develop test statistics that are applicable in evaluating linear and nonlinear time series models based on continuous distributions. Models based on mixtures of distributions are of special interest because it turns out that for those models traditional residuals, often referred to as Pearson's residuals, are not appropriate. As such models have become more and more popular in practice, especially with financial time series data there is a need for reliable diagnostic tools that can be used to evaluate them. The aim of the thesis is to show how such diagnostic tools can be obtained and used in model evaluation. The quantile residuals considered here are defined in such a way that, when the model is correctly specified and its parameters are consistently estimated, they are approximately independent with standard normal distribution. All the tests derived in the thesis are pure significance type tests and are theoretically sound in that they properly take the uncertainty caused by parameter estimation into account. -- In Chapter 2 a general framework based on the likelihood function and smooth functions of univariate quantile residuals is derived that can be used to obtain misspecification tests for various purposes. Three easy-to-use tests aimed at detecting non-normality, autocorrelation, and conditional heteroscedasticity in quantile residuals are formulated. It also turns out that these tests can be interpreted as Lagrange Multiplier or score tests so that they are asymptotically optimal against local alternatives. Chapter 3 extends the concept of quantile residuals to multivariate models. The framework of Chapter 2 is generalized and tests aimed at detecting non-normality, serial correlation, and conditional heteroscedasticity in multivariate quantile residuals are derived based on it. Score test interpretations are obtained for the serial correlation and conditional heteroscedasticity tests and in a rather restricted special case for the normality test. In Chapter 4 the tests are constructed using the empirical distribution function of quantile residuals. So-called Khmaladze s martingale transformation is applied in order to eliminate the uncertainty caused by parameter estimation. Various test statistics are considered so that critical bounds for histogram type plots as well as Quantile-Quantile and Probability-Probability type plots of quantile residuals are obtained. Chapters 2, 3, and 4 contain simulations and empirical examples which illustrate the finite sample size and power properties of the derived tests and also how the tests and related graphical tools based on residuals are applied in practice.
Resumo:
Reliability analysis is a well established branch of statistics that deals with the statistical study of different aspects of lifetimes of a system of components. As we pointed out earlier that major part of the theory and applications in connection with reliability analysis were discussed based on the measures in terms of distribution function. In the beginning chapters of the thesis, we have described some attractive features of quantile functions and the relevance of its use in reliability analysis. Motivated by the works of Parzen (1979), Freimer et al. (1988) and Gilchrist (2000), who indicated the scope of quantile functions in reliability analysis and as a follow up of the systematic study in this connection by Nair and Sankaran (2009), in the present work we tried to extend their ideas to develop necessary theoretical framework for lifetime data analysis. In Chapter 1, we have given the relevance and scope of the study and a brief outline of the work we have carried out. Chapter 2 of this thesis is devoted to the presentation of various concepts and their brief reviews, which were useful for the discussions in the subsequent chapters .In the introduction of Chapter 4, we have pointed out the role of ageing concepts in reliability analysis and in identifying life distributions .In Chapter 6, we have studied the first two L-moments of residual life and their relevance in various applications of reliability analysis. We have shown that the first L-moment of residual function is equivalent to the vitality function, which have been widely discussed in the literature .In Chapter 7, we have defined percentile residual life in reversed time (RPRL) and derived its relationship with reversed hazard rate (RHR). We have discussed the characterization problem of RPRL and demonstrated with an example that the RPRL for given does not determine the distribution uniquely
Determinants of fruit and vegetable intake in England: a re-examination based on quantile regression
Resumo:
Objective To examine die sociodemographic determinants of fruit and vegetable (F&V) consumption in England and determine the differential effects of socioeconomic variables at various parts of the intake distribution, with a special focus on severely inadequate intakes Design Quantile regression, expressing F&V intake as a function of sociodemographic variables, is employed. Here, quantile regression flexibly allows variables such as ethnicity to exert effects on F&V intake that. vary depending oil existing levels of intake. Setting The 2003 Health survey of England. Subjects Data were from 11044 adult individuals. Results The influence of particular sociodemographic variables is found to vary significantly across the intake distribution We conclude that women consume more F&V than men, Asians and Hacks mole dian Whites, co-habiting individuals more than single-living ones Increased incomes and education also boost intake However, the key general finding of the present study is that the influence of most variables is relatively weak in the area of greatest concern, i e among those with the most inadequate intakes in any reference group. Conclusions. Our findings emphasise the importance of allowing the effects of socio-economic drivers to vary across the intake distribution The main finding, that variables which exert significant influence on F&V Intake at other parts Of the conditional distribution have a relatively weak influence at the lower tail, is cause for concern. It implies that in any defined group, those consuming the lease F&V are hard to influence using compaigns or policy levers.
Resumo:
The Normal Quantile Transform (NQT) has been used in many hydrological and meteorological applications in order to make the Cumulated Distribution Function (CDF) of the observed, simulated and forecast river discharge, water level or precipitation data Gaussian. It is also the heart of the meta-Gaussian model for assessing the total predictive uncertainty of the Hydrological Uncertainty Processor (HUP) developed by Krzysztofowicz. In the field of geo-statistics this transformation is better known as the Normal-Score Transform. In this paper some possible problems caused by small sample sizes when applying the NQT in flood forecasting systems will be discussed and a novel way to solve the problem will be outlined by combining extreme value analysis and non-parametric regression methods. The method will be illustrated by examples of hydrological stream-flow forecasts.
Resumo:
The estimation of labor supply elasticities has been an important issue m the economic literature. Yet all works have estimated conditional mean labor supply functions only. The objective of this paper is to obtain more information on labor supply, by estimating the conditional quantile labor supply function. vI/e use a sample of prime age urban males employees in Brazil. Two stage estimators are used as the net wage and virtual income are found to be endogenous to the model. Contrary to previous works using conditional mean estimators, it is found that labor supply elasticities vary significantly and asymmetrically across hours of work. vVhile the income and wage elasticities at the standard work week are zero, for those working longer hours the elasticities are negative.
Resumo:
Direct quantile regression involves estimating a given quantile of a response variable as a function of input variables. We present a new framework for direct quantile regression where a Gaussian process model is learned, minimising the expected tilted loss function. The integration required in learning is not analytically tractable so to speed up the learning we employ the Expectation Propagation algorithm. We describe how this work relates to other quantile regression methods and apply the method on both synthetic and real data sets. The method is shown to be competitive with state of the art methods whilst allowing for the leverage of the full Gaussian process probabilistic framework.
Resumo:
Fleck and Johnson (Int. J. Mech. Sci. 29 (1987) 507) and Fleck et al. (Proc. Inst. Mech. Eng. 206 (1992) 119) have developed foil rolling models which allow for large deformations in the roll profile, including the possibility that the rolls flatten completely. However, these models require computationally expensive iterative solution techniques. A new approach to the approximate solution of the Fleck et al. (1992) Influence Function Model has been developed using both analytic and approximation techniques. The numerical difficulties arising from solving an integral equation in the flattened region have been reduced by applying an Inverse Hilbert Transform to get an analytic expression for the pressure. The method described in this paper is applicable to cases where there is or there is not a flat region.
Resumo:
A new method for estimating the time to colonization of Methicillin-resistant Staphylococcus Aureus (MRSA) patients is developed in this paper. The time to colonization of MRSA is modelled using a Bayesian smoothing approach for the hazard function. There are two prior models discussed in this paper: the first difference prior and the second difference prior. The second difference prior model gives smoother estimates of the hazard functions and, when applied to data from an intensive care unit (ICU), clearly shows increasing hazard up to day 13, then a decreasing hazard. The results clearly demonstrate that the hazard is not constant and provide a useful quantification of the effect of length of stay on the risk of MRSA colonization which provides useful insight.