69 resultados para LINEAR-REGRESSION MODELS
em Queensland University of Technology - ePrints Archive
Resumo:
We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for first-order Poisson regression models where uncertainty exists in the prior parameter estimates. Given certain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this methodology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, incorporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
Resumo:
We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for first-order Poisson regression models where uncertainty exists in the prior parameter estimates. Given certain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this methodology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, incorporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
Resumo:
Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression
Resumo:
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states—perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of “excess” zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to “excess” zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed—and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros
Resumo:
Traditional crash prediction models, such as generalized linear regression models, are incapable of taking into account the multilevel data structure, which extensively exists in crash data. Disregarding the possible within-group correlations can lead to the production of models giving unreliable and biased estimates of unknowns. This study innovatively proposes a -level hierarchy, viz. (Geographic region level – Traffic site level – Traffic crash level – Driver-vehicle unit level – Vehicle-occupant level) Time level, to establish a general form of multilevel data structure in traffic safety analysis. To properly model the potential cross-group heterogeneity due to the multilevel data structure, a framework of Bayesian hierarchical models that explicitly specify multilevel structure and correctly yield parameter estimates is introduced and recommended. The proposed method is illustrated in an individual-severity analysis of intersection crashes using the Singapore crash records. This study proved the importance of accounting for the within-group correlations and demonstrated the flexibilities and effectiveness of the Bayesian hierarchical method in modeling multilevel structure of traffic crash data.
Resumo:
An important aspect of robotic path planning for is ensuring that the vehicle is in the best location to collect the data necessary for the problem at hand. Given that features of interest are dynamic and move with oceanic currents, vehicle speed is an important factor in any planning exercises to ensure vehicles are at the right place at the right time. Here, we examine different Gaussian process models to find a suitable predictive kinematic model that enable the speed of an underactuated, autonomous surface vehicle to be accurately predicted given a set of input environmental parameters.
Resumo:
Visual localization in outdoor environments is often hampered by the natural variation in appearance caused by such things as weather phenomena, diurnal fluctuations in lighting, and seasonal changes. Such changes are global across an environment and, in the case of global light changes and seasonal variation, the change in appearance occurs in a regular, cyclic manner. Visual localization could be greatly improved if it were possible to predict the appearance of a particular location at a particular time, based on the appearance of the location in the past and knowledge of the nature of appearance change over time. In this paper, we investigate whether global appearance changes in an environment can be learned sufficiently to improve visual localization performance. We use time of day as a test case, and generate transformations between morning and afternoon using sample images from a training set. We demonstrate the learned transformation can be generalized from training data and show the resulting visual localization on a test set is improved relative to raw image comparison. The improvement in localization remains when the area is revisited several weeks later.
Resumo:
This paper develops a semiparametric estimation approach for mixed count regression models based on series expansion for the unknown density of the unobserved heterogeneity. We use the generalized Laguerre series expansion around a gamma baseline density to model unobserved heterogeneity in a Poisson mixture model. We establish the consistency of the estimator and present a computational strategy to implement the proposed estimation techniques in the standard count model as well as in truncated, censored, and zero-inflated count regression models. Monte Carlo evidence shows that the finite sample behavior of the estimator is quite good. The paper applies the method to a model of individual shopping behavior. © 1999 Elsevier Science S.A. All rights reserved.
Resumo:
Studies have examined the associations between cancers and circulating 25-hydroxyvitamin D [25(OH)D], but little is known about the impact of different laboratory practices on 25(OH)D concentrations. We examined the potential impact of delayed blood centrifuging, choice of collection tube, and type of assay on 25(OH)D concentrations. Blood samples from 20 healthy volunteers underwent alternative laboratory procedures: four centrifuging times (2, 24, 72, and 96 h after blood draw); three types of collection tubes (red top serum tube, two different plasma anticoagulant tubes containing heparin or EDTA); and two types of assays (DiaSorin radioimmunoassay [RIA] and chemiluminescence immunoassay [CLIA/LIAISON®]). Log-transformed 25(OH)D concentrations were analyzed using the generalized estimating equations (GEE) linear regression models. We found no difference in 25(OH)D concentrations by centrifuging times or type of assay. There was some indication of a difference in 25(OH)D concentrations by tube type in CLIA/LIAISON®-assayed samples, with concentrations in heparinized plasma (geometric mean, 16.1 ng ml−1) higher than those in serum (geometric mean, 15.3 ng ml−1) (p = 0.01), but the difference was significant only after substantial centrifuging delays (96 h). Our study suggests no necessity for requiring immediate processing of blood samples after collection or for the choice of a tube type or assay.
Resumo:
Background: While there is emerging evidence that sedentary behavior is negatively associated with health risk, research on the correlates of sitting time in adults is scarce. Methods: Self-report data from 7,724 women born between 1973-1978 and 8,198 women born between 1946-1951 were collected as part of the Australian Longitudinal Study on Women’s Health. Linear regression models were computed to examine whether demographic, family and caring duties, time use, health and health behavior variables were associated with weekday sitting time. Results: Mean sitting time (SD) was 6.60 (3.32) hours/day for the 1973-1978 cohort and 5.70 (3.04) hours/day for the 1946-1951 cohort. Indicators of socio-economic advantage, such as full11 time work and skilled occupations in both cohorts and university education in the mid-age cohort, were associated with high sitting time. A cluster of ‘healthy behaviours’ was associated with lower sitting time in the mid-aged women (moderate/high physical activity levels, non-smoking, non-drinking). For both cohorts, sitting time was highest in women in full-time work, in skilled occupations and in those who spent the most time in passive leisure. Conclusions: The results suggest that, in young and mid-aged women, interventions for reducing sitting time should focus on both occupational and leisure-time sitting.
Resumo:
In addition to the well-known health risks associated with lack of physical activity (PA), evidence is emerging about the health risks of sedentary behaviour (sitting). Research about patterns and correlates of sitting and PA in older women is scarce. METHODS: Self-report data from 6,116 women aged 76-81 years were collected as part of the Australian Longitudinal Study on Woman’s Health. Linear regression models were computed to examine whether demographic, social and health factors were associated with sitting and PA. RESULTS: Women who did no PA sat more than women who did any PA (p<0.001). Seven correlates were associated with sitting and PA (p<0.05). Five of these were associated with more sitting and less PA: three health-related (BMI, chronic conditions, anxiety/depression) and two social correlates (caring duties, volunteering). One demographic (being from another English-speaking country) and one social correlate (more social interaction) were associated with more sitting and more PA. Four correlates, two demographic (living in a city; post-high school education), one social (being single), and one health-related correlate (dizziness/loss of balance) were associated with more sitting only. Two other health-related correlates (stiff/painful joints; feet problems) were associated with less PA only. CONCLUSION: Sedentary behaviour and PA are distinct behaviours in older Australian women. Information about the correlates of both behaviours can be used to identify population groups who might benefit from interventions to reduce sedentary behaviour and/or increase PA.