34 resultados para Random regression
em University of Queensland eSpace - Australia
Finite mixture regression model with random effects: application to neonatal hospital length of stay
Resumo:
A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
A significant problem in the collection of responses to potentially sensitive questions, such as relating to illegal, immoral or embarrassing activities, is non-sampling error due to refusal to respond or false responses. Eichhorn & Hayre (1983) suggested the use of scrambled responses to reduce this form of bias. This paper considers a linear regression model in which the dependent variable is unobserved but for which the sum or product with a scrambling random variable of known distribution, is known. The performance of two likelihood-based estimators is investigated, namely of a Bayesian estimator achieved through a Markov chain Monte Carlo (MCMC) sampling scheme, and a classical maximum-likelihood estimator. These two estimators and an estimator suggested by Singh, Joarder & King (1996) are compared. Monte Carlo results show that the Bayesian estimator outperforms the classical estimators in almost all cases, and the relative performance of the Bayesian estimator improves as the responses become more scrambled.
Resumo:
A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-specific mortality data. The survival experience of stroke patients after index stroke may be described by a subpopulation of patients in the acute condition and another subpopulation of patients in the chronic phase. To adjust for the inherent correlation of observations due to random hospital effects, a mixture model of two survival functions with random effects is formulated. Assuming a Weibull hazard in both components, an EM algorithm is developed for the estimation of fixed effect parameters and variance components. A simulation study is conducted to assess the performance of the two-component survival mixture model estimators. Simulation results confirm the applicability of the proposed model in a small sample setting. Copyright (C) 2004 John Wiley Sons, Ltd.
Resumo:
A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environ mental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed. Copyright (C) 2001 John Wiley & Sons, Ltd.
Resumo:
In this paper, we consider testing for additivity in a class of nonparametric stochastic regression models. Two test statistics are constructed and their asymptotic distributions are established. We also conduct a small sample study for one of the test statistics through a simulated example. (C) 2002 Elsevier Science (USA).
Resumo:
The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.
Resumo:
Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.
Resumo:
We investigate here a modification of the discrete random pore model [Bhatia SK, Vartak BJ, Carbon 1996;34:1383], by including an additional rate constant which takes into account the different reactivity of the initial pore surface having attached functional groups and hydrogens, relative to the subsequently exposed surface. It is observed that the relative initial reactivity has a significant effect on the conversion and structural evolution, underscoring the importance of initial surface chemistry. The model is tested against experimental data on chemically controlled char oxidation and steam gasification at various temperatures. It is seen that the variations of the reaction rate and surface area with conversion are better represented by the present approach than earlier random pore models. The results clearly indicate the improvement of model predictions in the low conversion region, where the effect of the initially attached functional groups and hydrogens is more significant, particularly for char oxidation. It is also seen that, for the data examined, the initial surface chemistry is less important for steam gasification as compared to the oxidation reaction. Further development of the approach must also incorporate the dynamics of surface complexation, which is not considered here.
Resumo:
This article describes a method to turn astronomical imaging into a random number generator by using the positions of incident cosmic rays and hot pixels to generate bit streams. We subject the resultant bit streams to a battery of standard benchmark statistical tests for randomness and show that these bit streams are statistically the same as a perfect random bit stream. Strategies for improving and building upon this method are outlined.
Resumo:
The linear relationship between work accomplished (W-lim) and time to exhaustion (t(lim)) can be described by the equation: W-lim = a + CP.t(lim). Critical power (CP) is the slope of this line and is thought to represent a maximum rate of ATP synthesis without exhaustion, presumably an inherent characteristic of the aerobic energy system. The present investigation determined whether the choice of predictive tests would elicit significant differences in the estimated CP. Ten female physical education students completed, in random order and on consecutive days, five art-out predictive tests at preselected constant-power outputs. Predictive tests were performed on an electrically-braked cycle ergometer and power loadings were individually chosen so as to induce fatigue within approximately 1-10 mins. CP was derived by fitting the linear W-lim-t(lim) regression and calculated three ways: 1) using the first, third and fifth W-lim-t(lim) coordinates (I-135), 2) using coordinates from the three highest power outputs (I-123; mean t(lim) = 68-193 s) and 3) using coordinates from the lowest power outputs (I-345; mean t(lim) = 193-485 s). Repeated measures ANOVA revealed that CPI123 (201.0 +/- 37.9W) > CPI135 (176.1 +/- 27.6W) > CPI345 (164.0 +/- 22.8W) (P < 0.05). When the three sets of data were used to fit the hyperbolic Power-t(lim) regression, statistically significant differences between each CP were also found (P < 0.05). The shorter the predictive trials, the greater the slope of the W-lim-t(lim) regression; possibly because of the greater influence of 'aerobic inertia' on these trials. This may explain why CP has failed to represent a maximal, sustainable work rate. The present findings suggest that if CP is to represent the highest power output that an individual can maintain for a very long time without fatigue then CP should be calculated over a range of predictive tests in which the influence of aerobic inertia is minimised.
Resumo:
Genetic markers that distinguish fungal genotypes are important tools for genetic analysis of heterokaryosis and parasexual recombination in fungi. Random amplified polymorphic DNA (RAPD) markers that distinguish two races of biotype B of Colletotrichum gloeosporioides infecting the legume Stylosanthes guianensis were sought. Eighty-five arbitrary oligonucleotide primers were used to generate 895 RAPD bands but only two bands were found to be specifically amplified from DNA of the race 3 isolate. These two RAPD bands were used as DNA probes and hybridised only to DNA of the race 3 isolate. Both RAPD bands hybridised to a dispensable 1.2 Mb chromosome of the race 3 isolate. No other genotype-specific chromosomes or DNA sequences were identified in either the race 2 or race 3 isolates. The RAPD markers hybridised to a 2 Mb chromosome in all races of the genetically distinct biotype A pathogen which infects other species of Stylosanthes as well as S. guianensis. The experiments indicate that RAPD analysis is a potentially useful tool for obtaining genotype-and chromosome-specific DNA probes in closely related isolates of one biotype of this fungal pathogen.
Resumo:
This study examined the relationship between isokinetic hip extensor/hip flexor strength, 1-RM squat strength, and sprint running performance for both a sprint-trained and non-sprint-trained group. Eleven male sprinters and 8 male controls volunteered for the study. On the same day subjects ran 20-m sprints from both a stationary start and with a 50-m acceleration distance, completed isokinetic hip extension/flexion exercises at 1.05, 4.74, and 8.42 rad.s(-1), and had their squat strength estimated. Stepwise multiple regression analysis showed that equations for predicting both 20-m maximum velocity nm time and 20-m acceleration time may be calculated with an error of less than 0.05 sec using only isokinetic and squat strength data. However, a single regression equation for predicting both 20-m acceleration and maximum velocity run times from isokinetic or squat tests was not found. The regression analysis indicated that hip flexor strength at all test velocities was a better predictor of sprint running performance than hip extensor strength.
Resumo:
The majority of past and current individual-tree growth modelling methodologies have failed to characterise and incorporate structured stochastic components. Rather, they have relied on deterministic predictions or have added an unstructured random component to predictions. In particular, spatial stochastic structure has been neglected, despite being present in most applications of individual-tree growth models. Spatial stochastic structure (also called spatial dependence or spatial autocorrelation) eventuates when spatial influences such as competition and micro-site effects are not fully captured in models. Temporal stochastic structure (also called temporal dependence or temporal autocorrelation) eventuates when a sequence of measurements is taken on an individual-tree over time, and variables explaining temporal variation in these measurements are not included in the model. Nested stochastic structure eventuates when measurements are combined across sampling units and differences among the sampling units are not fully captured in the model. This review examines spatial, temporal, and nested stochastic structure and instances where each has been characterised in the forest biometry and statistical literature. Methodologies for incorporating stochastic structure in growth model estimation and prediction are described. Benefits from incorporation of stochastic structure include valid statistical inference, improved estimation efficiency, and more realistic and theoretically sound predictions. It is proposed in this review that individual-tree modelling methodologies need to characterise and include structured stochasticity. Possibilities for future research are discussed. (C) 2001 Elsevier Science B.V. All rights reserved.