981 resultados para random regression
Resumo:
In clinical practice, traditional X-ray radiography is widely used, and knowledge of landmarks and contours in anteroposterior (AP) pelvis X-rays is invaluable for computer aided diagnosis, hip surgery planning and image-guided interventions. This paper presents a fully automatic approach for landmark detection and shape segmentation of both pelvis and femur in conventional AP X-ray images. Our approach is based on the framework of landmark detection via Random Forest (RF) regression and shape regularization via hierarchical sparse shape composition. We propose a visual feature FL-HoG (Flexible- Level Histogram of Oriented Gradients) and a feature selection algorithm based on trace radio optimization to improve the robustness and the efficacy of RF-based landmark detection. The landmark detection result is then used in a hierarchical sparse shape composition framework for shape regularization. Finally, the extracted shape contour is fine-tuned by a post-processing step based on low level image features. The experimental results demonstrate that our feature selection algorithm reduces the feature dimension in a factor of 40 and improves both training and test efficiency. Further experiments conducted on 436 clinical AP pelvis X-rays show that our approach achieves an average point-to-curve error around 1.2 mm for femur and 1.9 mm for pelvis.
Finite mixture regression model with random effects: application to neonatal hospital length of stay
Resumo:
A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62K15, 91B42, 62H99.
Resumo:
Poisson distribution has often been used for count like accident data. Negative Binomial (NB) distribution has been adopted in the count data to take care of the over-dispersion problem. However, Poisson and NB distributions are incapable of taking into account some unobserved heterogeneities due to spatial and temporal effects of accident data. To overcome this problem, Random Effect models have been developed. Again another challenge with existing traffic accident prediction models is the distribution of excess zero accident observations in some accident data. Although Zero-Inflated Poisson (ZIP) model is capable of handling the dual-state system in accident data with excess zero observations, it does not accommodate the within-location correlation and between-location correlation heterogeneities which are the basic motivations for the need of the Random Effect models. This paper proposes an effective way of fitting ZIP model with location specific random effects and for model calibration and assessment the Bayesian analysis is recommended.
Resumo:
Purpose: Flat-detector, cone-beam computed tomography (CBCT) has enormous potential to improve the accuracy of treatment delivery in image-guided radiotherapy (IGRT). To assist radiotherapists in interpreting these images, we use a Bayesian statistical model to label each voxel according to its tissue type. Methods: The rich sources of prior information in IGRT are incorporated into a hidden Markov random field (MRF) model of the 3D image lattice. Tissue densities in the reference CT scan are estimated using inverse regression and then rescaled to approximate the corresponding CBCT intensity values. The treatment planning contours are combined with published studies of physiological variability to produce a spatial prior distribution for changes in the size, shape and position of the tumour volume and organs at risk (OAR). The voxel labels are estimated using the iterated conditional modes (ICM) algorithm. Results: The accuracy of the method has been evaluated using 27 CBCT scans of an electron density phantom (CIRS, Inc. model 062). The mean voxel-wise misclassification rate was 6.2%, with Dice similarity coefficient of 0.73 for liver, muscle, breast and adipose tissue. Conclusions: By incorporating prior information, we are able to successfully segment CBCT images. This could be a viable approach for automated, online image analysis in radiotherapy.
Resumo:
Background: Random Breath Testing (RBT) is the main drink driving law enforcement tool used throughout Australia. International comparative research considers Australia to have the most successful RBT program compared to other countries in terms of crash reductions (Erke, Goldenbeld, & Vaa, 2009). This success is attributed to the programs high intensity (Erke et al., 2009). Our review of the extant literature suggests that there is no research evidence that indicates an optimal level of alcohol breath testing. That is, we suggest that no research exists to guide policy regarding whether or not there is a point at which alcohol related crashes reach a point of diminishing returns as a result of either saturated or targeted RBT testing. Aims: In this paper we first provide an examination of RBTs and alcohol related crashes across Australian jurisdictions. We then address the question of whether or not an optimal level of random breath testing exists by examining the relationship between the number of RBTs conducted and the occurrence of alcohol-related crashes over time, across all Australian states. Method: To examine the association between RBT rates and alcohol related crashes and to assess whether an optimal ratio of RBT tests per licenced drivers can be determined we draw on three administrative data sources form each jurisdiction. Where possible data collected spans January 1st 2000 to September 30th 2012. The RBT administrative dataset includes the number of Random Breath Tests (RBTs) conducted per month. The traffic crash administrative dataset contains aggregated monthly count of the number of traffic crashes where an individual’s recorded BAC reaches or exceeds 0.05g/ml of alcohol in blood. The licenced driver data were the monthly number of registered licenced drivers spanning January 2000 to December 2011. Results: The data highlights that the Australian story does not reflective of all States and territories. The stable RBT to licenced driver ratio in Queensland (of 1:1) suggests a stable rate of alcohol related crash data of 5.5 per 100,000 licenced drivers. Yet, in South Australia were a relative stable rate of RBT to licenced driver ratio of 1:2 is maintained the rate of alcohol related traffic crashes is substantially less at 3.7 per 100,000. We use joinpoint regression techniques and varying regression models to fit the data and compare the different patterns between jurisdictions. Discussion: The results of this study provide an updated review and evaluation of RBTs conducted in Australia and examines the association between RBTs and alcohol related traffic crashes. We also present an evidence base to guide policy decisions for RBT operations.
Resumo:
Cone-beam computed tomography (CBCT) has enormous potential to improve the accuracy of treatment delivery in image-guided radiotherapy (IGRT). To assist radiotherapists in interpreting these images, we use a Bayesian statistical model to label each voxel according to its tissue type. The rich sources of prior information in IGRT are incorporated into a hidden Markov random field model of the 3D image lattice. Tissue densities in the reference CT scan are estimated using inverse regression and then rescaled to approximate the corresponding CBCT intensity values. The treatment planning contours are combined with published studies of physiological variability to produce a spatial prior distribution for changes in the size, shape and position of the tumour volume and organs at risk. The voxel labels are estimated using iterated conditional modes. The accuracy of the method has been evaluated using 27 CBCT scans of an electron density phantom. The mean voxel-wise misclassification rate was 6.2\%, with Dice similarity coefficient of 0.73 for liver, muscle, breast and adipose tissue. By incorporating prior information, we are able to successfully segment CBCT images. This could be a viable approach for automated, online image analysis in radiotherapy.
Resumo:
A computationally efficient sequential Monte Carlo algorithm is proposed for the sequential design of experiments for the collection of block data described by mixed effects models. The difficulty in applying a sequential Monte Carlo algorithm in such settings is the need to evaluate the observed data likelihood, which is typically intractable for all but linear Gaussian models. To overcome this difficulty, we propose to unbiasedly estimate the likelihood, and perform inference and make decisions based on an exact-approximate algorithm. Two estimates are proposed: using Quasi Monte Carlo methods and using the Laplace approximation with importance sampling. Both of these approaches can be computationally expensive, so we propose exploiting parallel computational architectures to ensure designs can be derived in a timely manner. We also extend our approach to allow for model uncertainty. This research is motivated by important pharmacological studies related to the treatment of critically ill patients.
Resumo:
Submarine groundwater discharge (SGD) is an integral part of the hydrological cycle and represents an important aspect of land-ocean interactions. We used a numerical model to simulate flow and salt transport in a nearshore groundwater aquifer under varying wave conditions based on yearlong random wave data sets, including storm surge events. The results showed significant flow asymmetry with rapid response of influxes and retarded response of effluxes across the seabed to the irregular wave conditions. While a storm surge immediately intensified seawater influx to the aquifer, the subsequent return of intruded seawater to the sea, as part of an increased SGD, was gradual. Using functional data analysis, we revealed and quantified retarded, cumulative effects of past wave conditions on SGD including the fresh groundwater and recirculating seawater discharge components. The retardation was characterized well by a gamma distribution function regardless of wave conditions. The relationships between discharge rates and wave parameters were quantifiable by a regression model in a functional form independent of the actual irregular wave conditions. This statistical model provides a useful method for analyzing and predicting SGD from nearshore unconfined aquifers affected by random waves
Resumo:
Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.
Resumo:
This article is motivated by a lung cancer study where a regression model is involved and the response variable is too expensive to measure but the predictor variable can be measured easily with relatively negligible cost. This situation occurs quite often in medical studies, quantitative genetics, and ecological and environmental studies. In this article, by using the idea of ranked-set sampling (RSS), we develop sampling strategies that can reduce cost and increase efficiency of the regression analysis for the above-mentioned situation. The developed method is applied retrospectively to a lung cancer study. In the lung cancer study, the interest is to investigate the association between smoking status and three biomarkers: polyphenol DNA adducts, micronuclei, and sister chromatic exchanges. Optimal sampling schemes with different optimality criteria such as A-, D-, and integrated mean square error (IMSE)-optimality are considered in the application. With set size 10 in RSS, the improvement of the optimal schemes over simple random sampling (SRS) is great. For instance, by using the optimal scheme with IMSE-optimality, the IMSEs of the estimated regression functions for the three biomarkers are reduced to about half of those incurred by using SRS.
Resumo:
This paper presents a method of partial automation of specification based regression testing, which we call ESSE (Explicit State Space Enumeration). The first step in ESSE method is the extraction of a finite state model of the system making use of an already tested version of the system under test (SUT). Thereafter, the finite state model thus obtained is used to compute good test sequences that can be used to regression test subsequent versions of the system. We present two new algorithms for test sequence computation - both based on our finite state model generated by the above method. We also provide the details and results of the experimental evaluation of ESSE method. Comparison with a practically used random-testing algorithm has shown substantial improvements.
Resumo:
Plant community ecologists use the null model approach to infer assembly processes from observed patterns of species co-occurrence. In about a third of published studies, the null hypothesis of random assembly cannot be rejected. When this occurs, plant ecologists interpret that the observed random pattern is not environmentally constrained - but probably generated by stochastic processes. The null model approach (using the C-score and the discrepancy index) was used to test for random assembly under two simulation algorithms. Logistic regression, distance-based redundancy analysis, and constrained ordination were used to test for environmental determinism (species segregation along environmental gradients or turnover and species aggregation). This article introduces an environmentally determined community of alpine hydrophytes that presents itself as randomly assembled. The pathway through which the random pattern arises in this community is suggested to be as follows: Two simultaneous environmental processes, one leading to species aggregation and the other leading to species segregation, concurrently generate the observed pattern, which results to be neither aggregated nor segregated - but random. A simulation study supports this suggestion. Although apparently simple, the null model approach seems to assume that a single ecological factor prevails or that if several factors decisively influence the community, then they all exert their influence in the same direction, generating either aggregation or segregation. As these assumptions are unlikely to hold in most cases and assembly processes cannot be inferred from random patterns, we would like to propose plant ecologists to investigate specifically the ecological processes responsible for observed random patterns, instead of trying to infer processes from patterns
Resumo:
Semisupervised dimensionality reduction has been attracting much attention as it not only utilizes both labeled and unlabeled data simultaneously, but also works well in the situation of out-of-sample. This paper proposes an effective approach of semisupervised dimensionality reduction through label propagation and label regression. Different from previous efforts, the new approach propagates the label information from labeled to unlabeled data with a well-designed mechanism of random walks, in which outliers are effectively detected and the obtained virtual labels of unlabeled data can be well encoded in a weighted regression model. These virtual labels are thereafter regressed with a linear model to calculate the projection matrix for dimensionality reduction. By this means, when the manifold or the clustering assumption of data is satisfied, the labels of labeled data can be correctly propagated to the unlabeled data; and thus, the proposed approach utilizes the labeled and the unlabeled data more effectively than previous work. Experimental results are carried out upon several databases, and the advantage of the new approach is well demonstrated.
Resumo:
In regression analysis of counts, a lack of simple and efficient algorithms for posterior computation has made Bayesian approaches appear unattractive and thus underdeveloped. We propose a lognormal and gamma mixed negative binomial (NB) regression model for counts, and present efficient closed-form Bayesian inference; unlike conventional Poisson models, the proposed approach has two free parameters to include two different kinds of random effects, and allows the incorporation of prior information, such as sparsity in the regression coefficients. By placing a gamma distribution prior on the NB dispersion parameter r, and connecting a log-normal distribution prior with the logit of the NB probability parameter p, efficient Gibbs sampling and variational Bayes inference are both developed. The closed-form updates are obtained by exploiting conditional conjugacy via both a compound Poisson representation and a Polya-Gamma distribution based data augmentation approach. The proposed Bayesian inference can be implemented routinely, while being easily generalizable to more complex settings involving multivariate dependence structures. The algorithms are illustrated using real examples. Copyright 2012 by the author(s)/owner(s).