Biblioteca Digital

926 resultados para Bayesian Mixture Model, Cavalieri Method, Trapezoidal Rule

Checking Assumptions in Latent Class Regression Models via a Markov Chain Monte Carlo Estimation Approach: An Application to Depression and Socio-Economic Status

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.

RANDOM EFFECTS MODELS IN A META-ANALYSIS OF THE ACCURACY OF DIAGNOSTIC TESTS WITHIN A GOLD STANDARD IN THE PRESENCE OF MISSING DATA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In evaluating the accuracy of diagnosis tests, it is common to apply two imperfect tests jointly or sequentially to a study population. In a recent meta-analysis of the accuracy of microsatellite instability testing (MSI) and traditional mutation analysis (MUT) in predicting germline mutations of the mismatch repair (MMR) genes, a Bayesian approach (Chen, Watson, and Parmigiani 2005) was proposed to handle missing data resulting from partial testing and the lack of a gold standard. In this paper, we demonstrate an improved estimation of the sensitivities and specificities of MSI and MUT by using a nonlinear mixed model and a Bayesian hierarchical model, both of which account for the heterogeneity across studies through study-specific random effects. The methods can be used to estimate the accuracy of two imperfect diagnostic tests in other meta-analyses when the prevalence of disease, the sensitivities and/or the specificities of diagnostic tests are heterogeneous among studies. Furthermore, simulation studies have demonstrated the importance of carefully selecting appropriate random effects on the estimation of diagnostic accuracy measurements in this scenario.

Adaptive sensing for target tracking applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a statistical inference scenario, the estimation of target signal or its parameters is done by processing data from informative measurements. The estimation performance can be enhanced if we choose the measurements based on some criteria that help to direct our sensing resources such that the measurements are more informative about the parameter we intend to estimate. While taking multiple measurements, the measurements can be chosen online so that more information could be extracted from the data in each measurement process. This approach fits well in Bayesian inference model often used to produce successive posterior distributions of the associated parameter. We explore the sensor array processing scenario for adaptive sensing of a target parameter. The measurement choice is described by a measurement matrix that multiplies the data vector normally associated with the array signal processing. The adaptive sensing of both static and dynamic system models is done by the online selection of proper measurement matrix over time. For the dynamic system model, the target is assumed to move with some distribution and the prior distribution at each time step is changed. The information gained through adaptive sensing of the moving target is lost due to the relative shift of the target. The adaptive sensing paradigm has many similarities with compressive sensing. We have attempted to reconcile the two approaches by modifying the observation model of adaptive sensing to match the compressive sensing model for the estimation of a sparse vector.

Correlating the Ancient Maya and Modern European Calendars with High-Precision AMS 14C Dating

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The reasons for the development and collapse of Maya civilization remain controversial and historical events carved on stone monuments throughout this region provide a remarkable source of data about the rise and fall of these complex polities. Use of these records depends on correlating the Maya and European calendars so that they can be compared with climate and environmental datasets. Correlation constants can vary up to 1000 years and remain controversial. We report a series of high-resolution AMS C-14 dates on a wooden lintel collected from the Classic Period city of Tikal bearing Maya calendar dates. The radiocarbon dates were calibrated using a Bayesian statistical model and indicate that the dates were carved on the lintel between AD 658-696. This strongly supports the Goodman-Martinez-Thompson (GMT) correlation and the hypothesis that climate change played an important role in the development and demise of this complex civilization.

SURVIVAL PREDICTION FOR BRAIN TUMOR PATIENTS USING GENE EXPRESSION DATA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Statistical characterization and development of summary measures of non-normal distributions of nuclear morphometry data from prostate cancer cells for use as prognostic indicators

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^

Plant diversity effects on the water balance of an experimental grassland

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the literature, contrasting effects of plant species richness on the soil water balance are reported. Our objective was to assess the effects of plant species and functional richness and functional identity on soil water contents and water fluxes in the experimental grassland of the Jena Experiment. The Jena Experiment comprises 86 plots on which plant species richness (0, 1, 2, 4, 8, 16, and 60) and functional group composition (zero to four functional groups: legumes, grasses, tall herbs, and small herbs) were manipulated in a factorial design. We recorded meteorological data and soil water contents of the 0·0–0·3 and 0·3–0·7 m soil layers and calculated actual evapotranspiration (ETa), downward flux (DF), and capillary rise with a soil water balance model for the period 2003–2007. Missing water contents were estimated with a Bayesian hierarchical model. Species richness decreased water contents in subsoil during wet soil conditions. Presence of tall herbs increased soil water contents in topsoil during dry conditions and decreased soil water contents in subsoil during wet conditions. Presence of grasses generally decreased water contents in topsoil, particularly during dry phases; increased ETa and decreased DF from topsoil; and decreased ETa from subsoil. Presence of legumes, in contrast, decreased ETa and increased DF from topsoil and increased ETa from subsoil. Species richness probably resulted in complementary water use. Specific functional groups likely affected the water balance via specific root traits (e.g. shallow dense roots of grasses and deep taproots of tall herbs) or specific shading intensity caused by functional group effects on vegetation cover. Copyright © 2013 John Wiley & Sons, Ltd.

Tramadol and o-desmethyl tramadol clearance maturation and disposition in humans: a pooled pharmacokinetic study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND AND OBJECTIVES We aimed to study the impact of size, maturation and cytochrome P450 2D6 (CYP2D6) genotype activity score as predictors of intravenous tramadol disposition. METHODS Tramadol and O-desmethyl tramadol (M1) observations in 295 human subjects (postmenstrual age 25 weeks to 84.8 years, weight 0.5-186 kg) were pooled. A population pharmacokinetic analysis was performed using a two-compartment model for tramadol and two additional M1 compartments. Covariate analysis included weight, age, sex, disease characteristics (healthy subject or patient) and CYP2D6 genotype activity. A sigmoid maturation model was used to describe age-related changes in tramadol clearance (CLPO), M1 formation clearance (CLPM) and M1 elimination clearance (CLMO). A phenotype-based mixture model was used to identify CLPM polymorphism. RESULTS Differences in clearances were largely accounted for by maturation and size. The time to reach 50 % of adult clearance (TM50) values was used to describe maturation. CLPM (TM50 39.8 weeks) and CLPO (TM50 39.1 weeks) displayed fast maturation, while CLMO matured slower, similar to glomerular filtration rate (TM50 47 weeks). The phenotype-based mixture model identified a slow and a faster metabolizer group. Slow metabolizers comprised 9.8 % of subjects with 19.4 % of faster metabolizer CLPM. Low CYP2D6 genotype activity was associated with lower (25 %) than faster metabolizer CLPM, but only 32 % of those with low genotype activity were in the slow metabolizer group. CONCLUSIONS Maturation and size are key predictors of variability. A two-group polymorphism was identified based on phenotypic M1 formation clearance. Maturation of tramadol elimination occurs early (50 % of adult value at term gestation).

Performance Evaluation of the New Connecticut Leading Employment Index Using Lead Profiles and BVAR Models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dua and Miller (1996) created leading and coincident employment indexes for the state of Connecticut, following Moore's (1981) work at the national level. The performance of the Dua-Miller indexes following the recession of the early 1990s fell short of expectations. This paper performs two tasks. First, it describes the process of revising the Connecticut Coincident and Leading Employment Indexes. Second, it analyzes the statistical properties and performance of the new indexes by comparing the lead profiles of the new and old indexes as well as their out-of-sample forecasting performance, using the Bayesian Vector Autoregressive (BVAR) method. The new indexes show improved performance in dating employment cycle chronologies. The lead profile test demonstrates that superiority in a rigorous, non-parametric statistic fashion. The mixed evidence on the BVAR forecasting experiments illustrates the truth in the Granger and Newbold (1986) caution that leading indexes properly predict cycle turning points and do not necessarily provide accurate forecasts except at turning points, a view that our results support.

Effects of air pollution on asthma hospitalization rates in different age groups in metropolitan cities of Korea

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many studies have shown relationships between air pollution and the rate of hospital admissions for asthma. A few studies have controlled for age-specific effects by adding separate smoothing functions for each age group. However, it has not yet been reported whether air pollution effects are significantly different for different age groups. This lack of information is the motivation for this study, which tests the hypothesis that air pollution effects on asthmatic hospital admissions are significantly different by age groups. Each air pollutant's effect on asthmatic hospital admissions by age groups was estimated separately. In this study, daily time-series data for hospital admission rates from seven cities in Korea from June 1999 through 2003 were analyzed. The outcome variable, daily hospital admission rates for asthma, was related to five air pollutants which were used as the independent variables, namely particulate matter <10 micrometers (μm) in aerodynamic diameter (PM10), carbon monoxide (CO), ozone (O3), nitrogen dioxide (NO2), and sulfur dioxide (SO2). Meteorological variables were considered as confounders. Admission data were divided into three age groups: children (<15 years of age), adults (ages 15-64), and elderly (≥ 65 years of age). The adult age group was considered to be the reference group for each city. In order to estimate age-specific air pollution effects, the analysis was separated into two stages. In the first stage, Generalized Additive Models (GAMs) with cubic spline for smoothing were applied to estimate the age-city-specific air pollution effects on asthmatic hospital admission rates by city and age group. In the second stage, the Bayesian Hierarchical Model with non-informative prior which has large variance was used to combine city-specific effects by age groups. The hypothesis test showed that the effects of PM10, CO and NO2 were significantly different by age groups. Assuming that the air pollution effect for adults is zero as a reference, age-specific air pollution effects were: -0.00154 (95% confidence interval(CI)= (-0.0030,-0.0001)) for children and 0.00126 (95% CI = (0.0006, 0.0019)) for the elderly for PM 10; -0.0195 (95% CI = (-0.0386,-0.0004)) for children for CO; and 0.00494 (95% CI = (0.0028, 0.0071)) for the elderly for NO2. Relative rates (RRs) were 1.008 (95% CI = (1.000-1.017)) in adults and 1.021 (95% CI = (1.012-1.030)) in the elderly for every 10 μg/m3 increase of PM10 , 1.019 (95% CI = (1.005-1.033)) in adults and 1.022 (95% CI = (1.012-1.033)) in the elderly for every 0.1 part per million (ppm) increase of CO; 1.006 (95%CI = (1.002-1.009)) and 1.019 (95%CI = (1.007-1.032)) in the elderly for every 1 part per billion (ppb) increase of NO2 and SO2, respectively. Asthma hospital admissions were significantly increased for PM10 and CO in adults, and for PM10, CO, NO2 and SO2 in the elderly.^

Logistic regression with Markov chains as covariates

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^

An Efficient Multiple Object Detection and Tracking Framework for Automatic Counting and Video Surveillance Applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic visual object counting and video surveillance have important applications for home and business environments, such as security and management of access points. However, in order to obtain a satisfactory performance these technologies need professional and expensive hardware, complex installations and setups, and the supervision of qualified workers. In this paper, an efficient visual detection and tracking framework is proposed for the tasks of object counting and surveillance, which meets the requirements of the consumer electronics: off-the-shelf equipment, easy installation and configuration, and unsupervised working conditions. This is accomplished by a novel Bayesian tracking model that can manage multimodal distributions without explicitly computing the association between tracked objects and detections. In addition, it is robust to erroneous, distorted and missing detections. The proposed algorithm is compared with a recent work, also focused on consumer electronics, proving its superior performance.

Virtual Membrane Systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Within the membrane computing research field, there are many papers about software simulations and a few about hardware implementations. In both cases, algorithms for implementing membrane systems in software and hardware that try to take advantages of massive parallelism are implemented. P-systems are parallel and non deterministic systems which simulate membranes behavior when processing information. This paper presents software techniques based on the proper utilization of virtual memory of a computer. There is a study of how much virtual memory is necessary to host a membrane model. This method improves performance in terms of time.

An Interval-based Multiobjective Approach to Feature Subset Selection Using Joint Modeling of Objectives and Variables

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets.

Integration of congestion pricing and intertemporal preference rate in social welfare function

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assessing social benefits in transport policy implementation has been studied by many researchers using theoretical or empirical measures. However, few of them measure social benefit using different discount rates including the inter-temporal preferences rate of users, the private investment discount rate and the inter-temporal preferences rate of the government. In general, the social discount rate used is the same for all social actors. Therefore, this paper aims to assess a new method by integrating different types of discount rate belonging to different social actors in order to measure the real benefits of each actor in the short, medium and long term. A dynamic simulation is provided by a strategic Land-Use and Transport Interaction (LUTI) model. The method is tested by optimizing a cordon toll scheme in Madrid considering socio- economic efficiency and environmental criteria. Based on the modified social welfare function (WF), the effects on the measure of social benefits are estimated and compared with the classical WF results as well. The results of this research could be a key issue to understanding the relationship between transport system policies and social actors' benefits distribution in a metropolitan context. The results show that the use of more suitable discount rates for each social actor had an effect on the selection and definition of optimal strategy of congestion pricing. The usefulness of the measure of congestion toll declines more quickly overtime.

«
1
2
...
20
21
22
23
24
25
26
...
61
62
»