958 resultados para multivariate binary data
Resumo:
Investigators interested in whether a disease aggregates in families often collect case-control family data, which consist of disease status and covariate information for families selected via case or control probands. Here, we focus on the use of case-control family data to investigate the relative contributions to the disease of additive genetic effects (A), shared family environment (C), and unique environment (E). To this end, we describe a ACE model for binary family data and then introduce an approach to fitting the model to case-control family data. The structural equation model, which has been described previously, combines a general-family extension of the classic ACE twin model with a (possibly covariate-specific) liability-threshold model for binary outcomes. Our likelihood-based approach to fitting involves conditioning on the proband’s disease status, as well as setting prevalence equal to a pre-specified value that can be estimated from the data themselves if necessary. Simulation experiments suggest that our approach to fitting yields approximately unbiased estimates of the A, C, and E variance components, provided that certain commonly-made assumptions hold. These assumptions include: the usual assumptions for the classic ACE and liability-threshold models; assumptions about shared family environment for relative pairs; and assumptions about the case-control family sampling, including single ascertainment. When our approach is used to fit the ACE model to Austrian case-control family data on depression, the resulting estimate of heritability is very similar to those from previous analyses of twin data.
Resumo:
Triggered event-related functional magnetic resonance imaging requires sparse intervals of temporally resolved functional data acquisitions, whose initiation corresponds to the occurrence of an event, typically an epileptic spike in the electroencephalographic trace. However, conventional fMRI time series are greatly affected by non-steady-state magnetization effects, which obscure initial blood oxygen level-dependent (BOLD) signals. Here, conventional echo-planar imaging and a post-processing solution based on principal component analysis were employed to remove the dominant eigenimages of the time series, to filter out the global signal changes induced by magnetization decay and to recover BOLD signals starting with the first functional volume. This approach was compared with a physical solution using radiofrequency preparation, which nullifies magnetization effects. As an application of the method, the detectability of the initial transient BOLD response in the auditory cortex, which is elicited by the onset of acoustic scanner noise, was used to demonstrate that post-processing-based removal of magnetization effects allows to detect brain activity patterns identical with those obtained using the radiofrequency preparation. Using the auditory responses as an ideal experimental model of triggered brain activity, our results suggest that reducing the initial magnetization effects by removing a few principal components from fMRI data may be potentially useful in the analysis of triggered event-related echo-planar time series. The implications of this study are discussed with special caution to remaining technical limitations and the additional neurophysiological issues of the triggered acquisition.
Resumo:
Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.
Resumo:
There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.
Resumo:
Multiple outcomes data are commonly used to characterize treatment effects in medical research, for instance, multiple symptoms to characterize potential remission of a psychiatric disorder. Often either a global, i.e. symptom-invariant, treatment effect is evaluated. Such a treatment effect may over generalize the effect across the outcomes. On the other hand individual treatment effects, varying across all outcomes, are complicated to interpret, and their estimation may lose precision relative to a global summary. An effective compromise to summarize the treatment effect may be through patterns of the treatment effects, i.e. "differentiated effects." In this paper we propose a two-category model to differentiate treatment effects into two groups. A model fitting algorithm and simulation study are presented, and several methods are developed to analyze heterogeneity presenting in the treatment effects. The method is illustrated using an analysis of schizophrenia symptom data.
Resumo:
Objective. To examine effects of primary care physicians (PCPs) and patients on the association between charges for primary care and specialty care in a point-of-service (POS) health plan. Data Source. Claims from 1996 for 3,308 adult male POS plan members, each of whom was assigned to one of the 50 family practitioner-PCPs with the largest POS plan member-loads. Study Design. A hierarchical multivariate two-part model was fitted using a Gibbs sampler to estimate PCPs' effects on patients' annual charges for two types of services, primary care and specialty care, the associations among PCPs' effects, and within-patient associations between charges for the two services. Adjusted Clinical Groups (ACGs) were used to adjust for case-mix. Principal Findings. PCPs with higher case-mix adjusted rates of specialist use were less likely to see their patients at least once during the year (estimated correlation: –.40; 95% CI: –.71, –.008) and provided fewer services to patients that they saw (estimated correlation: –.53; 95% CI: –.77, –.21). Ten of 11 PCPs whose case-mix adjusted effects on primary care charges were significantly less than or greater than zero (p < .05) had estimated, case-mix adjusted effects on specialty care charges that were of opposite sign (but not significantly different than zero). After adjustment for ACG and PCP effects, the within-patient, estimated odds ratio for any use of primary care given any use of specialty care was .57 (95% CI: .45, .73). Conclusions. PCPs and patients contributed independently to a trade-off between utilization of primary care and specialty care. The trade-off appeared to partially offset significant differences in the amount of care provided by PCPs. These findings were possible because we employed a hierarchical multivariate model rather than separate univariate models.
Resumo:
OBJECTIVE: To describe the electronic medical databases used in antiretroviral therapy (ART) programmes in lower-income countries and assess the measures such programmes employ to maintain and improve data quality and reduce the loss of patients to follow-up. METHODS: In 15 countries of Africa, South America and Asia, a survey was conducted from December 2006 to February 2007 on the use of electronic medical record systems in ART programmes. Patients enrolled in the sites at the time of the survey but not seen during the previous 12 months were considered lost to follow-up. The quality of the data was assessed by computing the percentage of missing key variables (age, sex, clinical stage of HIV infection, CD4+ lymphocyte count and year of ART initiation). Associations between site characteristics (such as number of staff members dedicated to data management), measures to reduce loss to follow-up (such as the presence of staff dedicated to tracing patients) and data quality and loss to follow-up were analysed using multivariate logit models. FINDINGS: Twenty-one sites that together provided ART to 50 060 patients were included (median number of patients per site: 1000; interquartile range, IQR: 72-19 320). Eighteen sites (86%) used an electronic database for medical record-keeping; 15 (83%) such sites relied on software intended for personal or small business use. The median percentage of missing data for key variables per site was 10.9% (IQR: 2.0-18.9%) and declined with training in data management (odds ratio, OR: 0.58; 95% confidence interval, CI: 0.37-0.90) and weekly hours spent by a clerk on the database per 100 patients on ART (OR: 0.95; 95% CI: 0.90-0.99). About 10 weekly hours per 100 patients on ART were required to reduce missing data for key variables to below 10%. The median percentage of patients lost to follow-up 1 year after starting ART was 8.5% (IQR: 4.2-19.7%). Strategies to reduce loss to follow-up included outreach teams, community-based organizations and checking death registry data. Implementation of all three strategies substantially reduced losses to follow-up (OR: 0.17; 95% CI: 0.15-0.20). CONCLUSION: The quality of the data collected and the retention of patients in ART treatment programmes are unsatisfactory for many sites involved in the scale-up of ART in resource-limited settings, mainly because of insufficient staff trained to manage data and trace patients lost to follow-up.
Resumo:
BACKGROUND AND OBJECTIVES: Combination antiretroviral therapy (cART) is changing, and this may affect the type and occurrence of side effects. We examined the frequency of lipodystrophy (LD) and weight changes in relation to the use of specific drugs in the Swiss HIV Cohort Study (SHCS). METHODS: In the SHCS, patients are followed twice a year and scored by the treating physician as having 'fat accumulation', 'fat loss', or neither. Treatments, and reasons for change thereof, are recorded. Our study sample included all patients treated with cART between 2003 and 2006 and, in addition, all patients who started cART between 2000 and 2003. RESULTS: From 2003 to 2006, the percentage of patients taking stavudine, didanosine and nelfinavir decreased, the percentage taking lopinavir, nevirapine and efavirenz remained stable, and the percentage taking atazanavir and tenofovir increased by 18.7 and 22.2%, respectively. In life-table Kaplan-Meier analysis, patients starting cART in 2003-2006 were less likely to develop LD than those starting cART from 2000 to 2002 (P<0.02). LD was quoted as the reason for treatment change or discontinuation for 4% of patients on cART in 2003, and for 1% of patients treated in 2006 (P for trend <0.001). In univariate and multivariate regression analysis, patients with a weight gain of >or=5 kg were more likely to take lopinavir or atazanavir than patients without such a weight gain [odds ratio (OR) 2, 95% confidence interval (CI) 1.3-2.9, and OR 1.7, 95% CI 1.3-2.1, respectively]. CONCLUSIONS: LD has become less frequent in the SHCS from 2000 to 2006. A weight gain of more than 5 kg was associated with the use of atazanavir and lopinavir.
Resumo:
The development of coronary vasculopathy is the main determinant of long-term survival in cardiac transplantation. The identification of risk factors, therefore, seems necessary in order to identify possible treatment strategies. Ninety-five out of 397 patients, undergoing orthotopic cardiac transplantation from 10/1985 to 10/1992 were evaluated retrospectively on the basis of perioperative and postoperative variables including age, sex, diagnosis, previous operations, renal function, cholesterol levels, dosage of immunosuppressive drugs (cyclosporin A, azathioprine, steroids), incidence of rejection, treatment with calcium channel blockers at 3, 6, 12, and 18 months postoperatively. Coronary vasculopathy was assessed by annual angiography at 1 and 2 years postoperatively. After univariate analysis, data were evaluated by stepwise multiple logistic regression analysis. Coronary vasculopathy was assessed in 15 patients at 1 (16%), and in 23 patients (24%) at 2, years. On multivariate analysis, previous operations and the incidence of rejections were identified as significant risk factors (P < 0.05), whereas the underlying diagnosis had borderline significance (P = 0.058) for the development of graft coronary vasculopathy. In contrast, all other variables were not significant in our subset of patients investigated. We therefore conclude that the development of coronary vasculopathy in cardiac transplant patients mainly depends on the rejection process itself, aside from patient-dependent factors. Therapeutic measures, such as the administration of calcium channel blockers and regulation of lipid disorders, may therefore only reduce the progress of native atherosclerotic disease in the posttransplant setting.
Resumo:
OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.
Resumo:
Questionnaire data may contain missing values because certain questions do not apply to all respondents. For instance, questions addressing particular attributes of a symptom, such as frequency, triggers or seasonality, are only applicable to those who have experienced the symptom, while for those who have not, responses to these items will be missing. This missing information does not fall into the category 'missing by design', rather the features of interest do not exist and cannot be measured regardless of survey design. Analysis of responses to such conditional items is therefore typically restricted to the subpopulation in which they apply. This article is concerned with joint multivariate modelling of responses to both unconditional and conditional items without restricting the analysis to this subpopulation. Such an approach is of interest when the distributions of both types of responses are thought to be determined by common parameters affecting the whole population. By integrating the conditional item structure into the model, inference can be based both on unconditional data from the entire population and on conditional data from subjects for whom they exist. This approach opens new possibilities for multivariate analysis of such data. We apply this approach to latent class modelling and provide an example using data on respiratory symptoms (wheeze and cough) in children. Conditional data structures such as that considered here are common in medical research settings and, although our focus is on latent class models, the approach can be applied to other multivariate models.