958 resultados para multivariate binary data
Resumo:
OBJECTIVES Patients with inflammatory bowel disease (IBD) have a high resource consumption, with considerable costs for the healthcare system. In a system with sparse resources, treatment is influenced not only by clinical judgement but also by resource consumption. We aimed to determine the resource consumption of IBD patients and to identify its significant predictors. MATERIALS AND METHODS Data from the prospective Swiss Inflammatory Bowel Disease Cohort Study were analysed for the resource consumption endpoints hospitalization and outpatient consultations at enrolment [1187 patients; 41.1% ulcerative colitis (UC), 58.9% Crohn's disease (CD)] and at 1-year follow-up (794 patients). Predictors of interest were chosen through an expert panel and a review of the relevant literature. Logistic regressions were used for binary endpoints, and negative binomial regressions and zero-inflated Poisson regressions were used for count data. RESULTS For CD, fistula, use of biologics and disease activity were significant predictors for hospitalization days (all P-values <0.001); age, sex, steroid therapy and biologics were significant predictors for the number of outpatient visits (P=0.0368, 0.023, 0.0002, 0.0003, respectively). For UC, biologics, C-reactive protein, smoke quitters, age and sex were significantly predictive for hospitalization days (P=0.0167, 0.0003, 0.0003, 0.0076 and 0.0175 respectively); disease activity and immunosuppressive therapy predicted the number of outpatient visits (P=0.0009 and 0.0017, respectively). The results of multivariate regressions are shown in detail. CONCLUSION Several highly significant clinical predictors for resource consumption in IBD were identified that might be considered in medical decision-making. In terms of resource consumption and its predictors, CD and UC show a different behaviour.
Resumo:
Objective: Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. Method: TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. Results: TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. Conclusions: TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy.
Resumo:
Well-known data mining algorithms rely on inputs in the form of pairwise similarities between objects. For large datasets it is computationally impossible to perform all pairwise comparisons. We therefore propose a novel approach that uses approximate Principal Component Analysis to efficiently identify groups of similar objects. The effectiveness of the approach is demonstrated in the context of binary classification using the supervised normalized cut as a classifier. For large datasets from the UCI repository, the approach significantly improves run times with minimal loss in accuracy.
Resumo:
BackgroundThe aim of the present study was to evaluate the feasibility of using a telephone survey in gaining an understanding of the possible herd and management factors influencing the performance (i.e. safety and efficacy) of a vaccine against porcine circovirus type 2 (PCV2) in a large number of herds and to estimate customers¿ satisfaction.ResultsDatasets from 227 pig herds that currently applied or have applied a PCV2 vaccine were analysed. Since 1-, 2- and 3-site production systems were surveyed, the herds were allocated in one of two subsets, where only applicable variables out of 180 were analysed. Group 1 was comprised of herds with sows, suckling pigs and nursery pigs, whereas herds in Group 2 in all cases kept fattening pigs. Overall 14 variables evaluating the subjective satisfaction with one particular PCV2 vaccine were comingled to an abstract dependent variable for further models, which was characterized by a binary outcome from a cluster analysis: good/excellent satisfaction (green cluster) and moderate satisfaction (red cluster). The other 166 variables comprised information about diagnostics, vaccination, housing, management, were considered as independent variables. In Group 1, herds using the vaccine due to recognised PCV2 related health problems (wasting, mortality or porcine dermatitis and nephropathy syndrome) had a 2.4-fold increased chance (1/OR) of belonging to the green cluster. In the final model for Group 1, the diagnosis of diseases other than PCV2, the reason for vaccine administration being other than PCV2-associated diseases and using a single injection of iron had significant influence on allocating into the green cluster (P¿<¿0.05). In Group 2, only unchanged time or delay of time of vaccination influenced the satisfaction (P¿<¿0.05).ConclusionThe methodology and statistical approach used in this study were feasible to scientifically assess ¿satisfaction¿, and to determine factors influencing farmers¿ and vets¿ opinion about the safety and efficacy of a new vaccine.
Resumo:
BACKGROUND Prostate cancer (PCa) is the second most common disease among men worldwide. It is important to know survival outcomes and prognostic factors for this disease. Recruitment for the largest therapeutic randomised controlled trial in PCa-the Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy: A Multi-Stage Multi-Arm Randomised Controlled Trial (STAMPEDE)-includes men with newly diagnosed metastatic PCa who are commencing long-term androgen deprivation therapy (ADT); the control arm provides valuable data for a prospective cohort. OBJECTIVE Describe survival outcomes, along with current treatment standards and factors associated with prognosis, to inform future trial design in this patient group. DESIGN, SETTING, AND PARTICIPANTS STAMPEDE trial control arm comprising men newly diagnosed with M1 disease who were recruited between October 2005 and January 2014. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS Overall survival (OS) and failure-free survival (FFS) were reported by primary disease characteristics using Kaplan-Meier methods. Hazard ratios and 95% confidence intervals (CIs) were derived from multivariate Cox models. RESULTS AND LIMITATIONS A cohort of 917 men with newly diagnosed M1 disease was recruited to the control arm in the specified interval. Median follow-up was 20 mo. Median age at randomisation was 66 yr (interquartile range [IQR]: 61-71), and median prostate-specific antigen level was 112 ng/ml (IQR: 34-373). Most men (n=574; 62%) had bone-only metastases, whereas 237 (26%) had both bone and soft tissue metastases; soft tissue metastasis was found mainly in distant lymph nodes. There were 238 deaths, 202 (85%) from PCa. Median FFS was 11 mo; 2-yr FFS was 29% (95% CI, 25-33). Median OS was 42 mo; 2-yr OS was 72% (95% CI, 68-76). Survival time was influenced by performance status, age, Gleason score, and metastases distribution. Median survival after FFS event was 22 mo. Trial eligibility criteria meant men were younger and fitter than general PCa population. CONCLUSIONS Survival remains disappointing in men presenting with M1 disease who are started on only long-term ADT, despite active treatments being available at first failure of ADT. Importantly, men with M1 disease now spend the majority of their remaining life in a state of castration-resistant relapse. PATIENT SUMMARY Results from this control arm cohort found survival is relatively short and highly influenced by patient age, fitness, and where prostate cancer has spread in the body.
Resumo:
This paper presents the electron and photon energy calibration achieved with the ATLAS detector using about 25 fb−1 of LHC proton–proton collision data taken at centre-of-mass energies of √s = 7 and 8 TeV. The reconstruction of electron and photon energies is optimised using multivariate algorithms. The response of the calorimeter layers is equalised in data and simulation, and the longitudinal profile of the electromagnetic showers is exploited to estimate the passive material in front of the calorimeter and reoptimise the detector simulation. After all corrections, the Z resonance is used to set the absolute energy scale. For electrons from Z decays, the achieved calibration is typically accurate to 0.05% in most of the detector acceptance, rising to 0.2% in regions with large amounts of passive material. The remaining inaccuracy is less than 0.2–1% for electrons with a transverse energy of 10 GeV, and is on average 0.3% for photons. The detector resolution is determined with a relative inaccuracy of less than 10% for electrons and photons up to 60 GeV transverse energy, rising to 40% for transverse energies above 500 GeV.
Resumo:
Background Protein-energy-malnutrition (PEM) is common in people with end stage kidney disease (ESKD) undergoing maintenance haemodialysis (MHD) and correlates strongly with mortality. To this day, there is no gold standard for detecting PEM in patients on MHD. Aim of Study The aim of this study was to evaluate if Nutritional Risk Screening 2002 (NRS-2002), handgrip strength measurement, mid-upper arm muscle area (MUAMA), triceps skin fold measurement (TSF), serum albumin, normalised protein catabolic rate (nPCR), Kt/V and eKt/V, dry body weight, body mass index (BMI), age and time since start on MHD are relevant for assessing PEM in patients on MHD. Methods The predictive value of the selected parameters on mortality and mortality or weight loss of more than 5% was assessed. Quantitative data analysis of the 12 parameters in the same patients on MHD in autumn 2009 (n = 64) and spring 2011 (n = 40) with paired statistical analysis and multivariate logistic regression analysis was performed. Results Paired data analysis showed significant reduction of dry body weight, BMI and nPCR. Kt/Vtot did not change, eKt/v and hand grip strength measurements were significantly higher in spring 2011. No changes were detected in TSF, serum albumin, NRS-2002 and MUAMA. Serum albumin was shown to be the only predictor of death and of the combined endpoint “death or weight loss of more than 5%”. Conclusion We now screen patients biannually for serum albumin, nPCR, Kt/V, handgrip measurement of the shunt-free arm, dry body weight, age and time since initiation of MHD.
Resumo:
PURPOSE To identify the influence of fixed prosthesis type on biologic and technical complication rates in the context of screw versus cement retention. Furthermore, a multivariate analysis was conducted to determine which factors, when considered together, influence the complication and failure rates of fixed implant-supported prostheses. MATERIALS AND METHODS Electronic searches of MEDLINE (PubMed), EMBASE, and the Cochrane Library were conducted. Selected inclusion and exclusion criteria were used to limit the search. Data were analyzed statistically with simple and multivariate random-effects Poisson regressions. RESULTS Seventy-three articles qualified for inclusion in the study. Screw-retained prostheses showed a tendency toward and significantly more technical complications than cemented prostheses with single crowns and fixed partial prostheses, respectively. Resin chipping and ceramic veneer chipping had high mean event rates, at 10.04 and 8.95 per 100 years, respectively, for full-arch screwed prostheses. For "all fixed prostheses" (prosthesis type not reported or not known), significantly fewer biologic and technical complications were seen with screw retention. Multivariate analysis revealed a significantly greater incidence of technical complications with cemented prostheses. Full-arch prostheses, cantilevered prostheses, and "all fixed prostheses" had significantly higher complication rates than single crowns. A significantly greater incidence of technical and biologic complications was seen with cemented prostheses. CONCLUSION Screw-retained fixed partial prostheses demonstrated a significantly higher rate of technical complications and screw-retained full-arch prostheses demonstrated a notably high rate of veneer chipping. When "all fixed prostheses" were considered, significantly higher rates of technical and biologic complications were seen for cement-retained prostheses. Multivariate Poisson regression analysis failed to show a significant difference between screw- and cement-retained prostheses with respect to the incidence of failure but demonstrated a higher rate of technical and biologic complications for cement-retained prostheses. The incidence of technical complications was more dependent upon prosthesis and retention type than prosthesis or abutment material.
Resumo:
Syndromic surveillance (SyS) systems currently exploit various sources of health-related data, most of which are collected for purposes other than surveillance (e.g. economic). Several European SyS systems use data collected during meat inspection for syndromic surveillance of animal health, as some diseases may be more easily detected post-mortem than at their point of origin or during the ante-mortem inspection upon arrival at the slaughterhouse. In this paper we use simulation to evaluate the performance of a quasi-Poisson regression (also known as an improved Farrington) algorithm for the detection of disease outbreaks during post-mortem inspection of slaughtered animals. When parameterizing the algorithm based on the retrospective analyses of 6 years of historic data, the probability of detection was satisfactory for large (range 83-445 cases) outbreaks but poor for small (range 20-177 cases) outbreaks. Varying the amount of historical data used to fit the algorithm can help increasing the probability of detection for small outbreaks. However, while the use of a 0·975 quantile generated a low false-positive rate, in most cases, more than 50% of outbreak cases had already occurred at the time of detection. High variance observed in the whole carcass condemnations time-series, and lack of flexibility in terms of the temporal distribution of simulated outbreaks resulting from low reporting frequency (monthly), constitute major challenges for early detection of outbreaks in the livestock population based on meat inspection data. Reporting frequency should be increased in the future to improve timeliness of the SyS system while increased sensitivity may be achieved by integrating meat inspection data into a multivariate system simultaneously evaluating multiple sources of data on livestock health.
Resumo:
The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^
Resumo:
Current statistical methods for estimation of parametric effect sizes from a series of experiments are generally restricted to univariate comparisons of standardized mean differences between two treatments. Multivariate methods are presented for the case in which effect size is a vector of standardized multivariate mean differences and the number of treatment groups is two or more. The proposed methods employ a vector of independent sample means for each response variable that leads to a covariance structure which depends only on correlations among the $p$ responses on each subject. Using weighted least squares theory and the assumption that the observations are from normally distributed populations, multivariate hypotheses analogous to common hypotheses used for testing effect sizes were formulated and tested for treatment effects which are correlated through a common control group, through multiple response variables observed on each subject, or both conditions.^ The asymptotic multivariate distribution for correlated effect sizes is obtained by extending univariate methods for estimating effect sizes which are correlated through common control groups. The joint distribution of vectors of effect sizes (from $p$ responses on each subject) from one treatment and one control group and from several treatment groups sharing a common control group are derived. Methods are given for estimation of linear combinations of effect sizes when certain homogeneity conditions are met, and for estimation of vectors of effect sizes and confidence intervals from $p$ responses on each subject. Computational illustrations are provided using data from studies of effects of electric field exposure on small laboratory animals. ^
Resumo:
Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^
Resumo:
The role of clinical chemistry has traditionally been to evaluate acutely ill or hospitalized patients. Traditional statistical methods have serious drawbacks in that they use univariate techniques. To demonstrate alternative methodology, a multivariate analysis of covariance model was developed and applied to the data from the Cooperative Study of Sickle Cell Disease.^ The purpose of developing the model for the laboratory data from the CSSCD was to evaluate the comparability of the results from the different clinics. Several variables were incorporated into the model in order to control for possible differences among the clinics that might confound any real laboratory differences.^ Differences for LDH, alkaline phosphatase and SGOT were identified which will necessitate adjustments by clinic whenever these data are used. In addition, aberrant clinic values for LDH, creatinine and BUN were also identified.^ The use of any statistical technique including multivariate analysis without thoughtful consideration may lead to spurious conclusions that may not be corrected for some time, if ever. However, the advantages of multivariate analysis far outweigh its potential problems. If its use increases as it should, the applicability to the analysis of laboratory data in prospective patient monitoring, quality control programs, and interpretation of data from cooperative studies could well have a major impact on the health and well being of a large number of individuals. ^
Resumo:
Individuals with disabilities face numerous barriers to participation due to biological and physical characteristics of the disability as well as social and environmental factors. Participation can be impacted on all levels from societal, to activities of daily living, exercise, education, and interpersonal relationships. This study evaluated the impact of pain, mood, depression, quality of life and fatigue on participation for individuals with mobility impairments. This cross sectional study derives from self-report data collected from a wheelchair using sample. Bivariate correlational and multivariate analysis were employed to examine the relationship between pain, quality of life, positive and negative mood, fatigue, and depression with participation while controlling for relevant socio-demographic variables (sex, age, time with disability, race, and education). Results from the 122 respondents with mobility impairments demonstrated that after controlling for socio-demographic characteristics in the full model, 20% of the variance in participation scores were accounted for by pain, quality of life, positive and negative mood, and depression. Notably, quality of life emerged as being the single variable that was significantly related to participation in the full model. Contrary to other studies, pain did not appear to significantly impact participation outcomes for wheelchair users in this sample. Participation is an emerging area of interest among rehabilitation and disability researchers, and results of this study provide compelling evidence that several psychosocial factors are related to participation. This area of inquiry warrants further study, as many of the psychosocial variables identified in this study (mood, depression, quality of life) may be amenable to intervention, which may also positively influence participation.^
Resumo:
Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^