9 resultados para Statistic Multivariate

em DigitalCommons@The Texas Medical Center


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This project develops K(bin), a relatively simple, binomial based statistic for assessing interrater agreement in which expected agreement is calculated a priori from the number of raters involved in the study and number of categories on the rating tool. The statistic is logical in interpretation, easily calculated, stable for small sample sizes, and has application over a wide range of possible combinations from the simplest case of two raters using a binomial scale to multiple raters using a multiple level scale.^ Tables of expected agreement values and tables of critical values for K(bin) which include power to detect three levels of the population parameter K for n from 2 to 30 and observed agreement $\ge$.70 calculated at alpha =.05,.025, and.01 are included.^ An example is also included which describes the use of the tables for planning and evaluating an interrater reliability study using the statistic, K(bin). ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A multivariate frailty hazard model is developed for joint-modeling of three correlated time-to-event outcomes: (1) local recurrence, (2) distant recurrence, and (3) overall survival. The term frailty is introduced to model population heterogeneity. The dependence is modeled by conditioning on a shared frailty that is included in the three hazard functions. Independent variables can be included in the model as covariates. The Markov chain Monte Carlo methods are used to estimate the posterior distributions of model parameters. The algorithm used in present application is the hybrid Metropolis-Hastings algorithm, which simultaneously updates all parameters with evaluations of gradient of log posterior density. The performance of this approach is examined based on simulation studies using Exponential and Weibull distributions. We apply the proposed methods to a study of patients with soft tissue sarcoma, which motivated this research. Our results indicate that patients with chemotherapy had better overall survival with hazard ratio of 0.242 (95% CI: 0.094 - 0.564) and lower risk of distant recurrence with hazard ratio of 0.636 (95% CI: 0.487 - 0.860), but not significantly better in local recurrence with hazard ratio of 0.799 (95% CI: 0.575 - 1.054). The advantages and limitations of the proposed models, and future research directions are discussed. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Common endpoints can be divided into two categories. One is dichotomous endpoints which take only fixed values (most of the time two values). The other is continuous endpoints which can be any real number between two specified values. Choices of primary endpoints are critical in clinical trials. If we only use dichotomous endpoints, the power could be underestimated. If only continuous endpoints are chosen, we may not obtain expected sample size due to occurrence of some significant clinical events. Combined endpoints are used in clinical trials to give additional power. However, current combined endpoints or composite endpoints in cardiovascular disease clinical trials or most clinical trials are endpoints that combine either dichotomous endpoints (total mortality + total hospitalization), or continuous endpoints (risk score). Our present work applied U-statistic to combine one dichotomous endpoint and one continuous endpoint, which has three different assessments and to calculate the sample size and test the hypothesis to see if there is any treatment effect. It is especially useful when some patients cannot provide the most precise measurement due to medical contraindication or some personal reasons. Results show that this method has greater power then the analysis using continuous endpoints alone. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current statistical methods for estimation of parametric effect sizes from a series of experiments are generally restricted to univariate comparisons of standardized mean differences between two treatments. Multivariate methods are presented for the case in which effect size is a vector of standardized multivariate mean differences and the number of treatment groups is two or more. The proposed methods employ a vector of independent sample means for each response variable that leads to a covariance structure which depends only on correlations among the $p$ responses on each subject. Using weighted least squares theory and the assumption that the observations are from normally distributed populations, multivariate hypotheses analogous to common hypotheses used for testing effect sizes were formulated and tested for treatment effects which are correlated through a common control group, through multiple response variables observed on each subject, or both conditions.^ The asymptotic multivariate distribution for correlated effect sizes is obtained by extending univariate methods for estimating effect sizes which are correlated through common control groups. The joint distribution of vectors of effect sizes (from $p$ responses on each subject) from one treatment and one control group and from several treatment groups sharing a common control group are derived. Methods are given for estimation of linear combinations of effect sizes when certain homogeneity conditions are met, and for estimation of vectors of effect sizes and confidence intervals from $p$ responses on each subject. Computational illustrations are provided using data from studies of effects of electric field exposure on small laboratory animals. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The role of clinical chemistry has traditionally been to evaluate acutely ill or hospitalized patients. Traditional statistical methods have serious drawbacks in that they use univariate techniques. To demonstrate alternative methodology, a multivariate analysis of covariance model was developed and applied to the data from the Cooperative Study of Sickle Cell Disease.^ The purpose of developing the model for the laboratory data from the CSSCD was to evaluate the comparability of the results from the different clinics. Several variables were incorporated into the model in order to control for possible differences among the clinics that might confound any real laboratory differences.^ Differences for LDH, alkaline phosphatase and SGOT were identified which will necessitate adjustments by clinic whenever these data are used. In addition, aberrant clinic values for LDH, creatinine and BUN were also identified.^ The use of any statistical technique including multivariate analysis without thoughtful consideration may lead to spurious conclusions that may not be corrected for some time, if ever. However, the advantages of multivariate analysis far outweigh its potential problems. If its use increases as it should, the applicability to the analysis of laboratory data in prospective patient monitoring, quality control programs, and interpretation of data from cooperative studies could well have a major impact on the health and well being of a large number of individuals. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An interim analysis is usually applied in later phase II or phase III trials to find convincing evidence of a significant treatment difference that may lead to trial termination at an earlier point than planned at the beginning. This can result in the saving of patient resources and shortening of drug development and approval time. In addition, ethics and economics are also the reasons to stop a trial earlier. In clinical trials of eyes, ears, knees, arms, kidneys, lungs, and other clustered treatments, data may include distribution-free random variables with matched and unmatched subjects in one study. It is important to properly include both subjects in the interim and the final analyses so that the maximum efficiency of statistical and clinical inferences can be obtained at different stages of the trials. So far, no publication has applied a statistical method for distribution-free data with matched and unmatched subjects in the interim analysis of clinical trials. In this simulation study, the hybrid statistic was used to estimate the empirical powers and the empirical type I errors among the simulated datasets with different sample sizes, different effect sizes, different correlation coefficients for matched pairs, and different data distributions, respectively, in the interim and final analysis with 4 different group sequential methods. Empirical powers and empirical type I errors were also compared to those estimated by using the meta-analysis t-test among the same simulated datasets. Results from this simulation study show that, compared to the meta-analysis t-test commonly used for data with normally distributed observations, the hybrid statistic has a greater power for data observed from normally, log-normally, and multinomially distributed random variables with matched and unmatched subjects and with outliers. Powers rose with the increase in sample size, effect size, and correlation coefficient for the matched pairs. In addition, lower type I errors were observed estimated by using the hybrid statistic, which indicates that this test is also conservative for data with outliers in the interim analysis of clinical trials.^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Campus behavior management is important for ensuring classroom order and promoting positive academic outcomes. Previous studies have shown the importance of individual student and campus personnel characteristics and campus context for explaining campus discipline rates (e.g., rates of suspension and expulsion). Assessing campus discipline rates, while controlling for these individual and campus characteristics, is important for the monitoring, evaluation, and intervention role of policymakers as well as state and federal level education agencies. Systems or metrics exist that measure other student outcomes (i.e., academic performance) with controls for individual and campus characteristics, but none exist that monitor these differences for discipline rates across campuses. In this paper, we use a multivariate model to analyze a longitudinal, statewide dataset for all secondary students in Texas from 2000 to 2008 in order to examine how campus discipline rates differ across schools with statistically similar students, teachers, and campus characteristics. The findings are important for understanding that some schools with similar characteristics have significantly different exclusionary discipline rates, and they are important for informing policy and agency level decision-making. The methodology described can easily be used by monitoring agencies as well as local school districts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^