11 resultados para Linear coregionalization model
em DigitalCommons@The Texas Medical Center
Resumo:
Hierarchical linear growth model (HLGM), as a flexible and powerful analytic method, has played an increased important role in psychology, public health and medical sciences in recent decades. Mostly, researchers who conduct HLGM are interested in the treatment effect on individual trajectories, which can be indicated by the cross-level interaction effects. However, the statistical hypothesis test for the effect of cross-level interaction in HLGM only show us whether there is a significant group difference in the average rate of change, rate of acceleration or higher polynomial effect; it fails to convey information about the magnitude of the difference between the group trajectories at specific time point. Thus, reporting and interpreting effect sizes have been increased emphases in HLGM in recent years, due to the limitations and increased criticisms for statistical hypothesis testing. However, most researchers fail to report these model-implied effect sizes for group trajectories comparison and their corresponding confidence intervals in HLGM analysis, since lack of appropriate and standard functions to estimate effect sizes associated with the model-implied difference between grouping trajectories in HLGM, and also lack of computing packages in the popular statistical software to automatically calculate them. ^ The present project is the first to establish the appropriate computing functions to assess the standard difference between grouping trajectories in HLGM. We proposed the two functions to estimate effect sizes on model-based grouping trajectories difference at specific time, we also suggested the robust effect sizes to reduce the bias of estimated effect sizes. Then, we applied the proposed functions to estimate the population effect sizes (d ) and robust effect sizes (du) on the cross-level interaction in HLGM by using the three simulated datasets, and also we compared the three methods of constructing confidence intervals around d and du recommended the best one for application. At the end, we constructed 95% confidence intervals with the suitable method for the effect sizes what we obtained with the three simulated datasets. ^ The effect sizes between grouping trajectories for the three simulated longitudinal datasets indicated that even though the statistical hypothesis test shows no significant difference between grouping trajectories, effect sizes between these grouping trajectories can still be large at some time points. Therefore, effect sizes between grouping trajectories in HLGM analysis provide us additional and meaningful information to assess group effect on individual trajectories. In addition, we also compared the three methods to construct 95% confident intervals around corresponding effect sizes in this project, which handled with the uncertainty of effect sizes to population parameter. We suggested the noncentral t-distribution based method when the assumptions held, and the bootstrap bias-corrected and accelerated method when the assumptions are not met.^
Resumo:
A patient classification system was developed integrating a patient acuity instrument with a computerized nursing distribution method based on a linear programming model. The system was designed for real-time measurement of patient acuity (workload) and allocation of nursing personnel to optimize the utilization of resources.^ The acuity instrument was a prototype tool with eight categories of patients defined by patient severity and nursing intensity parameters. From this tool, the demand for nursing care was defined in patient points with one point equal to one hour of RN time. Validity and reliability of the instrument was determined as follows: (1) Content validity by a panel of expert nurses; (2) predictive validity through a paired t-test analysis of preshift and postshift categorization of patients; (3) initial reliability by a one month pilot of the instrument in a practice setting; and (4) interrater reliability by the Kappa statistic.^ The nursing distribution system was a linear programming model using a branch and bound technique for obtaining integer solutions. The objective function was to minimize the total number of nursing personnel used by optimally assigning the staff to meet the acuity needs of the units. A penalty weight was used as a coefficient of the objective function variables to define priorities for allocation of staff.^ The demand constraints were requirements to meet the total acuity points needed for each unit and to have a minimum number of RNs on each unit. Supply constraints were: (1) total availability of each type of staff and the value of that staff member (value was determined relative to that type of staff's ability to perform the job function of an RN (i.e., value for eight hours RN = 8 points, LVN = 6 points); (2) number of personnel available for floating between units.^ The capability of the model to assign staff quantitatively and qualitatively equal to the manual method was established by a thirty day comparison. Sensitivity testing demonstrated appropriate adjustment of the optimal solution to changes in penalty coefficients in the objective function and to acuity totals in the demand constraints.^ Further investigation of the model documented: correct adjustment of assignments in response to staff value changes; and cost minimization by an addition of a dollar coefficient to the objective function. ^
Resumo:
A historical prospective study was designed to assess the man weight status of subjects who participated in a behavioral weight reduction program in 1983 and to determine whether there was an association between the dependent variable weight change and any of 31 independent variables after a 2 year follow-up period. Data was obtained by abstracting the subjects records and from a follow-up questionnaire administered 2 years following program participation. Five hundred nine subjects (386 females and 123 males) of 1460 subjects who participated in the program, completed and returned the questionnaire. Results showed that mean weight was significantly different (p < 0.001) between the measurement at baseline and after a 2 year follow-up period. The mean weight loss of the group was 5.8 pounds, 10.7 pounds for males and 4.2 pounds for females after a 2 year follow-up period. A total of 63.9% of the group, 69.9% of males and 61.9% of females were still below their initial weight after the 2 year follow-up period. Sixteen of the 31 variables assessed utilizing bivariate analyses were found to be significantly (p (LESSTHEQ) 0.05) associated with weight change after a 2 year follow-up period. These variables were then entered into a multivariate linear regression model. A total of 37.9% of the variance of the dependent variable, weight change, was accounted for by all 16 variables. Eight of these variables were found to be significantly (p (LESSTHEQ) 0.05) predictive of weight change in the stepwise multivariate process accounting for 37.1% of the variance. These variables included: Two baseline variables (percent over ideal body weight at enrollment and occupation) and six follow-up variables (feeling in control of eating habits, percent of body weight lost during treatment, frequency of weight measurement, physical activity, eating in response to emotions, and number of pounds of weight gain needed to resume a diet). It was concluded that a greater amount of emphasis should be placed on the six follow-up variables by clinicians involved in the treatment of obesity, and by the subjects themselves to enhance their chances of success at long-term weight loss. ^
Resumo:
Background. In over 30 years, the prevalence of overweight for children and adolescents has increased across the United States (Barlow et al., 2007; Ogden, Flegal, Carroll, & Johnson, 2002). Childhood obesity is linked with adverse physiological and psychological issues in youth and affects ethnic/minority populations in disproportionate rates (Barlow et al., 2007; Butte et al., 2006; Butte, Cai, Cole, Wilson, Fisher, Zakeri, Ellis, & Comuzzie, 2007). More importantly, overweight in children and youth tends to track into adulthood (McNaughton, Ball, Mishra, & Crawford, 2008; Ogden et al., 2002). Childhood obesity affects body functions such as the cardiovascular, respiratory, gastrointestinal, and endocrine systems, including emotional health (Barlow et al., 2007, Ogden et al., 2002). Several dietary factors have been associated with the development of obesity in children; however, these factors have not been fully elucidated, especially in ethnic/minority children. In particular, few studies have been done to determine the effects of different meal patterns on the development of obesity in children. Purpose. The purpose of the study is to examine the relationships between daily proportions of energy consumed and energy derived from fat across breakfast, lunch, dinner, and snack, and obesity among Hispanic children and adolescents. Methods. A cross-sectional design was used to evaluate the relationship between dietary patterns and overweight status in Hispanic children and adolescents 4-19 years of age who participated in the Viva La Familia Study. The goal of the Viva La Familia Study was to evaluate genetic and environmental factors affecting childhood obesity and its co-morbidities in the Hispanic population (Butte et al., 2006, 2007). The study enrolled 1030 Hispanic children and adolescents from 319 families and examined factors related to increased body weight by focusing on a multilevel analysis of extensive sociodemographic, genetic, metabolic, and behavioral data. Baseline dietary intakes of the children were collected using 24-hour recalls, and body mass index was calculated from measured height and weight, and classified using the CDC standards. Dietary data were analyzed using a GEE population-averaged panel-data model with a cluster variable family identifier to include possible correlations within related data sets. A linear regression model was used to analyze associations of dietary patterns using possible covariates, and to examine the percentage of daily energy coming from breakfast, lunch, dinner, and snack while adjusting for age, sex, and BMI z-score. Random-effects logistic regression models were used to determine the relationship of the dietary variables with obesity status and to understand if the percent energy intake (%EI) derived from fat from all meals (breakfast, lunch, dinner, and snacks) affected obesity. Results. Older children (age 4-19 years) consumed a higher percent of energy at lunch and dinner and less percent energy from snacks compared to younger children. Age was significantly associated with percentage of total energy intake (%TEI) for lunch, as well as dinner, while no association was found by gender. Percent of energy consumed from dinner significantly differed by obesity status, with obese children consuming more energy at dinner (p = 0.03), but no associations were found between percent energy from fat and obesity across all meals. Conclusions. Information from this study can be used to develop interventions that target dietary intake patterns in obesity prevention programs for Hispanic children and adolescents. In particular, intervention programs for children should target dietary patterns with energy intake that is spread throughout the day and earlier in the day. These results indicate that a longitudinal study should be used to further explore the relationship of dietary patterns and BMI in this and other populations (Dubois et al., 2008; Rodriquez & Moreno, 2006; Thompson et al., 2005; Wilson et al., in review, 2008). ^
Resumo:
Patients who had started HAART (Highly Active Anti-Retroviral Treatment) under previous aggressive DHHS guidelines (1997) underwent a life-long continuous HAART that was associated with many short term as well as long term complications. Many interventions attempted to reduce those complications including intermittent treatment also called pulse therapy. Many studies were done to study the determinants of rate of fall in CD4 count after interruption as this data would help guide treatment interruptions. The data set used here was a part of a cohort study taking place at the Johns Hopkins AIDS service since January 1984, in which the data were collected both prospectively and retrospectively. The patients in this data set consisted of 47 patients receiving via pulse therapy with the aim of reducing the long-term complications. ^ The aim of this project was to study the impact of virologic and immunologic factors on the rate of CD4 loss after treatment interruption. The exposure variables under investigation included CD4 cell count and viral load at treatment initiation. The rates of change of CD4 cell count after treatment interruption was estimated from observed data using advanced longitudinal data analysis methods (i.e., linear mixed model). Using random effects accounted for repeated measures of CD4 per person after treatment interruption. The regression coefficient estimates from the model was then used to produce subject specific rates of CD4 change accounting for group trends in change. The exposure variables of interest were age, race, and gender, CD4 cell counts and HIV RNA levels at HAART initiation. ^ The rate of fall of CD4 count did not depend on CD4 cell count or viral load at initiation of treatment. Thus these factors may not be used to determine who can have a chance of successful treatment interruption. CD4 and viral load were again studied by t-tests and ANOVA test after grouping based on medians and quartiles to see any difference in means of rate of CD4 fall after interruption. There was no significant difference between the groups suggesting that there was no association between rate of fall of CD4 after treatment interruption and above mentioned exposure variables. ^
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
The objective of this dissertation was to design and implement strategies for assessment of exposures to organic chemicals used in the production of a styrene-butadiene polymer at the Texas Plastics Company (TPC). Linear statistical retrospective exposure models, univariate and multivariate, were developed based on the validation of historical industrial hygiene monitoring data collected by industrial hygienists at TPC, and additional current industrial hygiene monitoring data collected for the purposes of this study. The current monitoring data served several purposes. First, it provided information on current exposure data, in the form of unbiased estimates of mean exposure to organic chemicals for each job title included. Second, it provided information on homogeneity of exposure within each job title, through the use of a carefully designed sampling scheme which addressed variability of exposure both between and within job titles. Third, it permitted the investigation of how well current exposure data can serve as an evaluation tool for retrospective exposure estimation. Finally, this dissertation investigated the simultaneous evaluation of exposure to several chemicals, as well as the use of values below detection limits in a multivariate linear statistical model of exposures. ^
Resumo:
Purpose: School districts in the U.S. regularly offer foods that compete with the USDA reimbursable meal, known as `a la carte' foods. These foods must adhere to state nutritional regulations; however, the implementation of these regulations often differs across districts. The purpose of this study is to compare two methods of offering a la carte foods on student's lunch intake: 1) an extensive a la carte program in which schools have a separate area for a la carte food sales, that includes non-reimbursable entrees; and 2) a moderate a la carte program, which offers the sale of a la carte foods on the same serving line with reimbursable meals. ^ Methods: Direct observation was used to assess children's lunch consumption in six schools, across two districts in Central Texas (n=373 observations). Schools were matched on socioeconomic status. Data collectors were randomly assigned to students, and recorded foods obtained, foods consumed, source of food, gender, grade, and ethnicity. Observations were entered into a nutrient database program, FIAS Millennium Edition, to obtain nutritional information. Differences in energy and nutrient intake across lunch sources and districts were assessed using ANOVA and independent t-tests. A linear regression model was applied to control for potential confounders. ^ Results: Students at schools with extensive a la carte programs consumed significantly more calories, carbohydrates, total fat, saturated fat, calcium, and sodium compared to students in schools with moderate a la carte offerings (p<.05). Students in the extensive a la carte program consumed approximately 94 calories more than students in the moderate a la carte program. There was no significant difference in the energy consumption in students who consumed any amount of a la carte compared to students who consumed none. In both districts, students who consumed a la carte offerings were more likely to consume sugar-sweetened beverages, sweets, chips, and pizza compared to students who consumed no a la carte foods. ^ Conclusion: The amount, type and method of offering a la carte foods can significantly affect student dietary intake. This pilot study indicates that when a la carte foods are more available, students consume more calories. Findings underscore the need for further investigation on how availability of a la carte foods affects children's diets. Guidelines for school a la carte offerings should be maximized to encourage the consumption of healthful foods and appropriate energy intake.^
New methods for quantification and analysis of quantitative real-time polymerase chain reaction data
Resumo:
Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantitation method that has been widely used in the biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle (CT) method, linear and non-linear model fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence is usually inaccurate, and therefore can distort results. Here, we propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtracted the fluorescence in the former cycle from that in the later cycle, transforming the n cycle raw data into n-1 cycle data. Then linear regression was applied to the natural logarithm of the transformed data. Finally, amplification efficiencies and the initial DNA molecular numbers were calculated for each PCR run. To evaluate this new method, we compared it in terms of accuracy and precision with the original linear regression method with three background corrections, being the mean of cycles 1-3, the mean of cycles 3-7, and the minimum. Three criteria, including threshold identification, max R2, and max slope, were employed to search for target data points. Considering that PCR data are time series data, we also applied linear mixed models. Collectively, when the threshold identification criterion was applied and when the linear mixed model was adopted, the taking-difference linear regression method was superior as it gave an accurate estimation of initial DNA amount and a reasonable estimation of PCR amplification efficiencies. When the criteria of max R2 and max slope were used, the original linear regression method gave an accurate estimation of initial DNA amount. Overall, the taking-difference linear regression method avoids the error in subtracting an unknown background and thus it is theoretically more accurate and reliable. This method is easy to perform and the taking-difference strategy can be extended to all current methods for qPCR data analysis.^
Resumo:
Interaction effect is an important scientific interest for many areas of research. Common approach for investigating the interaction effect of two continuous covariates on a response variable is through a cross-product term in multiple linear regression. In epidemiological studies, the two-way analysis of variance (ANOVA) type of method has also been utilized to examine the interaction effect by replacing the continuous covariates with their discretized levels. However, the implications of model assumptions of either approach have not been examined and the statistical validation has only focused on the general method, not specifically for the interaction effect.^ In this dissertation, we investigated the validity of both approaches based on the mathematical assumptions for non-skewed data. We showed that linear regression may not be an appropriate model when the interaction effect exists because it implies a highly skewed distribution for the response variable. We also showed that the normality and constant variance assumptions required by ANOVA are not satisfied in the model where the continuous covariates are replaced with their discretized levels. Therefore, naïve application of ANOVA method may lead to an incorrect conclusion. ^ Given the problems identified above, we proposed a novel method modifying from the traditional ANOVA approach to rigorously evaluate the interaction effect. The analytical expression of the interaction effect was derived based on the conditional distribution of the response variable given the discretized continuous covariates. A testing procedure that combines the p-values from each level of the discretized covariates was developed to test the overall significance of the interaction effect. According to the simulation study, the proposed method is more powerful then the least squares regression and the ANOVA method in detecting the interaction effect when data comes from a trivariate normal distribution. The proposed method was applied to a dataset from the National Institute of Neurological Disorders and Stroke (NINDS) tissue plasminogen activator (t-PA) stroke trial, and baseline age-by-weight interaction effect was found significant in predicting the change from baseline in NIHSS at Month-3 among patients received t-PA therapy.^
Resumo:
With most clinical trials, missing data presents a statistical problem in evaluating a treatment's efficacy. There are many methods commonly used to assess missing data; however, these methods leave room for bias to enter the study. This thesis was a secondary analysis on data taken from TIME, a phase 2 randomized clinical trial conducted to evaluate the safety and effect of the administration timing of bone marrow mononuclear cells (BMMNC) for subjects with acute myocardial infarction (AMI).^ We evaluated the effect of missing data by comparing the variance inflation factor (VIF) of the effect of therapy between all subjects and only subjects with complete data. Through the general linear model, an unbiased solution was made for the VIF of the treatment's efficacy using the weighted least squares method to incorporate missing data. Two groups were identified from the TIME data: 1) all subjects and 2) subjects with complete data (baseline and follow-up measurements). After the general solution was found for the VIF, it was migrated Excel 2010 to evaluate data from TIME. The resulting numerical value from the two groups was compared to assess the effect of missing data.^ The VIF values from the TIME study were considerably less in the group with missing data. By design, we varied the correlation factor in order to evaluate the VIFs of both groups. As the correlation factor increased, the VIF values increased at a faster rate in the group with only complete data. Furthermore, while varying the correlation factor, the number of subjects with missing data was also varied to see how missing data affects the VIF. When subjects with only baseline data was increased, we saw a significant rate increase in VIF values in the group with only complete data while the group with missing data saw a steady and consistent increase in the VIF. The same was seen when we varied the group with follow-up only data. This essentially showed that the VIFs steadily increased when missing data is not ignored. When missing data is ignored as with our comparison group, the VIF values sharply increase as correlation increases.^