11 resultados para analysis of variance

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The electroencephalogram (EEG) is a physiological time series that measures electrical activity at different locations in the brain, and plays an important role in epilepsy research. Exploring the variance and/or volatility may yield insights for seizure prediction, seizure detection and seizure propagation/dynamics.^ Maximal Overlap Discrete Wavelet Transforms (MODWTs) and ARMA-GARCH models were used to determine variance and volatility characteristics of 66 channels for different states of an epileptic EEG – sleep, awake, sleep-to-awake and seizure. The wavelet variances, changes in wavelet variances and volatility half-lives for the four states were compared for possible differences between seizure and non-seizure channels.^ The half-lives of two of the three seizure channels were found to be shorter than all of the non-seizure channels, based on 95% CIs for the pre-seizure and awake signals. No discernible patterns were found the wavelet variances of the change points for the different signals. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

West Nile Virus (WNV) is an arboviral disease that has affected hundreds of residents in Harris County, Texas since its introduction in 2002. Persistent infection, lingering sequelae and other long-term symptoms of patients reaffirm the need for prevention of this important vector-borne disease. This study aimed to determine if living within 400m of a water body increases one’s odds of infection with WNV. Additionally, we wanted to determine if one’s proximity to a particular water type or water body source increased one’s odds of infection with WNV.^ 145 cases’ addresses were abstracted from the initial interview and consent records from a cohort of patients (Epidemiology of Arboviral Encephalitis in Houston study, HSC-SPH-03-039). After applying inclusion criteria, 140 cases were identified for analysis. 140 controls were selected for analysis using a population proportionate to size model and US Census Bureau data. MapMarker USA v14 was used to geocode the cases’ addresses. Both cases’ and controls’ coordinates were uploaded onto a Harris County water shapefile in MapInfo Professional v9.5.1. Distance in meters to the closest water source, closest water source type, and closest water source name were recorded.^ Analysis of Variance (p=0.329, R2 = 0.0034) indicated no association between water body distance and risk of WNV disease. Living near a creek (x2 = 11.79, p < 0.001), or the combined group of creek and gully (x 2 = 14.02, p < 0.001) were found to be strongly associated with infection of WNV. Living near Cypress Creek and its feeders (x2 = 15.2, p < 0.001) was found to be strongly associated with WNV infection. We found that creek and gully habitats, particularly Cypress Creek, were preferential for the local disease transmitting Culex quinquefasciatus and reservoir avian population.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Any functionally important mutation is embedded in an evolutionary matrix of other mutations. Cladistic analysis, based on this, is a method of investigating gene effects using a haplotype phylogeny to define a set of tests which localize causal mutations to branches of the phylogeny. Previous implementations of cladistic analysis have not addressed the issue of analyzing data from related individuals, though in human studies, family data are usually needed to obtain unambiguous haplotypes. In this study, a method of cladistic analysis is described in which haplotype effects are parameterized in a linear model which accounts for familial correlations. The method was used to study the effect of apolipoprotein (Apo) B gene variation on total-, LDL-, and HDL-cholesterol, triglyceride, and Apo B levels in 121 French families. Five polymorphisms defined Apo B haplotypes: the signal peptide Insertion/deletion, Bsp 1286I, XbaI, MspI, and EcoRI. Eleven haplotypes were found, and a haplotype phylogeny was constructed and used to define a set of tests of haplotype effects on lipid and apo B levels.^ This new method of cladistic analysis, the parametric method, found significant effects for single haplotypes for all variables. For HDL-cholesterol, 3 clusters of evolutionarily-related haplotypes affecting levels were found. Haplotype effects accounted for about 10% of the genetic variance of triglyceride and HDL-cholesterol levels. The results of the parametric method were compared to those of a method of cladistic analysis based on permutational testing. The permutational method detected fewer haplotype effects, even when modified to account for correlations within families. Simulation studies exploring these differences found evidence of systematic errors in the permutational method due to the process by which haplotype groups were selected for testing.^ The applicability of cladistic analysis to human data was shown. The parametric method is suggested as an improvement over the permutational method. This study has identified candidate haplotypes for sequence comparisons in order to locate the functional mutations in the Apo B gene which may influence plasma lipid levels. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Do siblings of centenarians tend to have longer life spans? To answer this question, life spans of 184 siblings for 42 centenarians have been evaluated. Two important questions have been addressed in analyzing the sibling data. First, a standard needs to be established, to which the life spans of 184 siblings are compared. In this report, an external reference population is constructed from the U.S. life tables. Its estimated mortality rates are treated as baseline hazards from which the relative mortality of the siblings are estimated. Second, the standard survival models which assume independent observations are invalid when correlation within family exists, underestimating the true variance. Methods that allow correlations are illustrated by three different methods. First, the cumulative relative excess mortality between siblings and their comparison group is calculated and used as an effective graphic tool, along with the Product Limit estimator of the survival function. The variance estimator of the cumulative relative excess mortality is adjusted for the potential within family correlation using Taylor linearization approach. Second, approaches that adjust for the inflated variance are examined. They are adjusted one-sample log-rank test using design effect originally proposed by Rao and Scott in the correlated binomial or Poisson distribution setting and the robust variance estimator derived from the log-likelihood function of a multiplicative model. Nether of these two approaches provide correlation estimate within families, but the comparison with the comparison with the standard remains valid under dependence. Last, using the frailty model concept, the multiplicative model, where the baseline hazards are known, is extended by adding a random frailty term that is based on the positive stable or the gamma distribution. Comparisons between the two frailty distributions are performed by simulation. Based on the results from various approaches, it is concluded that the siblings of centenarians had significant lower mortality rates as compared to their cohorts. The frailty models also indicate significant correlations between the life spans of the siblings. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background. Childhood immunization programs have dramatically reduced the morbidity and mortality associated with vaccine-preventable diseases. Proper documentation of immunizations that have been administered is essential to prevent duplicate immunization of children. To help improve documentation, immunization information systems (IISs) have been developed. IISs are comprehensive repositories of immunization information for children residing within a geographic region. The two models for participation in an IIS are voluntary inclusion, or "opt-in," and voluntary exclusion, or "opt-out." In an opt-in system, consent must be obtained for each participant, conversely, in an opt-out IIS, all children are included unless procedures to exclude the child are completed. Consent requirements for participation vary by state; the Texas IIS, ImmTrac, is an opt-in system.^ Objectives. The specific objectives are to: (1) Evaluate the variance among the time and costs associated with collecting ImmTrac consent at public and private birthing hospitals in the Greater Houston area; (2) Estimate the total costs associated with collecting ImmTrac consent at selected public and private birthing hospitals in the Greater Houston area; (3) Describe the alternative opt-out process for collecting ImmTrac consent at birth and discuss the associated cost savings relative to an opt-in system.^ Methods. Existing time-motion studies (n=281) conducted between October, 2006 and August, 2007 at 8 birthing hospitals in the Greater Houston area were used to assess the time and costs associated with obtaining ImmTrac consent at birth. All data analyzed are deidentified and contain no personal information. Variations in time and costs at each location were assessed and total costs per child and costs per year were estimated. The cost of an alternative opt-out system was also calculated.^ Results. The median time required by birth registrars to complete consent procedures varied from 72-285 seconds per child. The annual costs associated with obtaining consent for 388,285 newborns in ImmTrac's opt-in consent process were estimated at $702,000. The corresponding costs of the proposed opt-out system were estimated to total $194,000 per year. ^ Conclusions. Substantial variation in the time and costs associated with completion of ImmTrac consent procedures were observed. Changing to an opt-out system for participation could represent significant cost savings. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The relative influence of race, income, education, and Food Stamp Program participation/nonparticipation on the food and nutrient intake of 102 fecund women ages 18-45 years in a Florida urban clinic population was assessed using the technique of multiple regression analysis. Study subgroups were defined by race and Food Stamp Program participation status. Education was found to have the greatest influence on food and nutrient intake. Race was the next most influential factor followed in order by Food Stamp Program participation and income. The combined effect of the four independent variables explained no more than 19 percent of the variance for any of the food and nutrient intake variables. This would indicate that a more complex model of influences is needed if variations in food and nutrient intake are to be fully explained.^ A socioeconomic questionnaire was administered to investigate other factors of influence. The influence of the mother, frequency and type of restaurant dining, and perceptions of food intake and weight were found to be factors deserving further study.^ Dietary data were collected using the 24-hour recall and food frequency checklist. Descriptive dietary findings indicated that iron and calcium were nutrients where adequacy was of concern for all study subgroups. White Food Stamp Program participants had the greatest number of mean nutrient intake values falling below the 1980 Recommended Dietary Allowances (RDAs). When Food Stamp Program participants were contrasted to nonparticipants, mean intakes of six nutrients (kilocalories, calcium, iron, vitamin A, thiamin, and riboflavin) were below the 1980 RDA compared to five mean nutrient intakes (kilocalories, calcium, iron, thiamin and riboflavin) for the nonparticipants. Use of the Index of Nutritional Quality (INQ), however, revealed that the quality of the diet of Food Stamp Program participants per 1000 kilocalories was adequate with exception of calcium and iron. Intakes of these nutrients were also not adequate on a 1000 kilocalorie basis for the nonparticipant group. When mean nutrient intakes of the groups were compared using Student's t-test oleicacid intake was the only significant difference found. Being a nonparticipant in the Food Stamp Program was found to be associated with more frequent consumption of cookies, sweet rolls, doughnuts, and honey. The findings of this study contradict the negative image of the Food Stamp Program participant and emphasize the importance of education. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation develops and tests through path analysis a theoretical model to explain how socioeconomic, socioenvironmental, and biologic risk factors simultaneously influence each other to further produce short-term, depressed growth in preschoolers. Three areas of risk factors were identified: child's proximal environment, maturational stage, and biological vulnerability. The theoretical model represented both the conceptual framework and the nature and direction of the hypotheses. Original research completed in 1978-80 and in 1982 provided the background data. It was analyzed first by nested-analysis of variance, followed by path analysis. The study provided evidence of mild iron deficiency and gastrointestinal symptomatology in the etiology of depressed, short-term weight gain. Also, there was evidence suggesting that family resources for material and social survival significantly contribute to the variability of short-term, age-adjusted growth velocity. These results challenge current views of unifocal intervention, whether for prevention or control. For policy formulations, though, the mechanisms underlying any set of interlaced relationships must be decoded. Theoretical formulations here proposed should be reassessed under a more extensive research design. It is suggested that studies should be undertaken where social changes are actually in progress; otherwise, nutritional epidemiology in developing countries operates somewhere between social reality and research concepts, with little grasp of its real potential. The study stresses that there is a connection between substantive theory, empirical observation, and policy issues. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many statistical studies feature data with both exact-time and interval-censored events. While a number of methods currently exist to handle interval-censored events and multivariate exact-time events separately, few techniques exist to deal with their combination. This thesis develops a theoretical framework for analyzing a multivariate endpoint comprised of a single interval-censored event plus an arbitrary number of exact-time events. The approach fuses the exact-time events, modeled using the marginal method of Wei, Lin, and Weissfeld, with a piecewise-exponential interval-censored component. The resulting model incorporates more of the information in the data and also removes some of the biases associated with the exclusion of interval-censored events. A simulation study demonstrates that our approach produces reliable estimates for the model parameters and their variance-covariance matrix. As a real-world data example, we apply this technique to the Systolic Hypertension in the Elderly Program (SHEP) clinical trial, which features three correlated events: clinical non-fatal myocardial infarction, fatal myocardial infarction (two exact-time events), and silent myocardial infarction (one interval-censored event). ^