18 resultados para variance analysis

em DigitalCommons@The Texas Medical Center


Relevância:

60.00% 60.00%

Publicador:

Resumo:

I have undertaken measurements of the genetic (or inherited) and nongenetic (or noninherited) components of the variability of metastasis formation and tumor diameter doubling time in more than 100 metastatic lines from each of three murine tumors (sarcoma SANH, sarcoma SA4020, and hepatocarcinoma HCA-I) syngeneic to C3Hf/Kam mice. These lines were isolated twice from lung metastases and analysed immediately thereafter to obtain the variance to spontaneous lung metastasis and tumor diameter doubling time. Additional studies utilized cells obtained from within 4 passages of isolation. Under the assumption that no genetic differences in metastasis formation or diameter doubling time existed among the cells of a given line, the variance within a line would estimate nongenetic variation. The variability derived from differences between lines would represent genetic origin. The estimates of the genetic contribution to the variation of metastasis and tumor diameter doubling time were significantly greater than zero, but only in the metastatic lines of tumor SANH was genetic variation the major source of metastatic variability (contributing 53% of the variability). In the tumor cell lines of SA4020 and HCA-I, however, the contribution of nongenetic factors predominated over genetic factors in the variability of the number of metastasis and tumor diameter doubling time. A number of other parameters examined, such as DNA content, karyotype, and selection and variance analysis with passage in vivo, indicated that genetic differences existed within the cell lines and that these differences were probably created by genetic instability. The mean metastatic propensity of the lines may have increased somewhat during their isolation and isotransplantation, but the variance was only slightly affected, if at all. Analysis of the DNA profiles of the metastatic lines of SA4020 and HCA-I revealed differences between these lines and their primary parent tumors, but not among the SANH lines and their parent tumor. Furthermore, there was a direct correlation between the extent of genetic influence on metastasis formation and the ability of the tumor cells to develop resistance to cisplatinum. Thus although nongenetic factors might predominate in contributing to metastasis formation, it is probably genetic variation and genetic instability that cause the progression of tumor cells to a more metastatic phenotype and leads to the emergence of drug resistance. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The electroencephalogram (EEG) is a physiological time series that measures electrical activity at different locations in the brain, and plays an important role in epilepsy research. Exploring the variance and/or volatility may yield insights for seizure prediction, seizure detection and seizure propagation/dynamics.^ Maximal Overlap Discrete Wavelet Transforms (MODWTs) and ARMA-GARCH models were used to determine variance and volatility characteristics of 66 channels for different states of an epileptic EEG – sleep, awake, sleep-to-awake and seizure. The wavelet variances, changes in wavelet variances and volatility half-lives for the four states were compared for possible differences between seizure and non-seizure channels.^ The half-lives of two of the three seizure channels were found to be shorter than all of the non-seizure channels, based on 95% CIs for the pre-seizure and awake signals. No discernible patterns were found the wavelet variances of the change points for the different signals. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Any functionally important mutation is embedded in an evolutionary matrix of other mutations. Cladistic analysis, based on this, is a method of investigating gene effects using a haplotype phylogeny to define a set of tests which localize causal mutations to branches of the phylogeny. Previous implementations of cladistic analysis have not addressed the issue of analyzing data from related individuals, though in human studies, family data are usually needed to obtain unambiguous haplotypes. In this study, a method of cladistic analysis is described in which haplotype effects are parameterized in a linear model which accounts for familial correlations. The method was used to study the effect of apolipoprotein (Apo) B gene variation on total-, LDL-, and HDL-cholesterol, triglyceride, and Apo B levels in 121 French families. Five polymorphisms defined Apo B haplotypes: the signal peptide Insertion/deletion, Bsp 1286I, XbaI, MspI, and EcoRI. Eleven haplotypes were found, and a haplotype phylogeny was constructed and used to define a set of tests of haplotype effects on lipid and apo B levels.^ This new method of cladistic analysis, the parametric method, found significant effects for single haplotypes for all variables. For HDL-cholesterol, 3 clusters of evolutionarily-related haplotypes affecting levels were found. Haplotype effects accounted for about 10% of the genetic variance of triglyceride and HDL-cholesterol levels. The results of the parametric method were compared to those of a method of cladistic analysis based on permutational testing. The permutational method detected fewer haplotype effects, even when modified to account for correlations within families. Simulation studies exploring these differences found evidence of systematic errors in the permutational method due to the process by which haplotype groups were selected for testing.^ The applicability of cladistic analysis to human data was shown. The parametric method is suggested as an improvement over the permutational method. This study has identified candidate haplotypes for sequence comparisons in order to locate the functional mutations in the Apo B gene which may influence plasma lipid levels. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In numerous intervention studies and education field trials, random assignment to treatment occurs in clusters rather than at the level of observation. This departure of random assignment of units may be due to logistics, political feasibility, or ecological validity. Data within the same cluster or grouping are often correlated. Application of traditional regression techniques, which assume independence between observations, to clustered data produce consistent parameter estimates. However such estimators are often inefficient as compared to methods which incorporate the clustered nature of the data into the estimation procedure (Neuhaus 1993).1 Multilevel models, also known as random effects or random components models, can be used to account for the clustering of data by estimating higher level, or group, as well as lower level, or individual variation. Designing a study, in which the unit of observation is nested within higher level groupings, requires the determination of sample sizes at each level. This study investigates the design and analysis of various sampling strategies for a 3-level repeated measures design on the parameter estimates when the outcome variable of interest follows a Poisson distribution. ^ Results study suggest that second order PQL estimation produces the least biased estimates in the 3-level multilevel Poisson model followed by first order PQL and then second and first order MQL. The MQL estimates of both fixed and random parameters are generally satisfactory when the level 2 and level 3 variation is less than 0.10. However, as the higher level error variance increases, the MQL estimates become increasingly biased. If convergence of the estimation algorithm is not obtained by PQL procedure and higher level error variance is large, the estimates may be significantly biased. In this case bias correction techniques such as bootstrapping should be considered as an alternative procedure. For larger sample sizes, those structures with 20 or more units sampled at levels with normally distributed random errors produced more stable estimates with less sampling variance than structures with an increased number of level 1 units. For small sample sizes, sampling fewer units at the level with Poisson variation produces less sampling variation, however this criterion is no longer important when sample sizes are large. ^ 1Neuhaus J (1993). “Estimation efficiency and Tests of Covariate Effects with Clustered Binary Data”. Biometrics , 49, 989–996^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study applies the multilevel analysis technique to longitudinal data of a large clinical trial. The technique accounts for the correlation at different levels when modeling repeated blood pressure measurements taken throughout the trial. This modeling allows for closer inspection of the remaining correlation and non-homogeneity of variance in the data. Three methods of modeling the correlation were compared. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Do siblings of centenarians tend to have longer life spans? To answer this question, life spans of 184 siblings for 42 centenarians have been evaluated. Two important questions have been addressed in analyzing the sibling data. First, a standard needs to be established, to which the life spans of 184 siblings are compared. In this report, an external reference population is constructed from the U.S. life tables. Its estimated mortality rates are treated as baseline hazards from which the relative mortality of the siblings are estimated. Second, the standard survival models which assume independent observations are invalid when correlation within family exists, underestimating the true variance. Methods that allow correlations are illustrated by three different methods. First, the cumulative relative excess mortality between siblings and their comparison group is calculated and used as an effective graphic tool, along with the Product Limit estimator of the survival function. The variance estimator of the cumulative relative excess mortality is adjusted for the potential within family correlation using Taylor linearization approach. Second, approaches that adjust for the inflated variance are examined. They are adjusted one-sample log-rank test using design effect originally proposed by Rao and Scott in the correlated binomial or Poisson distribution setting and the robust variance estimator derived from the log-likelihood function of a multiplicative model. Nether of these two approaches provide correlation estimate within families, but the comparison with the comparison with the standard remains valid under dependence. Last, using the frailty model concept, the multiplicative model, where the baseline hazards are known, is extended by adding a random frailty term that is based on the positive stable or the gamma distribution. Comparisons between the two frailty distributions are performed by simulation. Based on the results from various approaches, it is concluded that the siblings of centenarians had significant lower mortality rates as compared to their cohorts. The frailty models also indicate significant correlations between the life spans of the siblings. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background. Childhood immunization programs have dramatically reduced the morbidity and mortality associated with vaccine-preventable diseases. Proper documentation of immunizations that have been administered is essential to prevent duplicate immunization of children. To help improve documentation, immunization information systems (IISs) have been developed. IISs are comprehensive repositories of immunization information for children residing within a geographic region. The two models for participation in an IIS are voluntary inclusion, or "opt-in," and voluntary exclusion, or "opt-out." In an opt-in system, consent must be obtained for each participant, conversely, in an opt-out IIS, all children are included unless procedures to exclude the child are completed. Consent requirements for participation vary by state; the Texas IIS, ImmTrac, is an opt-in system.^ Objectives. The specific objectives are to: (1) Evaluate the variance among the time and costs associated with collecting ImmTrac consent at public and private birthing hospitals in the Greater Houston area; (2) Estimate the total costs associated with collecting ImmTrac consent at selected public and private birthing hospitals in the Greater Houston area; (3) Describe the alternative opt-out process for collecting ImmTrac consent at birth and discuss the associated cost savings relative to an opt-in system.^ Methods. Existing time-motion studies (n=281) conducted between October, 2006 and August, 2007 at 8 birthing hospitals in the Greater Houston area were used to assess the time and costs associated with obtaining ImmTrac consent at birth. All data analyzed are deidentified and contain no personal information. Variations in time and costs at each location were assessed and total costs per child and costs per year were estimated. The cost of an alternative opt-out system was also calculated.^ Results. The median time required by birth registrars to complete consent procedures varied from 72-285 seconds per child. The annual costs associated with obtaining consent for 388,285 newborns in ImmTrac's opt-in consent process were estimated at $702,000. The corresponding costs of the proposed opt-out system were estimated to total $194,000 per year. ^ Conclusions. Substantial variation in the time and costs associated with completion of ImmTrac consent procedures were observed. Changing to an opt-out system for participation could represent significant cost savings. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although many family-based genetic studies have collected dietary data, very few have used the dietary information in published findings. No single solution has been presented or discussed in the literature to deal with the problem of using factor analyses for the analyses of dietary data from several related individuals from a given household. The standard statistical approach of factor analysis cannot be applied to the VIVA LA FAMILIA Study diet data to ascertain dietary patterns since this population consists of three children from each family, thus the dietary patterns of the related children may be correlated and non-independent. Addressing this problem in this project will enable us to describe the dietary patterns in Hispanic families and to explore the relationships between dietary patterns and childhood obesity. ^ In the VIVA LA FAMILIA Study, an overweight child was first identified and then his/her siblings and parents were brought in for data collection which included 24 hour recalls and food frequency questionnaire (FFQ). Dietary intake data were collected using FFQ and 24 hour recalls on 1030 Hispanic children from 319 families. ^ The design of the VIVA LA FAMILIA Study has important and unique statistical considerations since its participants are related to each other, the majority form distinct nuclear families. Thus, the standard approach of factor analysis cannot be applied to these diet data to ascertain dietary patterns. In this project we propose to investigate whether the determinants of the correlation matrix of each family unit will allow us to adjust the original correlation matrix of the dietary intake data prior to ascertaining dietary intake patterns. If these methods are appropriate, then in the future the dietary patterns among related individuals could be assessed by standard orthogonal principal component factor analysis.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

West Nile Virus (WNV) is an arboviral disease that has affected hundreds of residents in Harris County, Texas since its introduction in 2002. Persistent infection, lingering sequelae and other long-term symptoms of patients reaffirm the need for prevention of this important vector-borne disease. This study aimed to determine if living within 400m of a water body increases one’s odds of infection with WNV. Additionally, we wanted to determine if one’s proximity to a particular water type or water body source increased one’s odds of infection with WNV.^ 145 cases’ addresses were abstracted from the initial interview and consent records from a cohort of patients (Epidemiology of Arboviral Encephalitis in Houston study, HSC-SPH-03-039). After applying inclusion criteria, 140 cases were identified for analysis. 140 controls were selected for analysis using a population proportionate to size model and US Census Bureau data. MapMarker USA v14 was used to geocode the cases’ addresses. Both cases’ and controls’ coordinates were uploaded onto a Harris County water shapefile in MapInfo Professional v9.5.1. Distance in meters to the closest water source, closest water source type, and closest water source name were recorded.^ Analysis of Variance (p=0.329, R2 = 0.0034) indicated no association between water body distance and risk of WNV disease. Living near a creek (x2 = 11.79, p < 0.001), or the combined group of creek and gully (x 2 = 14.02, p < 0.001) were found to be strongly associated with infection of WNV. Living near Cypress Creek and its feeders (x2 = 15.2, p < 0.001) was found to be strongly associated with WNV infection. We found that creek and gully habitats, particularly Cypress Creek, were preferential for the local disease transmitting Culex quinquefasciatus and reservoir avian population.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A discussion of nonlinear dynamics, demonstrated by the familiar automobile, is followed by the development of a systematic method of analysis of a possibly nonlinear time series using difference equations in the general state-space format. This format allows recursive state-dependent parameter estimation after each observation thereby revealing the dynamics inherent in the system in combination with random external perturbations.^ The one-step ahead prediction errors at each time period, transformed to have constant variance, and the estimated parametric sequences provide the information to (1) formally test whether time series observations y(,t) are some linear function of random errors (ELEM)(,s), for some t and s, or whether the series would more appropriately be described by a nonlinear model such as bilinear, exponential, threshold, etc., (2) formally test whether a statistically significant change has occurred in structure/level either historically or as it occurs, (3) forecast nonlinear system with a new and innovative (but very old numerical) technique utilizing rational functions to extrapolate individual parameters as smooth functions of time which are then combined to obtain the forecast of y and (4) suggest a measure of resilience, i.e. how much perturbation a structure/level can tolerate, whether internal or external to the system, and remain statistically unchanged. Although similar to one-step control, this provides a less rigid way to think about changes affecting social systems.^ Applications consisting of the analysis of some familiar and some simulated series demonstrate the procedure. Empirical results suggest that this state-space or modified augmented Kalman filter may provide interesting ways to identify particular kinds of nonlinearities as they occur in structural change via the state trajectory.^ A computational flow-chart detailing computations and software input and output is provided in the body of the text. IBM Advanced BASIC program listings to accomplish most of the analysis are provided in the appendix. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Path analysis has been applied to components of the iron metabolic system with the intent of suggesting an integrated procedure for better evaluating iron nutritional status at the community level. The primary variables of interest in this study were (1) iron stores, (2) total iron-binding capacity, (3) serum ferritin, (4) serum iron, (5) transferrin saturation, and (6) hemoglobin concentration. Correlation coefficients for relationships among these variables were obtained from published literature and postulated in a series of models using measures of those variables that are feasible to include in a community nutritional survey. Models were built upon known information about the metabolism of iron and were limited by what had been reported in the literature in terms of correlation coefficients or quantitative relationships. Data were pooled from various studies and correlations of the same bivariate relationships were averaged after z- transformations. Correlation matrices were then constructed by transforming the average values back into correlation coefficients. The results of path analysis in this study indicate that hemoglobin is not a good indicator of early iron deficiency. It does not account for variance in iron stores. On the other hand, 91% of the variance in iron stores is explained by serum ferritin and total iron-binding capacity. In addition, the magnitude of the path coefficient (.78) of the serum ferritin-iron stores relationship signifies that serum ferritin is the most important predictor of iron stores in the proposed model. Finally, drawing upon known relations among variables and the amount of variance explained in path models, it is suggested that the following blood measures should be made in assessing community iron deficiency: (1) serum ferritin, (2) total iron-binding capacity, (3) serum iron, (4) transferrin saturation, and (5) hemoglobin concentration. These measures (with acceptable ranges and cut-off points) could make possible the complete evaluation of all three stages of iron deficiency in those persons surveyed at the community level. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The relative influence of race, income, education, and Food Stamp Program participation/nonparticipation on the food and nutrient intake of 102 fecund women ages 18-45 years in a Florida urban clinic population was assessed using the technique of multiple regression analysis. Study subgroups were defined by race and Food Stamp Program participation status. Education was found to have the greatest influence on food and nutrient intake. Race was the next most influential factor followed in order by Food Stamp Program participation and income. The combined effect of the four independent variables explained no more than 19 percent of the variance for any of the food and nutrient intake variables. This would indicate that a more complex model of influences is needed if variations in food and nutrient intake are to be fully explained.^ A socioeconomic questionnaire was administered to investigate other factors of influence. The influence of the mother, frequency and type of restaurant dining, and perceptions of food intake and weight were found to be factors deserving further study.^ Dietary data were collected using the 24-hour recall and food frequency checklist. Descriptive dietary findings indicated that iron and calcium were nutrients where adequacy was of concern for all study subgroups. White Food Stamp Program participants had the greatest number of mean nutrient intake values falling below the 1980 Recommended Dietary Allowances (RDAs). When Food Stamp Program participants were contrasted to nonparticipants, mean intakes of six nutrients (kilocalories, calcium, iron, vitamin A, thiamin, and riboflavin) were below the 1980 RDA compared to five mean nutrient intakes (kilocalories, calcium, iron, thiamin and riboflavin) for the nonparticipants. Use of the Index of Nutritional Quality (INQ), however, revealed that the quality of the diet of Food Stamp Program participants per 1000 kilocalories was adequate with exception of calcium and iron. Intakes of these nutrients were also not adequate on a 1000 kilocalorie basis for the nonparticipant group. When mean nutrient intakes of the groups were compared using Student's t-test oleicacid intake was the only significant difference found. Being a nonparticipant in the Food Stamp Program was found to be associated with more frequent consumption of cookies, sweet rolls, doughnuts, and honey. The findings of this study contradict the negative image of the Food Stamp Program participant and emphasize the importance of education. ^