Biblioteca Digital

178 resultados para biostatistics

Variance of familial dietary intake

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although many family-based genetic studies have collected dietary data, very few have used the dietary information in published findings. No single solution has been presented or discussed in the literature to deal with the problem of using factor analyses for the analyses of dietary data from several related individuals from a given household. The standard statistical approach of factor analysis cannot be applied to the VIVA LA FAMILIA Study diet data to ascertain dietary patterns since this population consists of three children from each family, thus the dietary patterns of the related children may be correlated and non-independent. Addressing this problem in this project will enable us to describe the dietary patterns in Hispanic families and to explore the relationships between dietary patterns and childhood obesity. ^ In the VIVA LA FAMILIA Study, an overweight child was first identified and then his/her siblings and parents were brought in for data collection which included 24 hour recalls and food frequency questionnaire (FFQ). Dietary intake data were collected using FFQ and 24 hour recalls on 1030 Hispanic children from 319 families. ^ The design of the VIVA LA FAMILIA Study has important and unique statistical considerations since its participants are related to each other, the majority form distinct nuclear families. Thus, the standard approach of factor analysis cannot be applied to these diet data to ascertain dietary patterns. In this project we propose to investigate whether the determinants of the correlation matrix of each family unit will allow us to adjust the original correlation matrix of the dietary intake data prior to ascertaining dietary intake patterns. If these methods are appropriate, then in the future the dietary patterns among related individuals could be assessed by standard orthogonal principal component factor analysis.^

The combined effect of pravastatin and aspirin on clinical cardiovascular events

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The 3-hydroxy-3methylglutaryl coenzyme A (HMG-CoA) reductase inhibitors, or statins, can achieve significant reductions in plasma low-density lipoprotein (LDL)-cholesterol levels. Experimental and clinical evidence now shows that some statins interfere with formation of atherosclerotic lesions independent of their hypolipidemic properties. Vulnerable plaque rupture can result in thrombus formation and artery occlusion; this plaque deterioration is responsible for most acute coronary syndromes, including myocardial infarction (MI), unstable angina, and coronary death, as well as coronary heart diseaseequivalent non-hemorrhagic stroke. Inhibition of HMG-CoA reductase has potential pleiotropic effects other than lipid-lowering, as statins block mevalonic acid production, a precursor to cholesterol and numerous other metabolites. Statins' beneficial effects on clinical events may also thus involve nonlipid-related mechanisms that modify endothelial function, inflammatory responses, plaque stability, and thrombus formation. Aspirin, routinely prescribed to post-MI patients as adjunct therapy, may potentiate statins beneficial effects, as aspirin does not compete metabolically with statins but acts similarly on atherosclerotic lesions. Common functions of both medications include inhibition of platelet activity and aggregation, reduction in atherosclerotic plaque macrophage cell count, and prevention of atherosclerotic vessel endothelial dysfunction. The Cholesterol and Recurrent Events (CARE) trial provides an ideal population in which to examine the combined effects of pravastatin and aspirin. Lipid levels, intermediate outcomes, are examined by pravastatin and aspirin status, and differences between the two pravastatin groups are found. A modified Cox proportional-hazards model with aspirin as a time-dependent covariate was used to determine the effect of aspirin and pravastatin on the clinical cardiovascular composite endpoint of coronary heart disease death, recurrent MI or stroke. Among those assigned to pravastatin, use of aspirin reduced the composite primary endpoint by 35%; this result was similar by gender, race, and diabetic status. Older patients demonstrated a nonsignificant 21% reduction in the primary outcome, whereas the younger had a significant reduction of 43% in the composite primary outcome. Secondary outcomes examined include coronary artery bypass graft (38% reduction), nonsurgical bypass, peripheral vascular disease, and unstable angina. Pravastatin and aspirin in a post-MI population was found to be a beneficial combination that seems to work through lipid and nonlipid, anti-inflammatory mechanisms. ^

Comparison of multiple comparison methods for identifying differential gene expression in simulated and real papillary thyroid cancer microarray data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The difficulty of detecting differential gene expression in microarray data has existed for many years. Several correction procedures try to avoid the family-wise error rate in multiple comparison process, including the Bonferroni and Sidak single-step p-value adjustments, Holm's step-down correction method, and Benjamini and Hochberg's false discovery rate (FDR) correction procedure. Each multiple comparison technique has its advantages and weaknesses. We studied each multiple comparison method through numerical studies (simulations) and applied the methods to the real exploratory DNA microarray data, which detect of molecular signatures in papillary thyroid cancer (PTC) patients. According to our results of simulation studies, Benjamini and Hochberg step-up FDR controlling procedure is the best process among these multiple comparison methods and we discovered 1277 potential biomarkers among 54675 probe sets after applying the Benjamini and Hochberg's method to PTC microarray data.^

Assessing the improved discriminatory power of a new biomarker in prognostic models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although the area under the receiver operating characteristic (AUC) is the most popular measure of the performance of prediction models, it has limitations, especially when it is used to evaluate the added discrimination of a new biomarker in the model. Pencina et al. (2008) proposed two indices, the net reclassification improvement (NRI) and integrated discrimination improvement (IDI), to supplement the improvement in the AUC (IAUC). Their NRI and IDI are based on binary outcomes in case-control settings, which do not involve time-to-event outcome. However, many disease outcomes are time-dependent and the onset time can be censored. Measuring discrimination potential of a prognostic marker without considering time to event can lead to biased estimates. In this dissertation, we have extended the NRI and IDI to survival analysis settings and derived the corresponding sample estimators and asymptotic tests. Simulation studies were conducted to compare the performance of the time-dependent NRI and IDI with Pencina’s NRI and IDI. For illustration, we have applied the proposed method to a breast cancer study.^ Key words: Prognostic model, Discrimination, Time-dependent NRI and IDI ^

Performance recovery from exercise: A review from a statistical point of view

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background. Research into methods for recovery from fatigue due to exercise is a popular topic among sport medicine, kinesiology and physical therapy. However, both the quantity and quality of studies and a clear solution of recovery are lacking. An analysis of the statistical methods in the existing literature of performance recovery can enhance the quality of research and provide some guidance for future studies. Methods: A literature review was performed using SCOPUS, SPORTDiscus, MEDLINE, CINAHL, Cochrane Library and Science Citation Index Expanded databases to extract the studies related to performance recovery from exercise of human beings. Original studies and their statistical analysis for recovery methods including Active Recovery, Cryotherapy/Contrast Therapy, Massage Therapy, Diet/Ergogenics, and Rehydration were examined. Results: The review produces a Research Design and Statistical Method Analysis Summary. Conclusion: Research design and statistical methods can be improved by using the guideline from the Research Design and Statistical Method Analysis Summary. This summary table lists the potential issues and suggested solutions, such as, sample size calculation, sports specific and research design issues consideration, population and measure markers selection, statistical methods for different analytical requirements, equality of variance and normality of data, post hoc analyses and effect size calculation.^

Genome-wide algorithm for detecting CNV associations with diseases

Relevância:

10.00% 10.00%

Publicador:

Resumo:

SNP genotyping arrays have been developed to characterize single-nucleotide polymorphisms (SNPs) and DNA copy number variations (CNVs). The quality of the inferences about copy number can be affected by many factors including batch effects, DNA sample preparation, signal processing, and analytical approach. Nonparametric and model-based statistical algorithms have been developed to detect CNVs from SNP genotyping data. However, these algorithms lack specificity to detect small CNVs due to the high false positive rate when calling CNVs based on the intensity values. Association tests based on detected CNVs therefore lack power even if the CNVs affecting disease risk are common. In this research, by combining an existing Hidden Markov Model (HMM) and the logistic regression model, a new genome-wide logistic regression algorithm was developed to detect CNV associations with diseases. We showed that the new algorithm is more sensitive and can be more powerful in detecting CNV associations with diseases than an existing popular algorithm, especially when the CNV association signal is weak and a limited number of SNPs are located in the CNV.^

Metabolic syndrome and cardiovascular mortality among participants of the hypertension detection and follow-up program

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Metabolic Syndrome (MetS) is a clustering of cardiovascular (CV) risk factors that includes obesity, dyslipidemia, hyperglycemia, and elevated blood pressure. Applying the criteria for MetS can serve as a clinically feasible tool for identifying patients at high risk for CV morbidity and mortality, particularly those who do not fall into traditional risk categories. The objective of this study was to examine the association between MetS and CV mortality among 10,940 American hypertensive adults, ages 30-69 years, participating in a large randomized controlled trial of hypertension treatment (HDFP 1973-1983). MetS was defined as the presence of hypertension and at least two of the following risk factors: obesity, dyslipidemia, or hyperglycemia. Of the 10,763 individuals with sufficient data available for analysis, 33.2% met criteria for MetS at baseline. The baseline prevalence of MetS was significantly higher among women (46%) than men (22%) and among non-blacks (37%) versus blacks (30%). All-cause and CV mortality was assessed for 10,763 individuals. Over a median follow-up of 7.8 years, 1,425 deaths were observed. Approximately 53% of these deaths were attributed to CV causes. Compared to individuals without MetS at baseline, those with MetS had higher rates of all-cause mortality (14.5% v. 12.6%) and CV mortality (8.2% versus 6.4%). The unadjusted risk of CV mortality among those with MetS was 1.31 (95% confidence interval [CI], 1.12-1.52) times that for those without MetS at baseline. After multiple adjustment for traditional risk factors of age, race, gender, history of cardiovascular disease (CVD), and smoking status, individuals with MetS, compared to those without MetS, were 1.42 (95% CI, 1.20-1.67) times more likely to die of CV causes. Of the individual components of MetS, hyperglycemia/diabetes conferred the strongest risk of CV mortality (OR 1.73; 95% CI, 1.39-2.15). Results of the present study suggest MetS defined as the presence of hypertension and 2 additional cardiometabolic risk factors (obesity, dyslipidemia, or hyperglycemia/diabetes) can be used with some success to predict CV mortality in middle-aged hypertensive adults. Ongoing and future prospective studies are vital to examine the association between MetS and cardiovascular morbidity and mortality in select high-risk subpopulations, and to continue evaluating the public health impact of aggressive, targeted screening, prevention, and treatment efforts to prevent future cardiovascular disability and death.^

Functional data analysis approaches for genotype-phenotype association studies from next-generation sequencing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

A new combined endpoint using U-statistic in analysis in clinical trials

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Common endpoints can be divided into two categories. One is dichotomous endpoints which take only fixed values (most of the time two values). The other is continuous endpoints which can be any real number between two specified values. Choices of primary endpoints are critical in clinical trials. If we only use dichotomous endpoints, the power could be underestimated. If only continuous endpoints are chosen, we may not obtain expected sample size due to occurrence of some significant clinical events. Combined endpoints are used in clinical trials to give additional power. However, current combined endpoints or composite endpoints in cardiovascular disease clinical trials or most clinical trials are endpoints that combine either dichotomous endpoints (total mortality + total hospitalization), or continuous endpoints (risk score). Our present work applied U-statistic to combine one dichotomous endpoint and one continuous endpoint, which has three different assessments and to calculate the sample size and test the hypothesis to see if there is any treatment effect. It is especially useful when some patients cannot provide the most precise measurement due to medical contraindication or some personal reasons. Results show that this method has greater power then the analysis using continuous endpoints alone. ^

A computerized statistical framework for coalescent analysis

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Coalescent theory represents the most significant progress in theoretical population genetics in the past three decades. The coalescent theory states that all genes or alleles in a given population are ultimately inherited from a single ancestor shared by all members of the population, known as the most recent common ancestor. It is now widely recognized as a cornerstone for rigorous statistical analyses of molecular data from population [1]. The scientists have developed a large number of coalescent models and methods[2,3,4,5,6], which are not only applied in coalescent analysis and process, but also in today’s population genetics and genome studies, even public health. The thesis aims at completing a statistical framework based on computers for coalescent analysis. This framework provides a large number of coalescent models and statistic methods to assist students and researchers in coalescent analysis, whose results are presented in various formats as texts, graphics and printed pages. In particular, it also supports to create new coalescent models and statistical methods. ^

Statistical Methods for Differential Expressions of Genes Detected in Multiple-Condition Experiment of Microarray

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most studies of differential gene-expressions have been conducted between two given conditions. The two-condition experimental (TCE) approach is simple in that all genes detected display a common differential expression pattern responsive to a common two-condition difference. Therefore, the genes that are differentially expressed under the other conditions other than the given two conditions are undetectable with the TCE approach. In order to address the problem, we propose a new approach called multiple-condition experiment (MCE) without replication and develop corresponding statistical methods including inference of pairs of conditions for genes, new t-statistics, and a generalized multiple-testing method for any multiple-testing procedure via a control parameter C. We applied these statistical methods to analyze our real MCE data from breast cancer cell lines and found that 85 percent of gene-expression variations were caused by genotypic effects and genotype-ANAX1 overexpression interactions, which agrees well with our expected results. We also applied our methods to the adenoma dataset of Notterman et al. and identified 93 differentially expressed genes that could not be found in TCE. The MCE approach is a conceptual breakthrough in many aspects: (a) many conditions of interests can be conducted simultaneously; (b) study of association between differential expressions of genes and conditions becomes easy; (c) it can provide more precise information for molecular classification and diagnosis of tumors; (d) it can save lot of experimental resources and time for investigators.^

Single nucleotide polymorphisms (SNPs) associated with TGF-beta pathway and their significance in systemic sclerosis - A multilevel analysis

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Systemic sclerosis (SSc) or Scleroderma is a complex disease and its etiopathogenesis remains unelucidated. Fibrosis in multiple organs is a key feature of SSc and studies have shown that transforming growth factor-β (TGF-β) pathway has a crucial role in fibrotic responses. For a complex disease such as SSc, expression quantitative trait loci (eQTL) analysis is a powerful tool for identifying genetic variations that affect expression of genes involved in this disease. In this study, a multilevel model is described to perform a multivariate eQTL for identifying genetic variation (SNPs) specifically associated with the expression of three members of TGF-β pathway, CTGF, SPARC and COL3A1. The uniqueness of this model is that all three genes were included in one model, rather than one gene being examined at a time. A protein might contribute to multiple pathways and this approach allows the identification of important genetic variations linked to multiple genes belonging to the same pathway. In this study, 29 SNPs were identified and 16 of them located in known genes. Exploring the roles of these genes in TGF-β regulation will help elucidate the etiology of SSc, which will in turn help to better manage this complex disease. ^

HPV vaccine coverage in Texas counties: An application of multilevel, small area estimation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Health departments, research institutions, policy-makers, and healthcare providers are often interested in knowing the health status of their clients/constituents. Without the resources, financially or administratively, to go out into the community and conduct health assessments directly, these entities frequently rely on data from population-based surveys to supply the information they need. Unfortunately, these surveys are ill-equipped for the job due to sample size and privacy concerns. Small area estimation (SAE) techniques have excellent potential in such circumstances, but have been underutilized in public health due to lack of awareness and confidence in applying its methods. The goal of this research is to make model-based SAE accessible to a broad readership using clear, example-based learning. Specifically, we applied the principles of multilevel, unit-level SAE to describe the geographic distribution of HPV vaccine coverage among females aged 11-26 in Texas.^ Multilevel (3 level: individual, county, public health region) random-intercept logit models of HPV vaccination (receipt of ≥ 1 dose Gardasil® ) were fit to data from the 2008 Behavioral Risk Factor Surveillance System (outcome and level 1 covariates) and a number of secondary sources (group-level covariates). Sampling weights were scaled (level 1) or constructed (levels 2 & 3), and incorporated at every level. Using the regression coefficients (and standard errors) from the final models, I simulated 10,000 datasets for each regression coefficient from the normal distribution and applied them to the logit model to estimate HPV vaccine coverage in each county and respective demographic subgroup. For simplicity, I only provide coverage estimates (and 95% confidence intervals) for counties.^ County-level coverage among females aged 11-17 varied from 6.8-29.0%. For females aged 18-26, coverage varied from 1.9%-23.8%. Aggregated to the state level, these values translate to indirect state estimates of 15.5% and 11.4%, respectively; both of which fall within the confidence intervals for the direct estimates of HPV vaccine coverage in Texas (Females 11-17: 17.7%, 95% CI: 13.6, 21.9; Females 18-26: 12.0%, 95% CI: 6.2, 17.7).^ Small area estimation has great potential for informing policy, program development and evaluation, and the provision of health services. Harnessing the flexibility of multilevel, unit-level SAE to estimate HPV vaccine coverage among females aged 11-26 in Texas counties, I have provided (1) practical guidance on how to conceptualize and conduct modelbased SAE, (2) a robust framework that can be applied to other health outcomes or geographic levels of aggregation, and (3) HPV vaccine coverage data that may inform the development of health education programs, the provision of health services, the planning of additional research studies, and the creation of local health policies.^

The theory of runs with application to drought predictions

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The primary interest was in predicting the distribution runs in a sequence of Bernoulli trials. Difference equation techniques were used to express the number of runs of a given length k in n trials under three assumptions (1) no runs of length greater than k, (2) no runs of length less than k, (3) no other assumptions about the length of runs. Generating functions were utilized to obtain the distributions of the future number of runs, future number of minimum run lengths and future number of the maximum run lengths unconditional on the number of successes and failures in the Bernoulli sequence. When applying the model to Texas hydrology data, the model provided an adequate fit for the data in eight of the ten regions. Suggested health applications of this approach to run theory are provided. ^

A cohort study of occupational asbestos-exposure related neoplasms in Texas Gulf Coast area

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A cohort study was conducted in Texas and Louisiana Gulf Coast area on individual workers who have been exposed to asbestos for 15 years or more. Most of these workers were employed in petrochemical industries. Of the 15,742 subjects initially selected for the cohort study, 3,258 had positive chest X-ray findings believed to be related to prolonged asbestos exposure. These subjects were further investigated. Their work out included detailed medical and occupational history, laboratory tests and spirometry. One thousand eight-hundred and three cases with positive chest X-ray findings whose data files were considered complete at the end of May 1986 were analyzed and their findings included in this report.^ The prevalence of lung cancer and cancer of the following sights: skin, stomach, oropharyngeal, pancreas and kidneys were significantly increased when compared to data from Connecticut Tumor Registry. The prevalence of other chronic conditions such as hypertension, emphysema, heart disease and peptic ulcer was also significantly high when compared to data for the U.S. and general population furnished by the National Center for Health Statistics (NCHS). In most instances the occurrence of cancer and the chronic ailment previously mentioned appeared to follow 15-25 years of exposure to asbestos. ^

«
1
2
3
4
5
6
7
8
...
11
12
»