893 resultados para STATISTICAL-METHOD


Relevância:

60.00% 60.00%

Publicador:

Resumo:

A first result of the search for ν ( )μ( ) → ν ( )e( ) oscillations in the OPERA experiment, located at the Gran Sasso Underground Laboratory, is presented. The experiment looked for the appearance of ν ( )e( ) in the CNGS neutrino beam using the data collected in 2008 and 2009. Data are compatible with the non-oscillation hypothesis in the three-flavour mixing model. A further analysis of the same data constrains the non-standard oscillation parameters θ (new) and suggested by the LSND and MiniBooNE experiments. For large values (>0.1 eV(2)), the OPERA 90% C.L. upper limit on sin(2)(2θ (new)) based on a Bayesian statistical method reaches the value 7.2 × 10(−3).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality. In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any size, but also a variety of covariates. I will then apply this method to three cancer-related data sets (Hodgkin Disease, Glioma, and Lung Cancer). I hypothesize that there is substantial improvement in the estimation of association between haplotypes and disease, with the use of a Bayesian mathematical method to infer haplotypes that uses prior information from known genetics sources. Analysis based on haplotypes using information from publically available genetic sources generally show increased odds ratios and smaller p-values in both the Hodgkin, Glioma, and Lung data sets. For instance, the Bayesian Joint Logistic Model (BJLM) inferred haplotype TC had a substantially higher estimated effect size (OR=12.16, 95% CI = 2.47-90.1 vs. 9.24, 95% CI = 1.81-47.2) and more significant p-value (0.00044 vs. 0.008) for Hodgkin Disease compared to a traditional logistic regression approach. Also, the effect sizes of haplotypes modeled with recessive genetic effects were higher (and had more significant p-values) when analyzed with the BJLM. Full genetic models with haplotype information developed with the BJLM resulted in significantly higher discriminatory power and a significantly higher Net Reclassification Index compared to those developed with haplo.stats for lung cancer. Future analysis for this work could be to incorporate the 1000 Genomes project, which offers a larger selection of SNPs can be incorporated into the information from known genetic sources as well. Other future analysis include testing non-binary outcomes, like the levels of biomarkers that are present in lung cancer (NNK), and extending this analysis to full GWAS studies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pax genes are important developmental control genes. They are involved in nervous system development, organogenesis and oncogenesis. A DNA specific binding domain called the paired domain, which is well conserved during evolution, defines Pax genes. Furthermore, Pax genes are also conserved in terms of their functions. For example, the Pax-6 gene has been showed to be one of the master control genes for eye development both in Drosophila and vertebrates. All of these properties of Pax genes make them an excellent model for studying the evolution of gene function. ^ Molecular evolutionary studies of paired domain are carried out in this study. Five Pax genes from cnidarians, which are the most primitive organisms possessing a nervous system, were isolated and characterized for their DNA binding properties. By combining data obtained from Genbank and this study, the phylogenetic relationship between Pax genes was studied. It was found that Pax genes could be divided into five groups: Pax-1/9, Pax-3 /7, Pax-A, Pax-2/5/ 8/B, and Pax- 4/6. Furthermore, Pax-2/5/8/ B, Pax-A and Pax-4/6 could be clustered into a supergroup I, while Pax-1/9 and Pax-3/7 could be clustered into supergroup II. The phylogeny was also supported by studies on DNA binding properties of paired domains from different groups. A statistical method was applied to infer the critical amino acid residue substitutions between two supergroups and within the supergroup I. It was found that two amino acid residues were mainly responsible for the difference of DNA binding between two supergroups, while only one amino acid was critical for the evolution of novel DNA binding properties of Pax-4/6 group from ancestor. Evolutionary implications of these data are also discussed. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A first result of the search for nu(mu)->nu(e) oscillations in the OPERA experiment, located at the Gran Sasso Underground Laboratory, is presented. The experiment looked for the appearance of nu(e) in the CNGS neutrino beam using the data collected in 2008 and 2009. Data are compatible with the non-oscillation hypothesis in the three-flavour mixing model. A further analysis of the same data constrains the non-standard oscillation parameters theta(new) and Delta m(new)(2) suggested by the LSND and MiniBooNE experiments. For large Delta m(new)(2) values (>0.1 eV(2)), the OPERA 90% C.L. upper limit on sin(2)(2 theta(new)) based on a Bayesian statistical method reaches the value 7.2 x 10(-3).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the aim of understanding the mechanism of molecular evolution, mathematical problems on the evolutionary change of DNA sequences are studied. The problems studied and the results obtained are as follows: (1) Estimation of evolutionary distance between nucleotide sequences. Studying the pattern of nucleotide substitution for the case of unequal substitution rates, a new mathematical formula for estimating the average number of nucleotide substitutions per site between two homologous DNA sequences is developed. It is shown that this formula has a wider applicability than currently available formulae. A statistical method for estimating the number of nucleotide changes due to deletion and insertion is also developed. (2) Biases of the estimates of nucleotide substitutions obtained by the restriction enzyme method. The deviation of the estimate of nucleotide substitutions obtained by the restriction enzyme method from the true value is investigated theoretically. It is shown that the amount of the deviation depends on the nucleotides in the recognition sequence of the restriction enzyme used, unequal rates of substitution among different nucleotides, and nucleotide frequences, but the primary factor is the unequal rates of nucleotide substitution. When many different kinds of enzymes are used, however, the amount of average deviation is generally small. (3) Distribution of restriction fragment lengths. To see the effect of undetectable restriction fragments and fragment differences on the estimate of nucleotide differences, the theoretical distribution of fragment lengths is studied. This distribution depends on the type of restriction enzymes used as well as on the relative frequencies of four nucleotides. It is shown that undetectability of small fragments or fragment differences gives a serious underestimate of nucleotide substitutions when the length-difference method of estimation is used, but the extent of underestimation is small when the site-difference method is used. (4) Evolutionary relationships of DNA sequences in finite populations. A mathematical theory on the expected evolutionary relationships among DNA sequences (nucleons) randomly chosen from the same or different populations is developed under the assumption that the evolutionary change of nucleons is determined solely by mutation and random genetic drift. . . . (Author's abstract exceeds stipulated maximum length. Discontinued here with permission of author). UMI ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With hundreds of single nucleotide polymorphisms (SNPs) in a candidate gene and millions of SNPs across the genome, selecting an informative subset of SNPs to maximize the ability to detect genotype-phenotype association is of great interest and importance. In addition, with a large number of SNPs, analytic methods are needed that allow investigators to control the false positive rate resulting from large numbers of SNP genotype-phenotype analyses. This dissertation uses simulated data to explore methods for selecting SNPs for genotype-phenotype association studies. I examined the pattern of linkage disequilibrium (LD) across a candidate gene region and used this pattern to aid in localizing a disease-influencing mutation. The results indicate that the r2 measure of linkage disequilibrium is preferred over the common D′ measure for use in genotype-phenotype association studies. Using step-wise linear regression, the best predictor of the quantitative trait was not usually the single functional mutation. Rather it was a SNP that was in high linkage disequilibrium with the functional mutation. Next, I compared three strategies for selecting SNPs for application to phenotype association studies: based on measures of linkage disequilibrium, based on a measure of haplotype diversity, and random selection. The results demonstrate that SNPs selected based on maximum haplotype diversity are more informative and yield higher power than randomly selected SNPs or SNPs selected based on low pair-wise LD. The data also indicate that for genes with small contribution to the phenotype, it is more prudent for investigators to increase their sample size than to continuously increase the number of SNPs in order to improve statistical power. When typing large numbers of SNPs, researchers are faced with the challenge of utilizing an appropriate statistical method that controls the type I error rate while maintaining adequate power. We show that an empirical genotype based multi-locus global test that uses permutation testing to investigate the null distribution of the maximum test statistic maintains a desired overall type I error rate while not overly sacrificing statistical power. The results also show that when the penetrance model is simple the multi-locus global test does as well or better than the haplotype analysis. However, for more complex models, haplotype analyses offer advantages. The results of this dissertation will be of utility to human geneticists designing large-scale multi-locus genotype-phenotype association studies. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Currently, the barriers to appropriate infant feeding practices are largely unknown in the Central River Division of the Gambia. A questionnaire was developed and implemented by a local Non Governmental Organization (NGO), the Gambia Food and Nutrition Agency, in order to gain more information and ultimately to improve the child mortality rate of the country. There were two participant groups: 88 Doers who are women who had adopted the appropriate complementary feeding practice guidelines as defined by the World Health Organization and 87 Non Doers who are women who had in some way strayed from the appropriate complementary feeding practice guidelines. The questionnaire included aspects of the Health Belief Model which can be used in the development of a future intervention. The Yes/No questions were analyzed using the Chi-square statistical method and the open-ended questions used a descriptive analysis method of evaluation. The constructs for perceived susceptibility, perceived action efficacy, perceived self efficacy, cues for action and perception of divine showed significant differences between the Doers and the Non Doers (p<0.05). The descriptive analysis revealed that both participant groups had a limited understanding of the preventative qualities of the adoption of appropriate complementary feeding practices. The women in both of groups also showed a strong perception of divine will. Women in the Central River Division perceive their husband and in-laws to be the most influential in the decision-making process regarding infant feeding practices. Recommendations for future interventions must acknowledge the importance and influence of the community surrounding the women in their adoption of the appropriate infant feeding practices. It would also be important to educate women about of the specific guidelines of the appropriate complementary feeding practices, specifically the delay in early initiation of complementary feeding. The results of this barrier analysis provide useful information to plan and implement an effective intervention to improve the child mortality rate in the Gambia. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The purpose of this dissertation was to estimate HIV incidence among the individuals who had HIV tests performed at the Houston Department of Health and Human Services (HDHHS) public health laboratory, and to examine the prevalence of HIV and AIDS concurrent diagnoses among HIV cases reported between 2000 and 2007 in Houston/Harris County. ^ The first study in this dissertation estimated the cumulative HIV incidence among the individuals testing at Houston public health laboratory using Serologic Testing Algorithms for Recent HIV Seroconversion (STARHS) during the two year study period (June 1, 2005 to May 31, 2007). The HIV incidence was estimated using two independently developed statistical imputation methods, one developed by the Centers for Disease Control and Prevention (CDC), and the other developed by HDHHS. Among the 54,394 persons who tested for HIV during the study period, 942 tested HIV positive (positivity rate=1.7%). Of these HIV positives, 448 (48%) were newly reported to the Houston HIV/AIDS Reporting System (HARS) and 417 of these 448 blood specimens (93%) were available for STARHS testing. The STARHS results showed 139 (33%) out of the 417 specimens were newly infected with HIV. Using both the CDC and HDHHS methods, the estimated cumulative HIV incidences over the two-year study period were similar: 862 per 100,000 persons (95% CI: 655-1,070) by CDC method, and 925 per 100,000 persons (95% CI: 908-943) by HDHHS method. Consistent with the national finding, this study found African Americans, and men who have sex with men (MSM) accounted for most of the new HIV infections among the individuals testing at Houston public health laboratory. Using CDC statistical method, this study also found the highest cumulative HIV incidence (2,176 per 100,000 persons [95%CI: 1,536-2,798]) was among those who tested in the HIV counseling and testing sites, compared to the sexually transmitted disease clinics (1,242 per 100,000 persons [95%CI: 871-1,608]) and city health clinics (215 per 100,000 persons [95%CI: 80-353]. This finding suggested the HIV counseling and testing sites in Houston were successful in reaching high risk populations and testing them early for HIV. In addition, older age groups had higher cumulative HIV incidence, but accounted for smaller proportions of new HIV infections. The incidence in the 30-39 age group (994 per 100,000 persons [95%CI: 625-1,363]) was 1.5 times the incidence in 13-29 age group (645 per 100,000 persons [95%CI: 447-840]); the incidences in 40-49 age group (1,371 per 100,000 persons [95%CI: 765-1,977]) and 50 or above age groups (1,369 per 100,000 persons [95%CI: 318-2,415]) were 2.1 times compared to the youngest 13-29 age group. The increased HIV incidence in older age groups suggested that persons 40 or above were still at risk to contract HIV infections. HIV prevention programs should encourage more people who are age 40 and above to test for HIV. ^ The second study investigated concurrent diagnoses of HIV and AIDS in Houston. Concurrent HIV/AIDS diagnosis is defined as AIDS diagnosis within three months of HIV diagnosis. This study found about one-third of the HIV cases were diagnosed with HIV and AIDS concurrently (within three months) in Houston/Harris County. Using multivariable logistic regression analysis, this study found being male, Hispanic, older, and diagnosed in the private sector of care were positively associated with concurrent HIV and AIDS diagnoses. By contrast, men who had sex with men and also used injection drugs (MSM/IDU) were 0.64 times (95% CI: 0.44-0.93) less likely to have concurrent HIV and AIDS diagnoses. A sensitivity analysis comparing difference durations of elapsed time for concurrent HIV and AIDS diagnosis definitions (1-month, 3-month, and 12-month cut-offs) affected the effect size of the odds ratios, but not the direction. ^ The results of these two studies, one describing characteristics of the individuals who were newly infected with HIV, and the other study describing persons who were diagnosed with HIV and AIDS concurrently, can be used as a reference for HIV prevention program planning in Houston/Harris County. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Traditional comparison of standardized mortality ratios (SMRs) can be misleading if the age-specific mortality ratios are not homogeneous. For this reason, a regression model has been developed which incorporates the mortality ratio as a function of age. This model is then applied to mortality data from an occupational cohort study. The nature of the occupational data necessitates the investigation of mortality ratios which increase with age. These occupational data are used primarily to illustrate and develop the statistical methodology.^ The age-specific mortality ratio (MR) for the covariates of interest can be written as MR(,ij...m) = ((mu)(,ij...m)/(theta)(,ij...m)) = r(.)exp (Z('')(,ij...m)(beta)) where (mu)(,ij...m) and (theta)(,ij...m) denote the force of mortality in the study and chosen standard populations in the ij...m('th) stratum, respectively, r is the intercept, Z(,ij...m) is the vector of covariables associated with the i('th) age interval, and (beta) is a vector of regression coefficients associated with these covariables. A Newton-Raphson iterative procedure has been used for determining the maximum likelihood estimates of the regression coefficients.^ This model provides a statistical method for a logical and easily interpretable explanation of an occupational cohort mortality experience. Since it gives a reasonable fit to the mortality data, it can also be concluded that the model is fairly realistic. The traditional statistical method for the analysis of occupational cohort mortality data is to present a summary index such as the SMR under the assumption of constant (homogeneous) age-specific mortality ratios. Since the mortality ratios for occupational groups usually increase with age, the homogeneity assumption of the age-specific mortality ratios is often untenable. The traditional method of comparing SMRs under the homogeneity assumption is a special case of this model, without age as a covariate.^ This model also provides a statistical technique to evaluate the relative risk between two SMRs or a dose-response relationship among several SMRs. The model presented has application in the medical, demographic and epidemiologic areas. The methods developed in this thesis are suitable for future analyses of mortality or morbidity data when the age-specific mortality/morbidity experience is a function of age or when there is an interaction effect between confounding variables needs to be evaluated. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genome-wide association studies (GWAS) have successfully identified several genetic loci associated with inherited predisposition to primary biliary cirrhosis (PBC), the most common autoimmune disease of the liver. Pathway-based tests constitute a novel paradigm for GWAS analysis. By evaluating genetic variation across a biological pathway (gene set), these tests have the potential to determine the collective impact of variants with subtle effects that are individually too weak to be detected in traditional single variant GWAS analysis. To identify biological pathways associated with the risk of development of PBC, GWAS of PBC from Italy (449 cases and 940 controls) and Canada (530 cases and 398 controls) were independently analyzed. The linear combination test (LCT), a recently developed pathway-level statistical method was used for this analysis. For additional validation, pathways that were replicated at the P <0.05 level of significance in both GWAS on LCT analysis were also tested for association with PBC in each dataset using two complementary GWAS pathway approaches. The complementary approaches included a modification of the gene set enrichment analysis algorithm (i-GSEA4GWAS) and Fisher's exact test for pathway enrichment ratios. Twenty-five pathways were associated with PBC risk on LCT analysis in the Italian dataset at P<0.05, of which eight had an FDR<0.25. The top pathway in the Italian dataset was the TNF/stress related signaling pathway (p=7.38×10 -4, FDR=0.18). Twenty-six pathways were associated with PBC at the P<0.05 level using the LCT in the Canadian dataset with the regulation and function of ChREBP in liver pathway (p=5.68×10-4, FDR=0.285) emerging as the most significant pathway. Two pathways, phosphatidylinositol signaling system (Italian: p=0.016, FDR=0.436; Canadian: p=0.034, FDR=0.693) and hedgehog signaling (Italian: p=0.044, FDR=0.636; Canadian: p=0.041, FDR=0.693), were replicated at LCT P<0.05 in both datasets. Statistically significant association of both pathways with PBC genetic susceptibility was confirmed in the Italian dataset on i-GSEA4GWAS. Results for the phosphatidylinositol signaling system were also significant in both datasets on applying Fisher's exact test for pathway enrichment ratios. This study identified a combination of known and novel pathway-level associations with PBC risk. If functionally validated, the findings may yield fresh insights into the etiology of this complex autoimmune disease with possible preventive and therapeutic application.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An interim analysis is usually applied in later phase II or phase III trials to find convincing evidence of a significant treatment difference that may lead to trial termination at an earlier point than planned at the beginning. This can result in the saving of patient resources and shortening of drug development and approval time. In addition, ethics and economics are also the reasons to stop a trial earlier. In clinical trials of eyes, ears, knees, arms, kidneys, lungs, and other clustered treatments, data may include distribution-free random variables with matched and unmatched subjects in one study. It is important to properly include both subjects in the interim and the final analyses so that the maximum efficiency of statistical and clinical inferences can be obtained at different stages of the trials. So far, no publication has applied a statistical method for distribution-free data with matched and unmatched subjects in the interim analysis of clinical trials. In this simulation study, the hybrid statistic was used to estimate the empirical powers and the empirical type I errors among the simulated datasets with different sample sizes, different effect sizes, different correlation coefficients for matched pairs, and different data distributions, respectively, in the interim and final analysis with 4 different group sequential methods. Empirical powers and empirical type I errors were also compared to those estimated by using the meta-analysis t-test among the same simulated datasets. Results from this simulation study show that, compared to the meta-analysis t-test commonly used for data with normally distributed observations, the hybrid statistic has a greater power for data observed from normally, log-normally, and multinomially distributed random variables with matched and unmatched subjects and with outliers. Powers rose with the increase in sample size, effect size, and correlation coefficient for the matched pairs. In addition, lower type I errors were observed estimated by using the hybrid statistic, which indicates that this test is also conservative for data with outliers in the interim analysis of clinical trials.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study proposed a novel statistical method that modeled the multiple outcomes and missing data process jointly using item response theory. This method follows the "intent-to-treat" principle in clinical trials and accounts for the correlation between outcomes and missing data process. This method may provide a good solution to chronic mental disorder study. ^ The simulation study demonstrated that if the true model is the proposed model with moderate or strong correlation, ignoring the within correlation may lead to overestimate of the treatment effect and result in more type I error than specified level. Even if the within correlation is small, the performance of proposed model is as good as naïve response model. Thus, the proposed model is robust for different correlation settings if the data is generated by the proposed model.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The genomic era brought by recent advances in the next-generation sequencing technology makes the genome-wide scans of natural selection a reality. Currently, almost all the statistical tests and analytical methods for identifying genes under selection was performed on the individual gene basis. Although these methods have the power of identifying gene subject to strong selection, they have limited power in discovering genes targeted by moderate or weak selection forces, which are crucial for understanding the molecular mechanisms of complex phenotypes and diseases. Recent availability and rapid completeness of many gene network and protein-protein interaction databases accompanying the genomic era open the avenues of exploring the possibility of enhancing the power of discovering genes under natural selection. The aim of the thesis is to explore and develop normal mixture model based methods for leveraging gene network information to enhance the power of natural selection target gene discovery. The results show that the developed statistical method, which combines the posterior log odds of the standard normal mixture model and the Guilt-By-Association score of the gene network in a naïve Bayes framework, has the power to discover moderate/weak selection gene which bridges the genes under strong selection and it helps our understanding the biology under complex diseases and related natural selection phenotypes.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^