3 resultados para Fingerprints.
em DigitalCommons@The Texas Medical Center
Resumo:
Variable number of tandem repeats (VNTR) are genetic loci at which short sequence motifs are found repeated different numbers of times among chromosomes. To explore the potential utility of VNTR loci in evolutionary studies, I have conducted a series of studies to address the following questions: (1) What are the population genetic properties of these loci? (2) What are the mutational mechanisms of repeat number change at these loci? (3) Can DNA profiles be used to measure the relatedness between a pair of individuals? (4) Can DNA fingerprint be used to measure the relatedness between populations in evolutionary studies? (5) Can microsatellite and short tandem repeat (STR) loci which mutate stepwisely be used in evolutionary analyses?^ A large number of VNTR loci typed in many populations were studied by means of statistical methods developed recently. The results of this work indicate that there is no significant departure from Hardy-Weinberg expectation (HWE) at VNTR loci in most of the human populations examined, and the departure from HWE in some VNTR loci are not solely caused by the presence of population sub-structure.^ A statistical procedure is developed to investigate the mutational mechanisms of VNTR loci by studying the allele frequency distributions of these loci. Comparisons of frequency distribution data on several hundreds VNTR loci with the predictions of two mutation models demonstrated that there are differences among VNTR loci grouped by repeat unit sizes.^ By extending the ITO method, I derived the distribution of the number of shared bands between individuals with any kinship relationship. A maximum likelihood estimation procedure is proposed to estimate the relatedness between individuals from the observed number of shared bands between them.^ It was believed that classical measures of genetic distance are not applicable to analysis of DNA fingerprints which reveal many minisatellite loci simultaneously in the genome, because the information regarding underlying alleles and loci is not available. I proposed a new measure of genetic distance based on band sharing between individuals that is applicable to DNA fingerprint data.^ To address the concern that microsatellite and STR loci may not be useful for evolutionary studies because of the convergent nature of their mutation mechanisms, by a theoretical study as well as by computer simulation, I conclude that the possible bias caused by the convergent mutations can be corrected, and a novel measure of genetic distance that makes the correction is suggested. In summary, I conclude that hypervariable VNTR loci are useful in evolutionary studies of closely related populations or species, especially in the study of human evolution and the history of geographic dispersal of Homo sapiens. (Abstract shortened by UMI.) ^
Resumo:
Background. Pulsed-field gel electrophoresis (PFGE) is a laboratory technique in which Salmonella DNA banding patterns are used as molecular fingerprints for epidemiologic study for "PFGE clusters". State and national health departments (CDC) use PFGE to detect clusters of related cases and to discover common sources of bacteria in outbreaks. ^ Objectives. Using Houston Department of Health and Human Services (HDHHS) data, the study sought: (1) to describe the epidemiology of Salmonella in Houston, with PFGE subtype as a variable; and (2) to determine whether PFGE patterns and clusters detected in Houston were local appearances of PFGE patterns or clusters that occurred statewide. ^ Methods. During the years 2002 to 2005, the HDHHS collected and analyzed data from routine surveillance of Salmonella. We implemented a protocol, between May 1, 2007 and December 31, 2007, in which PFGE patterns from local cases were sent via e-mail to the Texas Department of State Health Services, to verify whether the local PFGE patterns were also part of statewide clusters. PFGE was performed from 106 patients providing a sample from which Salmonella was isolated in that time period. Local PFGE clusters were investigated, with the enhanced picture obtained by linking local PFGE patterns to PFGE patterns at the state and national level. ^ Results. We found that, during the years 2002 to 2005, there were 66 PFGE clusters, ranging in size from 2 to 22 patients within each cluster. Between different serotypes, there were marked differences in the sizes of PFGE clusters. A common source or risk factor was found in fewer than 5 of the 66 PFGE clusters. With the revised protocol, we found that 19 of 66 local PFGE patterns were indistinguishable from PFGE patterns at Texas DSHS. During the eight months, we identified ten local PFGE clusters with a total of 42 patients. The PFGE pattern for eight of the ten clusters matched the PFGE patterns for cases reported to Texas DSHS from other geographic areas. Five of the ten PFGE patterns matched PFGE patterns for clusters under investigation at PulseNet at the national level. HDHHS epidemiologists identified a mode of transmission in two of the ten local clusters and a common risk factor in a third local cluster. ^ Conclusion. In the extended-study protocol, Houston PFGE patterns were linked to patterns seen at the state and national level. The investigation of PFGE clusters was more efficacious in detecting a common transmission when local data were linked to state and national data. ^
Resumo:
It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays as well as next generation sequencing assays interrogating somatic mutation, insertion, deletion, translocation and structural rearrangements. Given the massive amount of data, a major challenge is to integrate information from multiple sources and formulate testable hypotheses. This thesis focuses on developing methodologies for integrative analyses of genomic assays profiled on the same set of samples. We have developed several novel methods for integrative biomarker identification and cancer classification. We introduce a regression-based approach to identify biomarkers predictive to therapy response or survival by integrating multiple assays including gene expression, methylation and copy number data through penalized regression. To identify key cancer-specific genes accounting for multiple mechanisms of regulation, we have developed the integIRTy software that provides robust and reliable inferences about gene alteration by automatically adjusting for sample heterogeneity as well as technical artifacts using Item Response Theory. To cope with the increasing need for accurate cancer diagnosis and individualized therapy, we have developed a robust and powerful algorithm called SIBER to systematically identify bimodally expressed genes using next generation RNAseq data. We have shown that prediction models built from these bimodal genes have the same accuracy as models built from all genes. Further, prediction models with dichotomized gene expression measurements based on their bimodal shapes still perform well. The effectiveness of outcome prediction using discretized signals paves the road for more accurate and interpretable cancer classification by integrating signals from multiple sources.