13 resultados para High Frequency Data
em DigitalCommons@The Texas Medical Center
Resumo:
Recent studies indicate that polymorphic genetic markers are potentially helpful in resolving genealogical relationships among individuals in a natural population. Genetic data provide opportunities for paternity exclusion when genotypic incompatibilities are observed among individuals, and the present investigation examines the resolving power of genetic markers in unambiguous positive determination of paternity. Under the assumption that the mother for each offspring in a population is unambiguously known, an analytical expression for the fraction of males excluded from paternity is derived for the case where males and females may be derived from two different gene pools. This theoretical formulation can also be used to predict the fraction of births for each of which all but one male can be excluded from paternity. We show that even when the average probability of exclusion approaches unity, a substantial fraction of births yield equivocal mother-father-offspring determinations. The number of loci needed to increase the frequency of unambiguous determinations to a high level is beyond the scope of current electrophoretic studies in most species. Applications of this theory to electrophoretic data on Chamaelirium luteum (L.) shows that in 2255 offspring derived from 273 males and 70 females, only 57 triplets could be unequivocally determined with eight polymorphic protein loci, even though the average combined exclusionary power of these loci was 73%. The distribution of potentially compatible male parents, based on multilocus genotypes, was reasonably well predicted from the allele frequency data available for these loci. We demonstrate that genetic paternity analysis in natural populations cannot be reliably based on exclusionary principles alone. In order to measure the reproductive contributions of individuals in natural populations, more elaborate likelihood principles must be deployed.
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.
Resumo:
The objective of this longitudinal study, conducted in a neonatal intensive care unit, was to characterize the response to pain of high-risk very low birth weight infants (<1,500 g) from 23 to 38 weeks post-menstrual age (PMA) by measuring heart rate variability (HRV). Heart period data were recorded before, during, and after a heel lanced or wrist venipunctured blood draw for routine clinical evaluation. Pain response to the blood draw procedure and age-related changes of HRV in low-frequency and high-frequency bands were modeled with linear mixed-effects models. HRV in both bands decreased during pain, followed by a recovery to near-baseline levels. Venipuncture and mechanical ventilation were factors that attenuated the HRV response to pain. HRV at the baseline increased with post-menstrual age but the growth rate of high-frequency power was reduced in mechanically ventilated infants. There was some evidence that low-frequency HRV response to pain improved with advancing PMA.
Resumo:
Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.
Resumo:
Chondrocyte gene regulation is important for the generation and maintenance of cartilage tissues. Several regulatory factors have been identified that play a role in chondrogenesis, including the positive transacting factors of the SOX family such as SOX9, SOX5, and SOX6, as well as negative transacting factors such as C/EBP and delta EF1. However, a complete understanding of the intricate regulatory network that governs the tissue-specific expression of cartilage genes is not yet available. We have taken a computational approach to identify cis-regulatory, transcription factor (TF) binding motifs in a set of cartilage characteristic genes to better define the transcriptional regulatory networks that regulate chondrogenesis. Our computational methods have identified several TFs, whose binding profiles are available in the TRANSFAC database, as important to chondrogenesis. In addition, a cartilage-specific SOX-binding profile was constructed and used to identify both known, and novel, functional paired SOX-binding motifs in chondrocyte genes. Using DNA pattern-recognition algorithms, we have also identified cis-regulatory elements for unknown TFs. We have validated our computational predictions through mutational analyses in cell transfection experiments. One novel regulatory motif, N1, found at high frequency in the COL2A1 promoter, was found to bind to chondrocyte nuclear proteins. Mutational analyses suggest that this motif binds a repressive factor that regulates basal levels of the COL2A1 promoter.
Resumo:
Nonpapillary renal cell carcinoma (RCC) is an adult cancer of the kidney which occurs both in familial and sporadic forms. The familial form of RCC is associated with translocations involving chromosome 3 with a breakpoint at 3p14-p13. Studies focused on sporadic RCC have shown two commonly deleted regions at 3p14.3-p13 and 3p21.3. In addition, a more distal region mapping to 3p26-p25 has been linked to the Von Hippel Lindau (VHL) disease gene. A large proportion of VHL patients develop RCC. The short arm of human chromosome 3 can, therefore, be dissected into three distinct regions which could encode tumor suppressor genes for RCC. Loss or inactivation of one or more of these loci may be an important step in the genesis of RCC.^ I have used the technique of microcell-mediated chromosome transfer to introduce an intact, normal human chromosome 3 and defined fragments of 3p, dominantly marked with pSV2neo, into the highly malignant RCC cell line SN12C.19. The introduction of chromosome 3 and of a centric fragment of 3p, encompassing 3p14-q11, into SN12C.19 resulted in dramatic suppression of tumor growth in nude mice. Another defined deletion hybrid contained the region 3p12-q24 of the introduced human chromosome and failed to suppress tumorigenicity. These data define the region 3p14-p12, the most proximal region of high frequency allele loss in sporadic RCC as well as the region containing the translocation breakpoint in familial RCC, to contain a novel tumor suppressor locus involved in RCC. We have designated this locus nonpapillary renal cell carcinoma-1 (NRC-1). Furthermore, we have functional evidence that NRC-1 controls the growth of RCC cells by inducing rapid cell death in vivo. ^
Resumo:
Complete NotI, SfiI, XbaI and BlnI cleavage maps of Escherichia coli K-12 strain MG1655 were constructed. Techniques used included: CHEF pulsed field gel electrophoresis; transposon mutagenesis; fragment hybridization to the ordered $\lambda$ library of Kohara et al.; fragment and cosmid hybridization to Southern blots; correlation of fragments and cleavage sites with EcoMap, a sequence-modified version of the genomic restriction map of Kohara et al.; and correlation of cleavage sites with DNA sequence databases. In all, 105 restriction sites were mapped and correlated with the EcoMap coordinate system.^ NotI, SfiI, XbaI and BlnI restriction patterns of five commonly used E. coli K-12 strains were compared to those of MG1655. The variability between strains, some of which are separated by numerous steps of mutagenic treatment, is readily detectable by pulsed-field gel electrophoresis. A model is presented to account for the difference between the strains on the basis of simple insertions, deletions, and in one case an inversion. Insertions and deletions ranged in size from 1 kb to 86 kb. Several of the larger features have previously been characterized and some of the smaller rearrangements can potentially account for previously reported genetic features of these strains.^ Some aspects of the frequency and distribution of NotI, SfiI, XbaI and BlnI cleavage sites were analyzed using a method based on Markov chain theory. Overlaps of Dam and Dcm methylase sites with XbaI and SfiI cleavage sites were examined. The one XbaI-Dam overlap in the database is in accord with the expected frequency of this overlap. The occurrence of certain types of SfiI-Dcm overlaps are overrepresented. Of the four subtypes of SfiI-Dcm overlap, only one has a partial inhibitory effect on the activity of SfiI. Recognition sites for all four enzymes are rarer than expected based on oligonucleotide frequency data, with this effect being much stronger for XbaI and BlnI than for NotI and SfiI. The latter two enzyme sites are rare mainly due to apparent negative selection against GGCC (both) and CGGCCG (NotI). The former two enzyme sites are rare mainly due to effects of the VSP repair system on certain di-tri- and tetranucleotides, most notably CTAG. Models are proposed to explain several of the anomalies of oligonucleotide distribution in E. coli, and the biological significance of the systems that produce these anomalies is discussed. ^
Resumo:
This study describes the patterns of occurrence of amyotrophic lateral sclerosis (ALS) and parkinsonism-dementia complex (PDC) of Guam during 1950-1989. Both ALS and PDC occur with high frequency among the indigenous Chamorro population, first recognized in the early 1950's. Reports in the early 1980's indicated that both ALS and PDC were disappearing, due to a purported reduction in exposure to harmful environmental factors as a result of the dramatic changes in lifestyle that took place after World War II. However, this study provides compelling evidence that ALS and PDC have not disappeared on Guam and that rates for both are higher during 1980-1989 than previously reported.^ The patterns of occurrence for both ALS and PDC overlap in most respects: (1) incidence and mortality are decreasing; (2) median age at onset is increasing; (3) males are at increased risk for developing disease; (4) risk is higher for those residing in the south compared to the non-south; and (5) age-specific incidence is decreasing over time except in the oldest age groups.^ Age-specific incidence of ALS and PDC, separately and together, is generally higher for cohorts born before 1920 than for those born after 1920. A significant birth cohort effect on the incidence of PDC for the 1906-1915 birth cohort was found, but not for ALS and for ALS and PDC together. Whether or not a cohort effect, period effect, or both are associated with incidence of ALS and PDC cannot be determined from the data currently available and will require additional follow-up of individuals born after 1920.^ The epidemiological data amassed over this 40-year period provide evidence that supports an environmental exposure model for disease occurrence as opposed to a simple genetic or infectious disease model. Whether neurodegenerative disease in this population occurs as a consequence of a single exposure or is explained by a multifactorial model such as a genetic predisposition with some environmental interaction is yet to be determined. However, descriptive studies such as this can provide clues concerning timing and location of potential adverse exposures but cannot determine etiology, underscoring the urgent need for analytic studies of ALS and PDC to further investigate existing etiologic hypotheses and to test new hypotheses. ^
Resumo:
Prostate cancer is the second most commonly diagnosed cancer among men in the United States. In this study, evidence is presented to support the hypothesis that specific chromosomal aberrations (involving one or more chromosomal regions) are associated with prostate cancer progression from organ-confined to locally advanced tumors and that some aberrations seen in high frequency in metastatic tumors may also be present in a subset of primary tumors. To determine the appropriate approach to address this hypothesis, I have established a modified CGH protocol by microdissection and DOP-PCR for use in detecting chromosomal changes in clinical prostate tumor specimens that is more sensitive and accurate than conventional CGH methods. I have successfully performed the improved CGH protocol to screen for genetic changes of 24 organ confined (pT2) and 21 locally advanced (pT3b) clinical prostate cancer specimens without metastases (N0M0). Comparisons of tumors by stage or Gleason scores following contingency table analysis showed that seven regions of the genome differed significantly between pT2 and pT3b tumors or between low and high Gleason tumors suggesting that these regions may be important in local prostate cancer progression. These included losses on 6p21–25, 6q24–27, 8p, 10q25–26, 15q22–26, and 18cen–q12 as well as gain of 3p13–q13. Multivariate analyses showed that loss of 8p (step1) and loss of 6q25–26 (or 6p21–25 or 10q25–26) (step 2) were predictive of pathologic stage or Gleason groups with 80% accuracy. Additional 5–7 steps in the multivariate model increased the predictive value to 91–95%. Comparison of the CGH data from the primary prostate tumors of this study with those obtained from published literature on metastases and recurrent tumors showed that the clinically more aggressive stage pT3b tumors shared more abnormalities in high frequency with metastases and recurrent tumors than less aggressive stage pT2 tumors. Furthermore, loss of 11cen–q22 was shared only between the primary tumors and metastases while gain of Xcen–q13 and loss of 18cen–q12 were in common between primary and recurrent tumors. These analyses suggest that the multistage model of prostate cancer progression is not linear and that some early primary tumors may be predisposed to metastasize or evolve into recurrent tumors due to the presence of specific genetic alterations. ^
Resumo:
Latinos have the highest teen birth rate nationally. Cameron County, Texas is primarily Latino (Mexican-American). This mixed-method study (n=43) examines Mexican-American parents of adolescents' beliefs, attitudes and practices regarding communication with their adolescent children about sex. Social Cognitive Theory (SCT) constructs self-efficacy, behavioral determinism, environment, outcome expectations and reciprocal determinism can be influences on frequency and quality of parent-adolescent sex communication.^ This study describes Mexican-American parents' of adolescents recollections of their own experiences associated with learning about sexuality. It also examines the attitudes and practices regarding communication about sex and the self-efficacy and behavioral capability of participants to teach their adolescent children about sex and sexually transmitted infections. ^ Negative childhood experiences (shame, lies and trauma) of the parents in this study played a key role in terms of their desire to communicate more comprehensively about sexuality with their own children than did their parents. While participants' reported low self-efficacy and behavioral capability to communicate with their adolescent children about sex, they reported relatively high frequency and quality of communication, with 75% of participants receiving a high quality score and over 44% reporting frequent communication with their adolescent children about sex. A Chi square analysis and Fisher's Exact Score revealed no association between acculturation status, gender or having a child who has mothered/fathered a baby and the frequency or quality of communication about sex with adolescent children. Study participants also gave specific recommendations for method, content and setting of sex education for their children and themselves. Promotora delivery of information and education in a comfortable, culturally appropriate neighborhood setting, as well as parent –child learning sessions were identified as possible approaches to address improve self-efficacy and behavioral capability of parents communicating with their adolescent children about sex.^ The results of this analysis provide public health practitioners and interested community entities data to identify and develop interventions that use a theoretical, evidence-based framework for culturally appropriate interventions to encourage and equip Mexican-American parents to effectively communicate with their adolescent children about sexuality, and ultimately to address the high rates of teen pregnancy in this U.S.-Mexico border community. ^
Resumo:
Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^
Resumo:
Carcinoma of the skin is the most common type of human cancer in the United States. Ultraviolet radiation (UVR) present in the sunlight is thought to be the major carcinogen responsible for induction of skin cancer. In UV-associated skin carcinogenesis, mutations in p53 are not only present with very high frequency, but occur early in the course of tumor development. In addition, UV-induced skin tumors in mice exhibit unique immunological characteristics. They are highly antigenic and express both individually-specific tumor transplantation antigens recognized by effector T cells and the UV-associated common antigen recognized by UV-induced suppressor T cells. ^ To examine the hypothesis that p53 plays a critical role in preventing skin cancer induction by UVR, mice constitutively lacking one or two functional p53 alleles were compared to wild-type mice for their susceptibility to UV carcinogenesis. Both p53 +/– and –/– mice showed greater susceptibility to skin cancer induction than wild-type mice, and –/– mice were the most susceptible, Accelerated tumor development in the p53 +/– mice was not associated with loss of the remaining wild-type allele of p53 , but in many cases was associated with UV-induced mutations in p53. Our studies clearly demonstrate the essential role of p53 in protection against UV carcinogenesis, particularly in the eye and epidermis. ^ The role of p53 in the antigenicity of UV-induced murine skin tumors was also addressed. Primary UV-induced tumors from p53 –/–, +/– and +/+ mice were transplanted into both normal and immunosuppressed mice, and rates of tumor rejection were compared. Tumors from mice with only one or no functional p53 alleles were less antigenic than those from mice with two functional p53 alleles. Moreover, tumors with no functional p53 also failed to grow well in chronically UV-irradiated mice. These results indicate that p53 contributes to the strong antigenicity of UV-induced murine skin tumors, and suggest that it may play a critical role in expression of the UV-associated common antigen recognized by suppressor T cells. ^ In this study we also monitored the effect of UVR on the development of lymphoid malignancies in p53 deficient mice. The incidence of lymphoid malignancies in UV-irradiated p53 +/– mice was drastically enhanced compared to that in unirradiated counterparts. The immune responses of the mice were identical and were suppressed to the same extent by UV irradiation regardless of the p53 genotype. These data provide the first experimental evidence that exposure to UVR can contribute to the development of lymphoid neoplasms in genetically susceptible hosts. ^