975 resultados para SEQUENCE VARIATION
Resumo:
Welche genetische Unterschiede machen uns verschieden von unseren nächsten Verwandten, den Schimpansen, und andererseits so ähnlich zu den Schimpansen? Was wir untersuchen und auch verstehen wollen, ist die komplexe Beziehung zwischen den multiplen genetischen und epigenetischen Unterschieden, deren Interaktion mit diversen Umwelt- und Kulturfaktoren in den beobachteten phänotypischen Unterschieden resultieren. Um aufzuklären, ob chromosomale Rearrangements zur Divergenz zwischen Mensch und Schimpanse beigetragen haben und welche selektiven Kräfte ihre Evolution geprägt haben, habe ich die kodierenden Sequenzen von 2 Mb umfassenden, die perizentrischen Inversionsbruchpunkte flankierenden Regionen auf den Chromosomen 1, 4, 5, 9, 12, 17 und 18 untersucht. Als Kontrolle dienten dabei 4 Mb umfassende kollineare Regionen auf den rearrangierten Chromosomen, welche mindestens 10 Mb von den Bruchpunktregionen entfernt lagen. Dabei konnte ich in den Bruchpunkten flankierenden Regionen im Vergleich zu den Kontrollregionen keine höhere Proteinevolutionsrate feststellen. Meine Ergebnisse unterstützen nicht die chromosomale Speziationshypothese für Mensch und Schimpanse, da der Anteil der positiv selektierten Gene (5,1% in den Bruchpunkten flankierenden Regionen und 7% in den Kontrollregionen) in beiden Regionen ähnlich war. Durch den Vergleich der Anzahl der positiv und negativ selektierten Gene per Chromosom konnte ich feststellen, dass Chromosom 9 die meisten und Chromosom 5 die wenigsten positiv selektierten Gene in den Bruchpunkt flankierenden Regionen und Kontrollregionen enthalten. Die Anzahl der negativ selektierten Gene (68) war dabei viel höher als die Anzahl der positiv selektierten Gene (17). Eine bioinformatische Analyse von publizierten Microarray-Expressionsdaten (Affymetrix Chip U95 und U133v2) ergab 31 Gene, die zwischen Mensch und Schimpanse differentiell exprimiert sind. Durch Untersuchung des dN/dS-Verhältnisses dieser 31 Gene konnte ich 7 Gene als negativ selektiert und nur 1 Gen als positiv selektiert identifizieren. Dieser Befund steht im Einklang mit dem Konzept, dass Genexpressionslevel unter stabilisierender Selektion evolvieren. Die meisten positiv selektierten Gene spielen überdies eine Rolle bei der Fortpflanzung. Viele dieser Speziesunterschiede resultieren eher aus Änderungen in der Genregulation als aus strukturellen Änderungen der Genprodukte. Man nimmt an, dass die meisten Unterschiede in der Genregulation sich auf transkriptioneller Ebene manifestieren. Im Rahmen dieser Arbeit wurden die Unterschiede in der DNA-Methylierung zwischen Mensch und Schimpanse untersucht. Dazu wurden die Methylierungsmuster der Promotor-CpG-Inseln von 12 Genen im Cortex von Menschen und Schimpansen mittels klassischer Bisulfit-Sequenzierung und Bisulfit-Pyrosequenzierung analysiert. Die Kandidatengene wurden wegen ihrer differentiellen Expressionsmuster zwischen Mensch und Schimpanse sowie wegen Ihrer Assoziation mit menschlichen Krankheiten oder dem genomischen Imprinting ausgewählt. Mit Ausnahme einiger individueller Positionen zeigte die Mehrzahl der analysierten Gene keine hohe intra- oder interspezifische Variation der DNA-Methylierung zwischen den beiden Spezies. Nur bei einem Gen, CCRK, waren deutliche intraspezifische und interspezifische Unterschiede im Grad der DNA-Methylierung festzustellen. Die differentiell methylierten CpG-Positionen lagen innerhalb eines repetitiven Alu-Sg1-Elements. Die Untersuchung des CCRK-Gens liefert eine umfassende Analyse der intra- und interspezifischen Variabilität der DNA-Methylierung einer Alu-Insertion in eine regulatorische Region. Die beobachteten Speziesunterschiede deuten darauf hin, dass die Methylierungsmuster des CCRK-Gens wahrscheinlich in Adaption an spezifische Anforderungen zur Feinabstimmung der CCRK-Regulation unter positiver Selektion evolvieren. Der Promotor des CCRK-Gens ist anfällig für epigenetische Modifikationen durch DNA-Methylierung, welche zu komplexen Transkriptionsmustern führen können. Durch ihre genomische Mobilität, ihren hohen CpG-Anteil und ihren Einfluss auf die Genexpression sind Alu-Insertionen exzellente Kandidaten für die Förderung von Veränderungen während der Entwicklungsregulation von Primatengenen. Der Vergleich der intra- und interspezifischen Methylierung von spezifischen Alu-Insertionen in anderen Genen und Geweben stellt eine erfolgversprechende Strategie dar.
Resumo:
Genome predictions based on selected genes would be a very welcome approach for taxonomic studies, including DNA-DNA similarity, G+C content and representative phylogeny of bacteria. At present, DNA-DNA hybridizations are still considered the gold standard in species descriptions. However, this method is time-consuming and troublesome, and datasets can vary significantly between experiments as well as between laboratories. For the same reasons, full matrix hybridizations are rarely performed, weakening the significance of the results obtained. The authors established a universal sequencing approach for the three genes recN, rpoA and thdF for the Pasteurellaceae, and determined if the sequences could be used for predicting DNA-DNA relatedness within the family. The sequence-based similarity values calculated using a previously published formula proved most useful for species and genus separation, indicating that this method provides better resolution and no experimental variation compared to hybridization. By this method, cross-comparisons within the family over species and genus borders easily become possible. The three genes also serve as an indicator of the genome G+C content of a species. A mean divergence of around 1 % was observed from the classical method, which in itself has poor reproducibility. Finally, the three genes can be used alone or in combination with already-established 16S rRNA, rpoB and infB gene-sequencing strategies in a multisequence-based phylogeny for the family Pasteurellaceae. It is proposed to use the three sequences as a taxonomic tool, replacing DNA-DNA hybridization.
Resumo:
PURPOSE: To perform baseline T(2) mapping of the hips of healthy volunteers, focusing on topographic variation, because no detailed study has involved hips. T(2) mapping is a quantitative magnetic resonance imaging (MRI) technique that evaluates cartilage matrix components. MATERIALS AND METHODS: Hips of 12 healthy adults (six men and six women; mean age = 29.5 +/- 4.9 years) were studied with a 3.0-Tesla MRI system. T(2) measurement in the oblique-coronal plane used a multi-spin-echo (MSE) sequence. Femoral cartilage was divided into 12 radial sections; acetabular cartilage was divided into six radial sections, and each section was divided into two layers representing the superficial and deep halves of the cartilage. T(2) of these sections and layers were measured. RESULTS: Femoral cartilage T(2) was the shortest (-20 degrees to 20 degrees and -10 degrees to 10 degrees , superficial and deep layers), with an increase near the magic angle (54.7 degrees ). Acetabular cartilage T(2) in both layers was shorter in the periphery than the other parts, especially at 20 degrees to 30 degrees . There were no significant differences in T(2) between right and left hips or between men and women. CONCLUSION: Topographic variation exists in hip cartilage T(2) in young, healthy adults. These findings should be taken into account when T(2) mapping is applied to patients with degenerative cartilage. J. Magn. Reson. Imaging 2007;26:165-171. (c) 2007 Wiley-Liss, Inc.
Resumo:
The growing knowledge on physiology, cell biology and biochemistry of the reproductive organs has provided many insights into molecular mechanisms that are required for successful reproduction. Research directed at the investigation of reproduction physiology in domestic animals was hampered in the past by a lack of species-specific genomic information. The genome sequences of dog, cattle and horse have become publicly available in 2005, 2006 and 2007 respectively. Although the gene content of mammalian genomes is generally very similar, genes involved in reproduction tend to be less conserved than the average mammalian gene. The availability of genome sequences provides a valuable resource to check whether any protein that may be known from human or mouse research is present in cattle and/or horse as well. Currently there are more than 200 genes known that are involved in the production of fertile sperm cells. Great progress has been made in the understanding of genetic aberrations that lead to male infertility. Additionally, the first genetic mechanisms are being discovered that contribute to the quantitative variation of fertility traits in fertile male animals. Here, I will review some selected aspects of genetic research in male fertility and offer some perspectives for the use of genomic sequence information.
Resumo:
Heritable variation in plant phenotypes, and thus potential for evolutionary change, can in principle not only be caused by variation in DNA sequence, but also by underlying epigenetic variation. However, the potential scope of such phenotypic effects and their evolutionary significance are largely unexplored. Here, we conducted a glasshouse experiment in which we tested the response of a large number of epigenetic recombinant inbred lines (epiRILs) of Arabidopsis thaliana – lines that are nearly isogenic but highly variable at the level of DNA methylation – to drought and increased nutrient conditions. We found significant heritable variation among epiRILs both in the means of several ecologically important plant traits and in their plasticities to drought and nutrients. Significant selection gradients, that is, fitness correlations, of several mean traits and plasticities suggest that selection could act on this epigenetically based phenotypic variation. Our study provides evidence that variation in DNA methylation can cause substantial heritable variation of ecologically important plant traits, including root allocation, drought tolerance and nutrient plasticity, and that rapid evolution based on epigenetic variation alone should thus be possible.
Resumo:
A multilocus sequence typing (MLST) scheme was established and evaluated for Mycoplasma hyopneumoniae, the etiologic agent of enzootic pneumonia in swine with the aim of defining strains. Putative target genes were selected by genome sequence comparisons. Out of 12 housekeeping genes chosen and experimentally validated, the 7 genes efp, metG, pgiB, recA, adk, rpoB, and tpiA were finally used to establish the MLST scheme. Their usefulness was assessed individually and in combination using a set of well-defined field samples and strains of M. hyopneumoniae. A reduction to the three targets showing highest variation (adk, rpoB, and tpiA) was possible resulting in the same number of sequence types as using the seven targets. The established MLST approach was compared with the recently described typing method using the serine-rich repeat motif-encoding region of the p146 gene. There was coherence between the two methods, but MLST resulted in a slightly higher resolution. Farms recognized to be affected by enzootic pneumonia were always associated with a single M. hyopneumoniae clone, which in most cases differed from farm to farm. However, farms in close geographic or operational contact showed identical clones as defined by MLST typing. Population analysis showed that recombination in M. hyopneumoniae occurs and that strains are very diverse with only limited clonality observed. Elaborate classical MLST schemes using multiple targets for M. hyopneumoniae might therefore be of limited value. In contrast, MLST typing of M. hyopneumoniae using the three genes adk, rpoB, and tpiA seems to be sufficient for epidemiological investigations by direct amplification of target genes from lysate of clinical material without prior cultivation.
Resumo:
BACKGROUND: Enterococcus faecalis has emerged as a major hospital pathogen. To explore its diversity, we sequenced E. faecalis strain OG1RF, which is commonly used for molecular manipulation and virulence studies. RESULTS: The 2,739,625 base pair chromosome of OG1RF was found to contain approximately 232 kilobases unique to this strain compared to V583, the only publicly available sequenced strain. Almost no mobile genetic elements were found in OG1RF. The 64 areas of divergence were classified into three categories. First, OG1RF carries 39 unique regions, including 2 CRISPR loci and a new WxL locus. Second, we found nine replacements where a sequence specific to V583 was substituted by a sequence specific to OG1RF. For example, the iol operon of OG1RF replaces a possible prophage and the vanB transposon in V583. Finally, we found 16 regions that were present in V583 but missing from OG1RF, including the proposed pathogenicity island, several probable prophages, and the cpsCDEFGHIJK capsular polysaccharide operon. OG1RF was more rapidly but less frequently lethal than V583 in the mouse peritonitis model and considerably outcompeted V583 in a murine model of urinary tract infections. CONCLUSION: E. faecalis OG1RF carries a number of unique loci compared to V583, but the almost complete lack of mobile genetic elements demonstrates that this is not a defining feature of the species. Additionally, OG1RF's effects in experimental models suggest that mediators of virulence may be diverse between different E. faecalis strains and that virulence is not dependent on the presence of mobile genetic elements.
Resumo:
Any functionally important mutation is embedded in an evolutionary matrix of other mutations. Cladistic analysis, based on this, is a method of investigating gene effects using a haplotype phylogeny to define a set of tests which localize causal mutations to branches of the phylogeny. Previous implementations of cladistic analysis have not addressed the issue of analyzing data from related individuals, though in human studies, family data are usually needed to obtain unambiguous haplotypes. In this study, a method of cladistic analysis is described in which haplotype effects are parameterized in a linear model which accounts for familial correlations. The method was used to study the effect of apolipoprotein (Apo) B gene variation on total-, LDL-, and HDL-cholesterol, triglyceride, and Apo B levels in 121 French families. Five polymorphisms defined Apo B haplotypes: the signal peptide Insertion/deletion, Bsp 1286I, XbaI, MspI, and EcoRI. Eleven haplotypes were found, and a haplotype phylogeny was constructed and used to define a set of tests of haplotype effects on lipid and apo B levels.^ This new method of cladistic analysis, the parametric method, found significant effects for single haplotypes for all variables. For HDL-cholesterol, 3 clusters of evolutionarily-related haplotypes affecting levels were found. Haplotype effects accounted for about 10% of the genetic variance of triglyceride and HDL-cholesterol levels. The results of the parametric method were compared to those of a method of cladistic analysis based on permutational testing. The permutational method detected fewer haplotype effects, even when modified to account for correlations within families. Simulation studies exploring these differences found evidence of systematic errors in the permutational method due to the process by which haplotype groups were selected for testing.^ The applicability of cladistic analysis to human data was shown. The parametric method is suggested as an improvement over the permutational method. This study has identified candidate haplotypes for sequence comparisons in order to locate the functional mutations in the Apo B gene which may influence plasma lipid levels. ^
Resumo:
Models of DNA sequence evolution and methods for estimating evolutionary distances are needed for studying the rate and pattern of molecular evolution and for inferring the evolutionary relationships of organisms or genes. In this dissertation, several new models and methods are developed.^ The rate variation among nucleotide sites: To obtain unbiased estimates of evolutionary distances, the rate heterogeneity among nucleotide sites of a gene should be considered. Commonly, it is assumed that the substitution rate varies among sites according to a gamma distribution (gamma model) or, more generally, an invariant+gamma model which includes some invariable sites. A maximum likelihood (ML) approach was developed for estimating the shape parameter of the gamma distribution $(\alpha)$ and/or the proportion of invariable sites $(\theta).$ Computer simulation showed that (1) under the gamma model, $\alpha$ can be well estimated from 3 or 4 sequences if the sequence length is long; and (2) the distance estimate is unbiased and robust against violations of the assumptions of the invariant+gamma model.^ However, this ML method requires a huge amount of computational time and is useful only for less than 6 sequences. Therefore, I developed a fast method for estimating $\alpha,$ which is easy to implement and requires no knowledge of tree. A computer program was developed for estimating $\alpha$ and evolutionary distances, which can handle the number of sequences as large as 30.^ Evolutionary distances under the stationary, time-reversible (SR) model: The SR model is a general model of nucleotide substitution, which assumes (i) stationary nucleotide frequencies and (ii) time-reversibility. It can be extended to SRV model which allows rate variation among sites. I developed a method for estimating the distance under the SR or SRV model, as well as the variance-covariance matrix of distances. Computer simulation showed that the SR method is better than a simpler method when the sequence length $L>1,000$ bp and is robust against deviations from time-reversibility. As expected, when the rate varies among sites, the SRV method is much better than the SR method.^ The evolutionary distances under nonstationary nucleotide frequencies: The statistical properties of the paralinear and LogDet distances under nonstationary nucleotide frequencies were studied. First, I developed formulas for correcting the estimation biases of the paralinear and LogDet distances. The performances of these formulas and the formulas for sampling variances were examined by computer simulation. Second, I developed a method for estimating the variance-covariance matrix of the paralinear distance, so that statistical tests of phylogenies can be conducted when the nucleotide frequencies are nonstationary. Third, a new method for testing the molecular clock hypothesis was developed in the nonstationary case. ^
Resumo:
Human pigmentation is a complex trait with the observed variation caused by the varied production of eumelanin (brown/black melanins) and phaeomelanin (red/yellow melanins) by the melanocytes. The melanocortin 1 receptor (MC1R), a G protein-coupled receptor expressed in the melanocytes, is a regulator eu- and phaeomelanin synthesis, and MC1R mutations causing skin and coat color changes are known in many mammals. To understand the role of MC1R in human pigmentation variation, I have sequenced the MC1R gene in 121 individuals sampled from world populations. In addition, I have sequenced the MC1R gene in common and pygmy chimpanzees, gorilla, orangutan, and baboon to study the evolution of MC1R and to infer the ancestral human MC1R sequence. The ancestral MC1R sequence is observed in all 25 African individuals studied, but at lower frequencies in the other populations examined, especially in East and Southeast Asians. The Arg163Gln variant is absent in the Africans studied, almost absent in Europeans, and at a low frequency in Indians, but is at an exceptionally high frequency (70%) in East and Southeast Asians. To further evaluate the role of MC1R variants in human pigmentation variation, I have combined these molecular evolution and population studies with functional assays on MC1R variants and primate MC1Rs. ^
Resumo:
BACKGROUND The copy number variation (CNV) in beta-defensin genes (DEFB) on human chromosome 8p23 has been proposed to contribute to the phenotypic differences in inflammatory diseases. However, determination of exact DEFB CN is a major challenge in association studies. Quantitative real-time PCR (qPCR), paralog ratio tests (PRT) and multiplex ligation-dependent probe amplification (MLPA) have been extensively used to determine DEFB CN in different laboratories, but inter-method inconsistencies were observed frequently. In this study we asked which one is superior among the three methods for DEFB CN determination. RESULTS We developed a clustering approach for MLPA and PRT to statistically correlate data from a single experiment. Then we compared qPCR, a newly designed PRT and MLPA for DEFB CN determination in 285 DNA samples. We found MLPA had the best convergence and clustering results of the raw data and the highest call rate. In addition, the concordance rates between MLPA or PRT and qPCR (32.12% and 37.99%, respectively) were unacceptably low with underestimated CN by qPCR. Concordance rate between MLPA and PRT (90.52%) was high but PRT systematically underestimated CN by one in a subset of samples. In these samples a sequence variant which caused complete PCR dropout of the respective DEFB cluster copies was found in one primer binding site of one of the targeted paralogous pseudogenes. CONCLUSION MLPA is superior to PRT and even more to qPCR for DEFB CN determination. Although the applied PRT provides in most cases reliable results, such a test is particularly sensitive to low-frequency sequence variations preferably accumulating in loci like pseudogenes which are most likely not under selective pressure. In the light of the superior performance of multiplex assays, the drawbacks of such single PRTs could be overcome by combining more test markers.
Resumo:
Objectives. The chief goal of this study was to analyze copy number variation (CNV) in breast cancer tumors from 25 African American women with early stage breast cancer (BC) using molecular inversion probes (MIP) in order to: (1) compare the degree of CNV in tumors compared to normal lymph nodes, and (2) determine whether gains and/or losses of genes in specific chromosomes differ between pathologic subtypes of breast cancer defined by known prognostic markers, (3) determine whether gains/losses in CN are associated with known oncogenes or tumor suppressor genes, and (4) determine whether increased gains/losses in CN for specific chromosomes were associated with differences in breast cancer recurrence. ^ Methods. Twenty to 37 nanograms of DNA extracted from 25 formalin-fixed paraffin embedded (FFPE) tumor samples and matched normal lymph nodes were added to individual tubes. Oligonucleotide probes with recognition sequences at each terminus were hybridized with a genomic target sequence to form a circular structure. Probes are released from genomic DNA obtained from FFPE samples, and those which have been correctly "circularized" in the proper allele/nucleotide reaction combination are amplified using polymerase chain reaction (PCR) primers. Amplicons were fluorescently labeled and the tag sequences released from the genome homology regions by treatment with uracil-N-glycosylase to cleave the probe at the site where uracils are present, and detected using a complementary tag array developed by Affymetrix. ^ Results. Analysis of CN gains and losses from tumors and normal tissues showed marked differences in tumors with numerous chromosomes affected. Similar changes were not observed in normal lymph nodes. When tumors were stratified into four groups based on expression or lack of expression of the estrogen receptor and HER2/neu, distinct patterns of CNV for different chromosomes were observed. Gains or losses in CN for specific chromosomes correlated with amplifications/deletions of particular oncogenes or tumor suppressor genes (i.e. such as found on chromosome 17) known to be associated with aggressive tumor phenotype and poor prognosis. There was a trend for increases in CN observed for chromosome 17 to correlate inversely with time to recurrence of BC (p=0.14 for trend). CNV was also observed for chromosomes 5, 8, 10, 11, and 16, which are known sites for several breast cancer susceptibility alleles. ^ Conclusions. This study is the first to validate the MIP technique, to correlate differences in gene expression with known prognostic tumor markers, and to correlate significant increases/decreases in CN with known tumor markers associated with prognosis. The results of this study may have far reaching public health implications towards identifying new high-risk groups based on genomic differences in CNP, both with respect to prognosis and response to therapy, and to eventually identify new therapeutic targets for prevention and treatment of this disease. ^
Resumo:
Seasonal variation in menarche, menstrual cycle length and menopause was investigated using Tremin Trust data. Too, self-reported hot flash data for women with natural and surgically-induced menopause were analyzed for rhythms.^ Menarche data from approximately 600 U.S. women born between 1940 and 1970 revealed a 6-month rhythm (first acrophase in January, double amplitude of 58%M). A notable shift from a December-January peak in menarche for those born in the 1940s and 1950s to an August-September peak for those born in the 1960s was observed. Groups of girls 8-14 and 15-17 yr old at menarche exhibited a seasonal difference in the pattern of menarche occurrence of about 6 months in relation to each other. Girls experiencing menarche during August-October were statistically significantly younger than those experiencing it at other times. Season of birth was not associated with season of menarche.^ The lengths of approximately 150,000 menstrual intervals of U.S. women were analyzed for seasonality. Menstrual intervals possibly disturbed by natural (e.g., childbirth) or other events (e.g., surgery, medication) were excluded. No 6- or 12-month rhythmicities were found for specific interval lengths (14-24, 25-31 and 32-56 days) or ages in relation to menstrual interval (9-11, 12-13, 15-19, 20-24, 25-39, 40-44 and 44 yr old and older).^ Hot flash data of 14 women experiencing natural menopause (NM) and 11 experiencing surgically-induced menopause (SIM) did not differ in frequency of hot flashes. Hot flashes in NM women exhibited 12- and 8-hr, but not 24-hr rhythmicities. Hot flashes in SIM women exhibited 24- and 12-hr, but not 8-hr, rhythmicities. Regardless of type of menopause, women with a peak frequency in hot flashes during the morning (0400 through 0950) were distinguishable from those with such in the evening (1600 through 2159).^ Data from approximately 200 U.S. women revealed a 6-month rhythm in menopause with first peak in May. No significant 12-month variation in menopause was detected by Cosinor analysis. Season of birth and age at menopause were not associated with season of menopause. Age at menopause declined significantly over the years for women born between 1907 and 1926, inclusive. ^