990 resultados para Gc Content


Relevância:

100.00% 100.00%

Publicador:

Resumo:

728 human genes were divided to four groups according to the GC contents of their coding sequences (from GC<0.43 to GC>0.58). Examination of synonymous-codon bias in the 4 groups show that NTG (N represents any base of T, A, C, G) is most favored and NCG

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Electrospray ionization mass spectrometry (ESI-MS) was used to investigate the binding of 13 alkaloids to two GC-rich DNA duplexes which are critical sequences in human survivin promoter. Negative ion ESI-MS was first applied to screen the binding of the alkaloids to the duplexes. Six alkaloids (including berberine, jatrorrhizine, palmatine, reserpine, berbamine, and tetrandrine) show complexation with the target DNA sequences. Relative binding affinities were estimated from the negative ion ESI data, and the alkaloids show a binding preference to the duplex with higher GC content. Positive ion ESI mass spectra of the complexes were also recorded and compared with those obtained in negative ion mode.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND: While effective population size (Ne) and life history traits such as generation time are known to impact substitution rates, their potential effects on base composition evolution are less well understood. GC content increases with decreasing body mass in mammals, consistent with recombination-associated GC biased gene conversion (gBGC) more strongly impacting these lineages. However, shifts in chromosomal architecture and recombination landscapes between species may complicate the interpretation of these results. In birds, interchromosomal rearrangements are rare and the recombination landscape is conserved, suggesting that this group is well suited to assess the impact of life history on base composition. RESULTS: Employing data from 45 newly and 3 previously sequenced avian genomes covering a broad range of taxa, we found that lineages with large populations and short generations exhibit higher GC content. The effect extends to both coding and non-coding sites, indicating that it is not due to selection on codon usage. Consistent with recombination driving base composition, GC content and heterogeneity were positively correlated with the rate of recombination. Moreover, we observed ongoing increases in GC in the majority of lineages. CONCLUSIONS: Our results provide evidence that gBGC may drive patterns of nucleotide composition in avian genomes and are consistent with more effective gBGC in large populations and a greater number of meioses per unit time; that is, a shorter generation time. Thus, in accord with theoretical predictions, base composition evolution is substantially modulated by species life history.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

O surgimento das plataformas de sequenciamento de nova geração (NGS) proporcionou o aumento do volume de dados produzidos, tornando possível a obtenção de genomas completos. Apesar das vantagens alcançadas com estas plataformas, são observadas regiões de elevada ou baixa cobertura, em relação à média, associadas diretamente ao conteúdo GC. Este viés GC pode afetar análises genômicas e dificultar a montagem de genomas através da abordagem de novo, além de afetar as análises baseadas em referência. Além do que, as maneiras de avaliar o viés GC deve ser adequada para dados com diferentes perfis de relação/associação entre GC e cobertura, tais como linear e quadrático. Desta forma, este trabalho propõe o uso do Coeficiente de Correlação de Pearson (r) para analisar a correlação entre conteúdo GC e Cobertura, permitindo identificar aintensidade da correlação linear e detectar associações não-lineares, além de identificar a relação entre viés GC e as plataformas de sequenciamento. Os sinais positivos e negativos de r também permitem inferir relações diretamente proporcionais e inversamente proporcionais respectivamente. Utilizou-se dados da espécie Corynebacterium pseudotuberculosis, conhecido por serem genomas clonais obtidas através de diferentes tecnologias de sequenciamento para identificar se há relação do viés GC com as plataformas utilizadas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Understanding the complexities that are involved in the genetics of multifactorial diseases is still a monumental task. In addition to environmental factors that can influence the risk of disease, there is also a number of other complicating factors. Genetic variants associated with age of disease onset may be different from those variants associated with overall risk of disease, and variants may be located in positions that are not consistent with the traditional protein coding genetic paradigm. Latent Variable Models are well suited for the analysis of genetic data. A latent variable is one that we do not directly observe, but which is believed to exist or is included for computational or analytic convenience in a model. This thesis presents a mixture of methodological developments utilising latent variables, and results from case studies in genetic epidemiology and comparative genomics. Epidemiological studies have identified a number of environmental risk factors for appendicitis, but the disease aetiology of this oft thought useless vestige remains largely a mystery. The effects of smoking on other gastrointestinal disorders are well documented, and in light of this, the thesis investigates the association between smoking and appendicitis through the use of latent variables. By utilising data from a large Australian twin study questionnaire as both cohort and case-control, evidence is found for the association between tobacco smoking and appendicitis. Twin and family studies have also found evidence for the role of heredity in the risk of appendicitis. Results from previous studies are extended here to estimate the heritability of age-at-onset and account for the eect of smoking. This thesis presents a novel approach for performing a genome-wide variance components linkage analysis on transformed residuals from a Cox regression. This method finds evidence for a dierent subset of genes responsible for variation in age at onset than those associated with overall risk of appendicitis. Motivated by increasing evidence of functional activity in regions of the genome once thought of as evolutionary graveyards, this thesis develops a generalisation to the Bayesian multiple changepoint model on aligned DNA sequences for more than two species. This sensitive technique is applied to evaluating the distributions of evolutionary rates, with the finding that they are much more complex than previously apparent. We show strong evidence for at least 9 well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least 7 classes in an alignment of four mammals, including human. A pattern of enrichment and depletion of genic regions in the profiled segments suggests they are functionally significant, and most likely consist of various functional classes. Furthermore, a method of incorporating alignment characteristics representative of function such as GC content and type of mutation into the segmentation model is developed within this thesis. Evidence of fine-structured segmental variation is presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The 3′ UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3′ UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3′ UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3′ UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3′ UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Computational epigenetics is a new area of research focused on exploring how DNA methylation patterns affect transcription factor binding that affect gene expression patterns. The aim of this study was to produce a new protocol for the detection of DNA methylation patterns using computational analysis which can be further confirmed by bisulfite PCR with serial pyrosequencing. The upstream regulatory element and pre-initiation complex relative to CpG islets within the methylenetetrahydrofolate reductase gene were determined via computational analysis and online databases. The 1,104 bp long CpG island located near to or at the alternative promoter site of methylenetetrahydrofolate reductase gene was identified. The CpG plot indicated that CpG islets A and B, within the island, contained 62 and 75 % GC content CpG ratios of 0.70 and 0.80–0.95, respectively. Further exploration of the CpG islets A and B indicates that the transcription start sites were GGC which were absent from the TATA boxes. In addition, although six PROSITE motifs were identified in CpG B, no motifs were detected in CpG A. A number of cis-regulatory elements were found in different regions within the CpGs A and B. Transcription factors were predicted to bind to CpGs A and B with varying affinities depending on the DNA methylation status. In addition, transcription factor binding may influence the expression patterns of the methylenetetrahydrofolate reductase gene by recruiting chromatin condensation inducing factors. These results have significant implications for the understanding of the architecture of transcription factor binding at CpG islets as well as DNA methylation patterns that affect chromatin structure.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In Escherichia coli, the canonical intrinsic terminator of transcription includes a palindrome followed by a U-trail on the transcript. The apparent underrepresentation of such terminators in eubacterial genomes led us to develop a rapid and accurate algorithm, GeSTer, to predict putative intrinsic terminators. Now, we have analyzed 378 genome sequences with an improved version of GeSTer. Our results indicate that the canonical E. coli type terminators are not overwhelmingly abundant in eubacteria. The atypical structures, having stem-loop structures but lacking ‘U’ trail, occur downstream of genes in all the analyzed genomes but different phyla show conserved preference for different types of terminators. This propensity correlates with genomic GC content and presence of the factor, Rho. 60–70% of identified terminators in all the genomes show “optimized” stem-length and ΔG. These results provide evidence that eubacteria extensively rely on the mechanism of intrinsic termination, with a considerable divergence in their structure, positioning and prevalence. The software and detailed results for individual genomes are freely available on request

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The rapid increase in genome sequence information has necessitated the annotation of their functional elements, particularly those occurring in the non-coding regions, in the genomic context. Promoter region is the key regulatory region, which enables the gene to be transcribed or repressed, but it is difficult to determine experimentally. Hence an in silico identification of promoters is crucial in order to guide experimental work and to pin point the key region that controls the transcription initiation of a gene. In this analysis, we demonstrate that while the promoter regions are in general less stable than the flanking regions, their average free energy varies depending on the GC composition of the flanking genomic sequence. We have therefore obtained a set of free energy threshold values, for genomic DNA with varying GC content and used them as generic criteria for predicting promoter regions in several microbial genomes, using an in-house developed tool `PromPredict'. On applying it to predict promoter regions corresponding to the 1144 and 612 experimentally validated TSSs in E. coli (50.8% GC) and B. subtilis (43.5% GC) sensitivity of 99% and 95% and precision values of 58% and 60%, respectively, were achieved. For the limited data set of 81 TSSs available for M. tuberculosis (65.6% GC) a sensitivity of 100% and precision of 49% was obtained.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A number of studies have shown that the structure and composition of bacterial nucleoid influences many a processes related to DNA metabolism. The nucleoid-associated proteins modulate not only the DNA conformation but also regulate the DNA metabolic processes such as replication, recombination, repair and transcription. Understanding of how these processes occur in the context of Mycobacterium tuberculosis nucleoid is of considerable medical importance because the nucleoid structure may be constantly remodeled in response to environmental signals and/or growth conditions. Many studies have concluded that Escherichia coli H-NS binds to DNA in a sequence-independent manner, with a preference for A-/T-rich tracts in curved DNA; however, recent studies have identified the existence of medium- and low-affinity binding sites in the vicinity of the curved DNA. Here, we show that the M. tuberculosis H-NS protein binds in a more structure-specific manner to DNA replication and repair intermediates, but displays lower affinity for double-stranded DNA with relatively higher GC content. Notably, M. tuberculosis H-NS was able to bind Holliday junction (HJ), the central recombination intermediate, with substantially higher affinity and inhibited the three-strand exchange promoted by its cognate RecA. Likewise, E. coli H-NS was able to bind the HJ and suppress DNA strand exchange promoted by E. coli RecA, although much less efficiently compared to M. tuberculosis H-NS. Our results provide new insights into a previously unrecognized function of H-NS protein, with implications for blocking the genome integration of horizontally transferred genes by homologous and/or homeologous recombination.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Motivation: The number of bacterial genomes being sequenced is increasing very rapidly and hence, it is crucial to have procedures for rapid and reliable annotation of their functional elements such as promoter regions, which control the expression of each gene or each transcription unit of the genome. The present work addresses this requirement and presents a generic method applicable across organisms. Results: Relative stability of the DNA double helical sequences has been used to discriminate promoter regions from non-promoter regions. Based on the difference in stability between neighboring regions, an algorithm has been implemented to predict promoter regions on a large scale over 913 microbial genome sequences. The average free energy values for the promoter regions as well as their downstream regions are found to differ, depending on their GC content. Threshold values to identify promoter regions have been derived using sequences flanking a subset of translation start sites from all microbial genomes and then used to predict promoters over the complete genome sequences. An average recall value of 72% (which indicates the percentage of protein and RNA coding genes with predicted promoter regions assigned to them) and precision of 56% is achieved over the 913 microbial genome dataset.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The construction and characterization of two genome-specific recombinant DNA clones from B. nigra are described. Southern analysis showed that the two clones belong to a dispersed repeat family. They differ from each other in their length, distribution and sequence, though the average GC content is nearly the same (45%). These B genome-specific repeats have been used to analyse the phylogenetic relationships between cultivated and wild species of the family Brassicaceae.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of the fundamental questions concerning homologous recombination is how RecA or its homologues recognize several DNA sequences with high affinity and catalyze all the diverse biological activities. In this study, we show that the extent of single-stranded DNA binding and strand exchange (SE) promoted by mycobacterial RecA proteins with DNA substrates having various degrees of GC content was comparable with that observed for Escherichia coli RecA. However, the rate and extent of SE promoted by these recombinases showed a strong negative correlation with increasing amounts of sequence divergence embedded at random across the length of the donor strand. Conversely, a positive correlation was seen between SE efficiency and the degree of sequence divergence in the recipient duplex DNA. The extent of heteroduplex formation was not significantly affected when both the pairing partners contained various degrees of sequence divergence, although there was a moderate decrease in the case of mycobacterial RecA proteins with substrates containing larger amounts of sequence divergence. Whereas a high GC content had no discernible effect on E. coli RecA coprotease activity, a negative correlation was apparent between mycobacterial RecA proteins and GC content. We further show clear differences in the extent of SE promoted by E. coli and mycobacterial RecA proteins in the presence of a wide range of ATP:ADP ratios. Taken together, our findings disclose the existence of functional diversity among E. coli and mycobacterial RecA nucleoprotein filaments, and the milieu of sequence divergence (i.e., in the donor or recipient) exerts differential effects on heteroduplex formation, which has implications for the emergence of new genetic variants.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Part I

Chapter 1.....A physicochemical study of the DNA molecules from the three bacteriophages, N1, N5, and N6, which infect the bacterium, M. lysodeikticus, has been made. The molecular weights, as measured by both electron microscopy and sedimentation velocity, are 23 x 106 for N5 DNA and 31 x 106 for N1 and N6 DNA's. All three DNA's are capable of thermally reversible cyclization. N1 and N6 DNA's have identical or very similar base sequences as judged by membrane filter hybridization and by electron microscope heteroduplex studies. They have identical or similar cohesive ends. These results are in accord with the close biological relation between N1 and N6 phages. N5 DNA is not closely related to N1 or N6 DNA. The denaturation Tm of all three DNA's is the same and corresponds to a (GC) content of 70%. However, the buoyant densities in CsCl of Nl and N6 DNA's are lower than expected, corresponding to predicted GC contents of 64 and 67%. The buoyant densities in Cs2SO4 are also somewhat anomalous. The buoyant density anomalies are probably due to the presence of odd bases. However, direct base composition analysis of N1 DNA by anion exchange chromatography confirms a GC content of 70%, and, in the elution system used, no peaks due to odd bases are present.

Chapter 2.....A covalently closed circular DNA form has been observed as an intracellular form during both productive and abortive infection processes in M. lysodeikticus. This species has been isolated by the method of CsC1-ethidium bromide centrifugation and examined with an electron microscope.

Chapter 3.....A minute circular DNA has been discovered as a homogeneous population in M. lysodeikticus. Its length and molecular weight as determined by electron microscopy are 0.445 μ and 0.88 x 106 daltons respectively. There is about one minicircle per bacterium.

Chapter 4.....Several strains of E. coli 15 harbor a prophage. Viral growth can be induced by exposing the host to mitomycin C or to uv irradiation. The coliphage 15 particles from E. coli 15 and E, coli 15 T- appear as normal phage with head and tail structure; the particles from E. coli 15 TAU are tailless. The complete particles exert a colicinogenic activity on E.coli 15 and 15 T-, the tailless particles do not. No host for a productive viral infection has been found and the phage may be defective. The properties of the DNA of the virus have been studied, mainly by electron microscopy. After induction but before lysis, a closed circular DNA with a contour length of about 11.9 μ is found in the bacterium; the mature phage DNA is a linear duplex and 7.5% longer than the intracellular circular form. This suggests the hypothesis that the mature phage DNA is terminally repetitious and circularly permuted. The hypothesis was confirmed by observing that denaturation and renaturation of the mature phage DNA produce circular duplexes with two single-stranded branches corresponding to the terminal repetition. The contour length of the mature phage DNA was measured relative to φX RFII DNA and λ DNA; the calculated molecular weight is 27 x 106. The length of the single-stranded terminal repetition was compared to the length of φX 174 DNA under conditions where single-stranded DNA is seen in an extended form in electron micrographs. The length of the terminal repetition is found to be 7.4% of the length of the nonrepetitious part of the coliphage 15 DNA. The number of base pairs in the terminal repetition is variable in different molecules, with a fractional standard deviation of 0.18 of the average number in the terminal repetition. A new phenomenon termed "branch migration" has been discovered in renatured circular molecules; it results in forked branches, with two emerging single strands, at the position of the terminal repetition. The distribution of branch separations between the two terminal repetitions in the population of renatured circular molecules was studied. The observed distribution suggests that there is an excluded volume effect in the renaturation of a population of circularly permuted molecules such that strands with close beginning points preferentially renature with each other. This selective renaturation and the phenomenon of branch migration both affect the distribution of branch separations; the observed distribution does not contradict the hypothesis of a random distribution of beginning points around the chromosome.

Chapter 5....Some physicochemical studies on the minicircular DNA species in E. coli 15 (0.670 μ, 1.47 x 106 daltons) have been made. Electron microscopic observations showed multimeric forms of the minicircle which amount to 5% of total DNA species and also showed presumably replicating forms of the minicircle. A renaturation kinetic study showed that the minicircle is a unique DNA species in its size and base sequence. A study on the minicircle replication has been made under condition in which host DNA synthesis is synchronized. Despite experimental uncertainties involved, it seems that the minicircle replication is random and the number of the minicircles increases continuously throughout a generation of the host, regardless of host DNA synchronization.

Part II

The flow dichroism of dilute DNA solutions (A260≈0.1) has been studied in a Couette-type apparatus with the outer cylinder rotating and with the light path parallel to the cylinder axis. Shear gradients in the range of 5-160 sec.-1 were studied. The DNA samples were whole, "half," and "quarter" molecules of T4 bacteriophage DNA, and linear and circular λb2b5c DNA. For the linear molecules, the fractional flow dichroism is a linear function of molecular weight. The dichroism for linear A DNA is about 1.8 that of the circular molecule. For a given DNA, the dichroism is an approximately linear function of shear gradient, but with a slight upward curvature at low values of G, and some trend toward saturation at larger values of G. The fractional dichroism increases as the supporting electrolyte concentration decreases.