913 resultados para Sequence Stratigrafy
Resumo:
By detailed NMR analysis of a human telomere repeating unit, d(CCCTAA), we have found that three distinct tetramers, each of which consists of four symmetric single-strands, slowly exchange in a slightly acidic solution. Our new finding is a novel i-motif topology (T-form) where T4 is intercalated between C1 and C2 of the other duplex. The other two tetramers have a topology where C1 is intercalated between C2 and C3 of the other parallel duplex, resulting in the non-stacking T4 residues (R-form), and a topology where C1 is stacked between C3 and T4 of the other duplex (S-form). From the NMR denaturation profile, the R-form is the most stable of the three structures in the temperature range of 15–50°C, the S-form the second and the T-form the least stable. The thermodynamic parameters indicate that the T-form is the most enthalpically driven and entropically opposed, and its population is increased with decreasing temperature. The T-form structure determined by restrained molecular dynamics calculation suggests that inter-strand van der Waals contacts in the narrow grooves should contribute to the enthalpic stabilization of the T-form.
Resumo:
Here we study the effect of point mutations in proteins on the redistributions of the conformational substates. We show that regardless of the location of a mutation in the protein structure and of its type, the observed movements of the backbone recur largely at the same positions in the structures. Despite the different interactions that are disrupted and formed by the residue substitution, not only are the conformations very similar, but the regions that move are also the same, regardless of their sequential or spatial distance from the mutation. This observation leads us to conclude that, apart from some extreme cases, the details of the interactions are not critically important in determining the protein conformation or in specifying which parts of the protein would be more prone to take on different local conformations in response to changes in the sequence. This finding further illustrates why proteins manifest a robustness toward many mutational events. This nonuniform distribution of the conformer population is consistently observed in a variety of protein structural types. Topology is critically important in determining folding pathways, kinetics, building block cutting, and anatomy trees. Here we show that topology is also very important in determining which regions of the protein structure will respond to sequence changes, regardless of the sequential or spatial location of the mutation.
Resumo:
We present here the complete genome sequence of a common avian clone of Pasteurella multocida, Pm70. The genome of Pm70 is a single circular chromosome 2,257,487 base pairs in length and contains 2,014 predicted coding regions, 6 ribosomal RNA operons, and 57 tRNAs. Genome-scale evolutionary analyses based on pairwise comparisons of 1,197 orthologous sequences between P. multocida, Haemophilus influenzae, and Escherichia coli suggest that P. multocida and H. influenzae diverged ≈270 million years ago and the γ subdivision of the proteobacteria radiated about 680 million years ago. Two previously undescribed open reading frames, accounting for ≈1% of the genome, encode large proteins with homology to the virulence-associated filamentous hemagglutinin of Bordetella pertussis. Consistent with the critical role of iron in the survival of many microbial pathogens, in silico and whole-genome microarray analyses identified more than 50 Pm70 genes with a potential role in iron acquisition and metabolism. Overall, the complete genomic sequence and preliminary functional analyses provide a foundation for future research into the mechanisms of pathogenesis and host specificity of this important multispecies pathogen.
Resumo:
Insulin-regulated aminopeptidase (IRAP), a transmembrane aminopeptidase, is dynamically retained within the endosomal compartment of fibroblasts. The characteristics of this dynamic retention are rapid internalization from the plasma membrane and slow recycling back to the cell surface. These specialized trafficking kinetics result in <15% of IRAP on the cell surface at steady state, compared with 35% of the transferrin receptor, another transmembrane protein that traffics between endosomes and the cell surface. Here we demonstrate that a 29-amino acid region of IRAP's cytoplasmic domain (residues 56–84) is necessary and sufficient to promote trafficking characteristic of IRAP. A di-leucine sequence and a cluster of acidic amino acids within this region are essential elements of the motif that slows IRAP recycling. Rapid internalization requires any two of three distinct motifs: M15,16, DED64–66, and LL76,77. The DED and LL sequences are part of the motif that regulates recycling, demonstrating that this motif is bifunctional. In this study we used horseradish peroxidase quenching of fluorescence to demonstrate that IRAP is dynamically retained within the transferrin receptor-containing general endosomal recycling compartment. Therefore, our data demonstrate that motifs similar to those that determine targeting among distinct membrane compartments can also regulate the rate of transport of proteins from endosomal compartments. We propose a model for dynamic retention in which IRAP is transported from the general endosomal recycling compartment in specialized, slowly budding recycling vesicles that are distinct from those that mediate rapid recycling back to the surface (e.g., transferrin receptor-containing transport vesicles). It is likely that the dynamic retention of IRAP is an example of a general mechanism for regulating the distribution of proteins between the surface and interior of cells.
Resumo:
SF3b155 is an essential spliceosomal protein, highly conserved during evolution. It has been identified as a subunit of splicing factor SF3b, which, together with a second multimeric complex termed SF3a, interacts specifically with the 12S U2 snRNP and converts it into the active 17S form. The protein displays a characteristic intranuclear localization. It is diffusely distributed in the nucleoplasm but highly concentrated in defined intranuclear structures termed “speckles,” a subnuclear compartment enriched in small ribonucleoprotein particles and various splicing factors. The primary sequence of SF3b155 suggests a multidomain structure, different from those of other nuclear speckles components. To identify which part of SF3b155 determines its specific intranuclear localization, we have constructed expression vectors encoding a series of epitope-tagged SF3b155 deletion mutants as well as chimeric combinations of SF3b155 sequences with the soluble cytoplasmic protein pyruvate kinase. Following transfection of cultured mammalian cells, we have identified (i) a functional nuclear localization signal of the monopartite type (KRKRR, amino acids 196–200) and (ii) a molecular segment with multiple threonine-proline repeats (amino acids 208–513), which is essential and sufficient to confer a specific accumulation in nuclear speckles. This latter sequence element, in particular amino acids 208–440, is required for correct subcellular localization of SF3b155 and is also sufficient to target a reporter protein to nuclear speckles. Moreover, this “speckle-targeting sequence” transfers the capacity for interaction with other U2 snRNP components.
Resumo:
The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living α-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus.
Resumo:
There is a need for faster and more sensitive algorithms for sequence similarity searching in view of the rapidly increasing amounts of genomic sequence data available. Parallel processing capabilities in the form of the single instruction, multiple data (SIMD) technology are now available in common microprocessors and enable a single microprocessor to perform many operations in parallel. The ParAlign algorithm has been specifically designed to take advantage of this technology. The new algorithm initially exploits parallelism to perform a very rapid computation of the exact optimal ungapped alignment score for all diagonals in the alignment matrix. Then, a novel heuristic is employed to compute an approximate score of a gapped alignment by combining the scores of several diagonals. This approximate score is used to select the most interesting database sequences for a subsequent Smith–Waterman alignment, which is also parallelised. The resulting method represents a substantial improvement compared to existing heuristics. The sensitivity and specificity of ParAlign was found to be as good as Smith–Waterman implementations when the same method for computing the statistical significance of the matches was used. In terms of speed, only the significantly less sensitive NCBI BLAST 2 program was found to outperform the new approach. Online searches are available at http://dna.uio.no/search/
Resumo:
The release of vast quantities of DNA sequence data by large-scale genome and expressed sequence tag (EST) projects underlines the necessity for the development of efficient and inexpensive ways to link sequence databases with temporal and spatial expression profiles. Here we demonstrate the power of linking cDNA sequence data (including EST sequences) with transcript profiles revealed by cDNA-AFLP, a highly reproducible differential display method based on restriction enzyme digests and selective amplification under high stringency conditions. We have developed a computer program (GenEST) that predicts the sizes of virtual transcript-derived fragments (TDFs) of in silico-digested cDNA sequences retrieved from databases. The vast majority of the resulting virtual TDFs could be traced back among the thousands of TDFs displayed on cDNA-AFLP gels. Sequencing of the corresponding bands excised from cDNA-AFLP gels revealed no inconsistencies. As a consequence, cDNA sequence databases can be screened very efficiently to identify genes with relevant expression profiles. The other way round, it is possible to switch from cDNA-AFLP gels to sequences in the databases. Using the restriction enzyme recognition sites, the primer extensions and the estimated TDF size as identifiers, the DNA sequence(s) corresponding to a TDF with an interesting expression pattern can be identified. In this paper we show examples in both directions by analyzing the plant parasitic nematode Globodera rostochiensis. Various novel pathogenicity factors were identified by combining ESTs from the infective stage juveniles with expression profiles of ∼4000 genes in five developmental stages produced by cDNA-AFLP.
Resumo:
The SfiI endonuclease cleaves DNA at the sequence GGCCNNNN↓NGGCC, where N is any base and ↓ is the point of cleavage. Proteins that recognise discontinuous sequences in DNA can be affected by the unspecified sequence between the specified base pairs of the target site. To examine whether this applies to SfiI, a series of DNA duplexes were made with identical sequences apart from discrete variations in the 5 bp spacer. The rates at which SfiI cleaved each duplex were measured under steady-state conditions: the steady-state rates were determined by the DNA cleavage step in the reaction pathway. SfiI cleaved some of these substrates at faster rates than other substrates. For example, the change in spacer sequence from AACAA to AAACA caused a 70-fold increase in reaction rate. In general, the extrapolated values for kcat and Km were both higher on substrates with inflexible spacers than those with flexible structures. The dinucleotide at the site of cleavage was largely immaterial. SfiI activity is thus highly dependent on conformational variations in the spacer DNA.
Resumo:
The epsilon enhancer element is a pyrimidine-rich sequence that increases expression of T7 gene 10 and a number of Escherichia coli mRNAs during initiation of translation and inhibits expression of the recF mRNA during elongation. Based on its complementarity to the 460 region of 16S rRNA, it has been proposed that epsilon exerts its enhancer activity by base pairing to this complementary rRNA sequence. We have tested this model of enhancer action by constructing mutations in the 460 region of 16S rRNA and examining expression of epsilon-containing CAT reporter genes and recF–lacZ fusions in strains expressing the mutant rRNAs. Replacement of the 460 E.coli stem–loop with that of Salmonella enterica serovar Typhimurium or a stem–loop containing a reversal of all 8 bp in the helical region produced fully functional rRNAs with no apparent effect on cell growth or expression of any epsilon-containing mRNA. Our experiments confirm the reported effects of the epsilon elements on gene expression but show that these effects are independent of the sequence of the 460 region of 16S rRNA, indicating that epsilon–rRNA base pairing does not occur.
Resumo:
We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an ∼25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat.
Resumo:
The 1,852,442-bp sequence of an M1 strain of Streptococcus pyogenes, a Gram-positive pathogen, has been determined and contains 1,752 predicted protein-encoding genes. Approximately one-third of these genes have no identifiable function, with the remainder falling into previously characterized categories of known microbial function. Consistent with the observation that S. pyogenes is responsible for a wider variety of human disease than any other bacterial species, more than 40 putative virulence-associated genes have been identified. Additional genes have been identified that encode proteins likely associated with microbial “molecular mimicry” of host characteristics and involved in rheumatic fever or acute glomerulonephritis. The complete or partial sequence of four different bacteriophage genomes is also present, with each containing genes for one or more previously undiscovered superantigen-like proteins. These prophage-associated genes encode at least six potential virulence factors, emphasizing the importance of bacteriophages in horizontal gene transfer and a possible mechanism for generating new strains with increased pathogenic potential.