42 resultados para Simple Sequence Repeats


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present study explores a “hydrophobic” energy function for folding simulations of the protein lattice model. The contribution of each monomer to conformational energy is the product of its “hydrophobicity” and the number of contacts it makes, i.e., E(h⃗, c⃗) = −Σi=1N cihi = −(h⃗.c⃗) is the negative scalar product between two vectors in N-dimensional cartesian space: h⃗ = (h1, … , hN), which represents monomer hydrophobicities and is sequence-dependent; and c⃗ = (c1, … , cN), which represents the number of contacts made by each monomer and is conformation-dependent. A simple theoretical analysis shows that restrictions are imposed concomitantly on both sequences and native structures if the stability criterion for protein-like behavior is to be satisfied. Given a conformation with vector c⃗, the best sequence is a vector h⃗ on the direction upon which the projection of c⃗ − c̄⃗ is maximal, where c̄⃗ is the diagonal vector with components equal to c̄, the average number of contacts per monomer in the unfolded state. Best native conformations are suggested to be not maximally compact, as assumed in many studies, but the ones with largest variance of contacts among its monomers, i.e., with monomers tending to occupy completely buried or completely exposed positions. This inside/outside segregation is reflected on an apolar/polar distribution on the corresponding sequence. Monte Carlo simulations in two dimensions corroborate this general scheme. Sequences targeted to conformations with large contact variances folded cooperatively with thermodynamics of a two-state transition. Sequences targeted to maximally compact conformations, which have lower contact variance, were either found to have degenerate ground state or to fold with much lower cooperativity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fanconi anemia (FA) is a genetically heterogeneous autosomal recessive syndrome associated with chromosomal instability, hypersensitivity to DNA crosslinking agents, and predisposition to malignancy. The gene for FA complementation group A (FAA) recently has been cloned. The cDNA is predicted to encode a polypeptide of 1,455 amino acids, with no homologies to any known protein that might suggest a function for FAA. We have used single-strand conformational polymorphism analysis to screen genomic DNA from a panel of 97 racially and ethnically diverse FA patients from the International Fanconi Anemia Registry for mutations in the FAA gene. A total of 85 variant bands were detected. Forty-five of the variants are probably benign polymorphisms, of which nine are common and can be used for various applications, including mapping studies for other genes in this region of chromosome 16q. Amplification refractory mutation system assays were developed to simplify their detection. Forty variants are likely to be pathogenic mutations. Seventeen of these are microdeletions/microinsertions associated with short direct repeats or homonucleotide tracts, a type of mutation thought to be generated by a mechanism of slipped-strand mispairing during DNA replication. A screening of 350 FA probands from the International Fanconi Anemia Registry for two of these deletions (1115–1118del and 3788–3790del) revealed that they are carried on about 2% and 5% of the FA alleles, respectively. 3788–3790del appears in a variety of ethnic groups and is found on at least two different haplotypes. We suggest that FAA is hypermutable, and that slipped-strand mispairing, a mutational mechanism recognized as important for the generation of germ-line and somatic mutations in a variety of cancer-related genes, including p53, APC, RB1, WT1, and BRCA1, may be a major mechanism for FAA mutagenesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The chromosomal DNA of the bacteria Streptomyces ambofaciens DSM40697 is an 8-Mb linear molecule that ends in terminal inverted repeats (TIRs) of 210 kb. The sequences of the TIRs are highly variable between the different linear replicons of Streptomyces (plasmids or chromosomes). Two spontaneous mutant strains harboring TIRs of 480 and 850 kb were isolated. The TIR polymorphism seen is a result of the deletion of one chromosomal end and its replacement by 480 or 850 kb of sequence identical to the end of the undeleted chromosomal arm. Analysis of the wild-type sequences involved in these rearrangements revealed that a recombination event took place between the two copies of a duplicated DNA sequence. Each copy was mapped to one chromosomal arm, outside of the TIR, and encoded a putative alternative sigma factor. The two ORFs, designated hasR and hasL, were found to be 99% similar at the nucleotide level. The sequence of the chimeric regions generated by the recombination showed that the chromosomal structure of the mutant strains resulted from homologous recombination events between the two copies. We suggest that this mechanism of chromosomal arm replacement contributes to the rapid evolutionary diversification of the sequences of the TIR in Streptomyces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple copies of the hexamer TGCATG have been shown to regulate fibronectin pre-mRNA alternative splicing. GCATG repeats also are clustered near the regulated calcitonin-specific 3′ splice site in the rat calcitonin/CGRP gene. Specific mutagenesis of these repeats in calcitonin/CGRP pre-mRNA resulted in the loss of calcitonin-specific splicing, suggesting that the native repeats act to enhance alternative exon inclusion. Mutation of subsets of these elements implies that alternative splicing requires a minimum of two repeats, and that the combination of one intronic and one exonic repeat is necessary for optimal cell-specific splicing. However, multimerized intronic repeats inhibited calcitonin-specific splicing in both the wild-type context and in a transcript lacking endogenous repeats. These results suggest that both the number and distribution of repeats may be important features for the regulation of tissue-specific alternative splicing. Further, RNA containing a single repeat bound cell-specific protein complexes, but tissue-specific differences in protein binding were not detected by using multimerized repeats. Together, these data support a novel model for alternative splicing regulation that requires the cell-specific recognition of multiple, distributed sequence elements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many proteins contain reiterated glutamine residues, but polyglutamine of excessive length may result in human disease by conferring new properties on the protein containing it. One established property of a glutamine residue, depending on the nature of the flanking residues, is its ability to act as an amine acceptor in a transglutaminase-catalyzed reaction and to make a glutamyl–lysine cross-link with a neighboring polypeptide. To learn whether glutamine repeats can act as amine acceptors, we have made peptides with variable lengths of polyglutamine flanked by the adjacent amino acid residues in the proteins associated with spinocerebellar ataxia type 1 (SCA1), Machado–Joseph disease (SCA3), or dentato-rubral pallido-luysian atrophy (DRPLA) or those residues adjacent to the preferred cross-linking site of involucrin, or solely by arginine residues. The polyglutamine was found to confer excellent substrate properties on any soluble peptide; under optimal conditions, virtually all the glutamine residues acted as amine acceptors in the reaction with glycine ethyl-ester, and lengthening the sequence of polyglutamine increased the reactivity of each glutamine residue. In the presence of transglutaminase, peptides containing polyglutamine formed insoluble aggregates with the proteins of brain extracts and these aggregates contained glutamyl–lysine cross-links. Repeated glutamine residues exposed on the surface of a neuronal protein should form cross-linked aggregates in the presence of any transglutaminase activated by the presence of Ca2+.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The evolutionarily-conserved DNA-binding protein RBP-J directly interacts with the RAM domain and the ankyrin (ANK) repeats of the Notch intracellular region (RAMIC), and activates transcription of downstream target genes that regulate cell differentiation. In vitro binding assays demonstrate that the truncated N- and C-terminal regions of RBP-J bind to the ANK repeats but not to the RAM domain. Using an OT11 mouse cell line, in which the RBP-J locus is disrupted, we showed that RBP-J constructs mutated in the N- and C-terminal regions were defective in their transcriptional activation induced by either RAMIC or IC (the Notch intracellular region without the RAM domain) although they had normal levels of binding activity to DNA and the RAM domain. The studies using chimeric molecules between RBP-J and its homolog RBP-L showed that the N- and C-terminal regions of RBP-J conferred the IC- as well as RAMIC-induced transactivation potential on RBP-L, which binds to the same DNA sequence as RBP-J but fails to interact with RAMIC. Taken together, these results indicate that the interactions between the N- and C-terminal regions of RBP-J and the ANK repeats of RAMIC are important for transactivation of RBP-J by RAMIC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BAliBASE is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The database contains high quality, manually constructed multiple sequence alignments together with detailed annotations. The alignments are all based on three-dimensional structural superpositions, with the exception of the transmembrane sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Here we describe version 2.0 of the database, which incorporates three new reference sets of alignments containing structural repeats, trans­membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences. BAliBASE can be viewed at the web site http://www-igbmc.u-strasbg.fr/BioInfo/BAliBASE2/index.html or can be downloaded from ftp://ftp-igbmc.u-strasbg.fr/pub/BAliBASE2/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The key requirements for high-throughput single-nucleotide polymorphism (SNP) typing of DNA samples in large-scale disease case-control studies are automatability, simplicity, and robustness, coupled with minimal cost. In this paper we describe a fluorescence technique for the detection of SNPs that have been amplified by using the amplification refractory mutation system (ARMS)-PCR procedure. Its performance was evaluated using 32 sequence-specific primer mixes to assign the HLA-DRB alleles to 80 lymphoblastoid cell line DNAs chosen from our database for their diversity. All had been typed previously by alternative methods, either direct sequencing or gel electrophoresis. We believe the detection system that we call AMDI (alkaline-mediated differential interaction) satisfies the above criteria and is suitable for general high-throughput SNP typing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SF3b155 is an essential spliceosomal protein, highly conserved during evolution. It has been identified as a subunit of splicing factor SF3b, which, together with a second multimeric complex termed SF3a, interacts specifically with the 12S U2 snRNP and converts it into the active 17S form. The protein displays a characteristic intranuclear localization. It is diffusely distributed in the nucleoplasm but highly concentrated in defined intranuclear structures termed “speckles,” a subnuclear compartment enriched in small ribonucleoprotein particles and various splicing factors. The primary sequence of SF3b155 suggests a multidomain structure, different from those of other nuclear speckles components. To identify which part of SF3b155 determines its specific intranuclear localization, we have constructed expression vectors encoding a series of epitope-tagged SF3b155 deletion mutants as well as chimeric combinations of SF3b155 sequences with the soluble cytoplasmic protein pyruvate kinase. Following transfection of cultured mammalian cells, we have identified (i) a functional nuclear localization signal of the monopartite type (KRKRR, amino acids 196–200) and (ii) a molecular segment with multiple threonine-proline repeats (amino acids 208–513), which is essential and sufficient to confer a specific accumulation in nuclear speckles. This latter sequence element, in particular amino acids 208–440, is required for correct subcellular localization of SF3b155 and is also sufficient to target a reporter protein to nuclear speckles. Moreover, this “speckle-targeting sequence” transfers the capacity for interaction with other U2 snRNP components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Previously conducted sequence analysis of Arabidopsis thaliana (ecotype Columbia-0) reported an insertion of 270-kb mtDNA into the pericentric region on the short arm of chromosome 2. DNA fiber-based fluorescence in situ hybridization analyses reveal that the mtDNA insert is 618 ± 42 kb, ≈2.3 times greater than that determined by contig assembly and sequencing analysis. Portions of the mitochondrial genome previously believed to be absent were identified within the insert. Sections of the mtDNA are repeated throughout the insert. The cytological data illustrate that DNA contig assembly by using bacterial artificial chromosomes tends to produce a minimal clone path by skipping over duplicated regions, thereby resulting in sequencing errors. We demonstrate that fiber-fluorescence in situ hybridization is a powerful technique to analyze large repetitive regions in the higher eukaryotic genomes and is a valuable complement to ongoing large genome sequencing projects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical significance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, moreover, between the P values that we derive from this distribution and those reported by standard programs (e.g., blast and fasta validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the sequence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This computer simulation is based on a model of the origin of life proposed by H. Kuhn and J. Waser, where the evolution of short molecular strands is assumed to take place in a distinct spatiotemporal structured environment. In their model, the prebiotic situation is strongly simplified to grasp essential features of the evolution of the genetic apparatus without attempts to trace the historic path. With the tool of computer implementation confining to principle aspects and focused on critical features of the model, a deeper understanding of the model's premises is achieved. Each generation consists of three steps: (i) construction of devices (entities exposed to selection) presently available; (ii) selection; and (iii) multiplication of the isolated strands (R oligomers) by complementary copying with occasional variation by copying mismatch. In the beginning, the devices are single strands with random sequences; later, increasingly complex aggregates of strands form devices such as a hairpin-assembler device which develop in favorable cases. A monomers interlink by binding to the hairpin-assembler device, and a translation machinery, called the hairpin-assembler-enzyme device, emerges, which translates the sequence of R1 and R2 monomers in the assembler strand to the sequence of A1 and A2 monomers in the A oligomer, working as an enzyme.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The whole genome sequence (1.83 Mbp) of Haemophilus influenzae strain Rd was searched to identify tandem oligonucleotide repeat sequences. Loss or gain of one or more nucleotide repeats through a recombination-independent slippage mechanism is known to mediate phase variation of surface molecules of pathogenic bacteria, including H. influenzae. This facilitates evasion of host defenses and adaptation to the varying microenvironments of the host. We reasoned that iterative nucleotides could identify novel genes relevant to microbe-host interactions. Our search of the Rd genome sequence identified 9 novel loci with multiple (range 6-36, mean 22) tandem tetranucleotide repeats. All were found to be located within putative open reading frames and included homologues of hemoglobin-binding proteins of Neisseria, a glycosyltransferase (IgtC gene product) of Neisseria, and an adhesin of Yersinia. These tetranucleotide repeat sequences were also shown to be present in two other epidemiologically different H. influenzae type b strains, although the number and distribution of repeats was different. Further characterization of the IgtC gene showed that it was involved in phenotypic switching of a lipopolysaccharide epitope and that this variable expression was associated with changes in the number of tetranucleotide repeats. Mutation of IgtC resulted in attenuated virulence of H. influenzae in an infant rat model of invasive infection. These data indicate the rapidity, economy, and completeness with which whole genome sequences can be used to investigate the biology of pathogenic bacteria.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aminoacyl-tRNA synthetases (tRNA synthetases) of higher eukaryotes form a multiprotein complex. Sequence elements that are responsible for the protein assembly were searched by using a yeast two-hybrid system. Human cytoplasmic isoleucyl-tRNA synthetase is a component of the multi-tRNA synthetase complex and it contains a unique C-terminal appendix. This part of the protein was used as bait to identify an interacting protein from a HeLa cDNA library. The selected sequence represented the internal 317 amino acids of human bifunctional (glutamyl- and prolyl-) tRNA synthetase, which is also known to be a component of the complex. Both the C-terminal appendix of the isoleucyl-tRNA synthetase and the internal region of bifunctional tRNA synthetase comprise repeating sequence units, two repeats of about 90 amino acids, and three repeats of 57 amino acids, respectively. Each repeated motif of the two proteins was responsible for the interaction, but the stronger interaction was shown by the native structures containing multiple motifs. Interestingly, the N-terminal extension of human glycyl-tRNA synthetase containing a single motif homologous to those in the bifunctional tRNA synthetase also interacted with the C-terminal motif of the isoleucyl-tRNA synthetase although the enzyme is not a component of the complex. The data indicate that the multiplicity of the binding motif in the tRNA synthetases is necessary for enhancing the interaction strength and may be one of the determining factors for the tRNA synthetases to be involved in the formation of the multi-tRNA synthetase complex.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current evidence on the long-term evolutionary effect of insertion of sequence elements into gene regions is reviewed, restricted to cases where a sequence derived from a past insertion participates in the regulation of expression of a useful gene. Ten such examples in eukaryotes demonstrate that segments of repetitive DNA or mobile elements have been inserted in the past in gene regions, have been preserved, sometimes modified by selection, and now affect control of transcription of the adjacent gene. Included are only examples in which transcription control was modified by the insert. Several cases in which merely transcription initiation occurred in the insert were set aside. Two of the examples involved the long terminal repeats of mammalian endogenous retroviruses. Another two examples were control of transcription by repeated sequence inserts in sea urchin genomes. There are now six published examples in which Alu sequences were inserted long ago into human gene regions, were modified, and now are central in control/enhancement of transcription. The number of published examples of Alu sequences affecting gene control has grown threefold in the last year and is likely to continue growing. Taken together, all of these examples show that the insertion of sequence elements in the genome has been a significant source of regulatory variation in evolution.