995 resultados para Short homologous sequences
Resumo:
Previous research into formulaic language has focussed on specialised groups of people (e.g. L1 acquisition by infants and adult L2 acquisition) with ordinary adult native speakers of English receiving less attention. Additionally, whilst some features of formulaic language have been used as evidence of authorship (e.g. the Unabomber’s use of you can’t eat your cake and have it too) there has been no systematic investigation into this as a potential marker of authorship. This thesis reports the first full-scale study into the use of formulaic sequences by individual authors. The theory of formulaic language hypothesises that formulaic sequences contained in the mental lexicon are shaped by experience combined with what each individual has found to be communicatively effective. Each author’s repertoire of formulaic sequences should therefore differ. To test this assertion, three automated approaches to the identification of formulaic sequences are tested on a specially constructed corpus containing 100 short narratives. The first approach explores a limited subset of formulaic sequences using recurrence across a series of texts as the criterion for identification. The second approach focuses on a word which frequently occurs as part of formulaic sequences and also investigates alternative non-formulaic realisations of the same semantic content. Finally, a reference list approach is used. Whilst claiming authority for any reference list can be difficult, the proposed method utilises internet examples derived from lists prepared by others, a procedure which, it is argued, is akin to asking large groups of judges to reach consensus about what is formulaic. The empirical evidence supports the notion that formulaic sequences have potential as a marker of authorship since in some cases a Questioned Document was correctly attributed. Although this marker of authorship is not universally applicable, it does promise to become a viable new tool in the forensic linguist’s tool-kit.
Resumo:
A detailed knowledge of the mapping between sequence and structure spaces in populations of RNA molecules is essential to better understand their present-day functional properties, to envisage a plausible early evolution of RNA in a prebiotic chemical environment and to improve the design of in vitro evolution experiments, among others. Analysis of natural RNAs, as well as in vitro and computational studies, show that certain RNA structural motifs are much more abundant than others, pointing out a complex relation between sequence and structure. Within this framework, we have investigated computationally the structural properties of a large pool (10 molecules) of single-stranded, 35 nt-long, random RNA sequences. The secondary structures obtained are ranked and classified into structure families. The number of structures in main families is analytically calculated and compared with the numerical results. This permits a quantification of the fraction of structure space covered by a large pool of sequences. We further show that the number of structural motifs and their frequency is highly unbalanced with respect to the nucleotide composition: simple structures such as stem-loops and hairpins arise from sequences depleted in G, while more complex structures require an enrichment of G. In general, we observe a strong correlation between subfamilies-characterized by a fixed number of paired nucleotides-and nucleotide composition. Our results are compared to the structural repertoire obtained in a second pool where isolated base pairs are prohibited. © 2008 Elsevier Ltd. All rights reserved.
Resumo:
Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We recently evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach in delineating breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
Resumo:
Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We previously evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach to delineate breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
Resumo:
In the kallikrein-kinin and renin-angiotensin systems the main receptors, B-1 and B-2 (kinin receptors) and AT(1) and AT(2) (angiotensin receptors) respectively, are seven-transmembrane domain G-protein-coupled receptors. Considering that the B, agonists Des-Arg(9)-BK (Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe), Lys-desArg(9)-BK or Des-Arg(10)-KD (Lys-Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe) and the AT, agonist (Asp-Arg-Val-Tyr-lle-His-Pro-Phe) have the same two residues at the C-terminal region (i.e. Pro-Phe), we hypothesized that TM V and TM VI of the B-1 receptor could play an essential role in agonist binding and activity, being these regions receptor sites for binding the C-terminal sequences of Des-Arg-kinins similarly to that observed to AT, receptor. To investigate this hypothesis, we replaced Arg(212) for Ala at the top of the TM V and the sequence 274-282 (CPYHFFAFL) in TM VI of the rat kinin B, receptor by the 32 receptor homologous sequence, 289-297 (FPFQISTFL) and subsequently analyzed the consequences of these mutations by competition binding and functional assays. Despite correct expression, observed at the mRNA and protein level by RT-PCR and confocal microscopy, respectively, no agonist binding and function was verified for the mutated receptors. Therefore, our results suggest an important role for Arg(212) in the TM V and a region of TM VI of rat B, receptor in the interaction with the C-terminal residues of Des-Arg-kinins, similar to that observed with AngII. (c) 2007 Elsevier B.V. All rights reserved.
Resumo:
Distinct Echinococcus granulosus life cycle patterns have been described in North America: domestic and sylvatic. Gene sequences of the sylvatic E. granulosus indicate that it represents a separate variant. Case-based data have suggested that the course of sylvatic disease is less severe than that of domestic disease. which led to the recommendation to treat cystic echinococcosis patients in the Arctic by careful medical management rather than by aggressive surgery. We recently reported the first two documented E. granalosus human cases in Alaska with accompanying severe sequelae. Here we describe the results of molecular genetic analysis of the cyst material of one of the subjects that supported identification of the parasite as the sylvatic (cervid) strain and not the domestic (common sheep strain), which was initially thought to be implicated in these unusually severe Alaskan cases.
Resumo:
A new RTE-like, non-long terminal repeat retrotransposon, termed SjR2, from the human blood fluke, Schistosoma japonicum, is described. SjR2 is similar to3.9 kb in length and is constituted of a single open reading frame encoding a polyprotein with apurinic/apyrimidinic endonuclease and reverse transcriptase domains. The open reading frame is bounded by 5'- and 3'-terininal untranslated regions and, at its 3-terminus, SjR2 bears a short (TGAC)(3) repeat. Phylogenetic analyses based on conserved domains of reverse transcriptase or endonuclease revealed that SjR2 belonged to the RTE clade of non-long terminal repeat retrotransposons. Further, SjR2 was homologous, but probably not orthologous, to SR2 front the African blood fluke, Schistosoma mansoni; this RTE-like family of non-long terminal repeat retrotransposons appears to have arisen before the divergence of the extant schistosome species. Hybridisation analyses indicated that similar to 10,000 copies of SjR2 were dispersed throughout the S. japonicum chromosomes, accounting for up to 14% of the nuclear genome. Messenger RNAs encoding the reverse transcriptase and endonuclease domains of SjR2 were detected in several developmental stages of the schistosome, indicating that the retrotransposon was actively replicating within the genome of the parasite. Exploration of the coding and non-coding regions of SjR2 revealed two notable characteristics. First, the recombinant reverse transcriptase domain of SjR2 expressed in insect cells primed reverse transcription of SjR2 mRNA in vitro. By contrast, recombinant SjR2-endonuclease did not appear to cleave schistosome or plasmid DNA. Second, the 5'-untranslated region of SjR2 was >80% identical to the 3-untranslated region of a schistosome heat shock protein-70 gene (hsp-70) in the antisense orientation, indicating that SjR2-like elements were probably inserted into the non-coding regions of ancestral S. japonicum HSP-70, probably after the species diverged from S. mansoni. (C) 2002 Australian Society for Parasitology Inc. Published by Elsevier Science Ltd. All rights reserved.
Resumo:
The small GTPases R-Ras and H-Ras are highly homologous proteins with contrasting biological properties, for example, they differentially modulate integrin affinity: H-Ras suppresses integrin activation in fibroblasts whereas R-Ras can reverse this effect of H-Ras. To gain insight into the sequences directing this divergent phenotype, we investigated a panel of H-Ras/R-Ras chimeras and found that sequences in the R-Ras hypervariable C-terminal region including amino acids 175-203 are required for the R-Ras ability to increase integrin activation in CHO cells; however, the proline-rich site in this region, previously reported to bind the adaptor protein Nck, was not essential for this effect. In addition, we found that the GTPase TC21 behaved similarly to R-Ras. Because the C-termini of Ras proteins can control their subcellular localization, we compared the localization of H-Ras and R-Ras. In contrast to H-Ras, which migrates out of lipid rafts upon activation, we found that activated R-Ras remained localized to lipid rafts. However, functionally distinct H-Ras/R-Ras chimeras containing different C-terminal R-Ras segments localized to lipid rafts irrespective of their integrin phenotype. (C) 2003 Elsevier Inc. All rights reserved.
Resumo:
Extensive chromosome size polymorphism in Plasmodium berghei in vivo mitotic multiplication. Size differences between homologous chromosomes mainly involve rearrangements in the subtelomeric regions while internal chromosomal regions are more conserved. Size differences are almost exclusively due to differences in the copy number of a 2.3 kb subtelomeric repeat unit. Not only deletion of 2.3 kb repeats occurs, but addition of new copies of this repeat sometimes results in the formation of enlarged chromosomes. Even chromosomes which originally lack 2.3 kb repeats, can acquire these during mitotic multiplication. In one karyotype mutant, 2.3 kb repeats were inserted within one of the original telomeres of chromosome 4, creating an internal stretch oftelomeric repeats. Chromosome translocation can contribute to chromosome size polymorphism as well We found a karyotype mutant in which chromosome 7 with a size of about 1.4 Mb is translocated to chromosome 13/14 with a size of about 3 Mb, resulting in a rearranged chromosome, which was shown to contain a junction between internal DNA sequences of chromosome 13/14 and subtelomeric 2.3 kb repeats of chromosome 7. In this mutant a new chromosome of 1.4 Mb is present which consists of part of chromosome 13/14.
Resumo:
We demonstrate that RecA protein can mediate annealing of complementary DNA strands in vitro by at least two different mechanisms. The first annealing mechanism predominates under conditions where RecA protein causes coaggregation of single-stranded DNA (ssDNA) molecules and where RecA-free ssDNA stretches are present on both reaction partners. Under these conditions annealing can take place between locally concentrated protein-free complementary sequences. Other DNA aggregating agents like histone H1 or ethanol stimulate annealing by the same mechanism. The second mechanism of RecA-mediated annealing of complementary DNA strands is best manifested when preformed saturated RecA-ssDNA complexes interact with protein-free ssDNA. In this case, annealing can occur between the ssDNA strand resident in the complex and the ssDNA strand that interacts with the preformed RecA-ssDNA complex. Here, the action of RecA protein reflects its specific recombination promoting mechanism. This mechanism enables DNA molecules resident in the presynaptic RecA-DNA complexes to be exposed for hydrogen bond formation with DNA molecules contacting the presynaptic RecA-DNA filament.
Resumo:
The four dominant outer membrane proteins (46, 38, 33 and 28 kDa) were detected by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) in a semi-purified preparation of vesicle membranes of a Neisseria meningitidis (N44/89, B:4:P1.15:P5.5,7) strain isolated in Brazil. The N-terminal amino acid sequence for the 46 kDa and 28 kDa proteins matched that reported by others for class 1 and 5 proteins respectively, whereas the sequence (25 amino acids) for the 38 kDa (class 3) protein was similar to class 1 meningococcal proteins. The sequence for the 33 kDa (class 4) was unique and not homologous to any known protein.
Resumo:
During genetic recombination a heteroduplex joint is formed between two homologous DNA molecules. The heteroduplex joint plays an important role in recombination since it accommodates sequence heterogeneities (mismatches, insertions or deletions) that lead to genetic variation. Two Escherichia coli proteins, RuvA and RuvB, promote the formation of heteroduplex DNA by catalysing the branch migration of crossovers, or Holliday junctions, which link recombining chromosomes. We show that RuvA and RuvB can promote branch migration through 1800 bp of heterologous DNA, in a reaction facilitated by the presence of E.coli single-stranded DNA binding (SSB) protein. Reaction intermediates, containing unpaired heteroduplex regions bound by SSB, were directly visualized by electron microscopy. In the absence of SSB, or when SSB was replaced by a single-strand binding protein from bacteriophage T4 (gene 32 protein), only limited heterologous branch migration was observed. These results show that the RuvAB proteins, which are induced as part of the SOS response to DNA damage, allow genetic recombination and the recombinational repair of DNA to occur in the presence of extensive lengths of heterology.
Resumo:
Electron microscopic analysis of heteroduplexes between the most distantly related Xenopus vitellogenin genes (A genes X B genes) has revealed the distribution of homologous regions that have been preferentially conserved after the duplication events that gave rise to the multigene family in Xenopus laevis. DNA sequence analysis was limited to the region downstream of the transcription initiation site of the Xenopus genes A1, B1 and B2 and a comparison with the Xenopus A2 and the major chicken vitellogenin gene is presented. Within the coding regions of the first three exons, nucleotide substitutions resulting in amino acid changes accumulate at a rate similar to that observed in globin genes. This suggests that the duplication event which led to the formation of the A and B ancestral genes in Xenopus laevis occurred about 150 million years ago. Homologous exons of the A1-A2 and B1-B2 gene pairs, which formed about 30 million years ago, show a quite similar sequence divergence. In contrast, A1-A2 homologous introns seem to have evolved much faster than their B1-B2 counterparts.
Resumo:
One of the crucial steps of authentication of aDNA sequences is phylogenetic consistency. Amplified sequences should fit into the phylogenetic framework of their supposed origin. An inherent property of aDNA sequences however, is their short sequence length. Additionally, genes for aDNA studies are often chosen by their preservation potential rather than by phylogenetically informative content. This poses potential challenges regarding their analyses, and might result in an inaccurate reflection of the supposed phylogenetic history of the sequence or organism under study. In this paper some fundamental problems of phylogenetic analysis and interpretation of aDNA datasets are discussed. Suggestions for character sampling and treatment of missing data are made. The publication is the result of a talk from the 1st PAMINSA Meeting in Rio de Janeiro, July 2005.