963 resultados para Multiple sequence alignment


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The anaerobic transcriptional regulator ANR induces the arginine deiminase and denitrification pathways in Pseudomonas aeruginosa during oxygen limitation. The homologous activator FNR of Escherichia coli, when introduced into an anr mutant of P. aeruginosa, could functionally replace ANR for anaerobic growth on nitrate but not for anaerobic induction of arginine deiminase. In an FNR-positive E. coli strain, the ANR-dependent promoter of the arcDABC operon, which encodes the enzymes of the arginine deiminase pathway, was not expressed. To analyse systematically these distinct induction patterns, a lacZ promoter-probe, broad-host-range plasmid containing various -40 regions (the ANR/FNR recognition sequences) and -10 promoter sequences was constructed. These constructs were tested in P. aeruginosa and in E. coli expressing either ANR or FNR. In conjunction with the consensus -10 hexamer of E. coli sigma 70 RNA polymerase (TATAAT), the consensus FNR site (TTGAT ..... ATCAA) was recognized efficiently by ANR and FNR in both hosts. By contrast, when promoters contained the Arc box (TTGAC .... ATCAG), which is found in the arcDABC promoter, or a symmetrical mutant FNR site (CTGAT .... ATCAG), ANR was a more effective activator than was FNR. Conversely, an extended 22 bp, fully symmetrical FNR site allowed better activation with FNR than with ANR. Combination of the arc promoter -10 sequence (CCTAAT) with the Arc box or the consensus FNR site resulted in good ANR-dependent expression in P. aeruginosa but gave practically no expression in E. coli, suggesting that RNA polymerase of P. aeruginosa differs from the E. coli enzyme in -10 recognition specificity. In conclusion, ANR and FNR are able to activate the RNA polymerases of P. aeruginosa and E. coli when the -40 and -10 promoter elements ae identical or close to the E. coli consensus sequences.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nonstructural protein 4B (NS4B) is a key organizer of hepatitis C virus (HCV) replication complex formation. In concert with other nonstructural proteins, it induces a specific membrane rearrangement, designated as membranous web, which serves as a scaffold for the HCV replicase. The N-terminal part of NS4B comprises a predicted and a structurally resolved amphipathic α-helix, designated as AH1 and AH2, respectively. Here, we report a detailed structure-function analysis of NS4B AH1. Circular dichroism and nuclear magnetic resonance structural analyses revealed that AH1 folds into an amphipathic α-helix extending from NS4B amino acid 4 to 32, with positively charged residues flanking the helix. These residues are conserved among hepaciviruses. Mutagenesis and selection of pseudorevertants revealed an important role of these residues in RNA replication by affecting the biogenesis of double-membrane vesicles making up the membranous web. Moreover, alanine substitution of conserved acidic residues on the hydrophilic side of the helix reduced infectivity without significantly affecting RNA replication, indicating that AH1 is also involved in virus production. Selective membrane permeabilization and immunofluorescence microscopy analyses of a functional replicon harboring an epitope tag between NS4B AH1 and AH2 revealed a dual membrane topology of the N-terminal part of NS4B during HCV RNA replication. Luminal translocation was unaffected by the mutations introduced into AH1, but was abrogated by mutations introduced into AH2. In conclusion, our study reports the three-dimensional structure of AH1 from HCV NS4B, and highlights the importance of positively charged amino acid residues flanking this amphipathic α-helix in membranous web formation and RNA replication. In addition, we demonstrate that AH1 possesses a dual role in RNA replication and virus production, potentially governed by different topologies of the N-terminal part of NS4B.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The bacterial siderophore pyochelin is composed of salicylate and two cysteine-derived heterocycles, the second of which is modified by reduction and N-methylation during biosynthesis. In Pseudomonas aeruginosa, the first cysteine residue is converted to its D-isoform during thiazoline ring formation, whereas the second cysteine remains in its L-configuration. Stereochemistry is opposite in the Pseudomonas fluorescens siderophore enantio-pyochelin, in which the first ring originates from L-cysteine and the second ring from D-cysteine. Both siderophores promote growth of the producer organism during iron limitation and induce the expression of their biosynthesis genes by activating the transcriptional AraC-type regulator PchR. However, neither siderophore is functional as an iron carrier or as a transcriptional inducer in the other species, demonstrating that both processes are highly stereospecific. Stereospecificity of pyochelin/enantio-pyochelin-mediated iron uptake is ensured at two levels: (i) by the outer membrane siderophore receptors and (ii) by the cytosolic PchR regulators.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We describe the odorant binding proteins (OBPs) of the red imported fire ant, Solenopsis invicta, obtained from analyses of an EST library and separate 454 sequencing runs of two normalized cDNA libraries. We identified a total of 18 putative functional OBPs in this ant. A third of the fire ant OBPs are orthologs to honey bee OBPs. Another third of the OBPs belong to a lineage-specific expansion, which is a common feature of insect OBP evolution. Like other OBPs, the different fire ant OBPs share little sequence similarity (∼ 20%), rendering evolutionary analyses difficult. We discuss the resulting problems with sequence alignment, phylogenetic analysis, and tests of selection. As previously suggested, our results underscore the importance for careful exploration of the sensitivity to the effects of alignment methods for data comprising widely divergent sequences.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Embryonic development in nonmammalian vertebrates depends entirely on nutritional reserves that are predominantly derived from vitellogenin proteins and stored in egg yolk. Mammals have evolved new resources, such as lactation and placentation, to nourish their developing and early offspring. However, the evolutionary timing and molecular events associated with this major phenotypic transition are not known. By means of sensitive comparative genomics analyses and evolutionary simulations, we here show that the three ancestral vitellogenin-encoding genes were progressively lost during mammalian evolution (until around 30-70 million years ago, Mya) in all but the egg-laying monotremes, which have retained a functional vitellogenin gene. Our analyses also provide evidence that the major milk resource genes, caseins, which have similar functional properties as vitellogenins, appeared in the common mammalian ancestor approximately 200-310 Mya. Together, our data are compatible with the hypothesis that the emergence of lactation in the common mammalian ancestor and the development of placentation in eutherian and marsupial mammals allowed for the gradual loss of yolk-dependent nourishment during mammalian evolution

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A variety of cellular proteins has the ability to recognize DNA lesions induced by the anti-cancer drug cisplatin, with diverse consequences on their repair and on the therapeutic effectiveness of this drug. We report a novel gene involved in the cell response to cisplatin in vertebrates. The RDM1 gene (for RAD52 Motif 1) was identified while searching databases for sequences showing similarities to RAD52, a protein involved in homologous recombination and DNA double-strand break repair. Ablation of RDM1 in the chicken B cell line DT40 led to a more than 3-fold increase in sensitivity to cisplatin. However, RDM1-/- cells were not hypersensitive to DNA damages caused by ionizing radiation, UV irradiation, or the alkylating agent methylmethane sulfonate. The RDM1 protein displays a nucleic acid binding domain of the RNA recognition motif (RRM) type. By using gel-shift assays and electron microscopy, we show that purified, recombinant chicken RDM1 protein interacts with single-stranded DNA as well as double-stranded DNA, on which it assembles filament-like structures. Notably, RDM1 recognizes DNA distortions induced by cisplatin-DNA adducts in vitro. Finally, human RDM1 transcripts are abundant in the testis, suggesting a possible role during spermatogenesis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Phylogenetic trees representing the evolutionary relationships of homologous genes are the entry point for many evolutionary analyses. For instance, the use of a phylogenetic tree can aid in the inference of orthology and paralogy relationships, and in the detection of relevant evolutionary events such as gene family expansions and contractions, horizontal gene transfer, recombination or incomplete lineage sorting. Similarly, given the plurality of evolutionary histories among genes encoded in a given genome, there is a need for the combined analysis of genome-wide collections of phylogenetic trees (phylomes). Here, we introduce a new release of PhylomeDB (http://phylomedb.org), a public repository of phylomes. Currently, PhylomeDB hosts 120 public phylomes, comprising >1.5 million maximum likelihood trees and multiple sequence alignments. In the current release, phylogenetic trees are annotated with taxonomic, protein-domain arrangement, functional and evolutionary information. PhylomeDB is also a major source for phylogeny-based predictions of orthology and paralogy, covering >10 million proteins across 1059 sequenced species. Here we describe newly implemented PhylomeDB features, and discuss a benchmark of the orthology predictions provided by the database, the impact of proteome updates and the use of the phylome approach in the analysis of newly sequenced genomes and transcriptomes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the increasing availability of various 'omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk download.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The death-inducing receptor Fas is activated when cross-linked by the type II membrane protein Fas ligand (FasL). When human soluble FasL (sFasL, containing the extracellular portion) was expressed in human embryo kidney 293 cells, the three N-linked glycans of each FasL monomer were found to be essential for efficient secretion. Based on the structure of the closely related lymphotoxin alpha-tumor necrosis factor receptor I complex, a molecular model of the FasL homotrimer bound to three Fas molecules was generated using knowledge-based protein modeling methods. Point mutations of amino acid residues predicted to affect the receptor-ligand interaction were introduced at three sites. The F275L mutant, mimicking the loss of function murine gld mutation, exhibited a high propensity for aggregation and was unable to bind to Fas. Mutants P206R, P206D, and P206F displayed reduced cytotoxicity toward Fas-positive cells with a concomitant decrease in the binding affinity for the recombinant Fas-immunoglobulin Fc fusion proteins. Although the cytotoxic activity of mutant Y218D was unaltered, mutant Y218R was inactive, correlating with the prediction that Tyr-218 of FasL interacts with a cluster of three basic amino acid side chains of Fas. Interestingly, mutant Y218F could induce apoptosis in murine, but not human cells.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mitochondrial DNA (mtDNA), a maternally inherited 16.6-Kb molecule crucial for energy production, is implicated in numerous human traits and disorders. It has been hypothesized that the presence of mutations in the mtDNA may contribute to the complex genetic basis of schizophreniadisease, due to the evidence of maternal inheritance and the presence of schizophrenia symptoms in patients affected of a mitochondrial disorder related to a mtDNA mutation. The present project aims to study the association of variants of mitochondrial DNA (mtDNA), and an increased risk of schizophrenia in a cohort of patients and controls from the same population. The entire mtDNA of 55 schizophrenia patients with an apparent maternal transmission of the disease and 38 controls was sequenced by Next Generation Sequencing (Ion Torrent PGM, Life Technologies) and compared to the reference sequence. The current method for establishing mtDNA haplotypes is Sanger sequencing, which is laborious, timeconsuming, and expensive. With the emergence of Next Generation Sequencing technologies, this sequencing process can be much more quickly and cost-efficiently. We have identified 14 variants that have not been previously reported. Two of them were missense variants: MTATP6 p.V113M and MTND5 p.F334L ,and also three variants encoding rRNA and one variant encoding tRNA. Not significant differences have been found in the number of variants between the two groups. We found that the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of the bioinformatics analysis and annotation step would be desirable to facilitate the application of NGS in mtDNA analysis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

It has long been known that amino acids are the building blocks for proteins and govern their folding into specific three-dimensional structures. However, the details of this process are still unknown and represent one of the main problems in structural bioinformatics, which is a highly active research area with the focus on the prediction of three-dimensional structure and its relationship to protein function. The protein structure prediction procedure encompasses several different steps from searches and analyses of sequences and structures, through sequence alignment to the creation of the structural model. Careful evaluation and analysis ultimately results in a hypothetical structure, which can be used to study biological phenomena in, for example, research at the molecular level, biotechnology and especially in drug discovery and development. In this thesis, the structures of five proteins were modeled with templatebased methods, which use proteins with known structures (templates) to model related or structurally similar proteins. The resulting models were an important asset for the interpretation and explanation of biological phenomena, such as amino acids and interaction networks that are essential for the function and/or ligand specificity of the studied proteins. The five proteins represent different case studies with their own challenges like varying template availability, which resulted in a different structure prediction process. This thesis presents the techniques and considerations, which should be taken into account in the modeling procedure to overcome limitations and produce a hypothetical and reliable three-dimensional structure. As each project shows, the reliability is highly dependent on the extensive incorporation of experimental data or known literature and, although experimental verification of in silico results is always desirable to increase the reliability, the presented projects show that also the experimental studies can greatly benefit from structural models. With the help of in silico studies, the experiments can be targeted and precisely designed, thereby saving both money and time. As the programs used in structural bioinformatics are constantly improved and the range of templates increases through structural genomics efforts, the mutual benefits between in silico and experimental studies become even more prominent. Hence, reliable models for protein three-dimensional structures achieved through careful planning and thoughtful executions are, and will continue to be, valuable and indispensable sources for structural information to be combined with functional data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Scientists have been debating for decades the origin of life on earth. A number of hypotheses were proposed as to what emerged first RNA or DNA; with most scientists are in favour of the "RNA World" hypothesis. Assuming RNA emerged first, it fellow that the RNA polymerases would've appeared before DNA polymerases. Using recombinant DNA technology and bioinformatics we undertook this study to explore the relationship between RNA polymerases, reverse transcriptase and DNA polymerases. The working hypothesis is that DNA polymerases evolved from reverse transcriptase and the latter evolved from RNA polymerases. If this hypothesis is correct then one would expect to find various ancient DNA polymerases with varying level of reverse transcriptase activity. In the first phase of this research project multiple sequence alignments were made on the protein sequence of 32 prokaryotic DNA-directed DNA polymerases originating from 11 prokaryotic families against 3 viral reverse transcriptase. The data from such alignments was not very conclusive. DNA polymerases with higher level of reverse transcriptase activity were non-confined to ancient organisms, as one would've expected. The second phase of this project was focused on conditions that may alter the DNA polymerase activity. Various reaction conditions, such as temperature, using various ions (Ni2+, Mn2+, Mg2+) were tested. Interestingly, it was found that the DNA polymerase from the Thermos aquatics family can be made to copy RNA into DNA (i.e. reverse transcriptase activity). Thus it was shown that under appropriate conditions (ions and reactions temperatures) reverse transcriptase activity can be induced in DNA polymerase. In the third phase of this study recombinant DNA technology was used to generate a chimeric DNA polymerase; in attempts to identify the region(s) of the polymerase responsible for RNA-directed DNA polymerase activity. The two DNA polymerases employed were the Thermus aquatic us and Thermus thermophiles. As in the second phase various reaction conditions were investigated. Data indicated that the newly engineered chimeric DNA polymerase can be induced to copy RNA into DNA. Thus the intrinsic reverse transcriptase activity found in ancient DNA polymerases was localized into a domain and can be induced via appropriate reaction conditions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dans un premier temps, nous avons modélisé la structure d’une famille d’ARN avec une grammaire de graphes afin d’identifier les séquences qui en font partie. Plusieurs autres méthodes de modélisation ont été développées, telles que des grammaires stochastiques hors-contexte, des modèles de covariance, des profils de structures secondaires et des réseaux de contraintes. Ces méthodes de modélisation se basent sur la structure secondaire classique comparativement à nos grammaires de graphes qui se basent sur les motifs cycliques de nucléotides. Pour exemplifier notre modèle, nous avons utilisé la boucle E du ribosome qui contient le motif Sarcin-Ricin qui a été largement étudié depuis sa découverte par cristallographie aux rayons X au début des années 90. Nous avons construit une grammaire de graphes pour la structure du motif Sarcin-Ricin et avons dérivé toutes les séquences qui peuvent s’y replier. La pertinence biologique de ces séquences a été confirmée par une comparaison des séquences d’un alignement de plus de 800 séquences ribosomiques bactériennes. Cette comparaison a soulevée des alignements alternatifs pour quelques unes des séquences que nous avons supportés par des prédictions de structures secondaires et tertiaires. Les motifs cycliques de nucléotides ont été observés par les membres de notre laboratoire dans l'ARN dont la structure tertiaire a été résolue expérimentalement. Une étude des séquences et des structures tertiaires de chaque cycle composant la structure du Sarcin-Ricin a révélé que l'espace des séquences dépend grandement des interactions entre tous les nucléotides à proximité dans l’espace tridimensionnel, c’est-à-dire pas uniquement entre deux paires de bases adjacentes. Le nombre de séquences générées par la grammaire de graphes est plus petit que ceux des méthodes basées sur la structure secondaire classique. Cela suggère l’importance du contexte pour la relation entre la séquence et la structure, d’où l’utilisation d’une grammaire de graphes contextuelle plus expressive que les grammaires hors-contexte. Les grammaires de graphes que nous avons développées ne tiennent compte que de la structure tertiaire et négligent les interactions de groupes chimiques spécifiques avec des éléments extra-moléculaires, comme d’autres macromolécules ou ligands. Dans un deuxième temps et pour tenir compte de ces interactions, nous avons développé un modèle qui tient compte de la position des groupes chimiques à la surface des structures tertiaires. L’hypothèse étant que les groupes chimiques à des positions conservées dans des séquences prédéterminées actives, qui sont déplacés dans des séquences inactives pour une fonction précise, ont de plus grandes chances d’être impliqués dans des interactions avec des facteurs. En poursuivant avec l’exemple de la boucle E, nous avons cherché les groupes de cette boucle qui pourraient être impliqués dans des interactions avec des facteurs d'élongation. Une fois les groupes identifiés, on peut prédire par modélisation tridimensionnelle les séquences qui positionnent correctement ces groupes dans leurs structures tertiaires. Il existe quelques modèles pour adresser ce problème, telles que des descripteurs de molécules, des matrices d’adjacences de nucléotides et ceux basé sur la thermodynamique. Cependant, tous ces modèles utilisent une représentation trop simplifiée de la structure d’ARN, ce qui limite leur applicabilité. Nous avons appliqué notre modèle sur les structures tertiaires d’un ensemble de variants d’une séquence d’une instance du Sarcin-Ricin d’un ribosome bactérien. L’équipe de Wool à l’université de Chicago a déjà étudié cette instance expérimentalement en testant la viabilité de 12 variants. Ils ont déterminé 4 variants viables et 8 létaux. Nous avons utilisé cet ensemble de 12 séquences pour l’entraînement de notre modèle et nous avons déterminé un ensemble de propriétés essentielles à leur fonction biologique. Pour chaque variant de l’ensemble d’entraînement nous avons construit des modèles de structures tertiaires. Nous avons ensuite mesuré les charges partielles des atomes exposés sur la surface et encodé cette information dans des vecteurs. Nous avons utilisé l’analyse des composantes principales pour transformer les vecteurs en un ensemble de variables non corrélées, qu’on appelle les composantes principales. En utilisant la distance Euclidienne pondérée et l’algorithme du plus proche voisin, nous avons appliqué la technique du « Leave-One-Out Cross-Validation » pour choisir les meilleurs paramètres pour prédire l’activité d’une nouvelle séquence en la faisant correspondre à ces composantes principales. Finalement, nous avons confirmé le pouvoir prédictif du modèle à l’aide d’un nouvel ensemble de 8 variants dont la viabilité à été vérifiée expérimentalement dans notre laboratoire. En conclusion, les grammaires de graphes permettent de modéliser la relation entre la séquence et la structure d’un élément structural d’ARN, comme la boucle E contenant le motif Sarcin-Ricin du ribosome. Les applications vont de la correction à l’aide à l'alignement de séquences jusqu’au design de séquences ayant une structure prédéterminée. Nous avons également développé un modèle pour tenir compte des interactions spécifiques liées à une fonction biologique donnée, soit avec des facteurs environnants. Notre modèle est basé sur la conservation de l'exposition des groupes chimiques qui sont impliqués dans ces interactions. Ce modèle nous a permis de prédire l’activité biologique d’un ensemble de variants de la boucle E du ribosome qui se lie à des facteurs d'élongation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.