905 resultados para upstream activator sequence
Resumo:
Genomic plasticity of human chromosome 8p23.1 region is highly influenced by two groups of complex segmental duplications (SDs), termed REPD and REPP, that mediate different kinds of rearrangements. Part of the difficulty to explain the wide range of phenotypes associated with 8p23.1 rearrangements is that REPP and REPD are not yet well characterized, probably due to their polymorphic status. Here, we describe a novel primate-specific gene family, named FAM90A (family with sequence similarity 90), found within these SDs. According to the current human reference sequence assembly, the FAM90A family includes 24 members along 8p23.1 region plus a single member on chromosome 12p13.31, showing copy number variation (CNV) between individuals. These genes can be classified into subfamilies I and II, which differ in their upstream and 5′-untranslated region sequences, but both share the same open reading frame and are ubiquitously expressed. Sequence analysis and comparative fluorescence in situ hybridization studies showed that FAM90A subfamily II suffered a big expansion in the hominoid lineage, whereas subfamily I members were likely generated sometime around the divergence of orangutan and African great apes by a fusion process. In addition, the analysis of the Ka/Ks ratios provides evidence of functional constraint of some FAM90A genes in all species. The characterization of the FAM90A gene family contributes to a better understanding of the structural polymorphism of the human 8p23.1 region and constitutes a good example of how SDs, CNVs and rearrangements within themselves can promote the formation of new gene sequences with potential functional consequences.
Resumo:
The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10,010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper.
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
Background: Single nucleotide polymorphisms (SNPs) are the most frequent type of sequence variation between individuals, and represent a promising tool for finding genetic determinants of complex diseases and understanding the differences in drug response. In this regard, it is of particular interest to study the effect of non-synonymous SNPs in the context of biological networks such as cell signalling pathways. UniProt provides curated information about the functional and phenotypic effects of sequence variation, including SNPs, as well as on mutations of protein sequences. However, no strategy has been developed to integrate this information with biological networks, with the ultimate goal of studying the impact of the functional effect of SNPs in the structure and dynamics of biological networks. Results: First, we identified the different challenges posed by the integration of the phenotypic effect of sequence variants and mutations with biological networks. Second, we developed a strategy for the combination of data extracted from public resources, such as UniProt, NCBI dbSNP, Reactome and BioModels. We generated attribute files containing phenotypic and genotypic annotations to the nodes of biological networks, which can be imported into network visualization tools such as Cytoscape. These resources allow the mapping and visualization of mutations and natural variations of human proteins and their phenotypic effect on biological networks (e.g. signalling pathways, protein-protein interaction networks, dynamic models). Finally, an example on the use of the sequence variation data in the dynamics of a network model is presented. Conclusion: In this paper we present a general strategy for the integration of pathway and sequence variation data for visualization, analysis and modelling purposes, including the study of the functional impact of protein sequence variations on the dynamics of signalling pathways. This is of particular interest when the SNP or mutation is known to be associated to disease. We expect that this approach will help in the study of the functional impact of disease-associated SNPs on the behaviour of cell signalling pathways, which ultimately will lead to a better understanding of the mechanisms underlying complex diseases.
Resumo:
Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.
Resumo:
Background: Single Nucleotide Polymorphisms, among other type of sequence variants, constitute key elements in genetic epidemiology and pharmacogenomics. While sequence data about genetic variation is found at databases such as dbSNP, clues about the functional and phenotypic consequences of the variations are generally found in biomedical literature. The identification of the relevant documents and the extraction of the information from them are hampered by the large size of literature databases and the lack of widely accepted standard notation for biomedical entities. Thus, automatic systems for the identification of citations of allelic variants of genes in biomedical texts are required. Results: Our group has previously reported the development of OSIRIS, a system aimed at the retrieval of literature about allelic variants of genes http://ibi.imim.es/osirisform.html. Here we describe the development of a new version of OSIRIS (OSIRISv1.2, http://ibi.imim.es/OSIRISv1.2.html webcite) which incorporates a new entity recognition module and is built on top of a local mirror of the MEDLINE collection and HgenetInfoDB: a database that collects data on human gene sequence variations. The new entity recognition module is based on a pattern-based search algorithm for the identification of variation terms in the texts and their mapping to dbSNP identifiers. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in 99% precision, 82% recall, and an F-score of 0.89. As an example, the application of the system for collecting literature citations for the allelic variants of genes related to the diseases intracranial aneurysm and breast cancer is presented. Conclusion: OSIRISv1.2 can be used to link literature references to dbSNP database entries with high accuracy, and therefore is suitable for collecting current knowledge on gene sequence variations and supporting the functional annotation of variation databases. The application of OSIRISv1.2 in combination with controlled vocabularies like MeSH provides a way to identify associations of biomedical interest, such as those that relate SNPs with diseases.
Resumo:
ABSTRACT Upregulation of the Major Facilitator transporter gene MDR1 (Multi_drug Resistance 1) is one of the mechanisms observed in Candida albicans clinical isolates developing resistance to azole antifungal agents. To better understand this phenomenon, the cis-acting regulatory elements present in a modulatable reporter system under the control of the MDR1 promoter were characterized. In an azole-susceptible strain, transcription of this reporter is transiently upregulated in response to either benomyl or H2O2, whereas its expression is constitutively high in an azole-resistant strain (FR2). Two cis-acting regulatory elements, that are necessary and sufficient to convey the same transcriptional responses to a heterologous promoter (CDR2), were identified within the MDR1promoter. The first element, called BRE (for Benomyl Response Element, -296 to -260 with respect to the ATG start codon), is required for benomyl-dependent MDR1 upregulation and for constitutive high expression of MDR1 in FR2. The second element, termed HRE (for H2O2 Response Element, -561 to -520), is required for H2O2-dependent MDR1 upregulation, but is dispensable for constitutive high expression. Two potential binding sites (TTAG/CTAA) for the blip transcription factor Cap1p lie within the HRE. Moreover, inactivation of CAP1 abolished the transient response to H2O2 and diminished significantly the transient response to benomyl. Cap1p, which has been previously implicated in cellular responses to oxidative stress, may thus play a transacting and positive regulatory role in benomyl- and H2O2-dependent transcription of MDR1. However, it is not the only transcription factor involved in the response of MDR1 to benomyl. A minimal BRE element (-290 to -273) that is sufficient to detect in vitro sequence-specific binding of protein complexes in crude extracts prepared from C. albicans was also delimited. Genome-wide transcript profiling analyses undertaken with a matched pair of clinical isolates, one of which being azole-resistant and upregulating MDR1, and with an azole-susceptible strain exposed to benomyl, revealed that genes specifically upregulated by benomyl harbour in their promoters Cap1p binding site(s). This strengthened the idea that Cap1p plays a role in benomyl-dependent upregulation of MDR1. BRE-like sequences were also identified in several genes co-regulated with MDR1 in both conditions, which was consistent with the involvement of the BRE in both processes. A set of 147 mutants lacking a single transcription factor gene was next screened for loss of MDR1response to benomyl. Unfortunately, none of the tested mutants showed a loss of benomyl-dependent MDR1 upregulation. Nevertheless, a significant diminution of the response was observed in the mutants in which the MADS-box transcription factor Mcm1p and the C2H2 zinc finger transcription factor orf19.13374p were inactivated, suggesting that Mcm1p and orf19.13374p are involved in MDR1response to benomyl. Interestingly, the BRE contains a perfect match to the binding consensus of Mcm1p, raising the possibility that MDR1may be a direct target of this transcriptional activator. In conclusion, while the identity of the trans-acting factors that bind to the BRE and HRE remains to be confirmed, the tools we have developed during characterization of the cis-acting elements of the MDR1promoter should now serve to elucidate the nature of the components that modulate its activity. RESUME La surexpression du gène MDR1 (pour Résistance Multidrogue 1), qui code pour un transporteur de la famille des Major Facilitators, est l'un des mécanismes observés dans les isolats cliniques de la levure Candida albicans développant une résistance aux agents antifongiques appelés azoles. Pour mieux comprendre ce phénomène, les éléments de régulation agissant en cis dans un système rapporteur modulable sous le contrôle du promoteur MDR1 ont été caractérisés. Dans une souche sensible aux azoles, la transcription de ce rapporteur est transitoirement surélevée en réponse soit au bénomyl soit à l'agent oxydant H2O2, alors que son expression est constitutivement élevée dans une souche résistante aux azoles (souche FR2). Deux éléments de régulation agissant en cis, nécessaires et suffisants pour transmettre les mêmes réponses transcriptionnelles à un promoteur hétérologue (CDR2), ont été identifiés dans le promoteur MDR1. Le premier élément, appelé BRE (pour Elément de Réponse au Bénomyl, de -296 à -260 par rapport au codon d'initiation ATG) est requis pour la surexpression de MDR1dépendante du bénomyl et pour l'expression constitutive de MDR1 dans FR2. Le deuxième élément, appelé HRE (pour Elément de Réponse à l'H2O2, de -561 à -520), est requis pour la surexpression de MDR1 dépendante de l'H2O2, mais n'est pas impliqué dans l'expression constitutive du gène MDR1. Deux sites de fixation potentiels (TTAG/CTAA) pour le facteur de transcription Cap1p ont été identifiés dans l'élément HRE. De plus, l'inactivation de CAP1 abolit la réponse transitoire à l'H2O2 et diminua significativement la réponse transitoire au bénomyl. Cap1p, qui est impliqué dans les réponses de la cellule au stress oxydatif, doit donc jouer un rôle positif en trans dans la surexpression de MDR1 dépendante du bénomyl et de l'H2O2. Cependant, ce n'est pas le seul facteur de transcription impliqué dans la réponse au bénomyl. Un élément BRE d'une longueur minimale (de -290 à -273) a également été défini et est suffisant pour détecter une interaction spécifique in vitro avec des protéines provenant d'extraits bruts de C. albicans. L'analyse du profil de transcription d'une paire d'isolats cliniques comprenant une souche résistante aux azoles surexprimant MDR1, et d'une souche sensible aux azoles exposée au bénomyl, a révélé que les gènes spécifiquement surexprimés par le bénomyl contiennent dans leurs promoteurs un ou plusieurs sites de fixation pour Cap1p. Ceci renforce l'idée que Cap1p joue un rôle dans la surexpression de MDR1dépendante du bénomyl. Une ou deux séquences ressemblant à l'élément BRE ont également été identifiées dans la plupart des gènes corégulés avec MDR1 dans ces deux conditions, ce qui était attendu compte-tenu du rôle joué par cet élément dans les deux processus. Une collection de 147 mutants dans lesquels un seul facteur de transcription est inactivé a été testée pour la perte de réponse au bénomyl de MDR1. Malheureusement, la surexpression de MDR1 dépendante du bénomyl n'a été perdue dans aucun des mutants testés. Néanmoins, une diminution significative de la réponse a été observée chez des mutants dans lesquels le facteur de transcription à MADS-box Mcm1p et le facteur de transcription à doigts de zinc de type C2H2 orf19.13374p ont été inactivés, suggérant que Mcm1p et orf19.13374p sont impliqués dans la réponse de MDR1au bénomyl. Il est intéressant de noter que la BRE contient une séquence qui s'aligne parfaitement avec la séquence consensus du site de fixation de Mcm1p, ce qui soulève la possibilité que MDR1 pourrait être une cible directe de cet activateur transcriptionnel. En conclusion, alors que l'identité des facteurs agissant en trans en se fixant à la BRE et à la HRE reste à être confirmée, les outils que nous avons développés au cours de la caractérisation des éléments agissant en cis sur le promoteur MDR1 peut maintenant servir à élucider la nature des composants modulant son activité.
Resumo:
Shrews of the genus Sorex are characterized by a Holarctic distribution, and relationships among extant taxa have never been fully resolved. Phylogenies have been proposed based on morphological, karyological, and biochemical comparisons, but these analyses often produced controversial and contradictory results. Phylogenetic analyses of partial mitochondrial cytochrome b gene sequences (1011 bp) were used to examine the relationships among 27 Sorex species. The molecular data suggest that Sorex comprises two major monophyletic lineages, one restricted mostly to the New World and one with a primarily Palearctic distribution. Furthermore, several sister-species relationships are revealed by the analysis. Based on the split between the Soricinae and Crocidurinae subfamilies, we used a 95% confidence interval for both the calibration of a molecular clock and the subsequent calculation of major diversification events within the genus Sorex. Our analysis does not support an unambiguous acceleration of the molecular clock in shrews, the estimated rate being similar to other estimates of mammalian mitochondrial clocks. In addition, the data presented here indicate that estimates from the fossil record greatly underestimate divergence dates among Sorex taxa.
Resumo:
The bacterial insertion sequence IS21 shares with many insertion sequences a two-step, reactive junction transposition pathway, for which a model is presented in this review: a reactive junction with abutted inverted repeats is first formed and subsequently integrated into the target DNA. The reactive junction occurs in IS21-IS21 tandems and IS21 minicircles. In addition, IS21 shows a unique specialization of transposition functions. By alternative translation initiation, the transposase gene codes for two products: the transposase, capable of promoting both steps of the reactive junction pathway, and the cointegrase, which only promotes the integration of reactive junctions but with higher efficiency. This review also includes a survey of the IS21 family and speculates on the possibility that other members present a similar transpositional specialization.
Resumo:
Resistance of human immunodeficiency virus type 1 (HIV-1) to antiretroviral agents results from target gene mutation within the pol gene, which encodes the viral protease, reverse transcriptase (RT), and integrase. We speculated that mutations in genes other that the drug target could lead to drug resistance. For this purpose, the p1-p6(gag)-p6(pol) region of HIV-1, placed immediately upstream of pol, was analyzed. This region has the potential to alter Pol through frameshift regulation (p1), through improved packaging of viral enzymes (p6(Gag)), or by changes in activation of the viral protease (p6(Pol)). Duplication of the proline-rich p6(Gag) PTAP motif, necessary for late viral cycle activities, was identified in plasma virus from 47 of 222 (21.2%) patients treated with nucleoside analog RT inhibitor (NRTI) antiretroviral therapy but was identified very rarely from drug-naïve individuals. Molecular clones carrying a 3-amino-acid duplication, APPAPP (transframe duplication SPTSPT in p6(Pol)), displayed a delay in protein maturation; however, they packaged a 34% excess of RT and exhibited a marked competitive growth advantage in the presence of NRTIs. This phenotype is reminiscent of the inoculum effect described in bacteriology, where a larger input, or a greater infectivity of an organism with a wild-type antimicrobial target, leads to escape from drug pressure and a higher MIC in vitro. Though the mechanism by which the PTAP region participates in viral maturation is not known, duplication of this proline-rich motif could improve assembly and packaging at membrane locations, resulting in the observed phenotype of increased infectivity and drug resistance.
Resumo:
During pregnancy the plasma concentration of two different inhibitors of plasminogen activators (PAIs) increases. The only one found in the plasma of nonpregnant women (PAI1) is immunologically related to a PAI of endothelial cells; its plasma activity, as deduced from the inhibition of single-chain tissue-type plasminogen activator (t-PA), increased from 3.4 +/- 2.3 U/mL (mean +/- 95% confidence limits) in the plasma of nonpregnant women to 29 +/- 7 U/mL at term, and its antigen level, measured by a radioimmunoassay, increased from 54 +/- 17 ng/mL to 144 +/- 25 ng/mL. In pregnancy plasma a second PAI (PAI 2) related to a PAI found in placenta extracts was observed. Its level, quantified with a radioimmunoassay, increased from below the detection limit (approximately 10 ng/mL) in normal plasma to 260 ng/mL at term. One hour after delivery, PAI 1 activities and antigen decreased sharply, but the PAI 2 antigen levels remained constant. Three days later, the PAI 1 antigen levels had fallen to normal levels, but the PAI 2 antigen levels were still at least eightfold above the nonpregnant values. During pregnancy, the t-PA and prourokinase (u-PA) antigen concentrations increased 50% and 200%, respectively, whereas the plasminogen and alpha 2-antiplasmin levels remained constant. Despite the large variations in the levels of PAs and PAIs, the overall fibrinolytic activity as measured in diluted plasma by a radioiodinated fibrin plate assay did not change significantly. Just after delivery, a great increase in the t-PA antigen levels was observed. Three to five days after delivery most parameters of the fibrinolytic system were normal again. Our results demonstrate that during pregnancy and in the puerperium profound alterations of the fibrinolytic system occur that are characterized by increases in PAs and their inhibitors, but these alterations do not affect the overall fibrinolytic activity.
Resumo:
Epidemiological processes leave a fingerprint in the pattern of genetic structure of virus populations. Here, we provide a new method to infer epidemiological parameters directly from viral sequence data. The method is based on phylogenetic analysis using a birth-death model (BDM) rather than the commonly used coalescent as the model for the epidemiological transmission of the pathogen. Using the BDM has the advantage that transmission and death rates are estimated independently and therefore enables for the first time the estimation of the basic reproductive number of the pathogen using only sequence data, without further assumptions like the average duration of infection. We apply the method to genetic data of the HIV-1 epidemic in Switzerland.
Resumo:
In contrast with mammals and birds, most poikilothermic vertebrates feature structurally undifferentiated sex chromosomes, which may result either from frequent turnovers, or from occasional events of XY recombination. The latter mechanism was recently suggested to be responsible for sex-chromosome homomorphy in European tree frogs (Hyla arborea). However, no single case of male recombination has been identified in large-scale laboratory crosses, and populations from NW Europe consistently display sex-specific allelic frequencies with male-diagnostic alleles, suggesting the absence of recombination in their recent history. To address this apparent paradox, we extended the phylogeographic scope of investigations, by analyzing the sequences of three sex-linked markers throughout the whole species distribution. Refugial populations (southern Balkans and Adriatic coast) show a mix of X and Y alleles in haplotypic networks, and no more within-individual pairwise nucleotide differences in males than in females, testifying to recurrent XY recombination. In contrast, populations of NW Europe, which originated from a recent postglacial expansion, show a clear pattern of XY differentiation; the X and Y gametologs of the sex-linked gene Med15 present different alleles, likely fixed by drift on the front wave of expansions, and kept differentiated since. Our results support the view that sex-chromosome homomorphy in H. arborea is maintained by occasional or historical events of recombination; whether the frequency of these events indeed differs between populations remains to be clarified.
Resumo:
Microtubule plus-end-tracking proteins (+TIPs) specifically localize to the growing plus-ends of microtubules to regulate microtubule dynamics and functions. A large group of +TIPs contain a short linear motif, SXIP, which is essential for them to bind to end-binding proteins (EBs) and target microtubule ends. The SXIP sequence site thus acts as a widespread microtubule tip localization signal (MtLS). Here we have analyzed the sequence-function relationship of a canonical MtLS. Using synthetic peptide arrays on membrane supports, we identified the residue preferences at each amino acid position of the SXIP motif and its surrounding sequence with respect to EB binding. We further developed an assay based on fluorescence polarization to assess the mechanism of the EB-SXIP interaction and to correlate EB binding and microtubule tip tracking of MtLS sequences from different +TIPs. Finally, we investigated the role of phosphorylation in regulating the EB-SXIP interaction. Together, our results define the sequence determinants of a canonical MtLS and provide the experimental data for bioinformatics approaches to carry out genome-wide predictions of novel +TIPs in multiple organisms.