956 resultados para Genomic sequence database
Resumo:
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, experimentally defined by a transcription start site (TSS). There may be multiple promoter entries for a single gene. The underlying experimental evidence comes from journal articles and, starting from release 73, from 5' ESTs of full-length cDNA clones used for so-called in silico primer extension. Access to promoter sequences is provided by pointers to TSS positions in nucleotide sequence entries. The annotation part of an EPD entry includes a description of the type and source of the initiation site mapping data, links to other biological databases and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. Web-based interfaces have been developed that enable the user to view EPD entries in different formats, to select and extract promoter sequences according to a variety of criteria and to navigate to related databases exploiting different cross-references. Tools for analysing sequence motifs around TSSs defined in EPD are provided by the signal search analysis server. EPD can be accessed at http://www.epd. isb-sib.ch.
Resumo:
Genomic islands, large potentially mobile regions of bacterial chromosomes, are a major contributor to bacteria evolution. Here, we investigated the fitness cost and phenotypic differences between the bacterium Pseudomonas aeruginosa PAO1 and a derivative carrying one integrated copy of the clc element, a 103-kb genomic island [and integrative and conjugative element (ICE)] originating in Pseudomonas sp. strain B13 and a close relative of genomic islands found in clinical and environmental isolates of P. aeruginosa. By using a combination of whole genome transcriptome profiling, phenotypic arrays, competition experiments, and biofilm formation studies, only few differences became apparent, such as reduced biofilm growth and fourfold stationary phase repression of genes involved in acetoin metabolism in PAO1 containing the clc element. In contrast, PAO1 carrying the clc element acquired the capacity to grow on 3-chlorobenzoate and 2-aminophenol as sole carbon and energy substrates. No fitness loss >1% was detectable in competition experiments between PAO1 and PAO1 carrying the clc element. The genes from the clc element were not silent in PAO1, and excision was observed, although transfer of clc from PAO1 to other recipient bacteria was reduced by two orders of magnitude. Our results indicate that newly acquired mobile DNA not necessarily invoke an important fitness cost on their host. Absence of immediate detriment to the host may have contributed to the wide distribution of genomic islands like clc in bacterial genomes
Resumo:
The number of sequences generated by genome projects has increased exponentially, but gene characterization has not followed at the same rate. Sequencing and analysis of full-length cDNAs is an important step in gene characterization that has been used nowadays by several research groups. In this work, we have selected Schistosoma mansoni clones for full-length sequencing, using an algorithm that investigates the presence of the initial methionine in the parasite sequence based on the positions of alignment start between two sequences. BLAST searches to produce such alignments have been performed using parasite expressed sequence tags produced by Minas Gerais Genome Network against sequences from the database Eukaryotic Cluster of Orthologous Groups (KOG). This procedure has allowed the selection of clones representing 398 proteins which have not been deposited as S. mansoni complete CDS in any public database. Dedicated sequencing of 96 of such clones with reads from both 5' and 3' ends has been performed. These reads have been assembled using PHRAP, resulting in the production of 33 full-length sequences that represent novel S. mansoni proteins. These results shall contribute to construct a more complete view of the biology of this important parasite.
Resumo:
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.
Resumo:
Two allelic genomic fragments containing ribosomal protein S4 encoding genes (rpS4) from Trypanosoma cruzi (CL-Brener strain) were isolated and characterized. One allele comprises two complete tandem repeats of a sequence encoding an rpS4 gene. In the other, only one rpS4 gene is found. Sequence comparison to the accessed data in the genome project database reveals that our two-copy allele corresponds to a variant haplotype. However, the deduced aminoacid sequence of all the gene copies is identical. The rpS4 transcripts processing sites were determined by comparison of genomic sequences with published cDNA data. The obtained sequence data demonstrates that rpS4 genes are expressed in epimastigotes, amastigotes, and trypomastigotes. A recombinant version of rpS4 was found to be an antigenic: it was recognized by 62.5% of the individuals with positive serology for T. cruzi and by 93.3% of patients with proven chronic chagasic disease.
Resumo:
The characterization of expressed sequence tags (ESTs) generated from a cDNA library of Leishmania (Leishmania) amazonensis amastigotes is described. The sequencing of 93 clones generated new L. (L.) amazonensis ESTs from which 32% are not related to any other sequences in database and 68% presented significant similarities to known genes. The chromosome localization of some L. (L.) amazonensis ESTs was also determined in L. (L.) amazonensis and L. (L.) major. The characterization of these ESTs is suitable for the genome physical mapping, as well as for the identification of genes encoding cysteine proteinases implicated with protective immune responses in leishmaniasis.
Resumo:
Islet-brain 1 (IB1), a regulator of the pancreatic beta-cell function in the rat, is homologous to JIP-1, a murine inhibitor of c-Jun amino-terminal kinase (JNK). Whether IB1 and JIP-1 are present in humans was not known. We report the sequence of the 2133-bp human IB1 cDNA, the expression, structure, and fine-mapping of the human IB1 gene, and the characterization of an IB1 pseudogene. Human IB1 is 94% identical to rat IB1. The tissue-specific expression of IB1 in human is similar to that observed in rodent. The IB1 gene contains 12 exons and maps to chromosome 11 (11p11.2-p12), a region that is deleted in DEFECT-11 syndrome. Apart from an IB1 pseudogene on chromosome 17 (17q21), no additional IB1-related gene was found in the human genome. Our data indicate that the sequence and expression pattern of IB1 are highly conserved between rodent and human and provide the necessary tools to investigate whether IB1 is involved in human diseases.
Resumo:
In this paper we review the impact that the availability of the Schistosoma mansoni genome sequence and annotation has had on schistosomiasis research. Easy access to the genomic information is important and several types of data are currently being integrated, such as proteomics, microarray and polymorphic loci. Access to the genome annotation and powerful means of extracting information are major resources to the research community.
Resumo:
CA88 is the first long nuclear repetitive DNA sequence identified in the blood fluke, Schistosoma mansoni. The assembled S. mansoni sequence, which contains the CA88 repeat, has 8,887 nucleotides and at least three repeat units of approximately 360 bp. In addition, CA88 also possesses an internal CA microsatellite, identified as SmBr18. Both PCR and BLAST analysis have been used to analyse and confirm the CA88 sequence in other S. mansoni sequences in the public database. PCR-acquired nuclear repetitive DNA sequence profiles from nine Schistosoma species were used to classify this organism into four genotypes. Included among the nine species analysed were five sequences of both African and Asian lineages that are known to infect humans. Within these genotypes, three of them refer to recognised species groups. A panel of four microsatellite loci, including SmBr18 and three previously published loci, has been used to characterise the nine Schistosoma species. Each species has been identified and classified based on its CA88 DNA fingerprint profile. Furthermore, microsatellite sequences and intra-specific variation have also been observed within the nine Schistosoma species sequences. Taken together, these results support the use of these markers in studying the population dynamics of Schistosoma isolates from endemic areas and also provide new methods for investigating the relationships between different populations of parasites. In addition, these data also indicate that Schistosoma magrebowiei is not a sister taxon to Schistosoma mattheei, prompting a new designation to a basal clade.
Resumo:
An online scheme to assign Stenotrophomonas isolates to genomic groups was developed using the multilocus sequence analysis (MLSA), which is based on the DNA sequencing of selected fragments of the housekeeping genes ATP synthase alpha subunit (atpA), the recombination repair protein (recA), the RNA polymerase alpha subunit (rpoA) and the excision repair beta subunit (uvrB). This MLSA-based scheme was validated using eight of the 10 Stenotrophomonas species that have been previously described. The environmental and nosocomial Stenotrophomonas strains were characterised using MLSA, 16S rRNA sequencing and DNA-DNA hybridisation (DDH) analyses. Strains of the same species were found to have greater than 95% concatenated sequence similarity and specific strains formed cohesive readily recognisable phylogenetic groups. Therefore, MLSA appeared to be an effective alternative methodology to amplified fragment length polymorphism fingerprint and DDH techniques. Strains of Stenotrophomonas can be readily assigned through the open database resource that was developed in the current study (www.steno.lncc.br/).
Resumo:
Access to online repositories for genomic and associated "-omics" datasets is now an essential part of everyday research activity. It is important therefore that the Tuberculosis community is aware of the databases and tools available to them online, as well as for the database hosts to know what the needs of the research community are. One of the goals of the Tuberculosis Annotation Jamboree, held in Washington DC on March 7th-8th 2012, was therefore to provide an overview of the current status of three key Tuberculosis resources, TubercuList (tuberculist.epfl.ch), TB Database (www.tbdb.org), and Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org). Here we summarize some key updates and upcoming features in TubercuList, and provide an overview of the PATRIC site and its online tools for pathogen RNA-Seq analysis.
Resumo:
Genomic islands are DNA elements acquired by horizontal gene transfer that are common to a large number of bacterial genomes, which can contribute specific adaptive functions, e.g. virulence, metabolic capacities or antibiotic resistances. Some genomic islands are still self-transferable and display an intricate life-style, reminiscent of both bacteriophages and conjugative plasmids. Here we studied the dynamical process of genomic island excision and intracellular reintegration using the integrative and conjugative element ICEclc from Pseudomonas knackmussii B13 as model. By using self-transfer of ICEclc from strain B13 to Pseudomonas putida and Cupriavidus necator as recipients, we show that ICEclc can target a number of different tRNA(Gly) genes in a bacterial genome, but only those which carry the GCC anticodon. Two conditional traps were designed for ICEclc based on the attR sequence, and we could show that ICEclc will insert with different frequencies in such traps producing brightly fluorescent cells. Starting from clonal primary transconjugants we demonstrate that ICEclc is excising and reintegrating at detectable frequencies, even in the absence of recipient. Recombination site analysis provided evidence to explain the characteristics of a larger number of genomic island insertions observed in a variety of strains, including Bordetella petri, Pseudomonas aeruginosa and Burkholderia.
Resumo:
We have analysed the whole mitochondrial (mt) genome sequences (each ~6 kilo nucleotide base pairs in length) of four field isolates of the malaria parasite Plasmodium falciparum collected from different locations in India. Comparative genomic analyses of mt genome sequences revealed three novel India-specific single nucleotide polymorphisms. In general, high mt genome diversity was found in Indian P. falciparum, at a level comparable to African isolates. A population phylogenetic tree placed the presently sequenced Indian P. falciparum with the global isolates, while a previously sequenced Indian isolate was an outlier. Although this preliminary study is limited to a few numbers of isolates, the data have provided fundamental evidence of the mt genome diversity and evolutionary relationships of Indian P. falciparum with that of global isolates.
Resumo:
The inv(16) and related t(16;16) are found in 10% of all cases with de novo acute myeloid leukemia. In these rearrangements the core binding factor beta (CBFB) gene on 16q22 is fused to the smooth muscle myosin heavy chain gene (MYH11) on 16p13. To gain insight into the mechanisms causing the inv(16) we have analysed 24 genomic CBFB-MYH11 breakpoints. All breakpoints in CBFB are located in a 15-Kb intron. More than 50% of the sequenced 6.2 Kb of this intron consists of human repetitive elements. Twenty-one of the 24 breakpoints in MYH11 are located in a 370-bp intron. The remaining three breakpoints in MYH11 are located more upstream. The localization of three breakpoints adjacent to a V(D)J recombinase signal sequence in MYH11 suggests a V(D)J recombinase-mediated rearrangement in these cases. V(D)J recombinase-associated characteristics (small nucleotide deletions and insertions of random nucleotides) were detected in six other cases. CBFB and MYH11 duplications were detected in four of six cases tested.
Resumo:
We report the complete genome sequence and analysis of an invasive Corynebacterium diphtheriae strain that caused endocarditis in Rio de Janeiro, Brazil. It was selected for sequencing on the basis of the current relevance of nontoxigenic strains for public health. The genomic information was explored in the context of diversity, plasticity and genetic relatedness with other contemporary strains.