891 resultados para sequence database


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Complete Arabidopsis Transcriptome Micro Array (CATMA) database contains gene sequence tag (GST) and gene model sequences for over 70% of the predicted genes in the Arabidopsis thaliana genome as well as primer sequences for GST amplification and a wide range of supplementary information. All CATMA GST sequences are specific to the gene for which they were designed, and all gene models were predicted from a complete reannotation of the genome using uniform parameters. The database is searchable by sequence name, sequence homology or direct SQL query, and is available through the CATMA website at http://www.catma.org/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although the molecular typing of Pseudomonas aeruginosa is important to understand the local epidemiology of this opportunistic pathogen, it remains challenging. Our aim was to develop a simple typing method based on the sequencing of two highly variable loci. Single-strand sequencing of three highly variable loci (ms172, ms217, and oprD) was performed on a collection of 282 isolates recovered between 1994 and 2007 (from patients and the environment). As expected, the resolution of each locus alone [number of types (NT) = 35-64; index of discrimination (ID) = 0.816-0.964] was lower than the combination of two loci (NT = 78-97; ID = 0.966-0.971). As each pairwise combination of loci gave similar results, we selected the most robust combination with ms172 [reverse; R] and ms217 [R] to constitute the double-locus sequence typing (DLST) scheme for P. aeruginosa. This combination gave: (i) a complete genotype for 276/282 isolates (typability of 98%), (ii) 86 different types, and (iii) an ID of 0.968. Analysis of multiple isolates from the same patients or taps showed that DLST genotypes are generally stable over a period of several months. The high typability, discriminatory power, and ease of use of the proposed DLST scheme makes it a method of choice for local epidemiological analyses of P. aeruginosa. Moreover, the possibility to give unambiguous definition of types allowed to develop an Internet database ( http://www.dlst.org ) accessible by all.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS (http://genome.imim.es/datasets/abs2005/index.html) is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work was to identify expressed simple sequence repeats (SSR) markers associated to leaf miner resistance in coffee progenies. Identification of SSR markers was accomplished by directed searches on the Brazilian Coffee Expressed Sequence Tags (EST) database. Sequence analysis of 32 selected SSR loci showed that 65% repeats are of tetra-, 21% of tri- and 14% of dinucleotides. Also, expressed SSR are localized frequently in the 5'-UTR of gene transcript. Moreover, most of the genes containing SSR are associated with defense mechanisms. Polymorphisms were analyzed in progenies segregating for resistance to the leaf miner and corresponding to advanced generations of a Coffea arabica x Coffea racemosa hybrid. Frequency of SSR alleles was 2.1 per locus. However, no polymorphism associated with leaf miner resistance was identified. These results suggest that marker-assisted selection in coffee breeding should be performed on the initial cross, in which genetic variability is still significant.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The flexibility of different regions of HIV-1 protease was examined by using a database consisting of 73 X-ray structures that differ in terms of sequence, ligands or both. The root-mean-square differences of the backbone for the set of structures were shown to have the same variation with residue number as those obtained from molecular dynamics simulations, normal mode analyses and X-ray B-factors. This supports the idea that observed structural changes provide a measure of the inherent flexibility of the protein, although specific interactions between the protease and the ligand play a secondary role. The results suggest that the potential energy surface of the HIV-1 protease is characterized by many local minima with small energetic differences, some of which are sampled by the different X-ray structures of the HIV-1 protease complexes. Interdomain correlated motions were calculated from the structural fluctuations and the results were also in agreement with molecular dynamics simulations and normal mode analyses. Implications of the results for the drug-resistance engendered by mutations are discussed briefly.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite the development of novel typing methods based on whole genome sequencing, most laboratories still rely on classical molecular methods for outbreak investigation or surveillance. Reference methods for Clostridium difficile include ribotyping and pulsed-field gel electrophoresis, which are band-comparing methods often difficult to establish and which require reference strain collections. Here, we present the double locus sequence typing (DLST) scheme as a tool to analyse C. difficile isolates. Using a collection of clinical C. difficile isolates recovered during a 1-year period, we evaluated the performance of DLST and compared the results to multilocus sequence typing (MLST), a sequence-based method that has been used to study the structure of bacterial populations and highlight major clones. DLST had a higher discriminatory power compared to MLST (Simpson's index of diversity of 0.979 versus 0.965) and successfully identified all isolates of the study (100 % typeability). Previous studies showed that the discriminatory power of ribotyping was comparable to that of MLST; thus, DLST might be more discriminatory than ribotyping. DLST is easy to establish and provides several advantages, including absence of DNA extraction [polymerase chain reaction (PCR) is performed on colonies], no specific instrumentation, low cost and unambiguous definition of types. Moreover, the implementation of a DLST typing scheme on an Internet database, such as that previously done for Staphylococcus aureus and Pseudomonas aeruginosa ( http://www.dlst.org ), will allow users to easily obtain the DLST type by submitting directly sequencing files and will avoid problems associated with multiple databases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Affiliation: Département de biochimie, Faculté de médecine, Université de Montréal

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: This study describes a bioinformatics approach designed to identify Plasmodium vivax proteins potentially involved in reticulocyte invasion. Specifically, different protein training sets were built and tuned based on different biological parameters, such as experimental evidence of secretion and/or involvement in invasion-related processes. A profile-based sequence method supported by hidden Markov models (HMMs) was then used to build classifiers to search for biologically-related proteins. The transcriptional profile of the P. vivax intra-erythrocyte developmental cycle was then screened using these classifiers. Results: A bioinformatics methodology for identifying potentially secreted P. vivax proteins was designed using sequence redundancy reduction and probabilistic profiles. This methodology led to identifying a set of 45 proteins that are potentially secreted during the P. vivax intra-erythrocyte development cycle and could be involved in cell invasion. Thirteen of the 45 proteins have already been described as vaccine candidates; there is experimental evidence of protein expression for 7 of the 32 remaining ones, while no previous studies of expression, function or immunology have been carried out for the additional 25. Conclusions: The results support the idea that probabilistic techniques like profile HMMs improve similarity searches. Also, different adjustments such as sequence redundancy reduction using Pisces or Cd-Hit allowed data clustering based on rational reproducible measurements. This kind of approach for selecting proteins with specific functions is highly important for supporting large-scale analyses that could aid in the identification of genes encoding potential new target antigens for vaccine development and drug design. The present study has led to targeting 32 proteins for further testing regarding their ability to induce protective immune responses against P. vivax malaria.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The current version of this database on CD-ROM contains information on 14 127 cocoa (Theobroma cacao) clones and their 14 112 synonyms, the origin and history of the clones and the clone names, and accession lists for 48 of the major cocoa gene banks including quarantine stations. Also included are morphological data for leaves, fruits and seeds, disease reactions, quality and agronomic characters, and reference information on common abbreviations and acronyms, cocoa gene bank addresses and a full bibliography (with hyperlinked reference to data). New additions are 748 photographs and drawings of 428 individual clones in 11 different locations. Also included are 376 profiles for 15 simple sequence repeat primer pairs on 331 clones held in the University of Reading Intermediate Cocoa Quarantine Facility. Minimum system requirements are Windows 95 or later, a Pentium 166 with 32 MB RAM, CD-ROM drive and a minimum 20 MB hard disk space. A user guide is included in the package.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Physiological and yield traits such as stomatal conductance (mmol m-2s-1), Leaf relative water content (RWC %) and grain yield per plant were studied in a separate experiment. Results revealed that five out of sixteen cultivars viz. Anmol, Moomal, Sarsabz, Bhitai and Pavan, appeared to be relatively more drought tolerant. Based on morphophysiological results, studies were continued to look at these cultivars for drought tolerance at molecular level. Initially, four well recognized primers for dehydrin genes (DHNs) responsible for drought induction in T. durum L., T. aestivum L. and O. sativa L. were used for profiling gene sequence of sixteen wheat cultivars. The primers amplified the DHN genes variably like Primer WDHN13 (T. aestivum L.) amplified the DHN gene in only seven cultivars whereas primer TdDHN15 (T. durum L.) amplified all the sixteen cultivars with even different DNA banding patterns some showing second weaker DNA bands. Third primer TdDHN16 (T. durum L.) has shown entirely different PCR amplification prototype, specially showing two strong DNA bands while fourth primer RAB16C (O. sativa L.) failed to amplify DHN gene in any of the cultivars. Examination of DNA sequences revealed several interesting features. First, it identified the two exon/one intron structure of this gene (complete sequences were not shown), a feature not previously described in the two database cDNA sequences available from T. aestivum L. (gi|21850). Secondly, the analysis identified several single nucleotide polymorphisms (SNPs), positions in gene sequence. Although complete gene sequence was not obtained for all the cultivars, yet there were a total of 38 variable positions in exonic (coding region) sequence, from a total gene length of 453 nucleotides. Matrix of SNP shows these 37 positions with individual sequence at positions given for each of the 14 cultivars (sequence of two cultivars was not obtained) included in this analysis. It demonstrated a considerable diversity for this gene with only three cultivars i.e. TJ-83, Marvi and TD-1 being similar to the consensus sequence. All other cultivars showed a unique combination of SNPs. In order to prove a functional link between these polymorphisms and drought tolerance in wheat, it would be necessary to conduct a more detailed study involving directed mutation of this gene and DHN gene expression.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hepatitis C virus (HCV) infection frequently persists despite substantial virus-specific immune responses and the combination of pegylated interferon (INF)-alpha and ribavirin therapy. Major histocompatibility complex class I restricted CD8+ T cells are responsible for the control of viraemia in HCV infection, and several studies suggest protection against viral infection associated with specific HLAs. The reason for low rates of sustained viral response (SVR) in HCV patients remains unknown. Escape mutations in response to cytotoxic T lymphocyte are widely described; however, its influence in the treatment outcome is ill understood. Here, we investigate the differences in CD8 epitopes frequencies from the Los Alamos database between groups of patients that showed distinct response to pegylated alpha-INF with ribavirin therapy and test evidence of natural selection on the virus in those who failed treatment, using five maximum likelihood evolutionary models from PAML package. The group of sustained virological responders showed three epitopes with frequencies higher than Non-responders group, all had statistical support, and we observed evidence of selection pressure in the last group. No escape mutation was observed. Interestingly, the epitope VLSDFKTWL was 100% conserved in SVR group. These results suggest that the response to treatment can be explained by the increase in immune pressure, induced by interferon therapy, and the presence of those epitopes may represent an important factor in determining the outcome of therapy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main goal of our research was to search for SSRs in the Eucalyptus EST FORESTs database (using a software for mining SSR-motifs). With this objective, we created a database for cataloging Eucalyptus EST-derived SSRs, and developed a bioinformatics tool, named Satellyptus, for finding and analyzing microsatellites in the Eucalyptus EST database. The search for microsatellites in the FORESTs database containing 71,115 Eucalyptus EST sequences (52.09 Mb) revealed 20,530 SSRs in 15,621 ESTs. The SSR abundance detected on the Eucalyptus ESTs database (29% or one microsatellite every four sequences) is considered very high for plants. Amongst the categories of SSR motifs, the dimeric (37%) and trimeric ones (33%) predominated. The AG/CT motif was the most frequent (35.15%) followed by the trimeric CCG/CGG (12.81%). From a random sample of 1,217 sequences, 343 microsatellites in 265 SSR-containing sequences were identified. Approximately 48% of these ESTs containing microsatellites were homologous to proteins with known biological function. Most of the microsatellites detected in Eucalyptus ESTs were positioned at either the 5 or 3 end. Our next priority involves the design of flanking primers for codominant SSR loci, which could lead to the development of a set of microsatellite-based markers suitable for marker-assisted Eucalyptus breeding programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genome sequencing efforts are providing us with complete genetic blueprints for hundreds of organisms. We are now faced with assigning, understanding, and modifying the functions of proteins encoded by these genomes. DBMODELING is a relational database of annotated comparative protein structure models and their metabolic pathway characterization, when identified. This procedure was applied to complete genomes such as Mycobacteritum tuberculosis and Xylella fastidiosa. The main interest in the study of metabolic pathways is that some of these pathways are not present in humans, which makes them selective targets for drug design, decreasing the impact of drugs in humans. In the database, there are currently 1116 proteins from two genomes. It can be accessed by any researcher at http://www.biocristalografia.df.ibilce.unesp.br/tools/. This project confirms that homology modeling is a useful tool in structural bioinformatics and that it can be very valuable in annotating genome sequence information, contributing to structural and functional genomics, and analyzing protein-ligand docking.