894 resultados para SEQUENCE DATABASES


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Massively parallel signature sequencing (MPSS) generates millions of short sequence tags corresponding to transcripts from a single RNA preparation. Most MPSS tags can be unambiguously assigned to genes, thereby generating a comprehensive expression profile of the tissue of origin. From the comparison of MPSS data from 32 normal human tissues, we identified 1,056 genes that are predominantly expressed in the testis. Further evaluation by using MPSS tags from cancer cell lines and EST data from a wide variety of tumors identified 202 of these genes as candidates for encoding cancer/testis (CT) antigens. Of these genes, the expression in normal tissues was assessed by RT-PCR in a subset of 166 intron-containing genes, and those with confirmed testis-predominant expression were further evaluated for their expression in 21 cancer cell lines. Thus, 20 CT or CT-like genes were identified, with several exhibiting expression in five or more of the cancer cell lines examined. One of these genes is a member of a CT gene family that we designated as CT45. The CT45 family comprises six highly similar (>98% cDNA identity) genes that are clustered in tandem within a 125-kb region on Xq26.3. CT45 was found to be frequently expressed in both cancer cell lines and lung cancer specimens. Thus, MPSS analysis has resulted in a significant extension of our knowledge of CT antigens, leading to the discovery of a distinctive X-linked CT-antigen gene family.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this work was to standardize a semiautomated method for genotyping soybean, based on universal tail sequence primers (UTSP), and to compare it with the conventional genotyping method that uses electrophoresis in polyacrylamide gels. Thirty soybean cultivars were genotypically characterized by both methods, using 13 microsatellite loci. For the UTSP method, the number of alleles (NA) was 50 (2-7 per marker) and the polymorphic information content (PIC) ranged from 0.40 to 0.74. For the conventional method, the NA was 38 (2-5 per marker) and the PIC varied from 0.39 to 0.67. The genetic dissimilarity matrices obtained by the two methods were highly correlated with each other (0.8026), and the formed groups were coherent with the phenotypic data used for varietal registration. The 13 markers allowed the distinction of all analyzed cultivars. The low cost of the UTSP method, associated with its high accuracy, makes it ideal for the characterization of soybean cultivars and for the determination of genetic purity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacillus subtilis is the best-characterized member of the Gram-positive bacteria. Its genome of 4,214,810 base pairs comprises 4,100 protein-coding genes. Of these protein-coding genes, 53% are represented once, while a quarter of the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest family containing 77 putative ATP-binding transport proteins. In addition, a large proportion of the genetic capacity is devoted to the utilization of a variety of carbon sources, including many plant-derived molecules. The identification of five signal peptidase genes, as well as several genes for components of the secretion apparatus, is important given the capacity of Bacillus strains to secrete large amounts of industrially important enzymes. Many of the genes are involved in the synthesis of secondary metabolites, including antibiotics, that are more typically associated with Streptomyces species. The genome contains at least ten prophages or remnants of prophages, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract One of the most important issues in molecular biology is to understand regulatory mechanisms that control gene expression. Gene expression is often regulated by proteins, called transcription factors which bind to short (5 to 20 base pairs),degenerate segments of DNA. Experimental efforts towards understanding the sequence specificity of transcription factors is laborious and expensive, but can be substantially accelerated with the use of computational predictions. This thesis describes the use of algorithms and resources for transcriptionfactor binding site analysis in addressing quantitative modelling, where probabilitic models are built to represent binding properties of a transcription factor and can be used to find new functional binding sites in genomes. Initially, an open-access database(HTPSELEX) was created, holding high quality binding sequences for two eukaryotic families of transcription factors namely CTF/NF1 and LEFT/TCF. The binding sequences were elucidated using a recently described experimental procedure called HTP-SELEX, that allows generation of large number (> 1000) of binding sites using mass sequencing technology. For each HTP-SELEX experiments we also provide accurate primary experimental information about the protein material used, details of the wet lab protocol, an archive of sequencing trace files, and assembled clone sequences of binding sequences. The database also offers reasonably large SELEX libraries obtained with conventional low-throughput protocols.The database is available at http://wwwisrec.isb-sib.ch/htpselex/ and and ftp://ftp.isrec.isb-sib.ch/pub/databases/htpselex. The Expectation-Maximisation(EM) algorithm is one the frequently used methods to estimate probabilistic models to represent the sequence specificity of transcription factors. We present computer simulations in order to estimate the precision of EM estimated models as a function of data set parameters(like length of initial sequences, number of initial sequences, percentage of nonbinding sequences). We observed a remarkable robustness of the EM algorithm with regard to length of training sequences and the degree of contamination. The HTPSELEX database and the benchmarked results of the EM algorithm formed part of the foundation for the subsequent project, where a statistical framework called hidden Markov model has been developed to represent sequence specificity of the transcription factors CTF/NF1 and LEF1/TCF using the HTP-SELEX experiment data. The hidden Markov model framework is capable of both predicting and classifying CTF/NF1 and LEF1/TCF binding sites. A covariance analysis of the binding sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism. We next tested the LEF1/TCF model by computing binding scores for a set of LEF1/TCF binding sequences for which relative affinities were determined experimentally using non-linear regression. The predicted and experimentally determined binding affinities were in good correlation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A variety of cellular proteins has the ability to recognize DNA lesions induced by the anti-cancer drug cisplatin, with diverse consequences on their repair and on the therapeutic effectiveness of this drug. We report a novel gene involved in the cell response to cisplatin in vertebrates. The RDM1 gene (for RAD52 Motif 1) was identified while searching databases for sequences showing similarities to RAD52, a protein involved in homologous recombination and DNA double-strand break repair. Ablation of RDM1 in the chicken B cell line DT40 led to a more than 3-fold increase in sensitivity to cisplatin. However, RDM1-/- cells were not hypersensitive to DNA damages caused by ionizing radiation, UV irradiation, or the alkylating agent methylmethane sulfonate. The RDM1 protein displays a nucleic acid binding domain of the RNA recognition motif (RRM) type. By using gel-shift assays and electron microscopy, we show that purified, recombinant chicken RDM1 protein interacts with single-stranded DNA as well as double-stranded DNA, on which it assembles filament-like structures. Notably, RDM1 recognizes DNA distortions induced by cisplatin-DNA adducts in vitro. Finally, human RDM1 transcripts are abundant in the testis, suggesting a possible role during spermatogenesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The numerous yeast genome sequences presently available provide a rich source of information for functional as well as evolutionary genomics but unequally cover the large phylogenetic diversity of extant yeasts. We present here the complete sequence of the nuclear genome of the haploid-type strain of Kuraishia capsulata (CBS1993(T)), a nitrate-assimilating Saccharomycetales of uncertain taxonomy, isolated from tunnels of insect larvae underneath coniferous barks and characterized by its copious production of extracellular polysaccharides. The sequence is composed of seven scaffolds, one per chromosome, totaling 11.4 Mb and containing 6,029 protein-coding genes, ~13.5% of which being interrupted by introns. This GC-rich yeast genome (45.7%) appears phylogenetically related with the few other nitrate-assimilating yeasts sequenced so far, Ogataea polymorpha, O. parapolymorpha, and Dekkera bruxellensis, with which it shares a very reduced number of tRNA genes, a novel tRNA sparing strategy, and a common nitrate assimilation cluster, three specific features to this group of yeasts. Centromeres were recognized in GC-poor troughs of each scaffold. The strain bears MAT alpha genes at a single MAT locus and presents a significant degree of conservation with Saccharomyces cerevisiae genes, suggesting that it can perform sexual cycles in nature, although genes involved in meiosis were not all recognized. The complete absence of conservation of synteny between K. capsulata and any other yeast genome described so far, including the three other nitrate-assimilating species, validates the interest of this species for long-range evolutionary genomic studies among Saccharomycotina yeasts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ancient asexuals have been considered to be a contradiction of the basic tenets of evolutionary theory. Barred from rearranging genetic variation by recombination, their reduced number of gene arrangements is thought to hamper their response to changing environments. For the same reason, it should be difficult for them to avoid the build-up of deleterious mutations. Several groups of taxonomically diverse organisms are thought to be ancient asexuals, although clear evidence for or against the existence of recombination events is scarce. Several methods have recently been developed for predicting recombination events by analyzing aligned sequences of a given region of DNA that all originate from one species. The methods are based on phylogenetic, substitution, and compatibility analyses. Here we present the results of analyses of sequence data from different loci studied in several groups of evolutionarily distant species that are considered to be ancient asexuals, using seven different types of analysis. The groups of organisms were the arbuscular mycorrhizal fungi (Glomales), Darwinula stevensoni (Darwinuloidea crustacean ostracods) and the bdelloid rotifers (Bdelloidea), which are thought to have been asexual for the last 400, 25-100, and 35-40 Myr, respectively. The seven different analytical methods evaluated the evolutionary relationships among haplotypes, and these methods had previously been shown to be reliable for predicting the occurrence of recombination events. Despite the different degree of genetic variation among the different groups of organisms, at least some evidence for recombination was found in all species groups. In particular, predictions of recombination events in the arbuscular mycorrhizal fungi were frequent. Predictions of recombination were also found for sequence data that have previously been used to infer the absence of recombination in bdelloid rotifers. Although our results have to be taken with some caution because they could signal very ancient recombination events or possibly other genetic variation of nonrecombinant origin, they suggest that some cryptic recombination events may exist in these organisms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'introduction des technologies de séquençage de nouvelle génération est en vue de révolutionner la médecine moderne. L'impact de ces nouveaux outils a déjà contribué à la découverte de nouveaux gènes et de voies cellulaires impliqués dans la pathologie de maladies génétiques rares ou communes. En revanche, l'énorme quantité de données générées par ces systèmes ainsi que la complexité des analyses bioinformatiques nécessaires, engendre un goulet d'étranglement pour résoudre les cas les plus difficiles. L'objectif de cette thèse a été d'identifier les causes génétiques de deux maladies héréditaires utilisant ces nouvelles techniques de séquençage, couplées à des technologies d'enrichissement de gènes. Dans ce cadre, nous avons développé notre propre méthode de travail (pipeline) pour l'alignement des fragments de séquence (reads). Suite à l'identification de gènes, nous avons réalisé une analyse fonctionnelle pour élucider leur rôle dans la maladie. Dans un premier temps, nous avons étudié et identifié des mutations impliquées dans une forme récessive de la rétinite pigmentaire qui est à ce jour la dégénérescence rétinienne héréditaire la plus fréquente. En particulier, nous avons constaté que des mutations faux-sens dans le gène FAM161A étaient la cause de la rétinite pigmentaire préalablement associé avec le locus RP28. De plus, nous avons démontré que ce gène avait des fonctions au niveau du cil du photorécepteur, complétant le large spectre des cilliopathies rétiniennes héréditaires. Dans un second temps, nous avons exploré la possibilité qu'un syndrome, relativement fréquent en pédiatrie de fièvre récurrente, appelé PFAPA (acronyme de fièvre périodique avec adénite stomatite, pharyngite et cervical aphteuse) puisse avoir une origine génétique. L'étiologie de cette maladie n'étant pas claire, nous avons tenté d'identifier le spectre génétique de patients PFAPA. Comme nous n'avons pas pu mettre à jour un nouveau gène unique muté et responsable de la maladie chez tous les individus dépistés, il semblerait qu'un modèle génétique plus complexe suggérant l'implication de plusieurs gènes dans la pathologie ait été identifié chez les patients touchés. Ces gènes seraient notamment impliqués dans des processus liés à l'inflammation ce qui élargirait l'impact de ces études à d'autres maladies auto-inflammatoires.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Environmental and depositional changes across the Late Cenomanian oceanic anoxic event (OAE2) in the Sinai, Egypt, are examined based on biostratigraphy, mineralogy, delta(13)C values and phosphorus analyses. Comparison with the Pueblo, Colorado, stratotype section reveals the Whadi El Ghaib section as stratigraphically complete across the late Cenomanian-early Turonian. Foraminifera are dominated by high-stress planktic and benthic assemblages characterized by low diversity, low-oxygen and low-salinity tolerant species, which mark shallow-water oceanic dysoxic conditions during OAE2. Oyster biostromes suggest deposition occurred in less than 50 m depths in low-oxygen, brackish, and nutrient-rich waters. Their demise prior to the peak delta(13)C excursion is likely due to a rising sea-level. Characteristic OAE2 anoxic conditions reached this coastal region only at the end of the delta(13)C plateau in deeper waters near the end of the Cenomanian. Increased phosphorus accumulations before and after the delta(13)C excursion suggest higher oxic conditions and increased detrital input. Bulk-rock and clay mineralogy indicate humid climate conditions, increased continental runoff and a rising sea up to the first delta(13)C peak. Above this interval, a dryer and seasonally well-contrasted climate with intermittently dry conditions prevailed. These results reveal the globally synchronous delta(13)C shift, but delayed effects of OAE2 dependent on water depth.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The complete amino acid sequence of mature C8 beta has been derived from the DNA sequence of a cDNA clone identified by expression screening of a human liver cDNA library. Comparison with the amino acid sequence of C9 shows an overall homology with few deletions and insertions. In particular, the cysteine-rich domains and membrane-inserting regions of C9 are well conserved. These findings are discussed in relation to a possible mechanism of membrane attack complex formation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

UniPathway (http://www.unipathway.org) is a fully manually curated resource for the representation and annotation of metabolic pathways. UniPathway provides explicit representations of enzyme-catalyzed and spontaneous chemical reactions, as well as a hierarchical representation of metabolic pathways. This hierarchy uses linear subpathways as the basic building block for the assembly of larger and more complex pathways, including species-specific pathway variants. All of the pathway data in UniPathway has been extensively cross-linked to existing pathway resources such as KEGG and MetaCyc, as well as sequence resources such as the UniProt KnowledgeBase (UniProtKB), for which UniPathway provides a controlled vocabulary for pathway annotation. We introduce here the basic concepts underlying the UniPathway resource, with the aim of allowing users to fully exploit the information provided by UniPathway.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10 000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DnaSP, DNA Sequence Polymorphism, is a software package for the analysis of nucleotide polymorphism from aligned DNA sequence data. DnaSP can estimate several measures of DNA sequence variation within and between populations (in noncoding, synonymous or nonsynonymous sites, or in various sorts of codon positions), as well as linkage disequilibrium, recombination, gene flow and gene conversion parameters. DnaSP can also carry out several tests of neutrality: Hudson, Kreitman and Aguadé (1987), Tajima (1989), McDonald and Kreitman (1991), Fu and Li (1993), and Fu (1997) tests. Additionally, DnaSP can estimate the confidence intervals of some test-statistics by the coalescent. The results of the analyses are displayed on tabular and graphic form.