Biblioteca Digital

16 resultados para Dna-sequences

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain

An assessment of gene prediction accuracy in large DNA sequences

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.

Molecular dating of caprines using ancient DNA sequences of Myotragus balearicus, an extinct endemic Balearic mammal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Myotragus balearicus was an endemic bovid from the Balearic Islands (Western Mediterranean) that became extinct around 6,000-4,000 years ago. The Myotragus evolutionary lineage became isolated in the islands most probably at the end of the Messinian crisis, when the desiccation of the Mediterranean ended, in a geological date established at 5.35 Mya. Thus, the sequences of Myotragus could be very valuable for calibrating the mammalian mitochondrial DNA clock and, in particular, the tree of the Caprinae subfamily, to which Myotragus belongs. Results: We have retrieved the complete mitochondrial cytochrome b gene (1,143 base pairs), plus fragments of the mitochondrial 12S gene and the nuclear 28S rDNA multi-copy gene from a well preserved Myotragus subfossil bone. The best resolved phylogenetic trees, obtained with the cytochrome b gene, placed Myotragus in a position basal to the Ovis group. Using the calibration provided by the isolation of Balearic Islands, we calculated that the initial radiation of caprines can be dated at 6.2 ± 0.4 Mya. In addition, alpine and southern chamois, considered until recently the same species, split around 1.6 ± 0.3 Mya, indicating that the two chamois species have been separated much longer than previously thought. Conclusion: Since there are almost no extant endemic mammals in Mediterranean islands, the sequence of the extinct Balearic endemic Myotragus has been crucial for allowing us to use the Messinian crisis calibration point for dating the caprines phylogenetic tree.

Molecular dating of caprines using ancient DNA sequences of Myotragus balearicus, an extinct endemic Balearic mammal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Myotragus balearicus was an endemic bovid from the Balearic Islands (Western Mediterranean) that became extinct around 6,000-4,000 years ago. The Myotragus evolutionary lineage became isolated in the islands most probably at the end of the Messinian crisis, when the desiccation of the Mediterranean ended, in a geological date established at 5.35 Mya. Thus, the sequences of Myotragus could be very valuable for calibrating the mammalian mitochondrial DNA clock and, in particular, the tree of the Caprinae subfamily, to which Myotragus belongs. Results: We have retrieved the complete mitochondrial cytochrome b gene (1,143 base pairs), plus fragments of the mitochondrial 12S gene and the nuclear 28S rDNA multi-copy gene from a well preserved Myotragus subfossil bone. The best resolved phylogenetic trees, obtained with the cytochrome b gene, placed Myotragus in a position basal to the Ovis group. Using the calibration provided by the isolation of Balearic Islands, we calculated that the initial radiation of caprines can be dated at 6.2 ± 0.4 Mya. In addition, alpine and southern chamois, considered until recently the same species, split around 1.6 ± 0.3 Mya, indicating that the two chamois species have been separated much longer than previously thought. Conclusion: Since there are almost no extant endemic mammals in Mediterranean islands, the sequence of the extinct Balearic endemic Myotragus has been crucial for allowing us to use the Messinian crisis calibration point for dating the caprines phylogenetic tree.

Geographic patterns of genetic variation in a broadly distributed marine vertebrate: new insights into loggerhead turtle stock structure from expanded mitochondrial DNA sequences

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Previous genetic studies have demonstrated that natal homing shapes the stock structure of marine turtle nesting populations. However, widespread sharing of common haplotypes based on short segments of the mitochondrial control region often limits resolution of the demographic connectivity of populations. Recent studies employing longer control region sequences to resolve haplotype sharing have focused on regional assessments of genetic structure and phylogeography. Here we synthesize available control region sequences for loggerhead turtles from the Mediterranean Sea, Atlantic, and western Indian Ocean basins. These data represent six of the nine globally significant regional management units (RMUs) for the species and include novel sequence data from Brazil, Cape Verde, South Africa and Oman. Genetic tests of differentiation among 42 rookeries represented by short sequences (380 bp haplotypes from 3,486 samples) and 40 rookeries represented by long sequences (~800 bp haplotypes from 3,434 samples) supported the distinction of the six RMUs analyzed as well as recognition of at least 18 demographically independent management units (MUs) with respect to female natal homing. A total of 59 haplotypes were resolved. These haplotypes belonged to two highly divergent global lineages, with haplogroup I represented primarily by CC-A1, CC-A4, and CC-A11 variants and haplogroup II represented by CC-A2 and derived variants. Geographic distribution patterns of haplogroup II haplotypes and the nested position of CC-A11.6 from Oman among the Atlantic haplotypes invoke recent colonization of the Indian Ocean from the Atlantic for both global lineages. The haplotypes we confirmed for western Indian Ocean RMUs allow reinterpretation of previous mixed stock analysis and further suggest that contemporary migratory connectivity between the Indian and Atlantic Oceans occurs on a broader scale than previously hypothesized. This study represents a valuable model for conducting comprehensive international cooperative data management and research in marine ecology.

Photocontrolled DNA Binding of a Receptor-Targeted Organometallic Ruthenium(II) Complex

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A photoactivated ruthenium(II) arene complex has been conjugated to two receptor-binding peptides, a dicarba analogue of octreotide and the Arg-Gly-Asp (RGD) tripeptide. These peptides can act as"tumor-targeting devices" since their receptors are overexpressed on the membranes of tumor cells. Both ruthenium-peptide conjugates are stable in aqueous solution in the dark, but upon irradiation with visible light, the pyridyl-derivatized peptides were selectively photodissociated from the ruthenium complex, as inferred by UV-vis and NMR spectroscopy. Importantly, the reactive aqua species generated from the conjugates, [(η6-p-cym)Ru(bpm)(H2O)]2+, reacted with the model DNA nucleobase 9-ethylguanine as well as with guanines of two DNA sequences, 5′dCATGGCT and 5′dAGCCATG. Interestingly, when irradiation was performed in the presence of the oligonucleotides, a new ruthenium adduct involving both guanines was formed as a consequence of the photodriven loss of p-cymene from the two monofunctional adducts. The release of the arene ligand and the formation of a ruthenated product with a multidentate binding mode might have important implications for the biological activity of such photoactivated ruthenium(II) arene complexes. Finally, photoreactions with the peptide-oligonucleotide hybrid, Phac-His-Gly-Met-linker-p5′dCATGGCT, also led to arene release and to guanine adducts, including a GG chelate. The lack of interaction with the peptide fragment confirms the preference of such organometallic ruthenium(II) complexes for guanine over other potential biological ligands, such as histidine or methionine amino acids.

Fragmentation of contaminant and endogenous DNA in ancient samples determined by shotgun sequencing; prospects for human palaeogenomics

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Despite the successful retrieval of genomes from past remains, the prospects for human palaeogenomics remain unclear because of the difficulty of distinguishing contaminant from endogenous DNA sequences. Previous sequence data generated on high-throughput sequencing platforms indicate that fragmentation of ancient DNA sequences is a characteristic trait primarily arising due to depurination processes that create abasic sites leading to DNA breaks.

Bioinformática: interfaz gráfica para comparación de genomas vía web

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El proyecto consiste en un entorno gráfico cuyo fin es el de visualizar, estudiar e interpretar la conservación de código genético existente entre los diferentes genomas. Una interface que permite cargar hasta ocho genomas para ser comparados en detalle, por pares o entre todos ellos a la vez. El gráfico que se muestra en la interfaz, representa los Maximal Unique Matchings entre cada par de genomas, lo que significa coincidencias de la mayor longitud posible no repetidas, en las secuencias de ADN de las especies comparadas. La finalidad es el estudio de las evoluciones que han ido apareciendo entre diferentes organismos o los genes que comparten unas especies con otras.

A minimal i-motif stabilized by minor groove G:T:G:T tetrads

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The repetitive DNA sequences found at telomeres and centromeres play a crucial role in the structure and function of eukaryotic chromosomes. This role may be related to the tendency observed in many repetitive DNAs to adopt non-canonical structures. Although there is an increasing recognition of the importance of DNA quadruplexes in chromosome biology, the co-existence of different quadruplex-forming elements in the same DNA structure is still a matter of debate. Here we report the structural study of the oligonucleotide d(TCGTTTCGT) and its cyclic analog d. Both sequences form dimeric quadruplex structures consisting of a minimal i-motif capped, at both ends, by a slipped minor groove-aligned G:T:G:T tetrad. These mini i-motifs, which do not exhibit the characteristic CD spectra of other i-motif structures, can be observed at neutral pH, although they are more stable under acidic conditions. This finding is particularly relevant since these oligonucleotide sequences do not contain contiguous cytosines. Importantly, these structures resemble the loop moiety adopted by an 11-nucleotide fragment of the conserved centromeric protein B (CENP-B) box motif, which is the binding site for the CENP-B.

A minimal i-motif stabilized by minor groove G:T:G:T tetrads

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The repetitive DNA sequences found at telomeres and centromeres play a crucial role in the structure and function of eukaryotic chromosomes. This role may be related to the tendency observed in many repetitive DNAs to adopt non-canonical structures. Although there is an increasing recognition of the importance of DNA quadruplexes in chromosome biology, the co-existence of different quadruplex-forming elements in the same DNA structure is still a matter of debate. Here we report the structural study of the oligonucleotide d(TCGTTTCGT) and its cyclic analog d. Both sequences form dimeric quadruplex structures consisting of a minimal i-motif capped, at both ends, by a slipped minor groove-aligned G:T:G:T tetrad. These mini i-motifs, which do not exhibit the characteristic CD spectra of other i-motif structures, can be observed at neutral pH, although they are more stable under acidic conditions. This finding is particularly relevant since these oligonucleotide sequences do not contain contiguous cytosines. Importantly, these structures resemble the loop moiety adopted by an 11-nucleotide fragment of the conserved centromeric protein B (CENP-B) box motif, which is the binding site for the CENP-B.

Population connectivity buffers genetic diversity loss in a seabird

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background Ancient DNA has revolutionized conservation genetic studies as it allows monitoring of the genetic variability of species through time and predicting the impact of ecosystems" threats on future population dynamics and viability. Meanwhile, the consequences of anthropogenic activities and climate change to island faunas, particularly seabirds, remain largely unknown. In this study, we examined temporal changes in the genetic diversity of a threatened seabird, the Cory"s shearwater (Calonectris borealis). Findings We analysed the mitochondrial DNA control region of ancient bone samples from the late-Holocene retrieved from the Canary archipelago (NE Atlantic) together with modern DNA sequences representative of the entire breeding range of the species. Our results show high levels of ancient genetic diversity in the Canaries comparable to that of the extant population. The temporal haplotype network further revealed rare but recurrent long-distance dispersal between ocean basins. The Bayesian demographic analyses reveal both regional and local population size expansion events, and this is in spite of the demographic decline experienced by the species over the last millennia. Conclusions Our findings suggest that population connectivity of the species has acted as a buffer of genetic losses and illustrate the use of ancient DNA to uncover such cryptic genetic events.

Solid-phase synthesis and NMR studies of modified oligonucleotides forming triplex helix and of oligonucleopeptides mimicking the topoisomerase I-DNA covalent complex

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Report for the scientific sojourn carried out at the Institut de Biologia Molecular de Barcelona of the CSIC –state agency – from april until september 2007. Topoisomerase I is an essential nuclear enzyme that modulates the topological status of DNA, facilitating DNA helix unwinding during replication and transcription. We have prepared the oligonucleotide-peptide conjugate Ac-NLeu-Asn-Tyr(p-3’TTCAGAAGC5’)-LeuC-CONH-(CH2)6-OH as model compound for NMR studies of the Topoisomerase I- DNA complex. Special attention was made on the synthetic aspects for the preparation of this challenging compound especially solid supports and protecting groups. The desired peptide was obtained although we did not achieve the amount of the conjugate needed for NMR studies. Most probably the low yield is due to the intrinsic sensitive to hydrolysis of the phosphate bond between oligonucleotide and tyrosine. We have started the synthesis and the structural characterization of oligonucleotides carrying intercalating compounds. At the present state we have obtained model duplex and quadruplex sequences modified with acridine and NMR studies are underway. In addition to this project we have successfully resolved the structure of a fusion peptide derived from hepatitis C virus envelope synthesized by the group of Dr. Haro and we have synthesized and started the characterization of a modified G-quadruplex.

Charge transfer in DNA: hole charge is confined to a single base pair due to solvation effects

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We include solvation effects in tight-binding Hamiltonians for hole states in DNA. The corresponding linear-response parameters are derived from accurate estimates of solvation energy calculated for several hole charge distributions in DNA stacks. Two models are considered: (A) the correction to a diagonal Hamiltonian matrix element depends only on the charge localized on the corresponding site and (B) in addition to this term, the reaction field due to adjacent base pairs is accounted for. We show that both schemes give very similar results. The effects of the polar medium on the hole distribution in DNA are studied. We conclude that the effects of polar surroundings essentially suppress charge delocalization in DNA, and hole states in (GC)n sequences are localized on individual guanines

Estimation of protein coding density in a corpus of DNA sequence data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.

Identifying protein-coding genes in genomic sequences

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Complete DNA sequence of Kuraishia capsulata illustrates novel genomic features among budding yeasts (Saccharomycotina)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The numerous yeast genome sequences presently available provide a rich source of information for functional as well as evolutionary genomics but unequally cover the large phylogenetic diversity of extant yeasts. We present here the complete sequence of the nuclear genome of the haploid-type strain of Kuraishia capsulata (CBS1993(T)), a nitrate-assimilating Saccharomycetales of uncertain taxonomy, isolated from tunnels of insect larvae underneath coniferous barks and characterized by its copious production of extracellular polysaccharides. The sequence is composed of seven scaffolds, one per chromosome, totaling 11.4 Mb and containing 6,029 protein-coding genes, ~13.5% of which being interrupted by introns. This GC-rich yeast genome (45.7%) appears phylogenetically related with the few other nitrate-assimilating yeasts sequenced so far, Ogataea polymorpha, O. parapolymorpha, and Dekkera bruxellensis, with which it shares a very reduced number of tRNA genes, a novel tRNA sparing strategy, and a common nitrate assimilation cluster, three specific features to this group of yeasts. Centromeres were recognized in GC-poor troughs of each scaffold. The strain bears MAT alpha genes at a single MAT locus and presents a significant degree of conservation with Saccharomyces cerevisiae genes, suggesting that it can perform sexual cycles in nature, although genes involved in meiosis were not all recognized. The complete absence of conservation of synteny between K. capsulata and any other yeast genome described so far, including the three other nitrate-assimilating species, validates the interest of this species for long-range evolutionary genomic studies among Saccharomycotina yeasts.

«
1
2
»