72 resultados para Sequence variability


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10,010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Single nucleotide polymorphisms (SNPs) are the most frequent type of sequence variation between individuals, and represent a promising tool for finding genetic determinants of complex diseases and understanding the differences in drug response. In this regard, it is of particular interest to study the effect of non-synonymous SNPs in the context of biological networks such as cell signalling pathways. UniProt provides curated information about the functional and phenotypic effects of sequence variation, including SNPs, as well as on mutations of protein sequences. However, no strategy has been developed to integrate this information with biological networks, with the ultimate goal of studying the impact of the functional effect of SNPs in the structure and dynamics of biological networks. Results: First, we identified the different challenges posed by the integration of the phenotypic effect of sequence variants and mutations with biological networks. Second, we developed a strategy for the combination of data extracted from public resources, such as UniProt, NCBI dbSNP, Reactome and BioModels. We generated attribute files containing phenotypic and genotypic annotations to the nodes of biological networks, which can be imported into network visualization tools such as Cytoscape. These resources allow the mapping and visualization of mutations and natural variations of human proteins and their phenotypic effect on biological networks (e.g. signalling pathways, protein-protein interaction networks, dynamic models). Finally, an example on the use of the sequence variation data in the dynamics of a network model is presented. Conclusion: In this paper we present a general strategy for the integration of pathway and sequence variation data for visualization, analysis and modelling purposes, including the study of the functional impact of protein sequence variations on the dynamics of signalling pathways. This is of particular interest when the SNP or mutation is known to be associated to disease. We expect that this approach will help in the study of the functional impact of disease-associated SNPs on the behaviour of cell signalling pathways, which ultimately will lead to a better understanding of the mechanisms underlying complex diseases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Single Nucleotide Polymorphisms, among other type of sequence variants, constitute key elements in genetic epidemiology and pharmacogenomics. While sequence data about genetic variation is found at databases such as dbSNP, clues about the functional and phenotypic consequences of the variations are generally found in biomedical literature. The identification of the relevant documents and the extraction of the information from them are hampered by the large size of literature databases and the lack of widely accepted standard notation for biomedical entities. Thus, automatic systems for the identification of citations of allelic variants of genes in biomedical texts are required. Results: Our group has previously reported the development of OSIRIS, a system aimed at the retrieval of literature about allelic variants of genes http://ibi.imim.es/osirisform.html. Here we describe the development of a new version of OSIRIS (OSIRISv1.2, http://ibi.imim.es/OSIRISv1.2.html webcite) which incorporates a new entity recognition module and is built on top of a local mirror of the MEDLINE collection and HgenetInfoDB: a database that collects data on human gene sequence variations. The new entity recognition module is based on a pattern-based search algorithm for the identification of variation terms in the texts and their mapping to dbSNP identifiers. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in 99% precision, 82% recall, and an F-score of 0.89. As an example, the application of the system for collecting literature citations for the allelic variants of genes related to the diseases intracranial aneurysm and breast cancer is presented. Conclusion: OSIRISv1.2 can be used to link literature references to dbSNP database entries with high accuracy, and therefore is suitable for collecting current knowledge on gene sequence variations and supporting the functional annotation of variation databases. The application of OSIRISv1.2 in combination with controlled vocabularies like MeSH provides a way to identify associations of biomedical interest, such as those that relate SNPs with diseases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today’s competitive markets, the importance of goodscheduling strategies in manufacturing companies lead to theneed of developing efficient methods to solve complexscheduling problems.In this paper, we studied two production scheduling problemswith sequence-dependent setups times. The setup times areone of the most common complications in scheduling problems,and are usually associated with cleaning operations andchanging tools and shapes in machines.The first problem considered is a single-machine schedulingwith release dates, sequence-dependent setup times anddelivery times. The performance measure is the maximumlateness.The second problem is a job-shop scheduling problem withsequence-dependent setup times where the objective is tominimize the makespan.We present several priority dispatching rules for bothproblems, followed by a study of their performance. Finally,conclusions and directions of future research are presented.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a general equilibrium model of money demand wherethe velocity of money changes in response to endogenous fluctuations in the interest rate. The parameter space can be divided into two subsets: one where velocity is constant and equal to one as in cash-in-advance models, and another one where velocity fluctuates as in Baumol (1952). Despite its simplicity, in terms of paramaters to calibrate, the model performs surprisingly well. In particular, it approximates the variability of money velocity observed in the U.S. for the post-war period. The model is then used to analyze the welfare costs of inflation under uncertainty. This application calculates the errors derived from computing the costs of inflation with deterministic models. It turns out that the size of this difference is small, at least for the levels of uncertainty estimated for the U.S. economy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Descripció de la seqüència estratigràfica i dels registres paleoambientals dels sediments holocens de Sant Julià de Boada

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cork is the bark of the cork oak tree (Quercus suber L), a renewable and biodegradable raw bioresource concentrated mainly in the Mediterranean region. Development of its potential uses as a biosorbent will require the investigation of its chemical composition; such information can be of help to understand its interactions with organic pollutants. The present study investigates the summative chemical composition of three bark layers (back, cork, and belly) of five Spanish cork samples and one cork sample from Portugal. Suberin was the main component in all the samples (21.1 to 53.1%), followed by lignin (14.8 to 31%), holocellulose (2.3 to 33.6%), extractives (7.3 to 20.4%), and ash (0.4 to 3.3%). The Kruskal-Wallis test was used to determine whether the variations in chemical composition with respect to the production area and bark layers were significant. The results indicate that, with respect to the bark layer, significant differences were found only for suberin and holocellulose contents: they were higher in the belly and cork than in the back. Based on the results presented, cork is a material with a lot of potential because of its heterogeneity in chemical composition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present Stroemgren uvby and Hbeta_ photometry for a set of 575 northern main sequence A type stars, most of them belonging to the Hipparcos Input Catalogue, with V from 5mag to 10mag and with known radial velocities. These observations enlarge the catalogue we began to compile some years ago to more than 1500 stars. Our catalogue includes kinematic and astrophysical data for each star. Our future goal is to perform an accurate analysis of the kinematical behaviour of these stars in the solar neighbourhood.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Context.LS 5039 has been observed with several X-ray instruments so far showing quite steady emission in the long term and no signatures of accretion disk. The source also presents X-ray variability at orbital timescales in flux and photon index. The system harbors an O-type main sequence star with moderate mass-loss. At present, the link between the X-rays and the stellar wind is unclear. Aims.We study the X-ray fluxes, spectra, and absorption properties of LS 5039 at apastron and periastron passages during an epoch of enhanced stellar mass-loss, and the long term evolution of the latter in connection with the X-ray fluxes. Methods.New XMM-Newton observations were performed around periastron and apastron passages in September 2005, when the stellar wind activity was apparently higher. April 2005 Chandra observations on LS 5039 were revisited. Moreover, a compilation of H EW data obtained since 1992, from which the stellar mass-loss evolution can be approximately inferred, was carried out. Results.XMM-Newton observations show higher and harder emission around apastron than around periastron. No signatures of thermal emission or a reflection iron line indicating the presence of an accretion disk are found in the spectrum, and the hydrogen column density () is compatible with being the same in both observations and consistent with the interstellar value. 2005 Chandra observations show a hard X-ray spectrum, and possibly high fluxes, although pileup effects preclude conclusive results from being obtained. The H EW shows yearly variations of 10%, and does not seem to be correlated with X-ray fluxes obtained at similar phases, unlike what is expected in the wind accretion scenario. Conclusions.2005 XMM-Newton and Chandra observations are consistent with 2003 RXTE/PCA results, namely moderate flux and spectral variability at different orbital phases. The constancy of the seems to imply that either the X-ray emitter is located at 1012 cm from the compact object, or the density in the system is 3 to 27 times smaller than that predicted by a spherical symmetric wind model. We suggest that the multiwavelength non-thermal emission of LS 5039 is related to the observed extended radio jets and is unlikely to be produced inside the binary system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems.