90 resultados para Short homologous sequences
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.
Resumo:
Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.
Resumo:
The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
Previous genetic studies have demonstrated that natal homing shapes the stock structure of marine turtle nesting populations. However, widespread sharing of common haplotypes based on short segments of the mitochondrial control region often limits resolution of the demographic connectivity of populations. Recent studies employing longer control region sequences to resolve haplotype sharing have focused on regional assessments of genetic structure and phylogeography. Here we synthesize available control region sequences for loggerhead turtles from the Mediterranean Sea, Atlantic, and western Indian Ocean basins. These data represent six of the nine globally significant regional management units (RMUs) for the species and include novel sequence data from Brazil, Cape Verde, South Africa and Oman. Genetic tests of differentiation among 42 rookeries represented by short sequences (380 bp haplotypes from 3,486 samples) and 40 rookeries represented by long sequences (~800 bp haplotypes from 3,434 samples) supported the distinction of the six RMUs analyzed as well as recognition of at least 18 demographically independent management units (MUs) with respect to female natal homing. A total of 59 haplotypes were resolved. These haplotypes belonged to two highly divergent global lineages, with haplogroup I represented primarily by CC-A1, CC-A4, and CC-A11 variants and haplogroup II represented by CC-A2 and derived variants. Geographic distribution patterns of haplogroup II haplotypes and the nested position of CC-A11.6 from Oman among the Atlantic haplotypes invoke recent colonization of the Indian Ocean from the Atlantic for both global lineages. The haplotypes we confirmed for western Indian Ocean RMUs allow reinterpretation of previous mixed stock analysis and further suggest that contemporary migratory connectivity between the Indian and Atlantic Oceans occurs on a broader scale than previously hypothesized. This study represents a valuable model for conducting comprehensive international cooperative data management and research in marine ecology.
Resumo:
Much of the research on industry dynamics focuses on the interdependence between the sectorial rates of entry and exit. This paper argues that the size of firms and the reaction-adjustment period are important conditions missed in this literature. I illustrate the effects of this omission using data from the Spanish manufacturing industries between 1994 and 2001. Estimates from systems of equations models provide evidence of a conical revolving door phenomenon and of partial adjustments in the replacement-displacement of large firms. KEYWORDS: aggregation, industry dynamics, panel data, symmetry, simultaneity. JEL CLASSIFICATION: C33, C52, L60, L11
Resumo:
This paper explores the real exchange rate behavior in Mexico from 1960 until 2005. Since the empirical analysis reveals that the real exchange rate is not mean reverting, we propose that economic fundamental variables affect its evolution in the long-run. Therefore, based on equilibrium exchange rate paradigms, we propose a simple model of real exchange rate determination which includes the relative labor productivity, the real interest rates and the net foreign assets over a long period of time. Our analysis also considers the dynamic adjustment in response to shocks through impulse response functions derived from the multivariate VAR model.
Resumo:
One of the main implications of the efficient market hypothesis (EMH) is that expected future returns on financial assets are not predictable if investors are risk neutral. In this paper we argue that financial time series offer more information than that this hypothesis seems to supply. In particular we postulate that runs of very large returns can be predictable for small time periods. In order to prove this we propose a TAR(3,1)-GARCH(1,1) model that is able to describe two different types of extreme events: a first type generated by large uncertainty regimes where runs of extremes are not predictable and a second type where extremes come from isolated dread/joy events. This model is new in the literature in nonlinear processes. Its novelty resides on two features of the model that make it different from previous TAR methodologies. The regimes are motivated by the occurrence of extreme values and the threshold variable is defined by the shock affecting the process in the preceding period. In this way this model is able to uncover dependence and clustering of extremes in high as well as in low volatility periods. This model is tested with data from General Motors stocks prices corresponding to two crises that had a substantial impact in financial markets worldwide; the Black Monday of October 1987 and September 11th, 2001. By analyzing the periods around these crises we find evidence of statistical significance of our model and thereby of predictability of extremes for September 11th but not for Black Monday. These findings support the hypotheses of a big negative event producing runs of negative returns in the first case, and of the burst of a worldwide stock market bubble in the second example. JEL classification: C12; C15; C22; C51 Keywords and Phrases: asymmetries, crises, extreme values, hypothesis testing, leverage effect, nonlinearities, threshold models
Resumo:
Based on Lucas functions, an improved version of the Diffie-Hellman distribution key scheme and to the ElGamal public key cryptosystem scheme are proposed, together with an implementation and computational cost. The security relies on the difficulty of factoring an RSA integer and on the difficulty of computing the discrete logarithm.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt."
Resumo:
Based on third order linear sequences, an improvement version of the Diffie-Hellman distribution key scheme and the ElGamal public key cryptosystem scheme are proposed, together with an implementation and computational cost. The security relies on the difficulty of factoring an RSA integer and on the difficulty of computing the discrete logarithm.
Resumo:
We consider a market where firms hire workers to run their projects and such projects differ in profitability. At any period, each firm needs two workers to successfully run its project: a junior agent, with no specific skills, and a senior worker, whose effort is not verifiable. Senior workers differ in ability and their competence is revealed after they have worked as juniors in the market. We study the length of the contractual relationships between firms and workers in an environment where the matching between firms and workers is the result of market interaction. We show that, despite in a one-firm-one-worker set-up long-term contracts are the optimal choice for firms, market forces often induce firms to use short-term contracts. Unless the market only consists of firms with very profitable projects, firms operating highly profitable projects offer short-term contracts to ensure the service of high-ability workers and those with less lucrative projects also use short-term contracts to save on the junior workers' wage. Intermediate firms may (or may not) hire workers through long-term contracts.
Resumo:
Projecte de recerca elaborat a partir d’una estada a la Stanford University, EEUU, entre 2007 i 2009. El present projecte es basa 1) en la síntesi de cadenes d'ARN dirigides a la inhibició de l'expressió gènica per un mecanisme d'ARN d'interferència (siRNAs o short interefering RNAs) i 2) en l'avaluació de l'activitat in vitro d'aquests oligonucleòtids en cultius cel•lulars. Concretament, la meva recerca ha estat enfocada principalment a l'estudi de cadenes de siRNA modificades amb nucleobases 5-metil i 5-propinil pirimidíniques. Es tractava d'avaluar l'efecte que exerceixen els factors estèrics en el major groove (solc major) dels siRNAs sobre la seva activitat biològica. En aquest sentit, he dut aterme síntesi de fosforamidits de nucleòsis pirimidínics modificats a la posició C-5 de la nucleobase. A continuació he incorporat aquestes unitats nucleosídiques en cadenes d'ARN emprant un sintetitzador d’ADN/ARN i he estudiat l'estabilitat dels corresponents dúplexs d'ARN mitjançant experiments de desnaturalització tèrmica. Finalment he dut a terme experiments d'inhibició de l'expressió gènica en cèl.lules HeLa per tal d'avaluar l'activitat biològia d'aquests siRNAs modificats. Els resultats d'aquests estudis han posat de manifest que la presència de grups voluminosos com el propinil a l'extrem 5' del dúplex de siRNA (definit per la cadena guia o antisense) influeix de forma molt negativa en la seva activitat biològica. En canvi, grups menys voluminosos com el metil hi influeixen positivament, de manera que algunes de les cadenes sintetitzades han resultat ser més actives que els corresponents siRNAs naturals (wild type siRNAs). A més, aquest tipus de modificació contribueix positivament en l'estabilitat de cadenes de siRNA en sèrum humà. Aquest treball ha estat publicat (Terrazas, M.; Kool, E.T. "Major Groove Modifications Improve siRNA Stability and Biological Activity" Nucleic Acids Res. 2009, in press).
Resumo:
Smarthistory.org is a proven, sustainable model for open educational resources in the Humanities. We discuss lessons learned during its agile development. Smarthistory.org is a free, creative-commons licensed, multi-media web-book designed as a dynamic enhancement or substitute for the traditional art history textbook. It uses conversation instead of the impersonal voice of the typical textbook in-order to reveal disagreement, emotion, and the experience of looking. The listener remains engaged with both the content and the interaction of the speakers. These conversations model close looking and a willingness to encounter and engage the unfamiliar. Smarthistory takes the inherent dialogic and multimedia nature of the web and uses it as a pedagogical method. This extendable Humanities framework uses an open-source content management system making Smarthistory inexpensive to create, and easy to manage and update. Its chronological timeline/chapter-based format integrates new contributions into a single historical framework, a structure applicable across the Humanities.