69 resultados para resolution prediction
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Background: Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both.Results: Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score () we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors.Conclusion: We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations.
Resumo:
Contribució al Seminari: "Les Euroregions: Experiències i aprenatges per a l’Euroregió Pirineus-Mediterrània", 15-16 de desembre de 2005
Resumo:
Informe de investigación elaborado a partir de una estancia en el Laboratorio de Diseño Computacional en Aeroespacial en el Massachusetts Institute of Technology (MIT), Estados Unidos, entre noviembre de 2006 y agosto de 2007. La aerodinámica es una rama de la dinámica de fluidos referida al estudio de los movimientos de los líquidos o gases, cuya meta principal es predecir las fuerzas aerodinámicas en un avión o cualquier tipo de vehículo, incluyendo los automóviles. Las ecuaciones de Navier-Stokes representan un estado dinámico del equilibrio de las fuerzas que actúan en cualquier región dada del fluido. Son uno de los sistemas de ecuaciones más útiles porque describen la física de una gran cantidad de fenómenos como corrientes del océano, flujos alrededor de una superficie de sustentación, etc. En el contexto de una tesis doctoral, se está estudiando un flujo viscoso e incompresible, solucionando las ecuaciones de Navier- Stokes incompresibles de una manera eficiente. Durante la estancia en el MIT, se ha utilizado un método de Galerkin discontinuo para solucionar las ecuaciones de Navier-Stokes incompresibles usando, o bien un parámetro de penalti para asegurar la continuidad de los flujos entre elementos, o bien un método de Galerkin discontinuo compacto. Ambos métodos han dado buenos resultados y varios ejemplos numéricos se han simulado para validar el buen comportamiento de los métodos desarrollados. También se han estudiado elementos particulares, los elementos de Raviart y Thomas, que se podrían utilizar en una formulación mixta para obtener un algoritmo eficiente para solucionar problemas numéricos complejos.
Resumo:
Projecte de recerca elaborat a partir d’una estada a la National Oceanography Centre of Southampton (NOCS), Gran Bretanya, entre maig i juliol del 2006. La possibilitat d’obtenir una estimació precissa de la salinitat marina (SSS) és important per a investigar i predir l’extensió del fenòmen del canvi climàtic. La missió Soil Moisture and Ocean Salinity (SMOS) va ser seleccionada per l’Agència Espacial Europea (ESA) per a obtenir mapes de salinitat de la superfície marina a escala global i amb un temps de revisita petit. Abans del llençament de SMOS es preveu l’anàlisi de la variabilitat horitzontal de la SSS i del potencial de les dades recuperades a partir de mesures de SMOS per a reproduir comportaments oceanogràfics coneguts. L’objectiu de tot plegat és emplenar el buit existent entre les fonts de dades d’entrada/auxiliars fiables i les eines desenvolupades per a simular i processar les dades adquirides segons la configuració de SMOS. El SMOS End-to-end Performance Simulator (SEPS) és un simulador adhoc desenvolupat per la Universitat Politècnica de Catalunya (UPC) per a generar dades segons la configuració de SMOS. Es va utilitzar dades d’entrada a SEPS procedents del projecte Ocean Circulation and Climate Advanced Modeling (OCCAM), utilitzat al NOCS, a diferents resolucions espacials. Modificant SEPS per a poder fer servir com a entrada les dades OCCAM es van obtenir dades de temperatura de brillantor simulades durant un mes amb diferents observacions ascendents que cobrien la zona seleccionada. Les tasques realitzades durant l’estada a NOCS tenien la finalitat de proporcionar una tècnica fiable per a realitzar la calibració externa i per tant cancel•lar el bias, una metodologia per a promitjar temporalment les diferents adquisicions durant les observacions ascendents, i determinar la millor configuració de la funció de cost abans d’explotar i investigar les posibiltats de les dades SEPS/OCCAM per a derivar la SSS recuperada amb patrons d’alta resolució.
Resumo:
Forest fires are a serious threat to humans and nature from an ecological, social and economic point of view. Predicting their behaviour by simulation still delivers unreliable results and remains a challenging task. Latest approaches try to calibrate input variables, often tainted with imprecision, using optimisation techniques like Genetic Algorithms. To converge faster towards fitter solutions, the GA is guided with knowledge obtained from historical or synthetical fires. We developed a robust and efficient knowledge storage and retrieval method. Nearest neighbour search is applied to find the fire configuration from knowledge base most similar to the current configuration. Therefore, a distance measure was elaborated and implemented in several ways. Experiments show the performance of the different implementations regarding occupied storage and retrieval time with overly satisfactory results.
Resumo:
Pensions together with savings and investments during active life are key elements of retirement planning. Motivation for personal choices about the standard of living, bequest and the replacement ratio of pension with respect to last salary income must be considered. This research contributes to the financial planning by helping to quantify long-term care economic needs. We estimate life expectancy from retirement age onwards. The economic cost of care per unit of service is linked to the expected time of needed care and the intensity of required services. The expected individual cost of long-term care from an onset of dependence is estimated separately for men and women. Assumptions on the mortality of the dependent people compared to the general population are introduced. Parameters defining eligibility for various forms of coverage by the universal public social care of the welfare system are addressed. The impact of the intensity of social services on individual predictions is assessed, and a partial coverage by standard private insurance products is also explored. Data were collected by the Spanish Institute of Statistics in two surveys conducted on the general Spanish population in 1999 and in 2008. Official mortality records and life table trends were used to create realistic scenarios for longevity. We find empirical evidence that the public long-term care system in Spain effectively mitigates the risk of incurring huge lifetime costs. We also find that the most vulnerable categories are citizens with moderate disabilities that do not qualify to obtain public social care support. In the Spanish case, the trends between 1999 and 2008 need to be further explored.
Resumo:
Alfacs and Fangar Bay in the Ebro Delta, NW Mediterranean are the major sites in Catalonia for shellfish cultivation. These bays are subject to occasional closures in shellfish harvesting due to the presence of phycotoxins. Fish kills have also been associated with harmful algal blooms. The comparison of phytoplankton dynamics in both bays offers the opportunity to reveal differences in bloom patterns of species known to be harmful for the ecosystem and aquaculture activities. Field research is underway under the GEOHAB framework within the Core Research Project on HABs in Fjords and Coastal Embayments. The overall objective of this study is to improve our understanding of HAB biogeographical patterns, and key elements driving bloom dynamics in time and space within these semi-constrained embayments. Via the comparative approach we aim to improve the prediction for monitoring purposes, with a focus on Karlodinium spp. associated with massive kills of aquaculture species. This objective is addressed by incorporating long-term time series of phytoplankton identification and enumeration with the first results of recent field work in both bays. The latter includes the application of optical sensors, to yield a complementary view with enhanced spatial and temporal resolution of bloom phenomena.
Resumo:
To perform a climatic analysis of the annual UV index (UVI) variations in Catalonia, Spain (northeast of the Iberian Peninsula), a new simple parameterization scheme is presented based on a multilayer radiative transfer model. The parameterization performs fast UVI calculations for a wide range of cloudless and snow-free situations and can be applied anywhere. The following parameters are considered: solar zenith angle, total ozone column, altitude, aerosol optical depth, and single-scattering albedo. A sensitivity analysis is presented to justify this choice with special attention to aerosol information. Comparisons with the base model show good agreement, most of all for the most common cases, giving an absolute error within 0.2 in the UVI for a wide range of cases considered. Two tests are done to show the performance of the parameterization against UVI measurements. One uses data from a high-quality spectroradiometer from Lauder, New Zealand [45.04°S, 169.684°E, 370 m above mean sea level (MSL)], where there is a low presence of aerosols. The other uses data from a Robertson–Berger-type meter from Girona, Spain (41.97°N, 2.82°E, 100 m MSL), where there is more aerosol load and where it has been possible to study the effect of aerosol information on the model versus measurement comparison. The parameterization is applied to a climatic analysis of the annual UVI variation in Catalonia, showing the contributions of solar zenith angle, ozone, and aerosols. High-resolution seasonal maps of typical UV index values in Catalonia are presented
Resumo:
Projecte de recerca elaborat a partir d’una estada al Max Planck Institute for Human Cognitive and Brain Sciences, Alemanya, entre 2010 i 2012. El principal objectiu d’aquest projecte era estudiar en detall les estructures subcorticals, en concret, el rol dels ganglis basals en control cognitiu durant processament lingüístic i no-lingüístic. Per tal d’assolir una diferenciació minuciosa en els diferents nuclis dels ganglis basals s’utilitzà ressonància magnètica d’ultra-alt camp i alta resolució (7T-MRI). El còrtex prefrontal lateral i els ganglis basals treballant conjuntament per a mitjançar memòria de treball i la regulació “top-down” de la cognició. Aquest circuit regula l’equilibri entre respostes automàtiques i d’alt-ordre cognitiu. Es crearen tres condicions experimentals principals: frases/seqüències noambigües, no-gramatical i ambigües. Les frases/seqüències no-ambigües haurien de provocar una resposta automàtica, mentre les frases/seqüències ambigües i no-gramaticals produïren un conflicte amb la resposta automàtica, i per tant, requeririen una resposta de d’alt-ordre cognitiu. Dins del domini de la resposta de control, la ambigüitat i no-gramaticalitat representen dues dimensions diferents de la resolució de conflicte, mentre per una frase/seqüència temporalment ambigua existeix una interpretació correcte, aquest no és el cas per a les frases/seqüències no-gramaticals. A més, el disseny experimental incloïa una manipulació lingüística i nolingüística, la qual posà a prova la hipòtesi que els efectes són de domini-general; així com una manipulació semàntica i sintàctica que avaluà les diferències entre el processament d’ambigüitat/error “intrínseca” vs. “estructural”. Els resultats del primer experiment (sintax-lingüístic) mostraren un gradient rostroventralcaudodorsal de control cognitiu dins del nucli caudat, això és, les regions més rostrals sostenint els nivells més alts de processament cognitiu
Resumo:
El principal objectiu del projecte era desenvolupar millores conceptuals i metodològiques que permetessin una millor predicció dels canvis en la distribució de les espècies (a una escala de paisatge) derivats de canvis ambientals en un context dominat per pertorbacions. En un primer estudi, vàrem comparar l'eficàcia de diferents models dinàmics per a predir la distribució de l'hortolà (Emberiza hortulana). Els nostres resultats indiquen que un model híbrid que combini canvis en la qualitat de l'hàbitat, derivats de canvis en el paisatge, amb un model poblacional espacialment explícit és una aproximació adequada per abordar canvis en la distribució d'espècies en contextos de dinàmica ambiental elevada i una capacitat de dispersió limitada de l'espècie objectiu. En un segon estudi abordarem la calibració mitjançant dades de seguiment de models de distribució dinàmics per a 12 espècies amb preferència per hàbitats oberts. Entre les conclusions extretes destaquem: (1) la necessitat de que les dades de seguiment abarquin aquelles àrees on es produeixen els canvis de qualitat; (2) el biaix que es produeix en la estimació dels paràmetres del model d'ocupació quan la hipòtesi de canvi de paisatge o el model de qualitat d'hàbitat són incorrectes. En el darrer treball estudiarem el possible impacte en 67 espècies d’ocells de diferents règims d’incendis, definits a partir de combinacions de nivells de canvi climàtic (portant a un augment esperat de la mida i freqüència d’incendis forestals), i eficiència d’extinció per part dels bombers. Segons els resultats dels nostres models, la combinació de factors antropogènics del regim d’incendis, tals com l’abandonament rural i l’extinció, poden ser més determinants per als canvis de distribució que els efectes derivats del canvi climàtic. Els productes generats inclouen tres publicacions científiques, una pàgina web amb resultats del projecte i una llibreria per a l'entorn estadístic R.
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.
Resumo:
The completion of the sequencing of the mouse genome promises to help predict human genes with greater accuracy. While current ab initio gene prediction programs are remarkably sensitive (i.e., they predict at least a fragment of most genes), their specificity is often low, predicting a large number of false-positive genes in the human genome. Sequence conservation at the protein level with the mouse genome can help eliminate some of those false positives. Here we describe SGP2, a gene prediction program that combines ab initio gene prediction with TBLASTX searches between two genome sequences to provide both sensitive and specific gene predictions. The accuracy of SGP2 when used to predict genes by comparing the human and mouse genomes is assessed on a number of data sets, including single-gene data sets, the highly curated human chromosome 22 predictions, and entire genome predictions from ENSEMBL. Results indicate that SGP2 outperforms purely ab initio gene prediction methods. Results also indicate that SGP2 works about as well with 3x shotgun data as it does with fully assembled genomes. SGP2 provides a high enough specificity that its predictions can be experimentally verified at a reasonable cost. SGP2 was used to generate a complete set of gene predictions on both the human and mouse by comparing the genomes of these two species. Our results suggest that another few thousand human and mouse genes currently not in ENSEMBL are worth verifying experimentally.
Resumo:
Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.
Resumo:
We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combinationof several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost surely, to that of the optimum, given by the Bayes predictor. We offer an analog result for the prediction of stationary gaussian processes.