79 resultados para Computational prediction
Resumo:
We evaluate the performance of different optimization techniques developed in the context of optical flowcomputation with different variational models. In particular, based on truncated Newton methods (TN) that have been an effective approach for large-scale unconstrained optimization, we develop the use of efficient multilevel schemes for computing the optical flow. More precisely, we evaluate the performance of a standard unidirectional multilevel algorithm - called multiresolution optimization (MR/OPT), to a bidrectional multilevel algorithm - called full multigrid optimization (FMG/OPT). The FMG/OPT algorithm treats the coarse grid correction as an optimization search direction and eventually scales it using a line search. Experimental results on different image sequences using four models of optical flow computation show that the FMG/OPT algorithm outperforms both the TN and MR/OPT algorithms in terms of the computational work and the quality of the optical flow estimation.
Resumo:
The system described herein represents the first example of a recommender system in digital ecosystems where agents negotiate services on behalf of small companies. The small companies compete not only with price or quality, but with a wider service-by-service composition by subcontracting with other companies. The final result of these offerings depends on negotiations at the scale of millions of small companies. This scale requires new platforms for supporting digital business ecosystems, as well as related services like open-id, trust management, monitors and recommenders. This is done in the Open Negotiation Environment (ONE), which is an open-source platform that allows agents, on behalf of small companies, to negotiate and use the ecosystem services, and enables the development of new agent technologies. The methods and tools of cyber engineering are necessary to build up Open Negotiation Environments that are stable, a basic condition for predictable business and reliable business environments. Aiming to build stable digital business ecosystems by means of improved collective intelligence, we introduce a model of negotiation style dynamics from the point of view of computational ecology. This model inspires an ecosystem monitor as well as a novel negotiation style recommender. The ecosystem monitor provides hints to the negotiation style recommender to achieve greater stability of an open negotiation environment in a digital business ecosystem. The greater stability provides the small companies with higher predictability, and therefore better business results. The negotiation style recommender is implemented with a simulated annealing algorithm at a constant temperature, and its impact is shown by applying it to a real case of an open negotiation environment populated by Italian companies
Resumo:
Report for the scientific sojourn carried out at Massachusetts General Hospital Cancer Center-Harvard Medical School, Estats Units, from 2010 to 2011. The project aims to study the aggregation behavior of amphiphilic molecules in the continuous phase of highly concentrated emulsions, which can be used as templates for the synthesis of meso/macroporous materials. At this stage of the project, we have investigated the self-assembly of diblock and triblock surfactants under the effect of a confined geometry being surrounded by the droplets of the dispersed phase. These droplets limit the growth of the aggregates, deeply modify their orientation and hence alter their spatial arrangement as compared to the self-assembly taking place far enough from any boundary surface, that is in the bulk. By performing Monte Carlo simulations, we have showed that the interface between the dispersed and continuous phases as well as its shape has a significant impact on the structural order of the resulting aggregates and hence on the potential applications of highly concentrated emulsions as reaction media, drug delivery systems, or templates for meso/macroporous materials. Due to the combined effect of symmetry breaking and morphological frustration, very intriguing structures, such as square columnar liquid crystals, twisted X-shaped aggregates, and helical phases of cylindrical aggregates, never observed in the bulk for the same model surfactant, have been found. The presence of other more conventional structures, such as micelles and cubic and hexagonal liquid crystals, formed at low and high amphiphilic concentrations, respectively, further enhance the interest on this already rich aggregation behavior.
Resumo:
El principal objectiu del projecte era desenvolupar millores conceptuals i metodològiques que permetessin una millor predicció dels canvis en la distribució de les espècies (a una escala de paisatge) derivats de canvis ambientals en un context dominat per pertorbacions. En un primer estudi, vàrem comparar l'eficàcia de diferents models dinàmics per a predir la distribució de l'hortolà (Emberiza hortulana). Els nostres resultats indiquen que un model híbrid que combini canvis en la qualitat de l'hàbitat, derivats de canvis en el paisatge, amb un model poblacional espacialment explícit és una aproximació adequada per abordar canvis en la distribució d'espècies en contextos de dinàmica ambiental elevada i una capacitat de dispersió limitada de l'espècie objectiu. En un segon estudi abordarem la calibració mitjançant dades de seguiment de models de distribució dinàmics per a 12 espècies amb preferència per hàbitats oberts. Entre les conclusions extretes destaquem: (1) la necessitat de que les dades de seguiment abarquin aquelles àrees on es produeixen els canvis de qualitat; (2) el biaix que es produeix en la estimació dels paràmetres del model d'ocupació quan la hipòtesi de canvi de paisatge o el model de qualitat d'hàbitat són incorrectes. En el darrer treball estudiarem el possible impacte en 67 espècies d’ocells de diferents règims d’incendis, definits a partir de combinacions de nivells de canvi climàtic (portant a un augment esperat de la mida i freqüència d’incendis forestals), i eficiència d’extinció per part dels bombers. Segons els resultats dels nostres models, la combinació de factors antropogènics del regim d’incendis, tals com l’abandonament rural i l’extinció, poden ser més determinants per als canvis de distribució que els efectes derivats del canvi climàtic. Els productes generats inclouen tres publicacions científiques, una pàgina web amb resultats del projecte i una llibreria per a l'entorn estadístic R.
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
Annotation of protein-coding genes is a key goal of genome sequencing projects. In spite of tremendous recent advances in computational gene finding, comprehensive annotation remains a challenge. Peptide mass spectrometry is a powerful tool for researching the dynamic proteome and suggests an attractive approach to discover and validate protein-coding genes. We present algorithms to construct and efficiently search spectra against a genomic database, with no prior knowledge of encoded proteins. By searching a corpus of 18.5 million tandem mass spectra (MS/MS) from human proteomic samples, we validate 39,000 exons and 11,000 introns at the level of translation. We present translation-level evidence for novel or extended exons in 16 genes, confirm translation of 224 hypothetical proteins, and discover or confirm over 40 alternative splicing events. Polymorphisms are efficiently encoded in our database, allowing us to observe variant alleles for 308 coding SNPs. Finally, we demonstrate the use of mass spectrometry to improve automated gene prediction, adding 800 correct exons to our predictions using a simple rescoring strategy. Our results demonstrate that proteomic profiling should play a role in any genome sequencing project.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
The completion of the sequencing of the mouse genome promises to help predict human genes with greater accuracy. While current ab initio gene prediction programs are remarkably sensitive (i.e., they predict at least a fragment of most genes), their specificity is often low, predicting a large number of false-positive genes in the human genome. Sequence conservation at the protein level with the mouse genome can help eliminate some of those false positives. Here we describe SGP2, a gene prediction program that combines ab initio gene prediction with TBLASTX searches between two genome sequences to provide both sensitive and specific gene predictions. The accuracy of SGP2 when used to predict genes by comparing the human and mouse genomes is assessed on a number of data sets, including single-gene data sets, the highly curated human chromosome 22 predictions, and entire genome predictions from ENSEMBL. Results indicate that SGP2 outperforms purely ab initio gene prediction methods. Results also indicate that SGP2 works about as well with 3x shotgun data as it does with fully assembled genomes. SGP2 provides a high enough specificity that its predictions can be experimentally verified at a reasonable cost. SGP2 was used to generate a complete set of gene predictions on both the human and mouse by comparing the genomes of these two species. Our results suggest that another few thousand human and mouse genes currently not in ENSEMBL are worth verifying experimentally.
Resumo:
Background: We present the results of EGASP, a community experiment to assess the state-ofthe-art in genome annotation within the ENCODE regions, which span 1% of the human genomesequence. The experiment had two major goals: the assessment of the accuracy of computationalmethods to predict protein coding genes; and the overall assessment of the completeness of thecurrent human genome annotations as represented in the ENCODE regions. For thecomputational prediction assessment, eighteen groups contributed gene predictions. Weevaluated these submissions against each other based on a ‘reference set’ of annotationsgenerated as part of the GENCODE project. These annotations were not available to theprediction groups prior to the submission deadline, so that their predictions were blind and anexternal advisory committee could perform a fair assessment.Results: The best methods had at least one gene transcript correctly predicted for close to 70%of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into accountalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotidelevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programsrelying on mRNA and protein sequences were the most accurate in reproducing the manuallycurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could beverified.Conclusions: This is the first such experiment in human DNA, and we have followed thestandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe theresults presented here contribute to the value of ongoing large-scale annotation projects and shouldguide further experimental methods when being scaled up to the entire human genome sequence.
Resumo:
Background: Despite the continuous production of genome sequence for a number of organisms,reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularlytrue for genomes for which there is not a large collection of known gene sequences, such as therecently published chicken genome. We used the chicken sequence to test comparative andhomology-based gene-finding methods followed by experimental validation as an effective genomeannotation method.Results: We performed experimental evaluation by RT-PCR of three different computational genefinders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram wascomputed and each component of it was evaluated. The results showed that de novo comparativemethods can identify up to about 700 chicken genes with no previous evidence of expression, andcan correctly extend about 40% of homology-based predictions at the 5' end.Conclusions: De novo comparative gene prediction followed by experimental verification iseffective at enhancing the annotation of the newly sequenced genomes provided by standardhomology-based methods.
Resumo:
Background: Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both.Results: Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score () we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors.Conclusion: We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations.
Resumo:
Background: The cooperative interaction between transcription factors has a decisive role in the control of the fate of the eukaryotic cell. Computational approaches for characterizing cooperative transcription factors in yeast, however, are based on different rationales and provide a low overlap between their results. Because the wealth of information contained in protein interaction networks and regulatory networks has proven highly effective in elucidating functional relationships between proteins, we compared different sets of cooperative transcription factor pairs (predicted by four different computational methods) within the frame of those networks. Results: Our results show that the overlap between the sets of cooperative transcription factors predicted by the different methods is low yet significant. Cooperative transcription factors predicted by all methods are closer and more clustered in the protein interaction network than expected by chance. On the other hand, members of a cooperative transcription factor pair neither seemed to regulate each other nor shared similar regulatory inputs, although they do regulate similar groups of target genes. Conclusion: Despite the different definitions of transcriptional cooperativity and the different computational approaches used to characterize cooperativity between transcription factors, the analysis of their roles in the framework of the protein interaction network and the regulatory network indicates a common denominator for the predictions under study. The knowledge of the shared topological properties of cooperative transcription factor pairs in both networks can be useful not only for designing better prediction methods but also for better understanding the complexities of transcriptional control in eukaryotes.
Resumo:
Intuitively, music has both predictable and unpredictable components. In this work we assess this qualitative statement in a quantitative way using common time series models fitted to state-of-the-art music descriptors. These descriptors cover different musical facets and are extracted from a large collection of real audio recordings comprising a variety of musical genres. Our findings show that music descriptor time series exhibit a certain predictability not only for short time intervals, but also for mid-term and relatively long intervals. This fact is observed independently of the descriptor, musical facet and time series model we consider. Moreover, we show that our findings are not only of theoretical relevance but can also have practical impact. To this end we demonstrate that music predictability at relatively long time intervals can be exploited in a real-world application, namely the automatic identification of cover songs (i.e. different renditions or versions of the same musical piece). Importantly, this prediction strategy yields a parameter-free approach for cover song identification that is substantially faster, allows for reduced computational storage and still maintains highly competitive accuracies when compared to state-of-the-art systems.