3 resultados para Pipelines

em BORIS: Bern Open Repository and Information System - Berna - Suiça


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Code clone detection helps connect developers across projects, if we do it on a large scale. The cornerstones that allow clone detection to work on a large scale are: (1) bad hashing (2) lightweight parsing using regular expressions and (3) MapReduce pipelines. Bad hashing means to determine whether or not two artifacts are similar by checking whether their hashes are identical. We show a bad hashing scheme that works well on source code. Lightweight parsing using regular expressions is our technique of obtaining entire parse trees from regular expressions, robustly and efficiently. We detail the algorithm and implementation of one such regular expression engine. MapReduce pipelines are a way of expressing a computation such that it can automatically and simply be parallelized. We detail the design and implementation of one such MapReduce pipeline that is efficient and debuggable. We show a clone detector that combines these cornerstones to detect code clones across all projects, across all versions of each project.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Complete transcriptomic data at high resolution are available only for a few model organisms with medical importance. The gene structures of non-model organisms are mostly computationally predicted based on comparative genomics with other species. As a result, more than half of the horse gene models are known only by projection. Experimental data supporting these gene models are scarce. Moreover, most of the annotated equine genes are single-transcript genes. Utilizing RNA sequencing (RNA-seq) the experimental validation of predicted transcriptomes has become accessible at reasonable costs. To improve the horse genome annotation we performed RNA-seq on 561 samples of peripheral blood mononuclear cells (PBMCs) derived from 85 Warmblood horses. The mapped sequencing reads were used to build a new transcriptome assembly. The new assembly revealed many alternative isoforms associated to known genes or to those predicted by the Ensembl and/or Gnomon pipelines. We also identified 7,531 transcripts not associated with any horse gene annotated in public databases. Of these, 3,280 transcripts did not have a homologous match to any sequence deposited in the NCBI EST database suggesting horse specificity. The unknown transcripts were categorized as coding and noncoding based on predicted coding potential scores. Among them 230 transcripts had high coding potential score, at least 2 exons, and an open reading frame of at least 300 nt. We experimentally validated 9 new equine coding transcripts using RT-PCR and Sanger sequencing. Our results provide valuable detailed information on many transcripts yet to be annotated in the horse genome.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the ongoing shift in the computer graphics industry toward Monte Carlo rendering, there is a need for effective, practical noise-reduction techniques that are applicable to a wide range of rendering effects and easily integrated into existing production pipelines. This course surveys recent advances in image-space adaptive sampling and reconstruction algorithms for noise reduction, which have proven very effective at reducing the computational cost of Monte Carlo techniques in practice. These approaches leverage advanced image-filtering techniques with statistical methods for error estimation. They are attractive because they can be integrated easily into conventional Monte Carlo rendering frameworks, they are applicable to most rendering effects, and their computational overhead is modest.