914 resultados para Probabilistic logic
Resumo:
With the advent of cheaper and faster DNA sequencing technologies, assembly methods have greatly changed. Instead of outputting reads that are thousands of base pairs long, new sequencers parallelize the task by producing read lengths between 35 and 400 base pairs. Reconstructing an organism’s genome from these millions of reads is a computationally expensive task. Our algorithm solves this problem by organizing and indexing the reads using n-grams, which are short, fixed-length DNA sequences of length n. These n-grams are used to efficiently locate putative read joins, thereby eliminating the need to perform an exhaustive search over all possible read pairs. Our goal was develop a novel n-gram method for the assembly of genomes from next-generation sequencers. Specifically, a probabilistic, iterative approach was utilized to determine the most likely reads to join through development of a new metric that models the probability of any two arbitrary reads being joined together. Tests were run using simulated short read data based on randomly created genomes ranging in lengths from 10,000 to 100,000 nucleotides with 16 to 20x coverage. We were able to successfully re-assemble entire genomes up to 100,000 nucleotides in length.
Resumo:
Content Addressable Memory (CAM) is a special type of Complementary Metal-Oxide-Semiconductor (CMOS) storage element that allows for a parallel search operation on a memory stack in addition to the read and write operations yielded by a conventional SRAM storage array. In practice, it is often desirable to be able to store a “don’t care” state for faster searching operation. However, commercially available CAM chips are forced to accomplish this functionality by having to include two binary memory storage elements per CAM cell,which is a waste of precious area and power resources. This research presents a novel CAM circuit that achieves the “don’t care” functionality with a single ternary memory storage element. Using the recent development of multiple-voltage-threshold (MVT) CMOS transistors, the functionality of the proposed circuit is validated and characteristics for performance, power consumption, noise immunity, and silicon area are presented. This workpresents the following contributions to the field of CAM and ternary-valued logic:• We present a novel Simple Ternary Inverter (STI) transistor geometry scheme for achieving ternary-valued functionality in existing SOI-CMOS 0.18µm processes.• We present a novel Ternary Content Addressable Memory based on Three-Valued Logic (3CAM) as a single-storage-element CAM cell with “don’t care” functionality.• We explore the application of macro partitioning schemes to our proposed 3CAM array to observe the benefits and tradeoffs of architecture design in the context of power, delay, and area.
Resumo:
Justification Logic studies epistemic and provability phenomena by introducing justifications/proofs into the language in the form of justification terms. Pure justification logics serve as counterparts of traditional modal epistemic logics, and hybrid logics combine epistemic modalities with justification terms. The computational complexity of pure justification logics is typically lower than that of the corresponding modal logics. Moreover, the so-called reflected fragments, which still contain complete information about the respective justification logics, are known to be in~NP for a wide range of justification logics, pure and hybrid alike. This paper shows that, under reasonable additional restrictions, these reflected fragments are NP-complete, thereby proving a matching lower bound. The proof method is then extended to provide a uniform proof that the corresponding full pure justification logics are $\Pi^p_2$-hard, reproving and generalizing an earlier result by Milnikel.
Resumo:
When reengineering legacy systems, it is crucial to assess if the legacy behavior has been preserved or how it changed due to the reengineering effort. Ideally if a legacy system is covered by tests, running the tests on the new version can identify potential differences or discrepancies. However, writing tests for an unknown and large system is difficult due to the lack of internal knowledge. It is especially difficult to bring the system to an appropriate state. Our solution is based on the acknowledgment that one of the few trustable piece of information available when approaching a legacy system is the running system itself. Our approach reifies the execution traces and uses logic programming to express tests on them. Thereby it eliminates the need to programatically bring the system in a particular state, and handles the test-writer a high-level abstraction mechanism to query the trace. The resulting system, called TESTLOG, was used on several real-world case studies to validate our claims.
Resumo:
Statistical approaches to evaluate higher order SNP-SNP and SNP-environment interactions are critical in genetic association studies, as susceptibility to complex disease is likely to be related to the interaction of multiple SNPs and environmental factors. Logic regression (Kooperberg et al., 2001; Ruczinski et al., 2003) is one such approach, where interactions between SNPs and environmental variables are assessed in a regression framework, and interactions become part of the model search space. In this manuscript we extend the logic regression methodology, originally developed for cohort and case-control studies, for studies of trios with affected probands. Trio logic regression accounts for the linkage disequilibrium (LD) structure in the genotype data, and accommodates missing genotypes via haplotype-based imputation. We also derive an efficient algorithm to simulate case-parent trios where genetic risk is determined via epistatic interactions.
Resumo:
A protein of a biological sample is usually quantified by immunological techniques based on antibodies. Mass spectrometry offers alternative approaches that are not dependent on antibody affinity and avidity, protein isoforms, quaternary structures, or steric hindrance of antibody-antigen recognition in case of multiprotein complexes. One approach is the use of stable isotope-labeled internal standards; another is the direct exploitation of mass spectrometric signals recorded by LC-MS/MS analysis of protein digests. Here we assessed the peptide match score summation index based on probabilistic peptide scores calculated by the PHENYX protein identification engine for absolute protein quantification in accordance with the protein abundance index as proposed by Mann and co-workers (Rappsilber, J., Ryder, U., Lamond, A. I., and Mann, M. (2002) Large-scale proteomic analysis of the human spliceosome. Genome Res. 12, 1231-1245). Using synthetic protein mixtures, we demonstrated that this approach works well, although proteins can have different response factors. Applied to high density lipoproteins (HDLs), this new approach compared favorably to alternative protein quantitation methods like UV detection of protein peaks separated by capillary electrophoresis or quantitation of protein spots on SDS-PAGE. We compared the protein composition of a well defined HDL density class isolated from plasma of seven hypercholesterolemia subjects having low or high HDL cholesterol with HDL from nine normolipidemia subjects. The quantitative protein patterns distinguished individuals according to the corresponding concentration and distribution of cholesterol from serum lipid measurements of the same samples and revealed that hypercholesterolemia in unrelated individuals is the result of different deficiencies. The presented approach is complementary to HDL lipid analysis; does not rely on complicated sample treatment, e.g. chemical reactions, or antibodies; and can be used for projective clinical studies of larger patient groups.