975 resultados para Massively parallel sequencing
Resumo:
DNA extraction was carried out as described on the MICROBIS project pages (http://icomm.mbl.edu/microbis ) using a commercially available extraction kit. We amplified the hypervariable regions V4-V6 of archaeal and bacterial 16S rRNA genes using PCR and several sets of forward and reverse primers (http://vamps.mbl.edu/resources/primers.php). Massively parallel tag sequencing of the PCR products was carried out on a 454 Life Sciences GS FLX sequencer at Marine Biological Laboratory, Woods Hole, MA, following the same experimental conditions for all samples. Sequence reads were submitted to a rigorous quality control procedure based on mothur v30 (doi:10.1128/AEM.01541-09) including denoising of the flow grams using an algorithm based on PyroNoise (doi:10.1038/nmeth.1361), removal of PCR errors and a chimera check using uchime (doi:10.1093/bioinformatics/btr381). The reads were taxonomically assigned according to the SILVA taxonomy (SSURef v119, 07-2014; doi:10.1093/nar/gks1219) implemented in mothur and clustered at 98% ribosomal RNA gene V4-V6 sequence identity. V4-V6 amplicon sequence abundance tables were standardized to account for unequal sampling effort using 1000 (Archaea) and 2300 (Bacteria) randomly chosen sequences without replacement using mothur and then used to calculate inverse Simpson diversity indices and Chao1 richness (doi:10.2307/4615964). Bray-Curtis dissimilarities (doi:10.2307/1942268) between all samples were calculated and used for 2-dimensional non metric multidimensional scaling (NMDS) ordinations with 20 random starts (doi:10.1007/BF02289694). Stress values below 0.2 indicated that the multidimensional dataset was well represented by the 2D ordination. NMDS ordinations were compared and tested using Procrustes correlation analysis (doi:10.1007/BF02291478). All analyses were carried out with the R statistical environment and the packages vegan (available at: http://cran.r-project.org/package=vegan), labdsv (available at: http://cran.r-project.org/package=labdsv), as well as with custom R scripts. Operational taxonomic units at 98% sequence identity (OTU0.03) that occurred only once in the whole dataset were termed absolute single sequence OTUs (SSOabs; doi:10.1038/ismej.2011.132). OTU0.03 sequences that occurred only once in at least one sample, but may occur more often in other samples were termed relative single sequence OTUs (SSOrel). SSOrel are particularly interesting for community ecology, since they comprise rare organisms that might become abundant when conditions change.16S rRNA amplicons and metagenomic reads have been stored in the sequence read archive under SRA project accession number SRP042162.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Membrane systems are computational equivalent to Turing machines. However, its distributed and massively parallel nature obtain polynomial solutions opposite to traditional non-polynomial ones. Nowadays, developed investigation for implementing membrane systems has not yet reached the massively parallel character of this computational model. Better published approaches have achieved a distributed architecture denominated “partially parallel evolution with partially parallel communication” where several membranes are allocated at each processor, proxys are used to communicate with membranes allocated at different processors and a policy of access control to the communications is mandatory. With these approaches, it is obtained processors parallelism in the application of evolution rules and in the internal communication among membranes allocated inside each processor. Even though, external communications share a common communication line, needed for the communication among membranes arranged in different processors, are sequential. In this work, we present a new hierarchical architecture that reaches external communication parallelism among processors and substantially increases parallelization in the application of evolution rules and internal communications. Consequently, necessary time for each evolution step is reduced. With all of that, this new distributed hierarchical architecture is near to the massively parallel character required by the model.
Resumo:
Large read-only or read-write transactions with a large read set and a small write set constitute an important class of transactions used in such applications as data mining, data warehousing, statistical applications, and report generators. Such transactions are best supported with optimistic concurrency, because locking of large amounts of data for extended periods of time is not an acceptable solution. The abort rate in regular optimistic concurrency algorithms increases exponentially with the size of the transaction. The algorithm proposed in this dissertation solves this problem by using a new transaction scheduling technique that allows a large transaction to commit safely with significantly greater probability that can exceed several orders of magnitude versus regular optimistic concurrency algorithms. A performance simulation study and a formal proof of serializability and external consistency of the proposed algorithm are also presented.^ This dissertation also proposes a new query optimization technique (lazy queries). Lazy Queries is an adaptive query execution scheme which optimizes itself as the query runs. Lazy queries can be used to find an intersection of sub-queries in a very efficient way, which does not require full execution of large sub-queries nor does it require any statistical knowledge about the data.^ An efficient optimistic concurrency control algorithm used in a massively parallel B-tree with variable-length keys is introduced. B-trees with variable-length keys can be effectively used in a variety of database types. In particular, we show how such a B-tree was used in our implementation of a semantic object-oriented DBMS. The concurrency control algorithm uses semantically safe optimistic virtual "locks" that achieve very fine granularity in conflict detection. This algorithm ensures serializability and external consistency by using logical clocks and backward validation of transactional queries. A formal proof of correctness of the proposed algorithm is also presented. ^
Resumo:
BACKGROUND: KRAS mutation testing is required to select patients with metastatic colorectal cancer (CRC) to receive anti-epidermal growth factor receptor antibodies, but the optimal KRAS mutation test method is uncertain. METHODS: We conducted a two-site comparison of two commercial KRAS mutation kits - the cobas KRAS Mutation Test and the Qiagen therascreen KRAS Kit - and Sanger sequencing. A panel of 120 CRC specimens was tested with all three methods. The agreement between the cobas test and each of the other methods was assessed. Specimens with discordant results were subjected to quantitative massively parallel pyrosequencing (MPP). DNA blends were tested to determine detection rates at 5% mutant alleles. RESULTS: Reproducibility of the cobas test between sites was 98%. Six mutations were detected by cobas that were not detected by Sanger, and five were confirmed by MPP. The cobas test detected eight mutations which were not detected by the therascreen test, and seven were confirmed by MPP. Detection rates with 5% mutant DNA blends were 100% for the cobas and therascreen tests and 19% for Sanger. CONCLUSION: The cobas test was reproducible between sites, and detected several mutations that were not detected by the therascreen test or Sanger. Sanger sequencing had poor sensitivity for low levels of mutation.
Resumo:
Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. Most VS methods suppose a unique binding site for the target, but it has been demonstrated that diverse ligands interact with unrelated parts of the target and many VS methods do not take into account this relevant fact. This problem is circumvented by a novel VS methodology named BINDSURF that scans the whole protein surface to find new hotspots, where ligands might potentially interact with, and which is implemented in massively parallel Graphics Processing Units, allowing fast processing of large ligand databases. BINDSURF can thus be used in drug discovery, drug design, drug repurposing and therefore helps considerably in clinical research. However, the accuracy of most VS methods is constrained by limitations in the scoring function that describes biomolecular interactions, and even nowadays these uncertainties are not completely understood. In order to solve this problem, we propose a novel approach where neural networks are trained with databases of known active (drugs) and inactive compounds, and later used to improve VS predictions.
Resumo:
Short-rib polydactyly syndromes (SRPS I-V) are a group of lethal congenital disorders characterized by shortening of the ribs and long bones, polydactyly, and a range of extraskeletal phenotypes. A number of other disorders in this grouping, including Jeune and Ellis-van Creveld syndromes, have an overlapping but generally milder phenotype. Collectively, these short-rib dysplasias (with or without polydactyly) share a common underlying defect in primary cilium function and form a subset of the ciliopathy disease spectrum. By using whole-exome capture and massive parallel sequencing of DNA from an affected Australian individual with SRPS type III, we detected two novel heterozygous mutations in WDR60, a relatively uncharacterized gene. These mutations segregated appropriately in the unaffected parents and another affected family member, confirming compound heterozygosity, and both were predicted to have a damaging effect on the protein. Analysis of an additional 54 skeletal ciliopathy exomes identified compound heterozygous mutations in WDR60 in a Spanish individual with Jeune syndrome of relatively mild presentation. Of note, these two families share one novel WDR60 missense mutation, although haplotype analysis suggested no shared ancestry. We further show that WDR60 localizes at the base of the primary cilium in wild-type human chondrocytes, and analysis of fibroblasts from affected individuals revealed a defect in ciliogenesis and aberrant accumulation of the GLI2 transcription factor at the centrosome or basal body in the absence of an obvious axoneme. These findings show that WDR60 mutations can cause skeletal ciliopathies and suggest a role for WDR60 in ciliogenesis.
Resumo:
Gnathodiaphyseal dysplasia (GDD) is a rare autosomal dominant condition characterized by bone fragility, irregular bone mineral density (BMD) and fibro-osseous lesions in the skull and jaw. Mutations in Anoctamin-5 (ANO5) have been identified in some cases. We aimed to identify the causative mutation in a family with features of GDD but no mutation in ANO5, using whole exome capture and massive parallel sequencing (WES). WES of two affected individuals (a mother and son) and the mother's unaffected parents identified a mutation in the C-propeptide cleavage site of COL1A1. Similar mutations have been reported in individuals with osteogenesis imperfecta (OI) and paradoxically increased BMD. C-propeptide cleavage site mutations in COL1A1 may not only cause 'high bone mass OI', but also the clinical features of GDD, specifically irregular sclerotic BMD and fibro-osseous lesions in the skull and jaw. GDD patients negative for ANO5 mutations should be assessed for mutations in type I collagen C-propeptide cleavage sites.
Resumo:
Stochastic volatility models are of fundamental importance to the pricing of derivatives. One of the most commonly used models of stochastic volatility is the Heston Model in which the price and volatility of an asset evolve as a pair of coupled stochastic differential equations. The computation of asset prices and volatilities involves the simulation of many sample trajectories with conditioning. The problem is treated using the method of particle filtering. While the simulation of a shower of particles is computationally expensive, each particle behaves independently making such simulations ideal for massively parallel heterogeneous computing platforms. In this paper, we present our portable Opencl implementation of the Heston model and discuss its performance and efficiency characteristics on a range of architectures including Intel cpus, Nvidia gpus, and Intel Many-Integrated-Core (mic) accelerators.
Resumo:
Segmentation defects of the vertebrae (SDV) are caused by aberrant somite formation during embryogenesis and result in irregular formation of the vertebrae and ribs. The Notch signal transduction pathway plays a critical role in somite formation and patterning in model vertebrates. In humans, mutations in several genes involved in the Notch pathway are associated with SDV, with both autosomal recessive (MESP2, DLL3, LFNG, HES7) and autosomal dominant (TBX6) inheritance. However, many individuals with SDV do not carry mutations in these genes. Using whole-exome capture and massive parallel sequencing, we identified compound heterozygous mutations in RIPPLY2 in two brothers with multiple regional SDV, with appropriate familial segregation. One novel mutation (c.A238T:p.Arg80*) introduces a premature stop codon. In transiently transfected C2C12 mouse myoblasts, the RIPPLY2 mutant protein demonstrated impaired transcriptional repression activity compared with wild-type RIPPLY2 despite similar levels of expression. The other mutation (c.240-4T>G), with minor allele frequency <0.002, lies in the highly conserved splice site consensus sequence 5' to the terminal exon. Ripply2 has a well-established role in somitogenesis and vertebral column formation, interacting at both gene and protein levels with SDV-associated Mesp2 and Tbx6. We conclude that compound heterozygous mutations in RIPPLY2 are associated with SDV, a new gene for this condition. © The Author 2014.
Resumo:
Introduction: Osteoporosis is the commonest metabolic bone disease worldwide. The clinical hallmark of osteoporosis is low trauma fracture, with the most devastating being hip fracture, resulting in significant effects on both morbidity and mortality. Sources of data: Data for this review have been gathered from the published literature and from a range of web resources. Areas of agreement: Genome-wide association studies in the field of osteoporosis have led to the identification of a number of loci associated with both bone mineral density and fracture risk and further increased our understanding of disease. Areas of controversy: The early strategies for mapping osteoporosis disease genes reported only isolated associations, with replication in independent cohorts proving difficult. Neither candidate gene or linkage studies showed association at genome-wide level of significance. Growing points: The advent of massive parallel sequencing technologies has proved extremely successful in mapping monogenic diseases and thus leading to the utilization of this new technology in complex disease genetics. Areas timely for developing research: The identification of novel genes and pathways will potentially lead to the identification of novel therapeutic options for patients with osteoporosis. © 2014 The Author.
Resumo:
Template matching is concerned with measuring the similarity between patterns of two objects. This paper proposes a memory-based reasoning approach for pattern recognition of binary images with a large template set. It seems that memory-based reasoning intrinsically requires a large database. Moreover, some binary image recognition problems inherently need large template sets, such as the recognition of Chinese characters which needs thousands of templates. The proposed algorithm is based on the Connection Machine, which is the most massively parallel machine to date, using a multiresolution method to search for the matching template. The approach uses the pyramid data structure for the multiresolution representation of templates and the input image pattern. For a given binary image it scans the template pyramid searching the match. A binary image of N × N pixels can be matched in O(log N) time complexity by our algorithm and is independent of the number of templates. Implementation of the proposed scheme is described in detail.
Resumo:
3-Dimensional Diffuse Optical Tomographic (3-D DOT) image reconstruction algorithm is computationally complex and requires excessive matrix computations and thus hampers reconstruction in real time. In this paper, we present near real time 3D DOT image reconstruction that is based on Broyden approach for updating Jacobian matrix. The Broyden method simplifies the algorithm by avoiding re-computation of the Jacobian matrix in each iteration. We have developed CPU and heterogeneous CPU/GPU code for 3D DOT image reconstruction in C and MatLab programming platform. We have used Compute Unified Device Architecture (CUDA) programming framework and CUDA linear algebra library (CULA) to utilize the massively parallel computational power of GPUs (NVIDIA Tesla K20c). The computation time achieved for C program based implementation for a CPU/GPU system for 3 planes measurement and FEM mesh size of 19172 tetrahedral elements is 806 milliseconds for an iteration.
Resumo:
Microarraying involves laying down genetic elements onto a solid substrate for DNA analysis on a massively parallel scale. Microarrays are prepared using a pin-based robotic platform to transfer liquid samples from microtitre plates to an array pattern of dots of different liquids on the surface of glass slides where they dry to form spots diameter < 200 μm. This paper presents the design, materials selection, micromachining technology and performance of reservoir pins for microarraying. A conical pin is produced by (i) conventional machining of stainless steel or wet etching of tungsten wire, followed by (ii) micromachining with a focused laser to produce a microreservoir and a capillary channel structure leading from the tip. The pin has a flat end diameter < 100 μm from which a 500 μm long capillary channel < 15 μm wide leads up the pin to a reservoir. Scanning electron micrographs of the metal surface show roughness on the scale of 10 μm, but the pins nevertheless give consistent and reproducible spotting performance. The pin capacity is 80 nanolitres of fluid containing DNA, and at least 50 spots can be printed before replenishing the reservoir. A typical robot holds can hold up to 64 pins. This paper discusses the fabrication technology, the performance and spotting uniformity for reservoir pins, the possible limits to miniaturization of pins using this approach, and the future prospects for contact and non-contact arraying technology.
Resumo:
This short communication presents our recent studies to implement numerical simulations for multi-phase flows on top-ranked supercomputer systems with distributed memory architecture. The numerical model is designed so as to make full use of the capacity of the hardware. Satisfactory scalability in terms of both the parallel speed-up rate and the size of the problem has been obtained on two high rank systems with massively parallel processors, the Earth Simulator (Earth simulator research center, Yokohama Kanagawa, Japan) and the TSUBAME (Tokyo Institute of Technology, Tokyo, Japan) supercomputers.