981 resultados para MAIN-SEQUENCE STARS
Resumo:
BACKGROUND: With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. FINDINGS: Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. CONCLUSIONS: BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology.BarraCUDA is currently available from http://seqbarracuda.sf.net.
Resumo:
The forming mechanism of the three - dimensional structures of proteins,i.e.the mechanism of protein folding,is a basic problem in molecular biology which is still unsolved unitl now. In which a core problem is whether there is the three – dimensional genetic information that decide the three - dimensional structures of proteins. However, the research on this field has mot yet been reported. Recently,we made a comparative study on the folded structures of more than 70 mature messeneger RNAs (mRNAs) and the three - dimensional structures of the proteins encoded by them,it has been found that there exist marked correspondences between their featured structures in the following aspects: 1.The number of the structural units. An RNA molecule can form a secondary structure(stem and loop structure) by the folding and the base pairing of itself. The elementary structural unit of an RNA secondary structure is hairpin(or compound hair pin).The regular structural unit in the secondary structure of a protein is # alpha # - helix or #beta# - sheet . We have found that the hairpin number in the secondary structure of each mature mRNA is equal or approximately equal to the number of the regular secondary structural unis of the encoded protein. 2 .Turning region. Turn is a main structrual element in the secondary structure of a protein, which decides the backbone orientation of a protein molecule to some extent .Our analysis shows that the nucleotide sequence segments in an mRNA which encode the turns of the corresponding protein are overall situated in the turning regions of the mRNA secondary structure such as haipin,bulge loop or multibaranch loops. 3 .The arrangement of structural elements in space. In order to understand the backbone orientation of an RNA molecule and the arangement of its structural elements in space,we have modeled the three一dimensional structure of the mRNA molecule on SGI workstation based on its secondary structure.The result shows that the spatial arrangement of most of the nucleotide sequence segments encoding the structural elements of a protein is consistent with that of these stretural exements in the protein. For instance,the nucleotide sequences corresponding to each pleated sheet of a # beta # - sheet structure are close to each other in the mRNA secondary stucture and in the three - dimensional structure,although some of the nucleotide segments are far apart from each other in the one - dimensional sequence. For another instance,the two triplet codons of cysteines which form a disulphide bridge geneal1y are very close to each other in the mRNA folded structure. In addition,we also analyzed the locations of the codons proline - coding and the distrbution of the nucleotide sequences #alpha# - helix - coding in the folded structures of mRNAs . Some distribution laws have been found. All of these results suggest that the transfer of the genetic information from mRNA to protein not only is one – dimensional but also is three - dime ns ional. That is,there exists the genetic information that decide the three - dimensional structures of proteins. To a certain extent,we could say that the mRNA folding detemines the protein folding. Based on these results,it would be possible to predict the three - dimensional structures of proteins from the primary,secondary and tertiary structures of the m RNAs at a higher accuracy.And more important is that a new clue has been provided to uncover the“spatial coding" of the genetic information.
Resumo:
We propose a novel model for the spatio-temporal clustering of trajectories based on motion, which applies to challenging street-view video sequences of pedestrians captured by a mobile camera. A key contribution of our work is the introduction of novel probabilistic region trajectories, motivated by the non-repeatability of segmentation of frames in a video sequence. Hierarchical image segments are obtained by using a state-of-the-art hierarchical segmentation algorithm, and connected from adjacent frames in a directed acyclic graph. The region trajectories and measures of confidence are extracted from this graph using a dynamic programming-based optimisation. Our second main contribution is a Bayesian framework with a twofold goal: to learn the optimal, in a maximum likelihood sense, Random Forests classifier of motion patterns based on video features, and construct a unique graph from region trajectories of different frames, lengths and hierarchical levels. Finally, we demonstrate the use of Isomap for effective spatio-temporal clustering of the region trajectories of pedestrians. We support our claims with experimental results on new and existing challenging video sequences. © 2011 IEEE.
Resumo:
A great deal of experimental studies have shown that many introns of eukaryotic genes function as regulators of transcription. However, comprehensive studies of this problem have not yet been conducted. After checking the transcription frequencies of some Saccharomyces cerevisiae (yeast), genes and their introns, a remarkable phenomenon was discovered that generally the introns of the genes with higher transcription frequencies are longer, and the introns of the genes with lower transcription frequencies are shorter. This suggests that the longer introns of genes with higher transcription frequencies may contain some characteristic sequence structures, which could enhance the transcription of genes. Therefore, two sets of introns of yeast genes were chosen for further study. The transcription frequencies of the first set of genes are higher (>30), and those of the second set of genes are lower (less than or equal to10). Some oligonucleotides are detected by statistically comparative analyses of the occurrence frequencies of oligonucleotides (mainly tetranucleotides and pentanucleotides), whose occurrence frequencies in the first set of introns; are significantly higher than those in the second set of introns, and are also significantly higher than those in the exons flanking the introns of the first set. Some of these extracted oligonucleotides are the same as the regulatory elements of transcription revealed by experimental analyses. Besides, the distributions of these extracted oligonucleotides in the two sets of introns and the exons show that the sequence structures of the first set of introns are favorable for transcription of genes.
Resumo:
A chymotrypsin inhibitor, designated NA-CI, was isolated from the venom of the Chinese cobra Naja atra by three-step chromatography. It inhibited bovine (x-chymotrypsin with a K-i of 25 nM. The molecular mass of NA-CI was determined to be 6403.8 Da by matrix-assisted laser-desorption ionization time-of-flight (MALDI-TOF) analysis. The complete amino acid sequence was determined after digestion of S-carboxymethylated inhibitor with Staphylococcus aureus V8 protease and porcine trypsin. NA-CI was a single polypeptide chain composed of 57 amino acid residues. The main contact site with the protease (PI) has a Phe, showing the specificity of the inhibitor. NA-CI shared great similarity with the chymotrypsin inhibitor from Naja naja venom (identities = 89.5%) and other snake venom protease inhibitors. (C) 2003 Elsevier Inc. All rights reserved.
Resumo:
Molecular phylogeny of three genera containing nine species and subspecies of the specialized schizothoracine fishes are investigated based on the complete nucleotide sequence of mitochondrial cytochrome b gene. Meantime relationships between the main cladogenetic events of the specialized schizothoracine fishes and the stepwise uplift of the Qinghai-Tibetan Plateau are also conducted using the molecular clock, which is calibrated by geological isolated events between the upper reaches of the Yellow River and the Qinghai Lake. Results indicated that the specialized schizothoracine fishes are not a monophyly. Five species and subspecies of Ptychobarbus form a monophyly. But three species of Gymnodiptychus do not form a monophyly. Gd. integrigymnatus is a sister taxon of the highly specialized schizothoracine fishes while Gd. pachycheilus has a close relation with Gd. dybowskii, and both of them are as a sister group of Diptychus maculatus. The specialized schizothoracines fishes might have originated during the Miocene (about 10 MaBP), and then the divergence of three genera happened during late Miocene (about 8 MaBP). Their main specialization occurred during the late Pliocene and Pleistocene (3.54-0.42 MaBP). The main cladogenetic events of the specialized schizothoracine fishes are mostly correlated with the geological tectonic events and intensive climate shift happened at 8, 3.6, 2.5 and 1.7 MaBP of the late Cenozoic. Molecular clock data do not support the hypothesis that the Qinghai-Tibetan Plateau uplifted to near present or even higher elevations during the Oligocene or Miocene, and neither in agreement with the view that the plateau uplifting reached only to an altitude of 2000 in during the late Pliocene (about 2.6 MaBP).
Resumo:
We analyzed n-mers (n=3-8) in the local environment of 8,249,446 human SNPs and compared their distribution with that in the genome reference sequences. The results revealed that the short sequences, which contained at least one CpG dinucleotide, occurred
Resumo:
A genome-wide view of sequence mutability in mice is still limited, although biologists usually assume the same scenario for mice as for humans. In this study, we examined the sequence context in the local environment of 482,528 mouse single nucleotide po
Resumo:
DYN3D reactor dynamics nodal diffusion code was originally developed for the analysis of Light Water Reactors. In this paper, we demonstrate the feasibility of using DYN3D for modeling of fast spectrum reactors. A homogenized cross sections data library was generated using continuous energy Monte-Carlo code Serpent which provides significant modeling flexibility compared with traditional deterministic lattice transport codes and tolerable execution time. A representative sodium cooled fast reactor core was modeled with the Serpent-DYN3D code sequence and the results were compared with those produced by ERANOS code and with a 3D full core Monte-Carlo solution. Very good agreement between the codes was observed for the core integral parameters and power distribution suggesting that the DYN3D code with cross section library generated using Serpent can be reliably used for the analysis of fast reactors. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Toll-like receptor 3 (TLR3) participates in the innate immune response by recognizing viral pathogens. To investigate grass carp immune system responding to GCRV (grass carp reovirus) infection, the full-length cDNA sequence and genomic organization of grass carp TLR3 (CiTLR3) was identified and characterized. The full-length genome sequence of CiTLR3 is composed of 5668 nucleotides, including five exons and four introns. The full-length of CiTLR3 cDNA is 3681 bp in length and encodes a polypeptide of 904 amino acids with an estimated molecular mass of 102,765 Da and a predicted isoelectric point of 8.35. Analysis of the deduced amino acid sequence indicated that CiTLR3 has four main structural domains, including a signal peptide sequence, 14 LRR (leucine-rich repeat) motifs, a transmembrane region and a TIR (Toll/interleukin-1 receptor) domain. It is most similar to the crucian carp (Carassius auratus) TLR3 amino acid sequence with an identity of 99%. Quantitative RT-PCR analysis showed that CiTLR3 transcripts were significantly up-regulated starting at day 1 and continued through day 7 following GCRV infection (P < 0.05). These data implied that CiTLR3 is involved in antiviral defense, provide molecular and functional information for grass carp TLR3, and implicate their role in mediating immune protection against grass carp viral diseases. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Background: Cytochrome P450 monooxygenases play key roles in the metabolism of a wide variety of substrates and they are closely associated with endocellular physiological processes or detoxification metabolism under environmental exposure. To date, however, none has been systematically characterized in the phylum Ciliophora. T. thermophila possess many advantages as a eukaryotic model organism and it exhibits rapid and sensitive responses to xenobiotics, making it an ideal model system to study the evolutionary and functional diversity of the P450 monooxygenase gene family. Results: A total of 44 putative functional cytochrome P450 genes were identified and could be classified into 13 families and 21 sub-families according to standard nomenclature. The characteristics of both the conserved intron-exon organization and scaffold localization of tandem repeats within each P450 family clade suggested that the enlargement of T. thermophila P450 families probably resulted from recent separate small duplication events. Gene expression patterns of all T. thermophila P450s during three important cell physiological stages (vegetative growth, starvation and conjugation) were analyzed based on EST and microarray data, and three main categories of expression patterns were postulated. Evolutionary analysis including codon usage preference, sit-especific selection and gene-expression evolution patterns were investigated and the results indicated remarkable divergences among the T. thermophila P450 genes. Conclusion: The characterization, expression and evolutionary analysis of T. thermophila P450 monooxygenase genes in the current study provides useful information for understanding the characteristics and diversities of the P450 genes in the Ciliophora, and provides the baseline for functional analyses of individual P450 isoforms in this model ciliate species.
Resumo:
The mitochondrial genome complete sequence of Achalinus meiguensis was reported for the first time in the present study. The complete mitochondrial genome of A. meiguensis is 17239 bp in length and contains 13 protein-coding genes, 22 tRNA, 2 rRNA, and 2 non-coding regions (Control regions). On the basis of comparison with the other complete mitochondrial sequences reported, we explored the characteristic of structure and evolution. For example, duplication control regions independently occurred in the evolutionary history of reptiles; the pseudo-tRNA of snakes occurred in the Caenophidia; snake is shorter than other vertebrates in the length of tRNA because of the truncations of T psi C arm (less than 5 bp) and "DHU" arm. The phylogenic analysis by MP and BI analysis showed that the phylogenetic position of A. meiguensis was placed in Caenophidia as a sister group to other advanced snakes with the exclusion of Acrochordus granulatus which was rooted in the Caenophidia. Therefore we suggested that the subfamily Xenodermatinae, which contains A. meiguensis, should be raised to a family rank or higher rank. At the same time, based on the phylogenic statistic test, the tree of Bayesian was used for estimating the divergence time. The results showed that the divergence time between Henophidia and Caenophidia was 109.50 Mya; 106.18 Mya for divergence between Acrochordus granulatus and the other snakes of the Caenophidia; the divergence time of A. meiguensis was 103 Mya, and Viperidae diverged from the unilateral of Elapidae and Colubridae was 96.06 Mya.
Resumo:
The objective of this study was to illustrate the phylogenetic relationship of the species in the genus Craspedacusta in China. The medusae samples were collected at 28 localities in China representing seven described species with their entire ITS region (the contiguous sequences of ITS-1, 5.8S and ITS-2 rDNA) rDNA sequences cloned. Among the 28 samples, the range of sequence variation in the complete ITS and 5.8S region was between 0 and 36.2%. Three main clades were revealed by both maximum likelihood and neighbour-joining trees, with sequence difference of 0-0.9, 0-3.7 and 0.1-1.5% in the three clades. The nesting of C. xinyangensis representatives within C. sowerbii, C. brevinema within C. sinensis and C. sichuanensis within C. kiatingi is strongly supported, with interspecific sequence divergence of 0-0.9, 0.1-1.4 and 0.0-0.4%, respectively. Thus, it is suggested that C. xinyangensis should be the synonym of C. sowerbii, C. sichuanensis the synonym of C. kiatingi and C. brevinema the synonym of C. sinensis. However, the taxonomic status of C. ziguiensis is still uncertain. According to the tree topology, C. kiatingi was closer to C. sowerbii than to C. sinensis. Craspedacusta sinensis was the most genetically distinct from distance matrix values, and located at the base of the phylogenetic trees, so it can be speculated that the C. sinensis may be the ancestral form in the genus Craspedacusta.
Resumo:
It is well known that several morphospecies of Microcystis, such as Microcystis aeruginosa (Kutzing) Lemmermann and Microcystis viridis (A. Brown) Lemmermann can produce hepatotoxic microcystins. However, previous studies gave contradictory conclusions about microcystin production of Microcystis wesenbergii (Komarek) Komarek. In the present study, ten Microcystis morphospecies were identified in waterblooms of seven Chinese waterbodies, and Microcystis wesenbergii was shown as the dominant species in these waters. More than 250 single colonies of M. wesenbergii were chosen, under morphological identification, to examine whether M. wesenbergii produce hepatotoxic microcystin by using multiplex PCR for molecular detection of a region (mcyA) of microcystin synthesis genes, and chemical analyses of microcystin content by ELISA and HPLC for 21 isolated strains of M. wesenbergii from these waters were also performed. Both molecular and chemical methods demonstrated that M. wesenbergii from Chinese waters did not produce microcystin. (C) 2007 Elsevier Ltd. All rights reserved.