227 resultados para NGS sequencing
Resumo:
Global aquaculture has expanded rapidly to address the increasing demand for aquatic protein needs and an uncertain future for wild fisheries. To date, however, most farmed aquatic stocks are essentially wild and little is known about their genomes or the genes that affect important economic traits in culture. Biologists have recognized that recent technological advances including next generation sequencing (NGS) have opened up the possibility of generating genome wide sequence data sets rapidly from non-model organisms at a reasonable cost. In an era when virtually any study organism can 'go genomic', understanding gene function and genetic effects on expressed quantitative trait locus phenotypes will be fundamental to future knowledge development. Many factors can influence the individual growth rate in target species but of particular importance in agriculture and aquaculture will be the identification and characterization of the specific gene loci that contribute important phenotypic variation to growth because the information can be applied to speed up genetic improvement programmes and to increase productivity via marker-assisted selection (MAS). While currently there is only limited genomic information available for any crustacean species, a number of putative candidate genes have been identified or implicated in growth and muscle development in some species. In an effort to stimulate increased research on the identification of growth-related genes in crustacean species, here we review the available information on: (i) associations between genes and growth reported in crustaceans, (ii) growth-related genes involved with moulting, (iii) muscle development and degradation genes involved in moulting, and; (iv) correlations between DNA sequences that have confirmed growth trait effects in farmed animal species used in terrestrial agriculture and related sequences in crustacean species. The information in concert can provide a foundation for increasing the rate at which knowledge about key genes affecting growth traits in crustacean species is gained.
Resumo:
The high risk of metabolic disease traits in Polynesians may be partly explained by elevated prevalence of genetic variants involved in energy metabolism. The genetics of Polynesian populations has been shaped by island hoping migration events which have possibly favoured thrifty genes. The aim of this study was to sequence the mitochondrial genome in a group of Maoris in an effort to characterise genome variation in this Polynesian population for use in future disease association studies. We sequenced the complete mitochondrial genomes of 20 non-admixed Maori subjects using Affymetrix technology. DNA diversity analyses showed the Maori group exhibited reduced mitochondrial genome diversity compared to other worldwide populations, which is consistent with historical bottleneck and founder effects. Global phylogenetic analysis positioned these Maori subjects specifically within mitochondrial haplogroup - B4a1a1. Interestingly, we identified several novel variants that collectively form new and unique Maori motifs – B4a1a1c, B4a1a1a3 and B4a1a1a5. Compared to ancestral populations we observed an increased frequency of non-synonymous coding variants of several mitochondrial genes in the Maori group, which may be a result of positive selection and/or genetic drift effects. In conclusion, this study reports the first complete mitochondrial genome sequence data for a Maori population. Overall, these new data reveal novel mitochondrial genome signatures in this Polynesian population and enhance the phylogenetic picture of maternal ancestry in Oceania. The increased frequency of several mitochondrial coding variants makes them good candidates for future studies aimed at assessment of metabolic disease risk in Polynesian populations.
Resumo:
We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1P29S) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1P29S showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.
Resumo:
Melanoma has historically been refractive to traditional therapeutic approaches. As such, the development of novel drug strategies has been needed to improve rates of overall survival in patients with melanoma, particularly those with late stage or disseminated disease. Recent success with molecularly based targeted drugs, such as Vemurafenib in BRAF-mutant melanomas, has now made “personalized medicine” a reality within some oncology clinics. In this sense, tailored drugs can be administered to patients according to their tumor “mutation profiles.” The success of these drug strategies, in part, can be attributed to the identification of the genetic mechanisms responsible for the development and progression of metastatic melanoma. Recently, the advances in sequencing technology have allowed for comprehensive mutation analysis of tumors and have led to the identification of a number of genes involved in the etiology of metastatic melanoma. As the methodology and costs associated with next-generation sequencing continue to improve, this technology will be rapidly adopted into routine clinical oncology practices and will significantly impact on personalized therapy. This review summarizes current and emerging molecular targets in metastatic melanoma, discusses the potential application of next-generation sequencing within the paradigm of personalized medicine, and describes the current limitations for the adoption of this technology within the clinic.
Resumo:
The candidate gene approach has been a pioneer in the field of genetic epidemiology, identifying risk alleles and their association with clinical traits. With the advent of rapidly changing technology, there has been an explosion of in silico tools available to researchers, giving them fast, efficient resources and reliable strategies important to find casual gene variants for candidate or genome wide association studies (GWAS). In this review, following a description of candidate gene prioritisation, we summarise the approaches to single nucleotide polymorphism (SNP) prioritisation and discuss the tools available to assess functional relevance of the risk variant with consideration to its genomic location. The strategy and the tools discussed are applicable to any study investigating genetic risk factors associated with a particular disease. Some of the tools are also applicable for the functional validation of variants relevant to the era of GWAS and next generation sequencing (NGS).
Resumo:
Forward genetic screens have identified numerous genes involved in development and metabolism, and remain a cornerstone of biological research. However, to locate a causal mutation, the practice of crossing to a polymorphic background to generate a mapping population can be problematic if the mutant phenotype is difficult to recognize in the hybrid F2 progeny, or dependent on parental specific traits. Here in a screen for leaf hyponasty mutants, we have performed a single backcross of an Ethane Methyl Sulphonate (EMS) generated hyponastic mutant to its parent. Whole genome deep sequencing of a bulked homozygous F2 population and analysis via the Next Generation EMS mutation mapping pipeline (NGM) unambiguously determined the causal mutation to be a single nucleotide polymorphisim (SNP) residing in HASTY, a previously characterized gene involved in microRNA biogenesis. We have evaluated the feasibility of this backcross approach using three additional SNP mapping pipelines; SHOREmap, the GATK pipeline, and the samtools pipeline. Although there was variance in the identification of EMS SNPs, all returned the same outcome in clearly identifying the causal mutation in HASTY. The simplicity of performing a single parental backcross and genome sequencing a small pool of segregating mutants has great promise for identifying mutations that may be difficult to map using conventional approaches.
Resumo:
Potato leafroll virus (PLRV) is a positive-strand RNA virus that generates subgenomic RNAs (sgRNA) for expression of 3' proximal genes. Small RNA (sRNA) sequencing and mapping of the PLRV-derived sRNAs revealed coverage of the entire viral genome with the exception of four distinctive gaps. Remarkably, these gaps mapped to areas of PLRV genome with extensive secondary structures, such as the internal ribosome entry site and 5' transcriptional start site of sgRNA1 and sgRNA2. The last gap mapped to ~500. nt from the 3' terminus of PLRV genome and suggested the possible presence of an additional sgRNA for PLRV. Quantitative real-time PCR and northern blot analysis confirmed the expression of sgRNA3 and subsequent analyses placed its 5' transcriptional start site at position 5347 of PLRV genome. A regulatory role is proposed for the PLRV sgRNA3 as it encodes for an RNA-binding protein with specificity to the 5' of PLRV genomic RNA. © 2013.
Resumo:
For the first of the baby boomers turning 65 years of age, after a decade littered with financial shocks (dot.com bubble, sub-prime, global financial crisis, sovereign debt), sequencing risk can represent a significant threat to their retirement nest eggs. This paper takes an outcomeoriented approach to the problem, to provide practical insights into how sequencing risk works and the critical dependency of retirement outcomes on sequencing risk. Our analysis challenges the conventional wisdom that it is the accumulated average of investment returns that matter. We show, instead, that it is the realised sequence of returns which largely determines the sustainability of retirement incomes.
Resumo:
Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world’s poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16–45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.
Resumo:
In the current market, extensive software development is taking place and the software industry is thriving. Major software giants have stated source code theft as a major threat to revenues. By inserting an identity-establishing watermark in the source code, a company can prove it's ownership over the source code. In this paper, we propose a watermarking scheme for C/C++ source codes by exploiting the language restrictions. If a function calls another function, the latter needs to be defined in the code before the former, unless one uses function pre-declarations. We embed the watermark in the code by imposing an ordering on the mutually independent functions by introducing bogus dependency. Removal of dependency by the attacker to erase the watermark requires extensive manual intervention thereby making the attack infeasible. The scheme is also secure against subtractive and additive attacks. Using our watermarking scheme, an n-bit watermark can be embedded in a program having n independent functions. The scheme is implemented on several sample codes and performance changes are analyzed.
Resumo:
Planning techniques for large scale earthworks have been considered in this article. To improve these activities a “block theoretic” approach was developed that provides an integrated solution consisting of an allocation of cuts to fills and a sequence of cuts and fills over time. It considers the constantly changing terrain by computing haulage routes dynamically. Consequently more realistic haulage costs are used in the decision making process. A digraph is utilised to describe the terrain surface which has been partitioned into uniform grids. It reflects the true state of the terrain, and is altered after each cut and fill. A shortest path algorithm is successively applied to calculate the cost of each haul, and these costs are summed over the entire sequence, to provide a total cost of haulage. To solve this integrated optimisation problem a variety of solution techniques were applied, including constructive algorithms, meta-heuristics and parallel programming. The extensive numerical investigations have successfully shown the applicability of our approach to real sized earthwork problems.
Resumo:
Background The sequencing, de novo assembly and annotation of transcriptome datasets generated with next generation sequencing (NGS) has enabled biologists to answer genomic questions in non-model species with unprecedented ease. Reliable and accurate de novo assembly and annotation of transcriptomes, however, is a critically important step for transcriptome assemblies generated from short read sequences. Typical benchmarks for assembly and annotation reliability have been performed with model species. To address the reliability and accuracy of de novo transcriptome assembly in non-model species, we generated an RNAseq dataset for an intertidal gastropod mollusc species, Nerita melanotragus, and compared the assembly produced by four different de novo transcriptome assemblers; Velvet, Oases, Geneious and Trinity, for a number of quality metrics and redundancy. Results Transcriptome sequencing on the Ion Torrent PGM™ produced 1,883,624 raw reads with a mean length of 133 base pairs (bp). Both the Trinity and Oases de novo assemblers produced the best assemblies based on all quality metrics including fewer contigs, increased N50 and average contig length and contigs of greater length. Overall the BLAST and annotation success of our assemblies was not high with only 15-19% of contigs assigned a putative function. Conclusions We believe that any improvement in annotation success of gastropod species will require more gastropod genome sequences, but in particular an increase in mollusc protein sequences in public databases. Overall, this paper demonstrates that reliable and accurate de novo transcriptome assemblies can be generated from short read sequencers with the right assembly algorithms. Keywords: Nerita melanotragus; De novo assembly; Transcriptome; Heat shock protein; Ion torrent
Resumo:
Background Small RNA sequencing is commonly used to identify novel miRNAs and to determine their expression levels in plants. There are several miRNA identification tools for animals such as miRDeep, miRDeep2 and miRDeep*. miRDeep-P was developed to identify plant miRNA using miRDeep’s probabilistic model of miRNA biogenesis, but it depends on several third party tools and lacks a user-friendly interface. The objective of our miRPlant program is to predict novel plant miRNA, while providing a user-friendly interface with improved accuracy of prediction. Result We have developed a user-friendly plant miRNA prediction tool called miRPlant. We show using 16 plant miRNA datasets from four different plant species that miRPlant has at least a 10% improvement in accuracy compared to miRDeep-P, which is the most popular plant miRNA prediction tool. Furthermore, miRPlant uses a Graphical User Interface for data input and output, and identified miRNA are shown with all RNAseq reads in a hairpin diagram. Conclusions We have developed miRPlant which extends miRDeep* to various plant species by adopting suitable strategies to identify hairpin excision regions and hairpin structure filtering for plants. miRPlant does not require any third party tools such as mapping or RNA secondary structure prediction tools. miRPlant is also the first plant miRNA prediction tool that dynamically plots miRNA hairpin structure with small reads for identified novel miRNAs. This feature will enable biologists to visualize novel pre-miRNA structure and the location of small RNA reads relative to the hairpin. Moreover, miRPlant can be easily used by biologists with limited bioinformatics skills.
Resumo:
Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..
Resumo:
Critical stage in open-pit mining is to determine the optimal extraction sequence of blocks, which has significant impacts on mining profitability. In this paper, a more comprehensive block sequencing optimisation model is developed for the open-pit mines. In the model, material characteristics of blocks, grade control, excavator and block sequencing are investigated and integrated to maximise the short-term benefit of mining. Several case studies are modeled and solved by CPLEX MIP and CP engines. Numerical investigations are presented to illustrate and validate the proposed methodology.