227 resultados para NGS sequencing
em Queensland University of Technology - ePrints Archive
Resumo:
The pH and salinity balance mechanisms of crayfish are controlled by a set of transport-related genes. We identified a set of the genes from the gill transcriptome from a freshwater crayfish Cherax quadricarinatus using the Illumina NGS-sequencing technology. We identified and characterized carbonic anhydrase (CA) genes and some other key genes involved in systematic acid-base balance and osmotic/ionic regulation. We also examined expression patterns of some of these genes across different sublethal pH levels [1]. A total of 72,382,710 paired-end Illumina reads were assembled into 36,128 contigs with an average length of 800 bp. About 37% of the contigs received significant BLAST hits and 22% were assigned gene ontology terms. These data will assist in further physiological-genomic studies in crayfish.
Resumo:
This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molec- ular biology, allowing routine clinical sequencing. NGS data consists of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans, with some strains exhibiting antibiotic resistance. Here we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from other pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
Resumo:
Next-generation sequencing techniques have revolutionized over the last decade providing researchers with low cost, high-throughput alternatives compared to the traditional Sanger sequencing methods. These sequencing techniques have rapidly evolved from first-generation to fourth-generation with very broad applications such as unravelling the complexity of the genome, in terms of genetic variations, and having a high impact on the biological field. In this review, we discuss the transition of sequencing from the second-generation to the third- and fourth-generations, and describe some of their novel biological applications. With the advancement in technology, the earlier challenges of minimal size of the instrument, flexibility of throughput, ease of data analysis and short run times are being addressed. However, the need for prospective analysis and effectiveness to test whether the knowledge of any given new variants identified has an effect on clinical outcome may need improvement.
Resumo:
Episodic Ataxia type 2 (EA2) is a rare autosomal dominantly inherited neurological disorder characterized by recurrent disabling imbalance, vertigo and episodes of ataxia lasting minutes to hours. EA2 is caused most often by loss of function mutations of the calcium channel gene CACNA1A. In addition to EA2, mutations in CACNA1A are responsible for two other allelic disorders: familial hemiplegic migraine type1 (FHM1) and spinocerebellar ataxia type 6 (SCA6). Herein, we have utilised Next Generation Sequencing (NGS) to screen the coding sequence, exon-intron boundaries and UTRs of five genes where mutation is known to produce symptoms related to EA2, including CACNA1A. We performed this screening in a group of 31 unrelated patients with EA2 symptoms. Both novel and known mutations were detected through NGS technology, and confirmed through Sanger sequencing. Genetic testing showed in total 15 mutation bearing patients (48%), of which 9 were novel mutations (6 missense and 3 small frameshift deletion mutations) and six known mutations (4 missense and 2 nonsense).These results demonstrate the efficiency of our NGS-panel for detecting known and novel mutations for EA2 in the CACNA1A gene, also identifying a novel missense mutation in ATP1A2 which is not a normal target for EA2 screening.
Resumo:
In the field of music technology there is a distinct focus on networking between spatially disparate locales to improve teaching and learning through real-time communication. This article proposes a new delivery model for learner support based on a review of technical and learning services, pilot research using remote desktops to teach music-sequencing software, and recent education research regarding professional development. A 24/7 delivery model using remote desktops, mobile devices and shared calendars offers a flexible real-time addition to the learner support services already on offer. Treating every user of the service as a potential expert, the model aims to deliver universal support situated in a personalized context, which will serve the technical and education requirements of teachers and learners.
Resumo:
Train scheduling is a complex and time consuming task of vital importance. To schedule trains more accurately and efficiently than permitted by current techniques a novel hybrid job shop approach has been proposed and implemented. Unique characteristics of train scheduling are first incorporated into a disjunctive graph model of train operations. A constructive algorithm that utilises this model is then developed. The constructive algorithm is a general procedure that constructs a schedule using insertion, backtracking and dynamic route selection mechanisms. It provides a significant search capability and is valid for any objective criteria. Simulated Annealing and Local Search meta-heuristic improvement algorithms are also adapted and extended. An important feature of these approaches is a new compound perturbation operator that consists of many unitary moves that allows trains to be shifted feasibly and more easily within the solution. A numerical investigation and case study is provided and demonstrates that high quality solutions are obtainable on real sized applications.
Massively parallel sequencing and analysis of expressed sequence tags in a successful invasive plant
Resumo:
Background Invasive species pose a significant threat to global economies, agriculture and biodiversity. Despite progress towards understanding the ecological factors associated with plant invasions, limited genomic resources have made it difficult to elucidate the evolutionary and genetic factors responsible for invasiveness. This study presents the first expressed sequence tag (EST) collection for Senecio madagascariensis, a globally invasive plant species. Methods We used pyrosequencing of one normalized and two subtractive libraries, derived from one native and one invasive population, to generate an EST collection. ESTs were assembled into contigs, annotated by BLAST comparison with the NCBI non-redundant protein database and assigned gene ontology (GO) terms from the Plant GO Slim ontologies. Key Results Assembly of the 221 746 sequence reads resulted in 12 442 contigs. Over 50 % (6183) of 12 442 contigs showed significant homology to proteins in the NCBI database, representing approx. 4800 independent transcripts. The molecular transducer GO term was significantly over-represented in the native (South African) subtractive library compared with the invasive (Australian) library. Based on NCBI BLAST hits and literature searches, 40 % of the molecular transducer genes identified in the South African subtractive library are likely to be involved in response to biotic stimuli, such as fungal, bacterial and viral pathogens. Conclusions This EST collection is the first representation of the S. madagascariensis transcriptome and provides an important resource for the discovery of candidate genes associated with plant invasiveness. The over-representation of molecular transducer genes associated with defence responses in the native subtractive library provides preliminary support for aspects of the enemy release and evolution of increased competitive ability hypotheses in this successful invasive. This study highlights the contribution of next-generation sequencing to better understanding the molecular mechanisms underlying ecological hypotheses that are important in successful plant invasions.
Resumo:
Background Human papillomavirus (HPV) is the aetiological agent for cervical cancer and genital warts. Concurrent HPV and HIV infection in the South African population is high. HIV positive (+) women are often infected with multiple, rare and undetermined HPV types. Data on HPV incidence and genotype distribution are based on commercial HPV detection kits, but these kits may not detect all HPV types in HIV + women. The objectives of this study were to (i) identify the HPV types not detected by commercial genotyping kits present in a cervical specimen from an HIV positive South African woman using next generation sequencing, and (ii) determine if these types were prevalent in a cohort of HIV-infected South African women. Methods Total DNA was isolated from 109 cervical specimens from South African HIV + women. A specimen within this cohort representing a complex multiple HPV infection, with 12 HPV genotypes detected by the Roche Linear Array HPV genotyping (LA) kit, was selected for next generation sequencing analysis. All HPV types present in this cervical specimen were identified by Illumina sequencing of the extracted DNA following rolling circle amplification. The prevalence of the HPV types identified by sequencing, but not included in the Roche LA, was then determined in the 109 HIV positive South African women by type-specific PCR. Results Illumina sequencing identified a total of 16 HPV genotypes in the selected specimen, with four genotypes (HPV-30, 74, 86 and 90) not included in the commercial kit. The prevalence's of HPV-30, 74, 86 and 90 in 109 HIV positive South African women were found to be 14.6 %, 12.8 %, 4.6 % and 8.3 % respectively. Conclusions Our results indicate that there are HPV types, with substantial prevalence, in HIV positive women not being detected in molecular epidemiology studies using commercial kits. The significance of these types in relation to cervical disease remains to be investigated.
Resumo:
miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star
Resumo:
We isolated and characterized 21 microsatellite loci in the vulnerable and iconic Australian lungfish, Neoceratodus forsteri. Loci were screened across eight individuals from the Burnett River and 40 individuals from the Pine River. Genetic diversity was low with between one and six alleles per locus within populations and a maximum expected heterozygosity of 0.774. These loci will now be available to assess effective population sizes and genetic structure in N. forsteri across its natural range in South East Queensland, Australia.