242 resultados para Genomic sequence database


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Global aquaculture has expanded rapidly to address the increasing demand for aquatic protein needs and an uncertain future for wild fisheries. To date, however, most farmed aquatic stocks are essentially wild and little is known about their genomes or the genes that affect important economic traits in culture. Biologists have recognized that recent technological advances including next generation sequencing (NGS) have opened up the possibility of generating genome wide sequence data sets rapidly from non-model organisms at a reasonable cost. In an era when virtually any study organism can 'go genomic', understanding gene function and genetic effects on expressed quantitative trait locus phenotypes will be fundamental to future knowledge development. Many factors can influence the individual growth rate in target species but of particular importance in agriculture and aquaculture will be the identification and characterization of the specific gene loci that contribute important phenotypic variation to growth because the information can be applied to speed up genetic improvement programmes and to increase productivity via marker-assisted selection (MAS). While currently there is only limited genomic information available for any crustacean species, a number of putative candidate genes have been identified or implicated in growth and muscle development in some species. In an effort to stimulate increased research on the identification of growth-related genes in crustacean species, here we review the available information on: (i) associations between genes and growth reported in crustaceans, (ii) growth-related genes involved with moulting, (iii) muscle development and degradation genes involved in moulting, and; (iv) correlations between DNA sequences that have confirmed growth trait effects in farmed animal species used in terrestrial agriculture and related sequences in crustacean species. The information in concert can provide a foundation for increasing the rate at which knowledge about key genes affecting growth traits in crustacean species is gained.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, with the development of techniques in modern molecular biology, it has become possible to study the genetic basis of carcinogenesis down to the level of DNA sequence. Major advances have been made in our understanding of the genes involved in cell cycle control and descriptions of mutations in those genes. These developments have led to the definition of the role of specific oncogenes and tumour suppressor genes in several cancers, including, for example, colon cancers and some forms of breast cancer. Work reported from our laboratory has led to the identification of a number of candidate genes involved in the development of non-melanotic skin cancers. In this chapter, we attempt to further explain the observed (phenomic) alterations in metabolic pathways associated with oxygen consumption with the changes at the genetic level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Potato leafroll virus (PLRV) is a positive-strand RNA virus that generates subgenomic RNAs (sgRNA) for expression of 3' proximal genes. Small RNA (sRNA) sequencing and mapping of the PLRV-derived sRNAs revealed coverage of the entire viral genome with the exception of four distinctive gaps. Remarkably, these gaps mapped to areas of PLRV genome with extensive secondary structures, such as the internal ribosome entry site and 5' transcriptional start site of sgRNA1 and sgRNA2. The last gap mapped to ~500. nt from the 3' terminus of PLRV genome and suggested the possible presence of an additional sgRNA for PLRV. Quantitative real-time PCR and northern blot analysis confirmed the expression of sgRNA3 and subsequent analyses placed its 5' transcriptional start site at position 5347 of PLRV genome. A regulatory role is proposed for the PLRV sgRNA3 as it encodes for an RNA-binding protein with specificity to the 5' of PLRV genomic RNA. © 2013.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The complete nucleotide sequence of Subterranean clover mottle virus (SCMoV) genomic RNA has been determined. The SCMoV genome is 4,258 nucleotides in length. It shares most nucleotide and amino acid sequence identity with the genome of Lucerne transient streak virus (LTSV). SCMoV RNA encodes four overlapping open reading frames and has a genome organisation similar to that of Cocksfoot mottle virus (CfMV). ORF1 and ORF4 are predicted to encode single proteins. ORF2 is predicted to encode two proteins that are derived from a -1 translational frameshift between two overlapping reading frames (ORF2a and ORF2b). A search of amino acid databases did not find a significant match for ORF1 and the function of this protein remains unclear. ORF2a contains a motif typical of chymotrypsin-like serine proteases and ORF2b has motifs characteristically present in positive-stranded RNA-dependent RNA polymerases. ORF4 is likely to be expressed from a subgenomic RNA and encodes the viral coat protein. The ORF2a/ORF2b overlapping gene expression strategy used by SCMoV and CfMV is similar to that of the poleroviruses and differ from that of other published sobemoviruses. These results suggest that the sobemoviruses could now be divided into two distinct subgroups based on those that express the RNA-dependent RNA polymerase from a single, in-frame polyprotein, and those that express it via a -1 translational frameshifting mechanism.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Subterranean clover stunt disease is an economically important aphid-borne virus disease affecting certain pasture and grain legumes in Australia. The virus associated with the disease, subterranean clover stunt virus (SCSV), was previously found to be representative of a new type of single-stranded DNA virus. Analysis of the virion DNA and restriction mapping of double-stranded cDNA synthesized from virion DNA suggested that SCSV has a segmented genome composed of 3 or 4 different species of circular ssDNA each of about 850-880 nucleotides. To further investigate the complexity of the SCSV genome, we have isolated the replicative form DNA from infected pea and from it prepared putative full-length clones representing the SCSV genome segments. Analysis of these clones by restriction mapping indicated that clones representing at least 4 distinct genomic segments were obtained. This method is thus suitable for generating an extensive genomic library of novel ssDNA viruses containing multiple genome segments such as SCSV and banana bunchy top virus. The N-terminal amino acid sequence and amino acid composition of the coat protein of SCSV were determined. Comparison of the amino acid sequence with partial DNA sequence data, and the distinctly different restriction maps obtained for the full-length clones suggested that only one of these clones contained the coat protein gene. The results confirmed that SCSV has a functionally divided genome composed of several distinct ssDNA circles each of about 1 kb.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article documents the public availability of (i) transcriptome sequence data, assembled and annotated contigs and unigenes, and BLAST hits from the Queensland fruit fly, Bactrocera tryoni; (ii) 75 single-nucleotide variants (SNVs) from 454 sequencing of reduced representation libraries for Phalangiidae harvestmen, Megabunus armatus, Megabunus vignai, Megabunus lesserti, and Rilaena triangularis; and (iii) expressed sequence tags from 454 sequencing of the lepidopterans Lymantria dispar and Lymantria monacha.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The striped catfish (Pangasianodon hypophthalmus) culture industry in the Mekong Delta in Vietnam has developed rapidly over the past decade. The culture industry now however, faces some significant challenges, especially related to climate change impacts notably from predicted extensive saltwater intrusion into many low topographical coastal provinces across the Mekong Delta. This problem highlights a need for development of culture stocks that can tolerate more saline culture environments as a response to expansion of saline water-intruded land. While a traditional artificial selection program can potentially address this need, understanding the genomic basis of salinity tolerance can assist development of more productive culture lines. The current study applied a transcriptomic approach using Ion PGM technology to generate expressed sequence tag (EST) resources from the intestine and swim bladder from striped catfish reared at a salinity level of 9 ppt which showed best growth performance. Total sequence data generated was 467.8 Mbp, consisting of 4,116,424 reads with an average length of 112 bp. De novo assembly was employed that generated 51,188 contigs, and allowed identification of 16,116 putative genes based on the GenBank non-redundant database. GO annotation, KEGG pathway mapping, and functional annotation of the EST sequences recovered with a wide diversity of biological functions and processes. In addition, more than 11,600 simple sequence repeats were also detected. This is the first comprehensive analysis of a striped catfish transcriptome, and provides a valuable genomic resource for future selective breeding programs and functional or evolutionary studies of genes that influence salinity tolerance in this important culture species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Over the past decade the mitochondrial (mt) genome has become the most widely used genomic resource available for systematic entomology. While the availability of other types of ‘–omics’ data – in particular transcriptomes – is increasing rapidly, mt genomes are still vastly cheaper to sequence and are far less demanding of high quality templates. Furthermore, almost all other ‘–omics’ approaches also sequence the mt genome, and so it can form a bridge between legacy and contemporary datasets. Mitochondrial genomes have now been sequenced for all insect orders, and in many instances representatives of each major lineage within orders (suborders, series or superfamilies depending on the group). They have also been applied to systematic questions at all taxonomic scales from resolving interordinal relationships (e.g. Cameron et al., 2009; Wan et al., 2012; Wang et al., 2012), through many intraordinal (e.g. Dowton et al., 2009; Timmermans et al., 2010; Zhao et al. 2013a) and family-level studies (e.g. Nelson et al., 2012; Zhao et al., 2013b) to population/biogeographic studies (e.g. Ma et al., 2012). Methodological issues around the use of mt genomes in insect phylogenetic analyses and the empirical results found to date have recently been reviewed by Cameron (2014), yet the technical aspects of sequencing and annotating mt genomes were not covered. Most papers which generate new mt genome report their methods in a simplified form which can be difficult to replicate without specific knowledge of the field. Published studies utilize a sufficiently wide range of approaches, usually without justification for the one chosen, that confusion about commonly used jargon such as ‘long PCR’ and ‘primer walking’ could be a serious barrier to entry. Furthermore, sequenced mt genomes have been annotated (gene locations defined) to wildly varying standards and improving data quality through consistent annotation procedures will benefit all downstream users of these datasets. The aims of this review are therefore to: 1. Describe in detail the various sequencing methods used on insect mt genomes; 2. Explore the strengths/weakness of different approaches; 3. Outline the procedures and software used for insect mt genome annotation, and; 4. Highlight quality control steps used for new annotations, and to improve the re-annotation of previously sequenced mt genomes used in systematic or comparative research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Escherichia coli ST131 is now recognised as a leading contributor to urinary tract and bloodstream infections in both community and clinical settings. Here we present the complete, annotated genome of E. coli EC958, which was isolated from the urine of a patient presenting with a urinary tract infection in the Northwest region of England and represents the most well characterised ST131 strain. Sequencing was carried out using the Pacific Biosciences platform, which provided sufficient depth and read-length to produce a complete genome without the need for other technologies. The discovery of spurious contigs within the assembly that correspond to site-specific inversions in the tail fibre regions of prophages demonstrates the potential for this technology to reveal dynamic evolutionary mechanisms. E. coli EC958 belongs to the major subgroup of ST131 strains that produce the CTX-M-15 extended spectrum β-lactamase, are fluoroquinolone resistant and encode the fimH30 type 1 fimbrial adhesin. This subgroup includes the Indian strain NA114 and the North American strain JJ1886. A comparison of the genomes of EC958, JJ1886 and NA114 revealed that differences in the arrangement of genomic islands, prophages and other repetitive elements in the NA114 genome are not biologically relevant and are due to misassembly. The availability of a high quality uropathogenic E. coli ST131 genome provides a reference for understanding this multidrug resistant pathogen and will facilitate novel functional, comparative and clinical studies of the E. coli ST131 clonal lineage.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. Results RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene. Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. Conclusions This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Single nucleotide polymorphisms (SNPs) are widely acknowledged as the marker of choice for many genetic and genomic applications because they show co-dominant inheritance, are highly abundant across genomes and are suitable for high-throughput genotyping. Here we evaluated the applicability of SNP markers developed from Crassostrea gigas and C. virginica expressed sequence tags (ESTs) in closely related Crassostrea and Ostrea species. A total of 213 putative interspecific level SNPs were identified from re-sequencing data in six amplicons, yielding on average of one interspecific level SNP per seven bp. High polymorphism levels were observed and the high success rate of transferability show that genic EST-derived SNP markers provide an efficient method for rapid marker development and SNP discovery in closely related oyster species. The six EST-SNP markers identified here will provide useful molecular tools for addressing questions in molecular ecology and evolution studies including for stock analysis (pedigree monitoring) in related oyster taxa.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Chlamydia pneumoniae is an obligate intracellular bacterium implicated in a wide range of human diseases including atherosclerosis and Alzheimer's disease. Efforts to understand the relationships between C. pneumoniae detected in these diseases have been hindered by the availability of sequence data for non-respiratory strains. In this study, we sequenced the whole genomes for C. pneumoniae isolates from atherosclerosis and Alzheimer's disease, and compared these to previously published C. pneumoniae genomes. Phylogenetic analyses of these new C. pneumoniae strains indicate two sub-groups within human C. pneumoniae, and suggest that both recombination and mutation events have driven the evolution of human C. pneumoniae. Further fine-detailed analyses of these new C. pneumoniae sequences show several genetically variable loci. This suggests that similar strains of C. pneumoniae are found in the brain, lungs and cardiovascular system and that only minor genetic differences may contribute to the adaptation of particular strains in human disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated approximately 2,000, approximately 3,700 and approximately 9,500 SNPs explained approximately 21%, approximately 24% and approximately 29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/beta-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The number of genetic factors associated with common human traits and disease is increasing rapidly, and the general public is utilizing affordable, direct-to-consumer genetic tests. The results of these tests are often in the public domain. A combination of factors has increased the potential for the indirect estimation of an individual's risk for a particular trait. Here we explain the basic principals underlying risk estimation which allowed us to test the ability to make an indirect risk estimation from genetic data by imputing Dr. James Watson's redacted apolipoprotein E gene (APOE) information. The principles underlying risk prediction from genetic data have been well known and applied for many decades, however, the recent increase in genomic knowledge, and advances in mathematical and statistical techniques and computational power, make it relatively easy to make an accurate but indirect estimation of risk. There is a current hazard for indirect risk estimation that is relevant not only to the subject but also to individuals related to the subject; this risk will likely increase as more detailed genomic data and better computational tools become available.