939 resultados para bacteria genome nucleotide usage
Resumo:
Infectious cDNA clones of RNA viruses are important research tools, but flavivirus cDNA clones have proven difficult to assemble and propagate in bacteria. This has been attributed to genetic instability and/or host cell toxicity, however the mechanism leading to these difficulties has not been fully elucidated. Here we identify and characterize an efficient cryptic bacterial promoter in the cDNA encoding the dengue virus (DENV) 5′ UTR. Following cryptic transcription in E. coli, protein expression initiated at a conserved in-frame AUG that is downstream from the authentic DENV initiation codon, yielding a DENV polyprotein fragment that was truncated at the N-terminus. A more complete understanding of constitutive viral protein expression in E. coli might help explain the cloning and propagation difficulties generally observed with flavivirus cDNA.
Resumo:
The monogeneric family Fergusoninidae consists of gall-forming flies that, together with Fergusobia (Tylenchida: Neotylenchidae) nematodes, form the only known mutualistic association between insects and nematodes. In this study, the entire 16,000 bp mitochondrial genome of Fergusonina taylori Nelson and Yeates was sequenced. The circular genome contains one encoding region including 27 genes and one non-coding A þT-rich region. The arrangement of the proteincoding, ribosomal RNA (rRNA) and transfer RNA (tRNA) genes was the same as that found in the ancestral insect. Nucleotide composition is highly A þ T biased. All of the protein initiation codons are ATN, except for nad1 which begins with TTT. All 22 tRNA anticodons of F. taylori match those observed in Drosophila yakuba, and all form the typical cloverleaf structure except for tRNA-Ser (AGN) which lacks a dihydrouridine (DHU) arm. Secondary structural features of the rRNA genes of Fergusonina are similar to those proposed for other insects, with minor modifications. The mitochondrial genome of Fergusonina presented here may prove valuable for resolving the sister group to the Fergusoninidae, and expands the available mtDNA data sources for acalyptrates overall.
Resumo:
Despite their ecological significance as decomposers and their evolutionary significance as the most speciose eusocial insect group outside the Hymenoptera, termite (Blattodea: Termitoidae or Isoptera) evolutionary relationships have yet to be well resolved. Previous morphological and molecular analyses strongly conflict at the family level and are marked by poor support for backbone nodes. A mitochondrial (mt) genome phylogeny of termites was produced to test relationships between the recognised termite families, improve nodal support and test the phylogenetic utility of rare genomic changes found in the termite mt genome. Complete mt genomes were sequenced for 7 of the 9 extant termite families with additional representatives of each of the two most speciose families Rhinotermitidae (3 of 7 subfamilies) and Termitidae (3 of 8 subfamilies). The mt genome of the well supported sister group of termites, the subsocial cockroach Cryptocercus, was also sequenced. A highly supported tree of termite relationships was produced by all analytical methods and data treatment approaches, however the relationship of the termites + Cryptocercus clade to other cockroach lineages was highly affected by the strong nucleotide compositional bias found in termites relative to other dictyopterans. The phylogeny supports previously proposed suprafamilial termite lineages, the Euisoptera and Neoisoptera, a later derived Kalotermitidae as sister group of the Neoisoptera and a monophyletic clade of dampwood (Stolotermitidae, Archotermopsidae) and harvester termites (Hodotermitidae). In contrast to previous termite phylogenetic studies, nodal supports were very high for family-level relationships within termites. Two rare genomic changes in the mt genome control region were found to be molecular synapomorphies for major clades. An elongated stem-loop structure defined the clade Polyphagidae + (Cryptocercus + termites), and a further series of compensatory base changes in this stem loop is synapomorphic for the Neoisoptera. The complicated repeat structures first identified in Reticulitermes, composed of short (A-type) and long (B-type repeats) defines the clade Heterotermitinae + Termitidae, while the secondary loss of A-type repeats is synapomorphic for the non-macrotermitine Termitidae.
Resumo:
Background. Recent reports have indicated that single-stranded DNA (ssDNA) viruses in the taxonomic families Geminiviridae, Parvoviridae and Anellovirus may be evolving at rates of ∼10-4 substitutions per site per year (subs/site/year). These evolution rates are similar to those of RNA viruses and are surprisingly high given that ssDNA virus replication involves host DNA polymerases with fidelities approximately 10 000 times greater than those of error-prone viral RNA polymerases. Although high ssDNA virus evolution rates were first suggested in evolution experiments involving the geminivirus maize streak virus (MSV), the evolution rate of this virus has never been accurately measured. Also, questions regarding both the mechanistic basis and adaptive value of high geminivirus mutation rates remain unanswered. Results. We determined the short-term evolution rate of MSV using full genome analysis of virus populations initiated from cloned genomes. Three wild type viruses and three defective artificial chimaeric viruses were maintained in planta for up to five years and displayed evolution rates of between 7.4 × 10-4 and 7.9 × 10-4 subs/site/year. Conclusion. These MSV evolution rates are within the ranges observed for other ssDNA viruses and RNA viruses. Although no obvious evidence of positive selection was detected, the uneven distribution of mutations within the defective virus genomes suggests that some of the changes may have been adaptive. We also observed inter-strand nucleotide substitution imbalances that are consistent with a recent proposal that high mutation rates in geminiviruses (and possibly ssDNA viruses in general) may be due to mutagenic processes acting specifically on ssDNA molecules. © 2008 Walt et al; licensee BioMed Central Ltd.
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
Objective: To perform a 1-stage meta-analysis of genome-wide association studies (GWAS) of multiple sclerosis (MS) susceptibility and to explore functional consequences of new susceptibility loci. Methods: We synthesized 7 MS GWAS. Each data set was imputed using HapMap phase II, and a per single nucleotide polymorphism (SNP) meta-analysis was performed across the 7 data sets. We explored RNA expression data using a quantitative trait analysis in peripheral blood mononuclear cells (PBMCs) of 228 subjects with demyelinating disease. Results: We meta-analyzed 2,529,394 unique SNPs in 5,545 cases and 12,153 controls. We identified 3 novel susceptibility alleles: rs170934T at 3p24.1 (odds ratio [OR], 1.17; p ¼ 1.6 � 10�8) near EOMES, rs2150702G in the second intron of MLANA on chromosome 9p24.1 (OR, 1.16; p ¼ 3.3 � 10�8), and rs6718520A in an intergenic region on chromosome 2p21, with THADA as the nearest flanking gene (OR, 1.17; p ¼ 3.4 � 10�8). The 3 new loci do not have a strong cis effect on RNA expression in PBMCs. Ten other susceptibility loci had a suggestive p < 1 � 10�6, some of these loci have evidence of association in other inflammatory diseases (ie, IL12B, TAGAP, PLEK, and ZMIZ1). Interpretation: We have performed a meta-analysis of GWAS in MS that more than doubles the size of previous gene discovery efforts and highlights 3 novel MS susceptibility loci. These and additional loci with suggestive evidence of association are excellent candidates for further investigations to refine and validate their role in the genetic architecture of MS.
Resumo:
Background: Genome-wide association studies (GWAS) have identified more than 100 genetic loci for various cancers. However, only one is for endometrial cancer. Methods: We conducted a three-stage GWAS including 8,492 endometrial cancer cases and 16,596 controls. After analyzing 585,963 single-nucleotide polymorphisms (SNP) in 832 cases and 2,682 controls (stage I) from the Shanghai Endometrial Cancer Genetics Study, we selected the top 106 SNPs for in silico replication among 1,265 cases and 5,190 controls from the Australian/British Endometrial Cancer GWAS (stage II). Nine SNPs showed results consistent in direction with stage I with P < 0.1. These nine SNPs were investigated among 459 cases and 558 controls (stage IIIa) and six SNPs showed a direction of association consistent with stages I and II. These six SNPs, plus two additional SNPs selected on the basis of linkage disequilibrium and P values in stage II, were investigated among 5,936 cases and 8,166 controls from an additional 11 studies (stage IIIb). Results: SNP rs1202524, near the CAPN9 gene on chromosome 1q42.2, showed a consistent association with endometrial cancer risk across all three stages, with ORs of 1.09 [95% confidence interval (CI), 1.03–1.16] for the A/G genotype and 1.17 (95% CI, 1.05–1.30) for the G/G genotype (P = 1.6 × 10−4 in combined analyses of all samples). The association was stronger when limited to the endometrioid subtype, with ORs (95% CI) of 1.11 (1.04–1.18) and 1.21 (1.08–1.35), respectively (P = 2.4 × 10−5). Conclusions: Chromosome 1q42.2 may host an endometrial cancer susceptibility locus. Impact: This study identified a potential genetic locus for endometrial cancer risk
Resumo:
Endometrial cancer is one of the most common female diseases in developed nations and is the most commonly diagnosed gynaecological cancer in Australia. The disease is commonly classified by histology: endometrioid or non-endometrioid endometrial cancer. While non-endometrioid endometrial cancers are accepted to be high-grade, aggressive cancers, endometrioid cancers (comprising 80% of all endometrial cancers diagnosed) generally carry a favourable patient prognosis. However, endometrioid endometrial cancer patients endure significant morbidity due to surgery and radiotherapy used for disease treatment, and patients with recurrent disease have a 5-year survival rate of less than 50%. Genetic analysis of women with endometrial cancer could uncover novel markers associated with disease risk and/or prognosis, which could then be used to identify women at high risk and for the use of specialised treatments. Proteases are widely accepted to play an important role in the development and progression of cancer. This PhD project hypothesised that SNPs from two protease gene families, the matrix metalloproteases (MMPs, including their tissue inhibitors, TIMPs) and the tissue kallikrein-related peptidases (KLKs) would be associated with endometrial cancer susceptibility and/or prognosis. In the first part of this study, optimisation of the genotyping techniques was performed. Results from previously published endometrial cancer genetic association studies were attempted to be validated in a large, multicentre replication set (maximum cases n = 2,888, controls n = 4,483, 3 studies). The rs11224561 progesterone receptor SNP (PGR, A/G) was observed to be associated with increased endometrial cancer risk (per A allele OR 1.31, 95% CI 1.12-1.53; p-trend = 0.001), a result which was initially reported among a Chinese sample set. Previously reported associations for the remaining 8 SNPs investigated for this section of the PhD study were not confirmed, thereby reinforcing the importance of validation of genetic association studies. To examine the effect of SNPs from the MMP and KLK families on endometrial cancer risk, we selected the most significantly associated MMP and KLK SNPs from genome-wide association study analysis (GWAS) to be genotyped in the GWAS replication set (cases n = 4,725, controls n = 9,803, 13 studies). The significance of the MMP24 rs932562 SNP was unchanged after incorporation of the stage 2 samples (Stage 1 per allele OR 1.18, p = 0.002; Combined Stage 1 and 2 OR 1.09, p = 0.002). The rs10426 SNP, located 3' to KLK10 was predicted by bioinformatic analysis to effect miRNA binding. This SNP was observed in the GWAS stage 1 result to exhibit a recessive effect on endometrial cancer risk, a result which was not validated in the stage 2 sample set (Stage 1 OR 1.44, p = 0.007; Combined Stage 1 and 2 OR 1.14, p = 0.08). Investigation of the regions imputed surrounding the MMP, TIMP and KLK genes did not reveal any significant targets for further analysis. Analysis of the case data from the endometrial cancer GWAS to identify genetic variation associated with cancer grade did not reveal SNPs from the MMP, TIMP or KLK genes to be statistically significant. However, the representation of SNPs from the MMP, TIMP and KLK families by the GWAS genotyping platform used in this PhD project was examined and observed to be very low, with the genetic variation of four genes (MMP23A, MMP23B, MMP28 and TIMP1) not captured at all by this technique. This suggests that comprehensive candidate gene association studies will be required to assess the role of SNPs from these genes with endometrial cancer risk and prognosis. Meta-analysis of gene expression microarray datasets curated as part of this PhD study identified a number of MMP, TIMP and KLK genes to display differential expression by endometrial cancer status (MMP2, MMP10, MMP11, MMP13, MMP19, MMP25 and KLK1) and histology (MMP2, MMP11, MMP12, MMP26, MMP28, TIMP2, TIMP3, KLK6, KLK7, KLK11 and KLK12). In light of these findings these genes should be prioritised for future targeted genetic association studies. Two SNPs located 43.5 Mb apart on chromosome 15 were observed from the GWAS analysis to be associated with increased endometrial cancer grade, results that were validated in silico in two independent datasets. One of these SNPs, rs8035725 is located in the 5' untranslated region of a MYC promoter binding protein DENND4A (Stage 1 OR 1.15, p = 9.85 x 10P -5 P, combined Stage 1 and in silico validation OR 1.13, p = 5.24 x 10P -6 P). This SNP has previously been reported to alter the expression of PTPLAD1, a gene involved in the synthesis of very long fatty acid chains and in the Rac1 signaling pathway. Meta-analysis of gene expression microarray data found PTPLAD1 to display increased expression in the aggressive non-endometrioid histology compared with endometrioid endometrial cancer, suggesting that the causal SNP underlying the observed genetic association may influence expression of this gene. Neither rs8035725 nor significant SNPs identified by imputation were predicted bioinformatically to affect transcription factor binding sites, indicating that further studies are required to assess their potential effect on other regulatory elements. The other grade- associated SNP, rs6606792, is located upstream of an inferred pseudogene, ELMO2P1 (Stage 1 OR 1.12, p = 5 x 10P -5 P; combined Stage 1 and in silico validation OR 1.09, p = 3.56 x 10P -5 P). Imputation of the ±1 Mb region surrounding this SNP revealed a cluster of significantly associated variants which are predicted to abolish various transcription factor binding sites, and would be expected to decrease gene expression. ELMO2P1 was not included on the microarray platforms collected for this PhD, and so its expression could not be investigated. However, the high sequence homology of ELMO2P1 with ELMO2, a gene important to cell motility, indicates that ELMO2 could be the parent gene for ELMO2P1 and as such, ELMO2P1 could function to regulate the expression of ELMO2. Increased expression of ELMO2 was seen to be associated with increasing endometrial cancer grade, as well as with aggressive endometrial cancer histological subtypes by microarray meta-analysis. Thus, it is hypothesised that SNPs in linkage disequilibrium with rs6606792 decrease the transcription of ELMO2P1, reducing the regulatory effect of ELMO2P1 on ELMO2 expression. Consequently, ELMO2 expression is increased, cell motility is enhanced leading to an aggressive endometrial cancer phenotype. In summary, these findings have identified several areas of research for further study. The results presented in this thesis provide evidence that a SNP in PGR is associated with risk of developing endometrial cancer. This PhD study also reports two independent loci on chromosome 15 to be associated with increased endometrial cancer grade, and furthermore, genes associated with these SNPs to be differentially expressed according in aggressive subtypes and/or by grade. The studies reported in this thesis support the need for comprehensive SNP association studies on prioritised MMP, TIMP and KLK genes in large sample sets. Until these studies are performed, the role of MMP, TIMP and KLK genetic variation remains unclear. Overall, this PhD study has contributed to the understanding of genetic variation involvement in endometrial cancer susceptibility and prognosis. Importantly, the genetic regions highlighted in this study could lead to the identification of novel gene targets to better understand the biology of endometrial cancer and also aid in the development of therapeutics directed at treating this disease.
Resumo:
Susceptibility to complex traits, by definition, involves aetiological polymorphisms at multiple genetic loci combined with variable contributions by environmental factors. However, the approaches taken to identifying genetic loci implicated in susceptibility to complex traits frequently overlooks the compounding contribution of multiple loci in favour of highlighting a single gene solely responsible for predisposition. It is only in a small minority of cases that this has resulted in clear disease heritability associated with polymorphisms in a single gene. More often, this approach has led to an accumulation of single-gene associations with minor contributions to disease susceptibility. As the genomic era advances and genome-wide screens become higher in resolution and throughput, the need for simultaneous consideration of multiple loci is becoming more important. With special reference to non-Hodgkin’s lymphoma (NHL), this chapter will overview the current progress made in elucidating genetic polymorphisms associated with disease susceptibility. We also present novel data from a high-resolution single nucleotide polymorphism (SNP) microarray screen for susceptibility loci that are involved in NHL. Using an ‘informed approach’, the findings are highlighted within the context of cellular pathways, and provide insight and new ideas for methods of analysis for genome-wide screens for susceptibility.
Resumo:
As of June 2009, 361 genome-wide association studies (GWAS) had been referenced by the HuGE database. GWAS require DNA from many thousands of individuals, relying on suitable DNA collections. We recently performed a multiple sclerosis (MS) GWAS where a substantial component of the cases (24%) had DNA derived from saliva. Genotyping was done on the Illumina genotyping platform using the Infinium Hap370CNV DUO microarray. Additionally, we genotyped 10 individuals in duplicate using both saliva- and blood-derived DNA. The performance of blood- versus saliva-derived DNA was compared using genotyping call rate, which reflects both the quantity and quality of genotyping per sample and the “GCScore,” an Illumina genotyping quality score, which is a measure of DNA quality. We also compared genotype calls and GCScores for the 10 sample pairs. Call rates were assessed for each sample individually. For the GWAS samples, we compared data according to source of DNA and center of origin. We observed high concordance in genotyping quality and quantity between the paired samples and minimal loss of quality and quantity of DNA in the saliva samples in the large GWAS sample, with the blood samples showing greater variation between centers of origin. This large data set highlights the usefulness of saliva DNA for genotyping, especially in high-density single-nucleotide polymorphism microarray studies such as GWAS.
Resumo:
The transient leaf assay in Nicotiana benthamiana is widely used in plant sciences, with one application being the rapid assembly of complex multigene pathways that produce new fatty acid profiles. This rapid and facile assay would be further improved if it were possible to simultaneously overexpress transgenes while accurately silencing endogenes. Here, we report a draft genome resource for N. benthamiana spanning over 75% of the 3.1 Gb haploid genome. This resource revealed a two-member NbFAD2 family, NbFAD2.1 and NbFAD2.2, and quantitative RT-PCR (qRT-PCR) confirmed their expression in leaves. FAD2 activities were silenced using hairpin RNAi as monitored by qRT-PCR and biochemical assays. Silencing of endogenous FAD2 activities was combined with overexpression of transgenes via the use of the alternative viral silencing-suppressor protein, V2, from Tomato yellow leaf curl virus. We show that V2 permits maximal overexpression of transgenes but, crucially, also allows hairpin RNAi to operate unimpeded. To illustrate the efficacy of the V2-based leaf assay system, endogenous lipids were shunted from the desaturation of 18:1 to elongation reactions beginning with 18:1 as substrate. These V2-based leaf assays produced ~50% more elongated fatty acid products than p19-based assays. Analyses of small RNA populations generated from hairpin RNAi against NbFAD2 confirm that the siRNA population is dominated by 21 and 22 nt species derived from the hairpin. Collectively, these new tools expand the range of uses and possibilities for metabolic engineering in transient leaf assays. © 2012 Naim et al.
Resumo:
Forward genetic screens have identified numerous genes involved in development and metabolism, and remain a cornerstone of biological research. However, to locate a causal mutation, the practice of crossing to a polymorphic background to generate a mapping population can be problematic if the mutant phenotype is difficult to recognize in the hybrid F2 progeny, or dependent on parental specific traits. Here in a screen for leaf hyponasty mutants, we have performed a single backcross of an Ethane Methyl Sulphonate (EMS) generated hyponastic mutant to its parent. Whole genome deep sequencing of a bulked homozygous F2 population and analysis via the Next Generation EMS mutation mapping pipeline (NGM) unambiguously determined the causal mutation to be a single nucleotide polymorphisim (SNP) residing in HASTY, a previously characterized gene involved in microRNA biogenesis. We have evaluated the feasibility of this backcross approach using three additional SNP mapping pipelines; SHOREmap, the GATK pipeline, and the samtools pipeline. Although there was variance in the identification of EMS SNPs, all returned the same outcome in clearly identifying the causal mutation in HASTY. The simplicity of performing a single parental backcross and genome sequencing a small pool of segregating mutants has great promise for identifying mutations that may be difficult to map using conventional approaches.
Resumo:
The nucleotide sequences of several animal, plant and bacterial genomes are now known, but the functions of many of the proteins that they are predicted to encode remain unclear. RNA interference is a gene-silencing technology that is being used successfully to investigate gene function in several organisms - for example, Caenorhabditis elegans. We discuss here that RNA-induced gene silencing approaches are also likely to be effective for investigating plant gene function in a high-throughput, genome-wide manner.
Resumo:
The complete nucleotide sequence of Subterranean clover mottle virus (SCMoV) genomic RNA has been determined. The SCMoV genome is 4,258 nucleotides in length. It shares most nucleotide and amino acid sequence identity with the genome of Lucerne transient streak virus (LTSV). SCMoV RNA encodes four overlapping open reading frames and has a genome organisation similar to that of Cocksfoot mottle virus (CfMV). ORF1 and ORF4 are predicted to encode single proteins. ORF2 is predicted to encode two proteins that are derived from a -1 translational frameshift between two overlapping reading frames (ORF2a and ORF2b). A search of amino acid databases did not find a significant match for ORF1 and the function of this protein remains unclear. ORF2a contains a motif typical of chymotrypsin-like serine proteases and ORF2b has motifs characteristically present in positive-stranded RNA-dependent RNA polymerases. ORF4 is likely to be expressed from a subgenomic RNA and encodes the viral coat protein. The ORF2a/ORF2b overlapping gene expression strategy used by SCMoV and CfMV is similar to that of the poleroviruses and differ from that of other published sobemoviruses. These results suggest that the sobemoviruses could now be divided into two distinct subgroups based on those that express the RNA-dependent RNA polymerase from a single, in-frame polyprotein, and those that express it via a -1 translational frameshifting mechanism.
Resumo:
The complete nucleotide sequence of genome segment S4 of rice ragged stunt oryzavirus (RRSV, Thai-isolate) was determined. The 3823 bp sequence contains two large open reading frames (ORFs). ORF1, spanning nucleotides 12 to 3776, is capable of encoding a protein of M(r) 141,380 (P4a). The P4a amino acid sequence predicted from the nucleotide sequence contains sequence motifs conserved in RNA-dependent RNA polymerases (RDRPs). When compared for evolutionary relationships with RDRPs of other reoviruses using the amino acid sequences around the conserved GDD motif, P4a was shown to be more related to Nilaparvata lugens reovirus and reovirus serotype 3 than to rice dwarf phytoreovirus, bovine rotavirus or bluetongue virus. The ORF2, spanning nucleotides 491 to 1468, is out of frame with ORF1 and is capable of encoding a protein of 36, 920 (P4b). Coupled in vitro transcription-translation from cloned ORF2 in wheat germ extract confirmed the existence of ORF2 but in vivo production and possible function of P4b is yet to be determined.