112 resultados para Prokaryotic Genomes
Resumo:
Psittacine beak and feather disease (PBFD) has a broad host range and is widespread in wild and captive psittacine populations in Asia, Africa, the Americas, Europe and Australasia. Beak and feather disease circovirus (BFDV) is the causative agent. BFDV has an ~2 kb single stranded circular DNA genome encoding just two proteins (Rep and CP). In this study we provide support for demarcation of BFDV strains by phylogenetic analysis of 65 complete genomes from databases and 22 new BFDV sequences isolated from infected psittacines in South Africa. We propose 94% genome-wide sequence identity as a strain demarcation threshold, with isolates sharing > 94% identity belonging to the same strain, and strain subtypes sharing> 98% identity. Currently, BFDV diversity falls within 14 strains, with five highly divergent isolates from budgerigars probably representing a new species of circovirus with three strains (budgerigar circovirus; BCV-A, -B and -C). The geographical distribution of BFDV and BCV strains is strongly linked to the international trade in exotic birds; strains with more than one host are generally located in the same geographical area. Lastly, we examined BFDV and BCV sequences for evidence of recombination, and determined that recombination had occurred in most BFDV and BCV strains. We established that there were two globally significant recombination hotspots in the viral genome: the first is along the entire intergenic region and the second is in the C-terminal portion of the CP ORF. The implications of our results for the taxonomy and classification of circoviruses are discussed. © 2011 SGM.
Resumo:
Background Maize streak virus -strain A (MSV-A; Genus Mastrevirus, Family Geminiviridae), the maize-adapted strain of MSV that causes maize streak disease throughout sub-Saharan Africa, probably arose between 100 and 200 years ago via homologous recombination between two MSV strains adapted to wild grasses. MSV recombination experiments and analyses of natural MSV recombination patterns have revealed that this recombination event entailed the exchange of the movement protein - coat protein gene cassette, bounded by the two genomic regions most prone to recombination in mastrevirus genomes; the first surrounding the virion-strand origin of replication, and the second around the interface between the coat protein gene and the short intergenic region. Therefore, aside from the likely adaptive advantages presented by a modular exchange of this cassette, these specific breakpoints may have been largely predetermined by the underlying mechanisms of mastrevirus recombination. To investigate this hypothesis, we constructed artificial, low-fitness, reciprocal chimaeric MSV genomes using alternating genomic segments from two MSV strains; a grass-adapted MSV-B, and a maize-adapted MSV-A. Between them, each pair of reciprocal chimaeric genomes represented all of the genetic material required to reconstruct - via recombination - the highly maize-adapted MSV-A genotype, MSV-MatA. We then co-infected a selection of differentially MSV-resistant maize genotypes with pairs of reciprocal chimaeras to determine the efficiency with which recombination would give rise to high-fitness progeny genomes resembling MSV-MatA. Results Recombinants resembling MSV-MatA invariably arose in all of our experiments. However, the accuracy and efficiency with which the MSV-MatA genotype was recovered across all replicates of each experiment depended on the MSV susceptibility of the maize genotypes used and the precise positions - in relation to known recombination hotspots - of the breakpoints required to re-create MSV-MatA. Although the MSV-sensitive maize genotype gave rise to the greatest variety of recombinants, the measured fitness of each of these recombinants correlated with their similarity to MSV-MatA. Conclusions The mechanistic predispositions of different MSV genomic regions to recombination can strongly influence the accessibility of high-fitness MSV recombinants. The frequency with which the fittest recombinant MSV genomes arise also correlates directly with the escalating selection pressures imposed by increasingly MSV-resistant maize hosts.
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
One of the next great challenges of cell biology is the determination of the enormous number of protein structures encoded in genomes. In recent years, advances in electron cryo-microscopy and high-resolution single particle analysis have developed to the point where they now provide a methodology for high resolution structure determination. Using this approach, images of randomly oriented single particles are aligned computationally to reconstruct 3-D structures of proteins and even whole viruses. One of the limiting factors in obtaining high-resolution reconstructions is obtaining a large enough representative dataset ($>100,000$ particles). Traditionally particles have been manually picked which is an extremely labour intensive process. The problem is made especially difficult by the low signal-to-noise ratio of the images. This paper describes the development of automatic particle picking software, which has been tested with both negatively stained and cryo-electron micrographs. This algorithm has been shown to be capable of selecting most of the particles, with few false positives. Further work will involve extending the software to detect differently shaped and oriented particles.
Resumo:
Several major human pathogens, including the filoviruses, paramyxoviruses, and rhabdoviruses, package their single-stranded RNA genomes within helical nucleocapsids, which bud through the plasma membrane of the infected cell to release enveloped virions. The virions are often heterogeneous in shape, which makes it difficult to study their structure and assembly mechanisms. We have applied cryo-electron tomography and sub-tomogram averaging methods to derive structures of Marburg virus, a highly pathogenic filovirus, both after release and during assembly within infected cells. The data demonstrate the potential of cryo-electron tomography methods to derive detailed structural information for intermediate steps in biological pathways within intact cells. We describe the location and arrangement of the viral proteins within the virion. We show that the N-terminal domain of the nucleoprotein contains the minimal assembly determinants for a helical nucleocapsid with variable number of proteins per turn. Lobes protruding from alternate interfaces between each nucleoprotein are formed by the C-terminal domain of the nucleoprotein, together with viral proteins VP24 and VP35. Each nucleoprotein packages six RNA bases. The nucleocapsid interacts in an unusual, flexible "Velcro-like" manner with the viral matrix protein VP40. Determination of the structures of assembly intermediates showed that the nucleocapsid has a defined orientation during transport and budding. Together the data show striking architectural homology between the nucleocapsid helix of rhabdoviruses and filoviruses, but unexpected, fundamental differences in the mechanisms by which the nucleocapsids are then assembled together with matrix proteins and initiate membrane envelopment to release infectious virions, suggesting that the viruses have evolved different solutions to these conserved assembly steps.
Resumo:
The marsupial genus Macropus includes three subgenera, the familiar large grazing kangaroos and wallaroos of M. (Macropus) and M. (Osphranter), as well as the smaller mixed grazing/browsing wallabies of M. (Notamacropus). A recent study of five concatenated nuclear genes recommended subsuming the predominantly browsing Wallabia bicolor (swamp wallaby) into Macropus. To further examine this proposal we sequenced partial mitochondrial genomes for kangaroos and wallabies. These sequences strongly favour the morphological placement of W. bicolor as sister to Macropus, although place M. irma (black-gloved wallaby) within M. (Osphranter) rather than as expected, with M. (Notamacropus). Species tree estimation from separately analysed mitochondrial and nuclear genes favours retaining Macropus and Wallabia as separate genera. A simulation study finds that incomplete lineage sorting among nuclear genes is a plausible explanation for incongruence with the mitochondrial placement of W. bicolor, while mitochondrial introgression from a wallaroo into M. irma is the deepest such event identified in marsupials. Similar such coalescent simulations for interpreting gene tree conflicts will increase in both relevance and statistical power as species-level phylogenetics enters the genomic age. Ecological considerations in turn, hint at a role for selection in accelerating the fixation of introgressed or incompletely sorted loci. More generally the inclusion of the mitochondrial sequences substantially enhanced phylogenetic resolution. However, we caution that the evolutionary dynamics that enhance mitochondria as speciation indicators in the presence of incomplete lineage sorting may also render them especially susceptible to introgression.
Resumo:
Phylogenetic inference from sequences can be misled by both sampling (stochastic) error and systematic error (nonhistorical signals where reality differs from our simplified models). A recent study of eight yeast species using 106 concatenated genes from complete genomes showed that even small internal edges of a tree received 100% bootstrap support. This effective negation of stochastic error from large data sets is important, but longer sequences exacerbate the potential for biases (systematic error) to be positively misleading. Indeed, when we analyzed the same data set using minimum evolution optimality criteria, an alternative tree received 100% bootstrap support. We identified a compositional bias as responsible for this inconsistency and showed that it is reduced effectively by coding the nucleotides as purines and pyrimidines (RY-coding), reinforcing the original tree. Thus, a comprehensive exploration of potential systematic biases is still required, even though genome-scale data sets greatly reduce sampling error.
Resumo:
It is exciting to be living at a time when the big questions in biology can be investigated using modern genetics and computing [1]. Bauzà-Ribot et al.[2] take on one of the fundamental drivers of biodiversity, the effect of continental drift in the formation of the world’s biota 3 and 4, employing next-generation sequencing of whole mitochondrial genomes and modern Bayesian relaxed molecular clock analysis. Bauzà-Ribot et al.[2] conclude that vicariance via plate tectonics best explains the genetic divergence between subterranean metacrangonyctid amphipods currently found on islands separated by the Atlantic Ocean. This finding is a big deal in biogeography, and science generally [3], as many other presumed biotic tectonic divergences have been explained as probably due to more recent transoceanic dispersal events [4]. However, molecular clocks can be problematic 5 and 6 and we have identified three issues with the analyses of Bauzà-Ribot et al.[2] that cast serious doubt on their results and conclusions. When we reanalyzed their mitochondrial data and attempted to account for problems with calibration 5 and 6, modeling rates across branches 5 and 7 and substitution saturation [5], we inferred a much younger date for their key node. This implies either a later trans-Atlantic dispersal of these crustaceans, or more likely a series of later invasions of freshwaters from a common marine ancestor, but either way probably not ancient tectonic plate movements.
Resumo:
Global aquaculture has expanded rapidly to address the increasing demand for aquatic protein needs and an uncertain future for wild fisheries. To date, however, most farmed aquatic stocks are essentially wild and little is known about their genomes or the genes that affect important economic traits in culture. Biologists have recognized that recent technological advances including next generation sequencing (NGS) have opened up the possibility of generating genome wide sequence data sets rapidly from non-model organisms at a reasonable cost. In an era when virtually any study organism can 'go genomic', understanding gene function and genetic effects on expressed quantitative trait locus phenotypes will be fundamental to future knowledge development. Many factors can influence the individual growth rate in target species but of particular importance in agriculture and aquaculture will be the identification and characterization of the specific gene loci that contribute important phenotypic variation to growth because the information can be applied to speed up genetic improvement programmes and to increase productivity via marker-assisted selection (MAS). While currently there is only limited genomic information available for any crustacean species, a number of putative candidate genes have been identified or implicated in growth and muscle development in some species. In an effort to stimulate increased research on the identification of growth-related genes in crustacean species, here we review the available information on: (i) associations between genes and growth reported in crustaceans, (ii) growth-related genes involved with moulting, (iii) muscle development and degradation genes involved in moulting, and; (iv) correlations between DNA sequences that have confirmed growth trait effects in farmed animal species used in terrestrial agriculture and related sequences in crustacean species. The information in concert can provide a foundation for increasing the rate at which knowledge about key genes affecting growth traits in crustacean species is gained.
Resumo:
The high risk of metabolic disease traits in Polynesians may be partly explained by elevated prevalence of genetic variants involved in energy metabolism. The genetics of Polynesian populations has been shaped by island hoping migration events which have possibly favoured thrifty genes. The aim of this study was to sequence the mitochondrial genome in a group of Maoris in an effort to characterise genome variation in this Polynesian population for use in future disease association studies. We sequenced the complete mitochondrial genomes of 20 non-admixed Maori subjects using Affymetrix technology. DNA diversity analyses showed the Maori group exhibited reduced mitochondrial genome diversity compared to other worldwide populations, which is consistent with historical bottleneck and founder effects. Global phylogenetic analysis positioned these Maori subjects specifically within mitochondrial haplogroup - B4a1a1. Interestingly, we identified several novel variants that collectively form new and unique Maori motifs – B4a1a1c, B4a1a1a3 and B4a1a1a5. Compared to ancestral populations we observed an increased frequency of non-synonymous coding variants of several mitochondrial genes in the Maori group, which may be a result of positive selection and/or genetic drift effects. In conclusion, this study reports the first complete mitochondrial genome sequence data for a Maori population. Overall, these new data reveal novel mitochondrial genome signatures in this Polynesian population and enhance the phylogenetic picture of maternal ancestry in Oceania. The increased frequency of several mitochondrial coding variants makes them good candidates for future studies aimed at assessment of metabolic disease risk in Polynesian populations.
Resumo:
The population of Norfolk Island, located off the eastern coast of Australia, possesses an unusual and fascinating history. Most present-day islanders are related to a small number of the 'Bounty' mutineer founders. These founders consisted of Caucasian males and Polynesian females and led to an admixed present-day population. By examining a single large pedigree of 5742 individuals, spanning >200 years, we analyzed the influence of admixture and founder effect on various cardiovascular disease (CVD)-related traits. On account of the relative isolation of the population, on average one-third of the genomes of present-day islanders (single large pedigree individuals) is derived from 17 initial founders. The proportion of Polynesian ancestry in the present-day individuals was found to significantly influence total triglycerides, body mass index, systolic blood pressure and diastolic blood pressure. For various cholesterol traits, the influence of ancestry was less marked but overall the direction of effect for all CVD-related traits was consistent with Polynesian ancestry conferring greater CVD risk. Marker-derived homozygosity was computed and agreed with measures of inbreeding derived from pedigree information. Founder effect (inbreeding and marker-derived homozygosity) significantly influenced height. In conclusion, both founder effect and extreme admixture have substantially influenced the genetic architecture of a variety of CVD-related traits in this population.
Resumo:
QUT Library continues to rethink research support with eResearch as a primary driver. The support to the development of the Lens, an open global cyberinfrastructure, has been especially important in the light of technology transfer promotion, and partly in the response to researchers’ needs in following the innovation landscapes not only within the scientific but also patent literature. The Lens http://www.lens.org/lens/ project makes innovation more efficient, fair, transparent and inclusive. It is a joint effort between Cambia http://www.cambia.org.au and Queensland University of Technology (QUT). The Lens serves more than 84 million patent documents in the world as open, annotatable digital public goods that are integrated with scholarly and technical literature along with regulatory and business data. Users can link from search results to visualization and document clusters; from a patent document description to its full-text; from there, if applicable, the sequence data can also be found. Figure 1 shows a BLAST Alignment (DNA) using the Lens. A unique feature of the Lens is the ability to embed search and BLAST results into blogs and websites, and provide real-time updates to them. PatSeq Explorer http://www.lens.org/lens/bio/patseqexplorer allows users to navigate patent sequences that map onto the human genome and in the future, many other genomes. PatSeq Explorer offers three level views for the sequence information and links each group of sequences at the chromosomal level to their corresponding patent documents in the Lens. By integrating sequence and patent search and document clustering capabilities, users can now understand the big and small details on the true extent and scope of genetic sequence patents. QUT Library supported Cambia in developing, testing and promoting the Lens. This poster demonstrates QUT Library’s provision of best practice and holistic research support to a research group and how QUT Librarians have acquired new capabilities to meet the needs of the researchers beyond traditional research support practices.
Resumo:
Viroids and most viral satellites have small, noncoding, and highly structured RNA genomes. How they cause disease symptoms without encoding proteins and why they have characteristic secondary structures are two longstanding questions. Recent studies have shown that both viroids and satellites are capable of inducing RNA silencing, suggesting a possible role of this mechanism in the pathology and evolution of these subviral RNAs. Here we show that preventing RNA silencing in tobacco, using a silencing suppressor, greatly reduces the symptoms caused by the Y satellite of cucumber mosaic virus. Furthermore, tomato plants expressing hairpin RNA, derived from potato spindle tuber viroid, developed symptoms similar to those of potato spindle tuber viroid infection. These results provide evidence suggesting that viroids and satellites cause disease symptoms by directing RNA silencing against physiologically important host genes. We also show that viroid and satellite RNAs are significantly resistant to RNA silencing-mediated degradation, suggesting that RNA silencing is an important selection pressure shaping the evolution of the secondary structures of these pathogens.
Resumo:
Carrot mottle umbravirus (CMoV) has always been found co-infecting plants with carrot red leaf luteovirus (CRLV) and in carrot (Daucus carota) these co-infections are associated with carrot motley dwarf disease (CMD). CMD occurs wherever carrots are grown. Hence, CMoV was believed to have a corresponding global distribution. However, little or no hybridisation was detected between cDNA generated from the sequenced Australian isolate of CMoV (CMoV-A) and RNA from the much studied Scottish isolate of CMoV (CMoV-S). A weak hybridisation signal was obtained using cDNA to a conserved part of the RNA-dependent RNA polymerase gene of CMoV-A, but when cDNAs to other parts of the CMoV-A genome were used as probes there was no detectable hybridisation with CMoV-S RNA. This lack of hybridisation suggests that the two virus isolates have relatively divergent genomes and that they should be regarded as distinct virus species. Both viruses are transmitted by Cavariella aegopodii, but only with the help of CRLV, and they yield almost identical double-stranded RNA profiles. For these reasons, we propose that the CMoV isolate from Australia be renamed carrot mottle mimic umbravirus (CMoMV). cDNA to CMoMV RNA hybridised with RNA from an isolate from New Zealand, whereas cDNA to CMoV-S RNA hybridised with RNA from isolates from England and Morocco but not to RNA from the isolate from New Zealand. Although preliminary, these data suggest that CMoV and CMoMV may have different global distributions.
Resumo:
Most multicellular organisms regulate developmental transitions by microRNAs, which are generated by an enzyme, Dicer. Insects and fungi have two Dicer-like genes, and many animals have only one, yet the plant, Arabidopsis, has four. Examining the poplar and rice genomes revealed that they contain five and six Dicer-like genes, respectively. Analysis of these genes suggests that plants require a basic set of four Dicer types which were present before the divergence of mono- and dicotyledonous plants (∼200 million years ago), but after the divergence of plants from green algae. A fifth type of Dicer seems to have evolved in monocots. © 2006 Federation of European Biochemical Societies.