356 resultados para Phylogenetic species concept
em Queensland University of Technology - ePrints Archive
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Time dependency of molecular rate estimates and systematic overestimation of recent divergence times
Resumo:
Studies of molecular evolutionary rates have yielded a wide range of rate estimates for various genes and taxa. Recent studies based on population-level and pedigree data have produced remarkably high estimates of mutation rate, which strongly contrast with substitution rates inferred in phylogenetic (species-level) studies. Using Bayesian analysis with a relaxed-clock model, we estimated rates for three groups of mitochondrial data: avian protein-coding genes, primate protein-coding genes, and primate d-loop sequences. In all three cases, we found a measurable transition between the high, short-term (<1–2 Myr) mutation rate and the low, long-term substitution rate. The relationship between the age of the calibration and the rate of change can be described by a vertically translated exponential decay curve, which may be used for correcting molecular date estimates. The phylogenetic substitution rates in mitochondria are approximately 0.5% per million years for avian protein-coding sequences and 1.5% per million years for primate protein-coding and d-loop sequences. Further analyses showed that purifying selection offers the most convincing explanation for the observed relationship between the estimated rate and the depth of the calibration. We rule out the possibility that it is a spurious result arising from sequence errors, and find it unlikely that the apparent decline in rates over time is caused by mutational saturation. Using a rate curve estimated from the d-loop data, several dates for last common ancestors were calculated: modern humans and Neandertals (354 ka; 222–705 ka), Neandertals (108 ka; 70–156 ka), and modern humans (76 ka; 47–110 ka). If the rate curve for a particular taxonomic group can be accurately estimated, it can be a useful tool for correcting divergence date estimates by taking the rate decay into account. Our results show that it is invalid to extrapolate molecular rates of change across different evolutionary timescales, which has important consequences for studies of populations, domestication, conservation genetics, and human evolution.
Resumo:
Multivariate predictive models are widely used tools for assessment of aquatic ecosystem health and models have been successfully developed for the prediction and assessment of aquatic macroinvertebrates, diatoms, local stream habitat features and fish. We evaluated the ability of a modelling method based on the River InVertebrate Prediction and Classification System (RIVPACS) to accurately predict freshwater fish assemblage composition and assess aquatic ecosystem health in rivers and streams of south-eastern Queensland, Australia. The predictive model was developed, validated and tested in a region of comparatively high environmental variability due to the unpredictable nature of rainfall and river discharge. The model was concluded to provide sufficiently accurate and precise predictions of species composition and was sensitive enough to distinguish test sites impacted by several common types of human disturbance (particularly impacts associated with catchment land use and associated local riparian, in-stream habitat and water quality degradation). The total number of fish species available for prediction was low in comparison to similar applications of multivariate predictive models based on other indicator groups, yet the accuracy and precision of our model was comparable to outcomes from such studies. In addition, our model developed for sites sampled on one occasion and in one season only (winter), was able to accurately predict fish assemblage composition at sites sampled during other seasons and years, provided that they were not subject to unusually extreme environmental conditions (e.g. extended periods of low flow that restricted fish movement or resulted in habitat desiccation and local fish extinctions).
Resumo:
Resolving species relationships and confirming diagnostic morphological characters for insect clades that are highly plastic, and/or include morphologically cryptic species, is crucial for both academic and applied reasons. Within the true fly (Diptera) family Chironomidae, a most ubiquitous freshwater insect group, the genera CricotopusWulp, 1874 and ParatrichocladiusSantos-Abreu, 1918 have long been taxonomically confusing. Indeed, until recently the Australian fauna had been examined in just two unpublished theses: most species were known by informal manuscript names only, with no concept of relationships. Understanding species limits, and the associated ecology and evolution, is essential to address taxonomic sufficiency in biomonitoring surveys. Immature stages are collected routinely, but tolerance is generalized at the genus level, despite marked variation among species. Here, we explored this issue using a multilocus molecular phylogenetic approach, including the standard mitochondrial barcode region, and tested explicitly for phylogenetic signal in ecological tolerance of species. Additionally, we addressed biogeographical patterns by conducting Bayesian divergence time estimation. We sampled all but one of the now recognized Australian Cricotopus species and tested monophyly using representatives from other austral and Asian locations. Cricotopus is revealed as paraphyletic by the inclusion of a nested monophyletic Paratrichocladius, with in-group diversification beginning in the Eocene. Previous morphological species concepts are largely corroborated, but some additional cryptic diversity is revealed. No significant relationship was observed between the phylogenetic position of a species and its ecology, implying either that tolerance to deleterious environmental impacts is a convergent trait among many Cricotopus species or that sensitive and restricted taxa have diversified into more narrow niches from a widely tolerant ancestor.
Resumo:
Hitherto, the Malaconothridae contained Malaconothrus Berlese, 1904 and Trimalaconothrus Berlese, 1916, defined by the possession of one pre-tarsal claw (monodactyly) or by three claws (tridactyly) respectively. However, monodactyly is a convergent apomorphy within the Oribatida and an unreliable character for a classification. Therefore we undertook a phylogenetic analysis of 102 species as the basis for a taxonomic review of the Malaconothridae. We identified two major clades, equivalent to the genera Tyrphonothrus Knülle, 1957 and Malaconothrus. These genera are redefined. Trimala-conothrus becomes the junior subjective synonym of Malaconothrus. Some 42 species of Trimalaconothrus are recom-bined to Malaconothrus and 15 species to Tyrphonothrus. Homonyms created by the recombinations are rectified. The replacement name M. hammerae nom. nov. is proposed for M. angulatus Hammer, 1958, the junior homonym of M. an-gulatus (Willmann, 1931) and the replacement name M. luxtoni nom. nov. is proposed for M. scutatus Luxton, 1987, the junior homonym of M. scutatus Mihelč ič, 1959. Trimalaconothrus iteratus Subías, 2004 is an unnecessary replacement name and is a junior objective synonym of Malaconothrus longirostrum (Hammer 1966). Malaconothrus praeoccupatus Subías, 2004 is a junior objective synonym of M. machadoi Balogh & Mahunka, 1969. Malaconothrus obsessus (Subías, 2004), an unnecessary replacement name for Trimalaconothrus albulus Hammer 1966 sensu Tseng 1982, becomes an available name for what is in fact a previously-undescribed species of Malaconothrus. We describe four new species of Tyrphonothrus: T. gnammaensis sp. nov. from Western Australia, T. gringai sp. nov. and T. maritimus sp. nov. from New South Wales, and T. taylori sp. nov. from Queensland. We describe six new species of Malaconothrus: M. beecroftensis sp. nov., M. darwini sp. nov. M. gundungurra sp. nov. and M. knuellei sp. nov. from New South Wales, M. jowettae sp. nov. from Norfolk Island, and M. talaitae sp. nov. from Victoria.
Resumo:
Bactrocera dorsalis sensu stricto, B. papayae, B. philippinensis and B. carambolae are serious pest fruit fly species of the B. dorsalis complex that predominantly occur in south-east Asia and the Pacific. Identifying molecular diagnostics has proven problematic for these four taxa, a situation that cofounds biosecurity and quarantine efforts and which may be the result of at least some of these taxa representing the same biological species. We therefore conducted a phylogenetic study of these four species (and closely related outgroup taxa) based on the individuals collected from a wide geographic range; sequencing six loci (cox1, nad4-3′, CAD, period, ITS1, ITS2) for approximately 20 individuals from each of 16 sample sites. Data were analysed within maximum likelihood and Bayesian phylogenetic frameworks for individual loci and concatenated data sets for which we applied multiple monophyly and species delimitation tests. Species monophyly was measured by clade support, posterior probability or bootstrap resampling for Bayesian and likelihood analyses respectively, Rosenberg's reciprocal monophyly measure, P(AB), Rodrigo's (P(RD)) and the genealogical sorting index, gsi. We specifically tested whether there was phylogenetic support for the four 'ingroup' pest species using a data set of multiple individuals sampled from a number of populations. Based on our combined data set, Bactrocera carambolae emerges as a distinct monophyletic clade, whereas B. dorsalis s.s., B. papayae and B. philippinensis are unresolved. These data add to the growing body of evidence that B. dorsalis s.s., B. papayae and B. philippinensis are the same biological species, which poses consequences for quarantine, trade and pest management.
Resumo:
Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.
Resumo:
Particulate pollution has been widely recognised as an important risk factor to human health. In addition to increases in respiratory and cardiovascular morbidity associated with exposure to particulate matter (PM), WHO estimates that urban PM causes 0.8 million premature deaths globally and that 1.5 million people die prematurely from exposure to indoor smoke generated from the combustion of solid fuels. Despite the availability of a huge body of research, the underlying toxicological mechanisms by which particles induce adverse health effects are not yet entirely understood. Oxidative stress caused by generation of free radicals and related reactive oxygen species (ROS) at the sites of deposition has been proposed as a mechanism for many of the adverse health outcomes associated with exposure to PM. In addition to particle-induced generation of ROS in lung tissue cells, several recent studies have shown that particles may also contain ROS. As such, they present a direct cause of oxidative stress and related adverse health effects. Cellular responses to oxidative stress have been widely investigated using various cell exposure assays. However, for a rapid screening of the oxidative potential of PM, less time-consuming and less expensive, cell-free assays are needed. The main aim of this research project was to investigate the application of a novel profluorescent nitroxide probe, synthesised at QUT, as a rapid screening assay in assessing the oxidative potential of PM. Considering that this was the first time that a profluorescent nitroxide probe was applied in investigating the oxidative stress potential of PM, the proof of concept regarding the detection of PM–derived ROS by using such probes needed to be demonstrated and a sampling methodology needed to be developed. Sampling through an impinger containing profluorescent nitroxide solution was chosen as a means of particle collection as it allowed particles to react with the profluorescent nitroxide probe during sampling, avoiding in that way any possible chemical changes resulting from delays between the sampling and the analysis of the PM. Among several profluorescent nitroxide probes available at QUT, bis(phenylethynyl)anthracene-nitroxide (BPEAnit) was found to be the most suitable probe, mainly due to relatively long excitation and emission wavelengths (λex= 430 nm; λem= 485 and 513 nm). These wavelengths are long enough to avoid overlap with the background fluorescence coming from light absorbing compounds which may be present in PM (e.g. polycyclic aromatic hydrocarbons and their derivatives). Given that combustion, in general, is one of the major sources of ambient PM, this project aimed at getting an insight into the oxidative stress potential of combustion-generated PM, namely cigarette smoke, diesel exhaust and wood smoke PM. During the course of this research project, it was demonstrated that the BPEAnit probe based assay is sufficiently sensitive and robust enough to be applied as a rapid screening test for PM-derived ROS detection. Considering that for all three aerosol sources (i.e. cigarette smoke, diesel exhaust and wood smoke) the same assay was applied, the results presented in this thesis allow direct comparison of the oxidative potential measured for all three sources of PM. In summary, it was found that there was a substantial difference between the amounts of ROS per unit of PM mass (ROS concentration) for particles emitted by different combustion sources. For example, particles from cigarette smoke were found to have up to 80 times less ROS per unit of mass than particles produced during logwood combustion. For both diesel and wood combustion it has been demonstrated that the type of fuel significantly affects the oxidative potential of the particles emitted. Similarly, the operating conditions of the combustion source were also found to affect the oxidative potential of particulate emissions. Moreover, this project has demonstrated a strong link between semivolatile (i.e. organic) species and ROS and therefore, clearly highlights the importance of semivolatile species in particle-induced toxicity.
Resumo:
Background Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. Conclusion We describe an Australian origin for B. pseudomallei, characterized by a single introduction event into Southeast Asia during a recent glacial period, and variable levels of lateral gene transfer within populations. These patterns provide insights into mechanisms of genetic diversification in B. pseudomallei and its closest relatives, and provide a framework for integrating the traditionally separate fields of population genetics and phylogenetics for other bacterial species with high levels of lateral gene transfer.
Resumo:
We took a comparative approach utilizing clines to investigate the extent to which natural selection may have shaped population divergence in cuticular hydrocarbons (CHCs) that are also under sexual selection in Drosophila. We detected the presence of CHC clines along a latitudinal gradient on the east coast of Australia in two fly species with independent phylogenetic and population histories, suggesting adaptation to shared abiotic factors. For both species, significant associations were detected between clinal variation in CHCs and temperature variation along the gradient, suggesting temperature maxima as a candidate abiotic factor shaping CHC variation among populations. However, rainfall and humidity correlated with CHC variation to differing extents in the two species, suggesting that response to these abiotic factors may vary in a species-specific manner. Our results suggest that natural selection, in addition to sexual selection, plays a significant role in structuring among-population variation in sexually selected traits in Drosophila.
Resumo:
Psittacine beak and feather disease (PBFD) has a broad host range and is widespread in wild and captive psittacine populations in Asia, Africa, the Americas, Europe and Australasia. Beak and feather disease circovirus (BFDV) is the causative agent. BFDV has an ~2 kb single stranded circular DNA genome encoding just two proteins (Rep and CP). In this study we provide support for demarcation of BFDV strains by phylogenetic analysis of 65 complete genomes from databases and 22 new BFDV sequences isolated from infected psittacines in South Africa. We propose 94% genome-wide sequence identity as a strain demarcation threshold, with isolates sharing > 94% identity belonging to the same strain, and strain subtypes sharing> 98% identity. Currently, BFDV diversity falls within 14 strains, with five highly divergent isolates from budgerigars probably representing a new species of circovirus with three strains (budgerigar circovirus; BCV-A, -B and -C). The geographical distribution of BFDV and BCV strains is strongly linked to the international trade in exotic birds; strains with more than one host are generally located in the same geographical area. Lastly, we examined BFDV and BCV sequences for evidence of recombination, and determined that recombination had occurred in most BFDV and BCV strains. We established that there were two globally significant recombination hotspots in the viral genome: the first is along the entire intergenic region and the second is in the C-terminal portion of the CP ORF. The implications of our results for the taxonomy and classification of circoviruses are discussed. © 2011 SGM.
Resumo:
Carrion-breeding Sarcophagidae (Diptera) can be used to estimate the post-mortem interval (PMI) in forensic cases. Difficulties with accurate morphological identifications at any life stage and a lack of documented thermobiological profiles have limited their current usefulness of these flies. The molecular-based approach of DNA barcoding, which utilises a 648-bp fragment of the mitochondrial cytochrome oxidase subunit I gene, was previously evaluated in a pilot study for the discrimination between 16 Australian sarcophagids. The current study comprehensively evaluated DNA barcoding on a larger taxon set of 588 adult Australian sarcophagids. A total of 39 of the 84 known Australian species were represented by 580 specimens, which includes 92% of potentially forensically important species. A further eight specimens could not be reliably identified, but included as six unidentifable taxa. A neighbour-joining phylogenetic tree was generated and nucleotide sequence divergences were calculated using the Kimura-two-parameter distance model. All species except Sarcophaga (Fergusonimyia) bancroftorum, known for high morphological variability, were resolved as reciprocally monophyletic (99.2% of cases), with most having bootstrap support of 100. Excluding S. bancroftorum, the mean intraspecific and interspecific variation ranged from 0.00-1.12% and 2.81-11.23%, respectively, allowing for species discrimination. DNA barcoding was therefore validated as a suitable method for the molecular identification of the Australian Sarcophagidae, which will aid in the implementation of this fauna in forensic entomology.
Resumo:
Understanding the evolutionary history and phylogenetic relationships between rare and common species is necessary for the effective management of rare species. The genus Cherax, a group of freshwater crayfish species, is of interest in this regard as a number of species are rare or have restricted distributions while other species are common and widespread. Here we describe the characterisation of three novel nuclear genes of the haemocyanin superfamily for phylogenetic reconstruction of the genus. All novel markers developed in this study amplified consistently in species from three divergent clades of the genus Cherax. The level of polymorphism found in these markers was consistently higher than that found in other nuclear genes previously used in invertebrate systematics, such as NaK ATP-ase. In combination, these markers will be useful to delineate phylogenetic relationships between rare and common Cherax species.