971 resultados para Phylogenetic Relatedness


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sequencing of mba gene fragments of reference strains of Ureaplasma urealyticum serovars 1, 3, 6, 14, in addition to 33 clinical U. urealyticum isolates is reported. A phylogenetic tree deduced from an alignment of these sequences clearly demonstrates two major clusters (confidence limit 100%), which equate to the parvo and T960 biovars, and five types which we have designated mba 1, 3, 6, 8 and X. These relationships are supported by bootstrap analysis. Polymorphisms within the mba fragment of types mba 1, 3, and 6 were used to define nine subtypes (mba 1a, 1b, 3a, 3b, 3c, 3d, 3e, 6a, and 6b) thus facilitating high resolution typing of U. urealyticum. Inclusion of the reference strains for serovars 1, 3, 6, and 8 in the mba typing scheme showed that the results of this analysis are broadly consistent with currently accepted serotyping. In addition a ure gene fragment from nine of the clinical isolates was amplified and sequenced. Comparisons of the sequences clearly distinguished the two biovars of U. urealyticum; however this fragment was invariant within the parvo biovar. This study has shown that the sequence of the mba can reveal the fine details of the relationships between U. urealyticum isolates and also supports the significant evolutionary gap between the two biovars.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A repeated measures design, with randomly assigned intervention and control groups and multiple sources of information on each participant, was used to examine whether changing the method of delivery of a school’s homework program in order to better meet the students’ needs for autonomy, relatedness and competence would lead to more positive student attitudes to homework and whether there would also be a positive change in overall motivation. The participants were 104 male students aged 10 to 12 years who attended a single sex high school. There was no overall intervention effect on motivation; however, the intervention appeared to have a protective effect on the quality of motivation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Whole-body computer control interfaces present new opportunities to engage children with games for learning. Stomp is a suite of educational games that use such a technology, allowing young children to use their whole body to interact with a digital environment projected on the floor. To maximise the effectiveness of this technology, tenets of self-determination theory (SDT) are applied to the design of Stomp experiences. By meeting user needs for competence, autonomy, and relatedness our aim is to increase children's engagement with the Stomp learning platform. Analysis of Stomp's design suggests that these tenets are met. Observations from a case study of Stomp being used by young children show that they were highly engaged and motivated by Stomp. This analysis demonstrates that continued application of SDT to Stomp will further enhance user engagement. It also is suggested that SDT, when applied more widely to other whole-body multi-user interfaces, could instil similar positive effects.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Carrion-breeding Sarcophagidae (Diptera) can be used to estimate the post-mortem interval (PMI) in forensic cases. Difficulties with accurate morphological identifications at any life stage and a lack of documented thermobiological profiles have limited their current usefulness of these flies. The molecular-based approach of DNA barcoding, which utilises a 648-bp fragment of the mitochondrial cytochrome oxidase subunit I gene, was previously evaluated in a pilot study for the discrimination between 16 Australian sarcophagids. The current study comprehensively evaluated DNA barcoding on a larger taxon set of 588 adult Australian sarcophagids. A total of 39 of the 84 known Australian species were represented by 580 specimens, which includes 92% of potentially forensically important species. A further eight specimens could not be reliably identified, but included as six unidentifable taxa. A neighbour-joining phylogenetic tree was generated and nucleotide sequence divergences were calculated using the Kimura-two-parameter distance model. All species except Sarcophaga (Fergusonimyia) bancroftorum, known for high morphological variability, were resolved as reciprocally monophyletic (99.2% of cases), with most having bootstrap support of 100. Excluding S. bancroftorum, the mean intraspecific and interspecific variation ranged from 0.00-1.12% and 2.81-11.23%, respectively, allowing for species discrimination. DNA barcoding was therefore validated as a suitable method for the molecular identification of the Australian Sarcophagidae, which will aid in the implementation of this fauna in forensic entomology.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Approximately 2500 fly species comprise the Sarcophagidae family worldwide. The complete mitochondrial genome of the carrion-breeding, forensically important Sarcophaga impatiens Walker (Diptera: Sarcophagidae) from Australia was sequenced. The 15,169 bp circular genome contains the 37 genes found in a typical Metazoan genome: 13 protein-coding genes, 2 ribosomal RNA genes and 22 transfer RNA genes. It also contains one non-coding A+T-rich region. The arrangement of the genes was the same as that found in the ancestral insect. All the protein initiation codons are ATN, except for cox1 that begins with TCG (encoding S). The 22 tRNA anticodons of S. impatiens are consistent with those observed in Drosophila yakuba, and all form the typical cloverleaf structure, except for tRNA-Ser(AGN) that lacks the DHU arm. The mitochondrial genome of Sarcophaga presented will be valuable for resolving phylogenetic relationships within the family Sarcophagidae and the order Diptera, and could be used to identify favourable genetic markers for species identifications for forensic purposes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The three genera of smut fungi, Ustilago, Sporisorium and Macalpinomyces, form a complex that has eluded resolution by morphology (Langdon & Fullerton 1975, Vánky 1991, Piepenbring et al. 1998) and molecular phylogenetic analysis (Stoll et al. 2003, 2005). Two suggestions to reconcile the taxonomy of the complex have been proposed. The first was to break up the current taxa into several smaller genera and subgenera, and the second to unify the three genera into a single genus, Ustilago (Vánky 2002, Piepenbring 2004). The former solution is dependent on finding morphological synapomorphies that can delimit the genera, and the latter solution dismisses the wide morphological diversity within the group (McTaggart et al. 2012b). Synapomorphic morphological characters and host plant classification delimited clades in the Ustilago-Sporisorium-Macalpinomyces complex (McTaggart et al. 2012a). The current study defines these synapomorphic characters and proposes a new classification for many species currently placed in Ustilago, Sporisorium and Macalpinomyces. This approach preserves the well-known genera Ustilago, Sporisorium and Macalpinomyces, and enables the classification to reflect morphological diversity in the complex.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The marsupial genus Macropus includes three subgenera, the familiar large grazing kangaroos and wallaroos of M. (Macropus) and M. (Osphranter), as well as the smaller mixed grazing/browsing wallabies of M. (Notamacropus). A recent study of five concatenated nuclear genes recommended subsuming the predominantly browsing Wallabia bicolor (swamp wallaby) into Macropus. To further examine this proposal we sequenced partial mitochondrial genomes for kangaroos and wallabies. These sequences strongly favour the morphological placement of W. bicolor as sister to Macropus, although place M. irma (black-gloved wallaby) within M. (Osphranter) rather than as expected, with M. (Notamacropus). Species tree estimation from separately analysed mitochondrial and nuclear genes favours retaining Macropus and Wallabia as separate genera. A simulation study finds that incomplete lineage sorting among nuclear genes is a plausible explanation for incongruence with the mitochondrial placement of W. bicolor, while mitochondrial introgression from a wallaroo into M. irma is the deepest such event identified in marsupials. Similar such coalescent simulations for interpreting gene tree conflicts will increase in both relevance and statistical power as species-level phylogenetics enters the genomic age. Ecological considerations in turn, hint at a role for selection in accelerating the fixation of introgressed or incompletely sorted loci. More generally the inclusion of the mitochondrial sequences substantially enhanced phylogenetic resolution. However, we caution that the evolutionary dynamics that enhance mitochondria as speciation indicators in the presence of incomplete lineage sorting may also render them especially susceptible to introgression.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Phylogenetic inference from sequences can be misled by both sampling (stochastic) error and systematic error (nonhistorical signals where reality differs from our simplified models). A recent study of eight yeast species using 106 concatenated genes from complete genomes showed that even small internal edges of a tree received 100% bootstrap support. This effective negation of stochastic error from large data sets is important, but longer sequences exacerbate the potential for biases (systematic error) to be positively misleading. Indeed, when we analyzed the same data set using minimum evolution optimality criteria, an alternative tree received 100% bootstrap support. We identified a compositional bias as responsible for this inconsistency and showed that it is reduced effectively by coding the nucleotides as purines and pyrimidines (RY-coding), reinforcing the original tree. Thus, a comprehensive exploration of potential systematic biases is still required, even though genome-scale data sets greatly reduce sampling error.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Good phylogenetic trees are required to test hypotheses about evolutionary processes. We report four new avian mitochondrial genomes, which together with an improved method of phylogenetic analysis for vertebrate mt genomes give results for three questions in avian evolution. The new mt genomes are: magpie goose (Anseranas semipalmata), an owl (morepork, Ninox novaeseelandiae); a basal passerine (rifleman, or New Zealand wren, Acanthisitta chloris); and a parrot (kakapo or owl-parrot, Strigops habroptilus). The magpie goose provides an important new calibration point for avian evolution because the well-studied Presbyornis fossils are on the lineage to ducks and geese, after the separation of the magpie goose. We find, as with other animal mitochondrial genomes, that RY-coding is helpful in adjusting for biases between pyrimidines and between purines. When RY-coding is used at third positions of the codon, the root occurs between paleognath and neognath birds (as expected from morphological and nuclear data). In addition, passerines form a relatively old group in Neoaves, and many modern avian lineages diverged during the Cretaceous. Although many aspects of the avian tree are stable, additional taxon sampling is required.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The generation of a correlation matrix from a large set of long gene sequences is a common requirement in many bioinformatics problems such as phylogenetic analysis. The generation is not only computationally intensive but also requires significant memory resources as, typically, few gene sequences can be simultaneously stored in primary memory. The standard practice in such computation is to use frequent input/output (I/O) operations. Therefore, minimizing the number of these operations will yield much faster run-times. This paper develops an approach for the faster and scalable computing of large-size correlation matrices through the full use of available memory and a reduced number of I/O operations. The approach is scalable in the sense that the same algorithms can be executed on different computing platforms with different amounts of memory and can be applied to different problems with different correlation matrix sizes. The significant performance improvement of the approach over the existing approaches is demonstrated through benchmark examples.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Infection by dengue virus (DENV) is a major public health concern in hundreds of tropical and subtropical countries. French Polynesia (FP) regularly experiences epidemics that initiate, or are consecutive to, DENV circulation in other South Pacific Island Countries (SPICs). In January 2009, after a decade of serotype 1 (DENV-1) circulation, the first cases of DENV-4 infection were reported in FP. Two months later a new epidemic emerged, occurring about 20 years after the previous circulation of DENV-4 in FP. In this study, we investigated the epidemiological and molecular characteristics of the introduction, spread and genetic microevolution of DENV-4 in FP. METHODOLOGY/PRINCIPAL FINDINGS: Epidemiological data suggested that recent transmission of DENV-4 in FP started in the Leeward Islands and this serotype quickly displaced DENV-1 throughout FP. Phylogenetic analyses of the nucleotide sequences of the envelope (E) gene of 64 DENV-4 strains collected in FP in the 1980s and in 2009-2010, and some additional strains from other SPICs showed that DENV-4 strains from the SPICs were distributed into genotypes IIa and IIb. Recent FP strains were distributed into two clusters, each comprising viruses from other but distinct SPICs, suggesting that emergence of DENV-4 in FP in 2009 resulted from multiple introductions. Otherwise, we observed that almost all strains collected in the SPICs in the 1980s exhibit an amino acid (aa) substitution V287I within domain I of the E protein, and all recent South Pacific strains exhibit a T365I substitution within domain III. CONCLUSIONS/SIGNIFICANCE: This study confirmed the cyclic re-emergence and displacement of DENV serotypes in FP. Otherwise, our results showed that specific aa substitutions on the E protein were present on all DENV-4 strains circulating in SPICs. These substitutions probably acquired and subsequently conserved could reflect a founder effect to be associated with epidemiological, geographical, eco-biological and social specificities in SPICs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Intra-host sequence data from RNA viruses have revealed the ubiquity of defective viruses in natural viral populations, sometimes at surprisingly high frequency. Although defective viruses have long been known to laboratory virologists, their relevance in clinical and epidemiological settings has not been established. The discovery of long-term transmission of a defective lineage of dengue virus type 1 (DENV-1) in Myanmar, first seen in 2001, raised important questions about the emergence of transmissible defective viruses and their role in viral epidemiology. By combining phylogenetic analyses and dynamical modelling, we investigate how evolutionary and ecological processes at the intra-host and inter-host scales shaped the emergence and spread of the defective DENV-1 lineage. We show that this lineage of defective viruses emerged between June 1998 and February 2001, and that the defective virus was transmitted primarily through co-transmission with the functional virus to uninfected individuals. We provide evidence that, surprisingly, this co-transmission route has a higher transmission potential than transmission of functional dengue viruses alone. Consequently, we predict that the defective lineage should increase overall incidence of dengue infection, which could account for the historically high dengue incidence reported in Myanmar in 2001-2002. Our results show the unappreciated potential for defective viruses to impact the epidemiology of human pathogens, possibly by modifying the virulence-transmissibility trade-off, or to emerge as circulating infections in their own right. They also demonstrate that interactions between viral variants, such as complementation, can open new pathways to viral emergence.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Between 50 and 100 million people are infected with dengue viruses each year and more than 100,000 of these die. Dr Choudhury has demonstrated that populations of dengue viruses in individual patients are genetically and functionally very diverse and that this diversity changes significantly at the time of major outbreaks of disease. The results of his studies may inform strategies which will make dengue vaccines far more effective.