57 resultados para CHECKING SEQUENCES
Resumo:
Alignment-free methods, in which shared properties of sub-sequences (e.g. identity or match length) are extracted and used to compute a distance matrix, have recently been explored for phylogenetic inference. However, the scalability and robustness of these methods to key evolutionary processes remain to be investigated. Here, using simulated sequence sets of various sizes in both nucleotides and amino acids, we systematically assess the accuracy of phylogenetic inference using an alignment-free approach, based on D2 statistics, under different evolutionary scenarios. We find that compared to a multiple sequence alignment approach, D2 methods are more robust against among-site rate heterogeneity, compositional biases, genetic rearrangements and insertions/deletions, but are more sensitive to recent sequence divergence and sequence truncation. Across diverse empirical datasets, the alignment-free methods perform well for sequences sharing low divergence, at greater computation speed. Our findings provide strong evidence for the scalability and the potential use of alignment-free methods in large-scale phylogenomics.
Resumo:
Over the last few years, investigations of human epigenetic profiles have identified key elements of change to be Histone Modifications, stable and heritable DNA methylation and Chromatin remodeling. These factors determine gene expression levels and characterise conditions leading to disease. In order to extract information embedded in long DNA sequences, data mining and pattern recognition tools are widely used, but efforts have been limited to date with respect to analyzing epigenetic changes, and their role as catalysts in disease onset. Useful insight, however, can be gained by investigation of associated dinucleotide distributions. The focus of this paper is to explore specific dinucleotides frequencies across defined regions within the human genome, and to identify new patterns between epigenetic mechanisms and DNA content. Signal processing methods, including Fourier and Wavelet Transformations, are employed and principal results are reported.
Resumo:
We report here the genome sequences of two alphabaculoviruses of Helicoverpa spp. from Australia: AC53, used in the biopesticides ViVUS and ViVUS Max, and H25EA1, used in in vitro production studies.
Resumo:
Debates on gene patents have necessitated the analysis of patents that disclose and reference human sequences. In this study, we built an automated classifier that assigns sequences to one of nine predefined categories according to their functional roles in patent claims by applying natural language processing and supervised learning techniques. To improve its correctness, we experimented with various feature mappings, resulting in the maximal accuracy of 79%.
Resumo:
MicroRNAs (miRNAs) are small non-coding RNAs of 20 nt in length that are capable of modulating gene expression post-transcriptionally. Although miRNAs have been implicated in cancer, including breast cancer, the regulation of miRNA transcription and the role of defects in this process in cancer is not well understood. In this study we have mapped the promoters of 93 breast cancer-associated miRNAs, and then looked for associations between DNA methylation of 15 of these promoters and miRNA expression in breast cancer cells. The miRNA promoters with clearest association between DNA methylation and expression included a previously described and a novel promoter of the Hsa-mir-200b cluster. The novel promoter of the Hsa-mir-200b cluster, denoted P2, is located 2 kb upstream of the 5′ stemloop and maps within a CpG island. P2 has comparable promoter activity to the previously reported promoter (P1), and is able to drive the expression of miR-200b in its endogenous genomic context. DNA methylation of both P1 and P2 was inversely associated with miR-200b expression in eight out of nine breast cancer cell lines, and in vitro methylation of both promoters repressed their activity in reporter assays. In clinical samples, P1 and P2 were differentially methylated with methylation inversely associated with miR-200b expression. P1 was hypermethylated in metastatic lymph nodes compared with matched primary breast tumours whereas P2 hypermethylation was associated with loss of either oestrogen receptor or progesterone receptor. Hypomethylation of P2 was associated with gain of HER2 and androgen receptor expression. These data suggest an association between miR-200b regulation and breast cancer subtype and a potential use of DNA methylation of miRNA promoters as a component of a suite of breast cancer biomarkers.
Resumo:
Chlamydia pneumoniae is a ubiquitous intracellular pathogen, first associated with human respiratory disease and subsequently detected in a range of mammals, amphibians, and reptiles. Here we report the draft genome sequence for strain B21 of C. pneumoniae, isolated from the endangered Australian marsupial the western barred bandicoot.
Resumo:
Biological sequences are an important part of global patenting, with unique challenges for their effective and equitable use in practice and in policy. Because their function can only be determined with computer-aided technology, the form in which sequences are disclosed matters greatly. Similarly, the scope of patent rights sought and granted requires computer readable data and tools for comparison. Critically, the primary data provided to the national patent offices and thence to the public, must be comprehensive, standardized, timely and meaningful. It is not yet. The proposed global Patent Sequence (PatSeq) Data platform can enable national and regional jurisdictions meet the desired standards.
Resumo:
This article presents a method for checking the conformance between an event log capturing the actual execution of a business process, and a model capturing its expected or normative execution. Given a business process model and an event log, the method returns a set of statements in natural language describing the behavior allowed by the process model but not observed in the log and vice versa. The method relies on a unified representation of process models and event logs based on a well-known model of concurrency, namely event structures. Specifically, the problem of conformance checking is approached by folding the input event log into an event structure, unfolding the process model into another event structure, and comparing the two event structures via an error-correcting synchronized product. Each behavioral difference detected in the synchronized product is then verbalized as a natural language statement. An empirical evaluation shows that the proposed method scales up to real-life datasets while producing more concise and higher-level difference descriptions than state-of-the-art conformance checking methods.
Resumo:
In the century since the description of the orthoclad genus Paratrichocladius Santos-Abreu (Diptera: Chironomidae), separation in any life stage from the cosmopolitan, diverse Cricotopus Wulp has been problematic. Molecular analysis reveals the presence of two species in Australia that conform in morphology to Paratrichocladius and which form a well-supported clade including Paratrichocladius micans (Kieffer) from Africa and a distinct southern African larva. This clade clusters with taxa allied with Cricotopus albitibia (Walker), in turn nested within all other sampled Australian Cricotopus. Relevant nodes strongly support Cricotopus as nonmonophyletic without inclusion of Paratrichocladius. We synonymize Paratrichocladius with Cricotopus syn.n, treating Paratrichocladius as a subgenus. Cricotopus (Paratrichocladius) australiensis Cranston sp.n. is described for Trichocladius pluriserialis Freeman from Australia, which is not the same species under that name in New Zealand. Cricotopus (Paratrichocladius) bifenestrus Cranston sp.n. from Australia is described, also in all life stages. The many new combinations, listed in an Appendix, include three replacement names for new secondary homonyms, namely: Cricotopus (Paratrichocladius) sinobicinctus Cranston & Krosch nom.n. for Paratrichocladius bicinctus Fu, Sæther & Wang, Cricotopus draysoni Cranston & Krosch nom.n. for Cricotopus brevicornis Drayson, Krosch & Cranston, and Cricotopus (Paratrichocladius) sikhotealinus Makarchenko & Makarchenko nom.n. for Cricotopus orientalis Kieffer. We conclude with comments on wider issues in the taxonomy of Paratrichocladius, especially concerning New Zealand species.
Resumo:
In this research we modelled computer network devices to ensure their communication behaviours meet various network standards. By modelling devices as finite-state machines and examining their properties in a range of configurations, we discovered a flaw in a common network protocol and produced a technique to improve organisations' network security against data theft.
Resumo:
This PhD research has proposed new machine learning techniques to improve human action recognition based on local features. Several novel video representation and classification techniques have been proposed to increase the performance with lower computational complexity. The major contributions are the construction of new feature representation techniques, based on advanced machine learning techniques such as multiple instance dictionary learning, Latent Dirichlet Allocation (LDA) and Sparse coding. A Binary-tree based classification technique was also proposed to deal with large amounts of action categories. These techniques are not only improving the classification accuracy with constrained computational resources but are also robust to challenging environmental conditions. These developed techniques can be easily extended to a wide range of video applications to provide near real-time performance.
Resumo:
Wild-type baculovirus isolates typically consist of multiple strains. We report the full genome sequences of seven alphabaculovirus strains derived by passage through tissue culture from Helicoverpa armigera SNPV-AC53 (KJ909666).