957 resultados para Sequence type
Resumo:
The first complete genome sequence of capsicum chlorosis virus (CaCV) from Australia was determined using a combination of Illumina HiSeq RNA and Sanger sequencing technologies. Australian CaCV had a tripartite genome structure like other CaCV isolates. The large (L) RNA was 8913 nucleotides (nt) in length and contained a single open reading frame (ORF) of 8634 nt encoding a predicted RNA-dependent RNA polymerase (RdRp) in the viral-complementary (vc) sense. The medium (M) and small (S) RNA segments were 4846 and 3944 nt in length, respectively, each containing two non-overlapping ORFs in ambisense orientation, separated by intergenic regions (IGR). The M segment contained ORFs encoding the predicted non-structural movement protein (NSm; 927 nt) and precursor of glycoproteins (GP; 3366 nt) in the viral sense (v) and vc strand, respectively, separated by a 449-nt IGR. The S segment coded for the predicted nucleocapsid (N) protein (828 nt) and non-structural suppressor of silencing protein (NSs; 1320 nt) in the vc and v strand, respectively. The S RNA contained an IGR of 1663 nt, being the largest IGR of all CaCV isolates sequenced so far. Comparison of the Australian CaCV genome with complete CaCV genome sequences from other geographic regions showed highest sequence identity with a Taiwanese isolate. Genome sequence comparisons and phylogeny of all available CaCV isolates provided evidence for at least two highly diverged groups of CaCV isolates that may warrant re-classification of AIT-Thailand and CP-China isolates as unique tospoviruses, separate from CaCV.
Resumo:
Turnip mosaic virus (TuMV) is a potyvirus that is transmitted by aphids and infects a wide range of plant species. We investigated the evolution of this pathogen by collecting 32 isolates of TuMV, mostly from Brassicaceae plants, in Australia and New Zealand. We performed a variety of sequence-based phylogenetic and population genetic analyses of the complete genomic sequences and of three non-recombinogenic regions of those sequences. The substitution rates, divergence times and phylogeographical patterns of the virus populations were estimated. Six inter- and seven intralineage recombination-type patterns were found in the genomes of the Australian and New Zealand isolates, and all were novel. Only one recombination-type pattern has been found in both countries. The Australian and New Zealand populations were genetically different, and were different from the European and Asian populations. Our Bayesian coalescent analyses, based on a combination of novel and published sequence data from three nonrecombinogenic protein-encoding regions, showed that TuMV probably started to migrate from Europe to Australia and New Zealand more than 80 years ago, and that distinct populations arose as a result of evolutionary drivers such as recombination. The basal-B2 subpopulation in Australia and New Zealand seems to be older than those of the world-B2 and -B3 populations. To our knowledge, our study presents the first population genetic analysis of TuMV in Australia and New Zealand. We have shown that the time of migration of TuMV correlates well with the establishment of agriculture and migration of Europeans to these countries.
Resumo:
The topic of this dissertation lies in the intersection of harmonic analysis and fractal geometry. We particulary consider singular integrals in Euclidean spaces with respect to general measures, and we study how the geometric structure of the measures affects certain analytic properties of the operators. The thesis consists of three research articles and an overview. In the first article we construct singular integral operators on lower dimensional Sierpinski gaskets associated with homogeneous Calderón-Zygmund kernels. While these operators are bounded their principal values fail to exist almost everywhere. Conformal iterated function systems generate a broad range of fractal sets. In the second article we prove that many of these limit sets are porous in a very strong sense, by showing that they contain holes spread in every direction. In the following we connect these results with singular integrals. We exploit the fractal structure of these limit sets, in order to establish that singular integrals associated with very general kernels converge weakly. Boundedness questions consist a central topic of investigation in the theory of singular integrals. In the third article we study singular integrals of different measures. We prove a very general boundedness result in the case where the two underlying measures are separated by a Lipshitz graph. As a consequence we show that a certain weak convergence holds for a large class of singular integrals.
Resumo:
Summary We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses.
Resumo:
A limited number of plant rhabdovirus genomes have been fully sequenced, making taxonomic classification, evolutionary analysis and molecular characterization of this virus group difficult. We have for the first time determined the complete genome sequence of 13,188 nucleotides of Datura yellow vein nucleorhabdovirus (DYVV). DYVV genome organization resembles that of its closest relative, Sonchus yellow net virus (SYNV), with six ORFs in antigenomic orientation, separated by highly conserved intergenic regions and flanked by complementary 3′ leader and 5′ trailer sequences. As is typical for nucleorhabdoviruses, all viral proteins, except the glycoprotein, which is targeted to the endoplasmic reticulum, are localized to the nucleus. Nucleocapsid (N) protein, matrix (M) protein and polymerase, as components of nuclear viroplasms during replication, have predicted strong canonical nuclear localization signals, and N and M proteins exclusively localize to the nucleus when transiently expressed as GFP fusions. As in all nucleorhabdoviruses studied so far, N and phosphoprotein P interact when co-expressed, significantly increasing P nuclear localization in the presence of N protein. This research adds to the list of complete genomes of plant-infecting rhabdoviruses, provides molecular tools for further characterization and supports classification of DYVV as a nucleorhabdovirus closely related to but with some distinct differences from SYNV.
Resumo:
Background: Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles. Results: A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316bp. Variety IW had the highest SNP frequency (one SNP every 258bp) while KP and NDM had similar frequencies (one SNP every 369bp and 360bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3'-hydroxylase (F3'H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties. Conclusions: The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and position in the pathway for up-stream genes. The high expression of PAL, C4H and CHS genes in mango peel compared to flesh is associated with high amounts of total phenolic contents in peels, which suggest that these genes have an influence on total flavonoid levels in mango fruit peel and flesh. In addition, the particularly high expression levels of ANR in KP and NDM peels compared to IW peel and the significant accumulation of its product epicatechin gallate (ECG) in those extracts reflects the rate-limiting role of ANR on ECG biosynthesis in mango. © 2015 Hoang et al.
Resumo:
Mycobacterium leprae, which has undergone reductive evolution leaving behind a minimal set of essential genes, has retained intervening sequences in four of its genes implicating a vital role for them in the survival of the leprosy bacillus. A single in-frame intervening sequence has been found embedded within its recA gene. Comparison of M. leprae recA intervening sequence with the known intervening sequences indicated that it has the consensus amino acid sequence necessary for being a LAGLIDADG-type homing endonuclease. In light of massive gene decay and function loss in the leprosy bacillus, we sought to investigate whether its recA intervening sequence encodes a catalytically active homing endonuclease. Here we show that the purified M. leprae RecA intein (PI-MleI) binds to cognate DNA and displays endonuclease activity in the presence of alternative divalent cations, Mg2+ or Mn2+. A combination of approaches including four complementary footprinting assays such as DNase I, Cu/phenanthroline, methylation protection and KMnO4, enhancement of 2-aminopurine fluorescence and mapping of the cleavage site revealed that PI-MleI binds to cognate DNA flanking its insertion site, induces helical distortion at the cleavage site and generates two staggered double-strand breaks. Taken together, these results implicate that PI-MleI possess a modular structure with separate domains for DNA target recognition and cleavage, each with distinct sequence preferences. From a biological standpoint, it is tempting to speculate that our findings have implications for understanding the evolution of LAGLIDADG family of homing endonucleases
Resumo:
The analysis of sequential data is required in many diverse areas such as telecommunications, stock market analysis, and bioinformatics. A basic problem related to the analysis of sequential data is the sequence segmentation problem. A sequence segmentation is a partition of the sequence into a number of non-overlapping segments that cover all data points, such that each segment is as homogeneous as possible. This problem can be solved optimally using a standard dynamic programming algorithm. In the first part of the thesis, we present a new approximation algorithm for the sequence segmentation problem. This algorithm has smaller running time than the optimal dynamic programming algorithm, while it has bounded approximation ratio. The basic idea is to divide the input sequence into subsequences, solve the problem optimally in each subsequence, and then appropriately combine the solutions to the subproblems into one final solution. In the second part of the thesis, we study alternative segmentation models that are devised to better fit the data. More specifically, we focus on clustered segmentations and segmentations with rearrangements. While in the standard segmentation of a multidimensional sequence all dimensions share the same segment boundaries, in a clustered segmentation the multidimensional sequence is segmented in such a way that dimensions are allowed to form clusters. Each cluster of dimensions is then segmented separately. We formally define the problem of clustered segmentations and we experimentally show that segmenting sequences using this segmentation model, leads to solutions with smaller error for the same model cost. Segmentation with rearrangements is a novel variation to the segmentation problem: in addition to partitioning the sequence we also seek to apply a limited amount of reordering, so that the overall representation error is minimized. We formulate the problem of segmentation with rearrangements and we show that it is an NP-hard problem to solve or even to approximate. We devise effective algorithms for the proposed problem, combining ideas from dynamic programming and outlier detection algorithms in sequences. In the final part of the thesis, we discuss the problem of aggregating results of segmentation algorithms on the same set of data points. In this case, we are interested in producing a partitioning of the data that agrees as much as possible with the input partitions. We show that this problem can be solved optimally in polynomial time using dynamic programming. Furthermore, we show that not all data points are candidates for segment boundaries in the optimal solution.
Resumo:
In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.
Resumo:
A simple method for evaluating dielectric relaxation parameters ie given whioh can be used for analyeing the arelaxation times of a liquid into two absorptions.