2 resultados para Non-autonomous graphs

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Eukaryotic ribosomal DNA constitutes a multi gene family organized in a cluster called nucleolar organizer region (NOR); this region is composed usually by hundreds to thousands of tandemly repeated units. Ribosomal genes, being repeated sequences, evolve following the typical pattern of concerted evolution. The autonomous retroelement R2 inserts in the ribosomal gene 28S, leading to defective 28S rDNA genes. R2 element, being a retrotransposon, performs its activity in the genome multiplying its copy number through a “copy and paste” mechanism called target primed reverse transcription. It consists in the retrotranscription of the element’s mRNA into DNA, then the DNA is integrated in the target site. Since the retrotranscription can be interrupted, but the integration will be carried out anyway, truncated copies of the element will also be present in the genome. The study of these truncated variants is a tool to examine the activity of the element. R2 phylogeny appears, in general, not consistent with that of its hosts, except some cases (e.g. Drosophila spp. and Reticulitermes spp.); moreover R2 is absent in some species (Fugu rubripes, human, mouse, etc.), while other species have more R2 lineages in their genome (the turtle Mauremys reevesii, the Japanese beetle Popilia japonica, etc). R2 elements here presented are isolated in 4 species of notostracan branchiopods and in two species of stick insects, whose reproductive strategies range from strict gonochorism to unisexuality. From sequencing data emerges that in Triops cancriformis (Spanish gonochoric population), in Lepidurus arcticus (two putatively unisexual populations from Iceland) and in Bacillus rossius (gonochoric population from Capalbio) the R2 elements are complete and encode functional proteins, reflecting the general features of this family of transposable elements. On the other hand, R2 from Italian and Austrian populations of T. cancriformis (respectively unisexual and hermaphroditic), Lepidurus lubbocki (two elements within the same Italian population, gonochoric but with unfunctional males) and Bacillus grandii grandii (gonochoric population from Ponte Manghisi) have sequences that encode incomplete or non-functional proteins in which it is possible to recognize only part of the characteristic domains. In Lepidurus couesii (Italian gonochoric populations) different elements were found as in L. lubbocki, and the sequencing is still in progress. Two hypothesis are given to explain the inconsistency of R2/host phylogeny: vertical inheritance of the element followed by extinction/diversification or horizontal transmission. My data support previous study that state the vertical transmission as the most likely explanation; nevertheless horizontal transfer events can’t be excluded. I also studied the element’s activity in Spanish populations of T. cancriformis, in L. lubbocki, in L. arcticus and in gonochoric and parthenogenetic populations of B. rossius. In gonochoric populations of T. cancriformis and B. rossius I found that each individual has its own private set of truncated variants. The situation is the opposite for the remaining hermaphroditic/parthenogenetic species and populations, all individuals sharing – in the so far analyzed samples - the majority of variants. This situation is very interesting, because it isn’t concordant with the Muller’s ratchet theory that hypothesizes the parthenogenetic populations being either devoided of transposable elements or TEs overloaded. My data suggest a possible epigenetic mechanism that can block the retrotransposon activity, and in this way deleterious mutations don’t accumulate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many application domains data can be naturally represented as graphs. When the application of analytical solutions for a given problem is unfeasible, machine learning techniques could be a viable way to solve the problem. Classical machine learning techniques are defined for data represented in a vectorial form. Recently some of them have been extended to deal directly with structured data. Among those techniques, kernel methods have shown promising results both from the computational complexity and the predictive performance point of view. Kernel methods allow to avoid an explicit mapping in a vectorial form relying on kernel functions, which informally are functions calculating a similarity measure between two entities. However, the definition of good kernels for graphs is a challenging problem because of the difficulty to find a good tradeoff between computational complexity and expressiveness. Another problem we face is learning on data streams, where a potentially unbounded sequence of data is generated by some sources. There are three main contributions in this thesis. The first contribution is the definition of a new family of kernels for graphs based on Directed Acyclic Graphs (DAGs). We analyzed two kernels from this family, achieving state-of-the-art results from both the computational and the classification point of view on real-world datasets. The second contribution consists in making the application of learning algorithms for streams of graphs feasible. Moreover,we defined a principled way for the memory management. The third contribution is the application of machine learning techniques for structured data to non-coding RNA function prediction. In this setting, the secondary structure is thought to carry relevant information. However, existing methods considering the secondary structure have prohibitively high computational complexity. We propose to apply kernel methods on this domain, obtaining state-of-the-art results.