26 resultados para Evolutionary clustering

em Helda - Digital Repository of University of Helsinki


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of the study is to explain how paradise beliefs are born from the viewpoint of mental functions of the human mind. The focus is on the observation that paradise beliefs across the world are mutually more similar than dissimilar. By using recent theories and results from the cognitive and evolutionary study of religion as well as from studies of environmental preferences, I suggest that this is because pan-human unconscious motivations, the architecture of mind, and the way the human mind processes information constrain the possible repertoire of paradise beliefs. The study is divided into two parts, theoretical and empirical. The arguments in the theoretical part are tested with data in the empirical part with two data sets. The first data set was collected using an Internet survey. The second data set was derived from literary sources. The first data test the assumption that intuitive conceptions of an environment of dreams generally follow the outlines set by evolved environmental preferences, but that they can be tweaked by modifying the presence of desirable elements. The second data test the assumption that familiarity is a dominant factor determining the content of paradise beliefs. The results of the study show that in addition to the widely studied belief in supernatural agents, belief in supernatural environments wells from the natural functioning of the human mind attesting the view that religious thinking and ideas are natural for human species and are produced by the same mental mechanisms as other cultural information. The results also help us to understand that the mental structures behind the belief in the supernatural have a wider scope than has been previously acknowledged.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the thesis it is discussed in what ways concepts and methodology developed in evolutionary biology can be applied to the explanation and research of language change. The parallel nature of the mechanisms of biological evolution and language change is explored along with the history of the exchange of ideas between these two disciplines. Against this background computational methods developed in evolutionary biology are taken into consideration in terms of their applicability to the study of historical relationships between languages. Different phylogenetic methods are explained in common terminology, avoiding the technical language of statistics. The thesis is on one hand a synthesis of earlier scientific discussion, and on the other an attempt to map out the problems of earlier approaches in addition to finding new guidelines in the study of language change on their basis. Primarily literature about the connections between evolutionary biology and language change, along with research articles describing applications of phylogenetic methods into language change have been used as source material. The thesis starts out by describing the initial development of the disciplines of evolutionary biology and historical linguistics, a process which right from the beginning can be seen to have involved an exchange of ideas concerning the mechanisms of language change and biological evolution. The historical discussion lays the foundation for the handling of the generalised account of selection developed during the recent few decades. This account is aimed for creating a theoretical framework capable of explaining both biological evolution and cultural change as selection processes acting on self-replicating entities. This thesis focusses on the capacity of the generalised account of selection to describe language change as a process of this kind. In biology, the mechanisms of evolution are seen to form populations of genetically related organisms through time. One of the central questions explored in this thesis is whether selection theory makes it possible to picture languages are forming populations of a similar kind, and what a perspective like this can offer to the understanding of language in general. In historical linguistics, the comparative method and other, complementing methods have been traditionally used to study the development of languages from a common ancestral language. Computational, quantitative methods have not become widely used as part of the central methodology of historical linguistics. After the fading of a limited popularity enjoyed by the lexicostatistical method since the 1950s, only in the recent years have also the computational methods of phylogenetic inference used in evolutionary biology been applied to the study of early language history. In this thesis the possibilities offered by the traditional methodology of historical linguistics and the new phylogenetic methods are compared. The methods are approached through the ways in which they have been applied to the Indo-European languages, which is the most thoroughly investigated language family using both the traditional and the phylogenetic methods. The problems of these applications along with the optimal form of the linguistic data used in these methods are explored in the thesis. The mechanisms of biological evolution are seen in the thesis as parallel in a limited sense to the mechanisms of language change, however sufficiently so that the development of a generalised account of selection is deemed as possibly fruiful for understanding language change. These similarities are also seen to support the validity of using phylogenetic methods in the study of language history, although the use of linguistic data and the models of language change employed by these models are seen to await further development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Puumala virus (PUUV) is the causative agent of nephropathia epidemica (NE), a mild form of hemorrhagic fever with renal syndrome. Finland has the highest documented incidence of NE with around 1000 cases diagnosed annually. PUUV is also found in other Scandinavian countries, Central Europe and the European part of Russia. PUUV belongs to the genus Hantavirus in the family Bunyaviridae. Hantaviruses are rodent-borne viruses each carried by a specific host that is persistently and asymptomatically infected by the virus. PUUV is carried by the bank voles (Myodes glareolus, previously known as Clethrionomys glareolus). Hantaviruses have co-evolved with their carrier rodents for millions of years and these host animals are the evolutionary scene of hantaviruses. In this study, PUUV sequences were recovered from bank voles captured in Denmark and Russian Karelia to study the evolution of PUUV in Scandinavia. Phylogenetic analysis of these strains showed a geographical clustering of genetic variants following the presumable migration pattern of bank voles during the recolonization of Scandinavia after the last ice age approximately 10 000 years ago. The currently known PUUV genome sequences were subjected to in-depth phylogenetic analyses and the results showed that genetic drift seems to be the major mechanism of PUUV evolution. In general, PUUV seems to evolve quite slowly following a molecular clock. We also found evidence for recombination in the evolution of some genetic lineages of PUUV. Viral microevolution was studied in controlled virus transmission in colonized bank voles and changes in quasispecies dynamics were recorded as the virus was transmitted from one animal to another. We witnessed PUUV evolution in vivo, as one synonymous mutation became repeatedly fixed in the viral genome during the experiment. The detailed knowledge on the PUUV diversity was used to establish new sensitive and specific detection methods for this virus. Direct viral invasion of the hypophysis was demonstrated for the first time in a lethal case of NE. PUUV detection was done by immunohistochemistry, in situ hybridization and RT-nested-PCR of the autopsy tissue samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The studies presented in this thesis contribute to the understanding of evolutionary ecology of three major viruses threatening cultivated sweetpotato (Ipomoea batatas Lam) in East Africa: Sweet potato feathery mottle virus (SPFMV; genus Potyvirus; Potyviridae), Sweet potato chlorotic stunt virus (SPCSV; genus Crinivirus; Closteroviridae) and Sweet potato mild mottle virus (SPMMV; genus Ipomovirus; Potyviridae). The viruses were serologically detected and the positive results confirmed by RT-PCR and sequencing. SPFMV was detected in 24 wild plant species of family Convolvulacea (genera Ipomoea, Lepistemon and Hewittia), of which 19 species were new natural hosts for SPFMV. SPMMV and SPCSV were detected in wild plants belonging to 21 and 12 species (genera Ipomoea, Lepistemon and Hewittia), respectively, all of which were previously unknown to be natural hosts of these viruses. SPFMV was the most abundant virus being detected in 17% of the plants, while SPMMV and SPCSV were detected in 9.8% and 5.4% of the assessed plants, respectively. Wild plants in Uganda were infected with the East African (EA), common (C), and the ordinary (O) strains, or co-infected with the EA and the C strain of SPFMV. The viruses and virus-like diseases were more frequent in the eastern agro-ecological zone than the western and central zones, which contrasted with known incidences of these viruses in sweetpotato crops, except for northern zone where incidences were lowest in wild plants as in sweetpotato. The NIb/CP junction in SPMMV was determined experimentally which facilitated CP-based phylogenetic and evolutionary analyses of SPMMV. Isolates of all the three viruses from wild plants were genetically similar to those found in cultivated sweetpotatoes in East Africa. There was no evidence of host-driven population genetic structures suggesting frequent transmission of these viruses between their wild and cultivated hosts. The p22 RNA silencing suppressor-encoding sequence was absent in a few SPCSV isolates, but regardless of this, SPCSV isolates incited sweet potato virus disease (SPVD) in sweetpotato plants co-infected with SPFMV, indicating that p22 is redundant for synergism between SCSV and SPFMV. Molecular evolutionary analysis revealed that isolates of strain EA of SPFMV that is largely restricted geographically in East Africa experience frequent recombination in comparison to isolates of strain C that is globally distributed. Moreover, non-homologous recombination events between strains EA and C were rare, despite frequent co-infections of these strains in wild plants, suggesting purifying selection against non-homologous recombinants between these strains or that such recombinants are mostly not infectious. Recombination was detected also in the 5 - and 3 -proximal regions of the SPMMV genome providing the first evidence of recombination in genus Ipomovirus, but no recombination events were detected in the characterized genomic regions of SPCSV. Strong purifying selection was implicated on evolution of majority of amino acids of the proteins encoded by the analyzed genomic regions of SPFMV, SPMMV and SPCSV. However, positive selection was predicted on 17 amino acids distributed over the whole the coat protein (CP) in the globally distributed strain C, as compared to only 4 amino acids in the multifunctional CP N-terminus (CP-NT) of strain EA largely restricted geographically to East Africa. A few amino acid sites in the N-terminus of SPMMV P1, the p7 protein and RNA silencing suppressor proteins p22 and RNase3 of SPCSV were also submitted to positive selection. Positively selected amino acids may constitute ligand-binding domains that determine interactions with plant host and/or insect vector factors. The P1 proteinase of SPMMV (genus Ipomovirus) seems to respond to needs of adaptation, which was not observed with the helper component proteinase (HC-Pro) of SPMMV, although the HC-Pro is responsible for many important molecular interactions in genus Potyvirus. Because the centre of origin of cultivated sweetpotato is in the Americas from where the crop was dispersed to other continents in recent history (except for the Australasia and South Pacific region), it would be expected that identical viruses and their strains occur worldwide, presuming virus dispersal with the host. Apparently, this seems not to be the case with SPMMV, the strain EA of SPFMV and the strain EA of SPCSV that are largely geographically confined in East Africa where they are predominant and occur both in natural and agro-ecosystems. The geographical distribution of plant viruses is constrained more by virus-vector relations than by virus-host interactions, which in accordance of the wide range of natural host species and the geographical confinement to East Africa suggest that these viruses existed in East African wild plants before the introduction of sweetpotato. Subsequently, these studies provide compelling evidence that East Africa constitutes a cradle of SPFMV strain EA, SPCSV strain EA, and SPMMV. Therefore, sweet potato virus disease (SPVD) in East Africa may be one of the examples of damaging virus diseases resulting from exchange of viruses between introduced crops and indigenous wild plant species. Keywords: Convolvulaceae, East Africa, epidemiology, evolution, genetic variability, Ipomoea, recombination, SPCSV, SPFMV, SPMMV, selection pressure, sweetpotato, wild plant species Author s Address: Arthur K. Tugume, Department of Agricultural Sciences, Faculty of Agriculture and Forestry, University of Helsinki, Latokartanonkaari 7, P.O Box 27, FIN-00014, Helsinki, Finland. Email: tugume.arthur@helsinki.fi Author s Present Address: Arthur K. Tugume, Department of Botany, Faculty of Science, Makerere University, P.O. Box 7062, Kampala, Uganda. Email: aktugume@botany.mak.ac.ug, tugumeka@yahoo.com

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Online content services can greatly benefit from personalisation features that enable delivery of content that is suited to each user's specific interests. This thesis presents a system that applies text analysis and user modeling techniques in an online news service for the purpose of personalisation and user interest analysis. The system creates a detailed thematic profile for each content item and observes user's actions towards content items to learn user's preferences. A handcrafted taxonomy of concepts, or ontology, is used in profile formation to extract relevant concepts from the text. User preference learning is automatic and there is no need for explicit preference settings or ratings from the user. Learned user profiles are segmented into interest groups using clustering techniques with the objective of providing a source of information for the service provider. Some theoretical background for chosen techniques is presented while the main focus is in finding practical solutions to some of the current information needs, which are not optimally served with traditional techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many species inhabit fragmented landscapes, resulting either from anthropogenic or from natural processes. The ecological and evolutionary dynamics of spatially structured populations are affected by a complex interplay between endogenous and exogenous factors. The metapopulation approach, simplifying the landscape to a discrete set of patches of breeding habitat surrounded by unsuitable matrix, has become a widely applied paradigm for the study of species inhabiting highly fragmented landscapes. In this thesis, I focus on the construction of biologically realistic models and their parameterization with empirical data, with the general objective of understanding how the interactions between individuals and their spatially structured environment affect ecological and evolutionary processes in fragmented landscapes. I study two hierarchically structured model systems, which are the Glanville fritillary butterfly in the Åland Islands, and a system of two interacting aphid species in the Tvärminne archipelago, both being located in South-Western Finland. The interesting and challenging feature of both study systems is that the population dynamics occur over multiple spatial scales that are linked by various processes. My main emphasis is in the development of mathematical and statistical methodologies. For the Glanville fritillary case study, I first build a Bayesian framework for the estimation of death rates and capture probabilities from mark-recapture data, with the novelty of accounting for variation among individuals in capture probabilities and survival. I then characterize the dispersal phase of the butterflies by deriving a mathematical approximation of a diffusion-based movement model applied to a network of patches. I use the movement model as a building block to construct an individual-based evolutionary model for the Glanville fritillary butterfly metapopulation. I parameterize the evolutionary model using a pattern-oriented approach, and use it to study how the landscape structure affects the evolution of dispersal. For the aphid case study, I develop a Bayesian model of hierarchical multi-scale metapopulation dynamics, where the observed extinction and colonization rates are decomposed into intrinsic rates operating specifically at each spatial scale. In summary, I show how analytical approaches, hierarchical Bayesian methods and individual-based simulations can be used individually or in combination to tackle complex problems from many different viewpoints. In particular, hierarchical Bayesian methods provide a useful tool for decomposing ecological complexity into more tractable components.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Oxysterol binding protein (OSBP) homologues have been found in eukaryotic organisms ranging from yeast to humans. These evolutionary conserved proteins have in common the presence of an OSBP-related domain (ORD) which contains the fully conserved EQVSHHPP sequence motif. The ORD forms a barrel structure that binds sterols in its interior. Other domains and sequence elements found in OSBP-homologues include pleckstrin homology domains, ankyrin repeats and two phenylalanines in an acidic tract (FFAT) motifs, which target the proteins to distinct subcellular compartments. OSBP homologues have been implicated in a wide range of intracellular processes, including vesicle trafficking, lipid metabolism and cell signaling, but little is known about the functional mechanisms of these proteins. The human family of OSBP homologues consists of twelve OSBP-related proteins (ORP). This thesis work is focused on one of the family members, ORP1, of which two variants were found to be expressed tissue-specifically in humans. The shorter variant, ORP1S contains an ORD only. The N-terminally extended variant, ORP1L, comprises a pleckstrin homology domain and three ankyrin repeats in addition to the ORD. The two ORP1 variants differ in intracellular localization. ORP1S is cytosolic, while the ankyrin repeat region of ORP1L targets the protein to late endosomes/lysosomes. This part of ORP1L also has profound effects on late endosomal morphology, inducing perinuclear clustering of late endosomes. A central aim of this study was to identify molecular interactions of ORP1L on late endosomes. The morphological changes of late endosomes induced by overexpressed ORP1L implies involvement of small Rab GTPases, regulators of organelle motility, tethering, docking and/or fusion, in generation of the phenotype. A direct interaction was demonstrated between ORP1L and active Rab7. ORP1L prolongs the active state of Rab7 by stabilizing its GTP-bound form. The clustering of late endosomes/lysosomes was also shown to be linked to the minus end-directed microtubule-based dynein-dynactin motor complex through the ankyrin repeat region of ORP1L. ORP1L, Rab7 and the Rab7-interacting lysosomal protein (RILP) were found to be part of the same effector complex recruiting the dynein-dynactin complex to late endosomes, thereby promoting minus end-directed movement. The proteins were found to be physically close to each other on late endosomes and RILP was found to stabilize the ORP1L-Rab7 interaction. It is possible that ORP1L and RILP bind to each other through their C-terminal and N-terminal regions, respectively, when they are bridged by Rab7. With the results of this study we have been able to place a member of the uncharacterized OSBP-family, ORP1L, in the endocytic pathway, where it regulates motility and possibly fusion of late endosomes through interaction with the small GTPase Rab7.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Evolutionary history of biological entities is recorded within their nucleic acid sequences and can (sometimes) be deciphered by thorough genomic analysis. In this study we sought to gain insights into the diversity and evolution of bacterial and archaeal viruses. Our primary interest was pointed towards those virus groups/families for which comprehensive genomic analysis was not previously possible due to the lack of sufficient amount of genomic data. During the course of this work twenty-five putative proviruses integrated into various prokaryotic genomes were identified, enabling us to undertake a comparative genomics approach. This analysis allowed us to test the previously formulated evolutionary hypotheses and also provided valuable information on the molecular mechanisms behind the genome evolution of the studied virus groups.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Environmental variation is a fact of life for all the species on earth: for any population of any particular species, the local environmental conditions are liable to vary in both time and space. In today's world, anthropogenic activity is causing habitat loss and fragmentation for many species, which may profoundly alter the characteristics of environmental variation in remaining habitat. Previous research indicates that, as habitat is lost, the spatial configuration of remaining habitat will increasingly affect the dynamics by which populations are governed. Through the use of mathematical models, this thesis asks how environmental variation interacts with species properties to influence population dynamics, local adaptation, and dispersal evolution. More specifically, we couple continuous-time continuous-space stochastic population dynamic models to landscape models. We manipulate environmental variation via parameters such as mean patch size, patch density, and patch longevity. Among other findings, we show that a mixture of high and low quality habitat is commonly better for a population than uniformly mediocre habitat. This conclusion is justified by purely ecological arguments, yet the positive effects of landscape heterogeneity may be enhanced further by local adaptation, and by the evolution of short-ranged dispersal. The predicted evolutionary responses to environmental variation are complex, however, since they involve numerous conflicting factors. We discuss why the species that have high levels of local adaptation within their ranges may not be the same species that benefit from local adaptation during range expansion. We show how habitat loss can lead to either increased or decreased selection for dispersal depending on the type of habitat and the manner in which it is lost. To study the models, we develop a recent analytical method, Perturbation expansion, to enable the incorporation of environmental variation. Within this context, we use two methods to address evolutionary dynamics: Adaptive dynamics, which assumes mutations occur infrequently so that the ecological and evolutionary timescales can be separated, and via Genotype distributions, which assume mutations are more frequent. The two approaches generally lead to similar predictions yet, exceptionally, we show how the evolutionary response of dispersal behaviour to habitat turnover may qualitatively depend on the mutation rate.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Productivity is predicted to drive the ecological and evolutionary dynamics of predator-prey interaction through changes in resource allocation between different traits. However, resources are seldom constantly available and thus temporal variation in productivity could have considerable effect on the species' potential to evolve. To study this, three long-term microbial laboratory experiments were established where Serratia marcescens prey bacteria was exposed to predation of protist Tetrahymena thermophila in different prey resource environments. The consequences of prey resource availability for the ecological properties of the predator-prey system, such as trophic dynamics, stability, and virulence, were determined. The evolutionary changes in species traits and prey genetic diversity were measured. The prey defence evolved stronger in high productivity environment. Increased allocation to defence incurred cost in terms of reduced prey resource use ability, which probably constrained prey evolution by increasing the effect of resource competition. However, the magnitude of this trade-off diminished when measured in high resource concentrations. Predation selected for white, non-pigmented, highly defensive prey clones that produced predation resistant biofilm. The biofilm defence was also potentially accompanied with cytotoxicity for predators and could have been traded off with high motility. Evidence for the evolution of predators was also found in one experiment suggesting that co-evolutionary dynamics could affect the evolution and ecology of predator-prey interaction. Temporal variation in resource availability increased variation in predator densities leading to temporally fluctuating selection for prey defences and resource use ability. Temporal variation in resource availability was also able to constrain prey evolution when the allocation to defence incurred high cost. However, when the magnitude of prey trade-off was small and the resource turnover was periodically high, temporal variation facilitated the formation of predator resistant biofilm. The evolution of prey defence constrained the transfer of energy from basal to higher trophic levels, decreasing the strength of top-down regulation on prey community. Predation and temporal variation in productivity decreased the stability of populations and prey traits in general. However, predation-induced destabilization was less pronounced in the high productivity environment where the evolution of prey defence was stronger. In addition, evolution of prey defence weakened the environmental variation induced destabilization of predator population dynamics. Moreover, protozoan predation decreased the S. marcescens virulence in the insect host moth (Parasemia plantaginis) suggesting that species interactions outside the context of host-pathogen relationship could be important indirect drivers for the evolution of pathogenesis. This thesis demonstrates that rapid evolution can affect various ecological properties of predator-prey interaction. The effect of evolution on the ecological dynamics depended on the productivity of the environment, being most evident in the constant environments with high productivity.