107 resultados para Evolutionary clustering
Resumo:
The diversification of life involved enormous increases in size and complexity. The evolutionary transitions from prokaryotes to unicellular eukaryotes to metazoans were accompanied by major innovations inmetabolicdesign.Hereweshowthat thescalingsofmetabolic rate, population growth rate, and production efficiency with body size have changed across the evolutionary transitions.Metabolic rate scales with body mass superlinearly in prokaryotes, linearly in protists, and sublinearly inmetazoans, so Kleiber’s 3/4 power scaling law does not apply universally across organisms. The scaling ofmaximum population growth rate shifts from positive in prokaryotes to negative in protists and metazoans, and the efficiency of production declines across these groups.Major changes inmetabolic processes duringtheearlyevolutionof life overcameexistingconstraints, exploited new opportunities, and imposed new constraints. The 3.5 billion year history of life on earth was characterized by
Resumo:
Background: Seed storage proteins are a major source of dietary protein, and the content of such proteins determines both the quantity and quality of crop yield. Significantly, examination of the protein content in the seeds of crop plants shows a distinct difference between monocots and dicots. Thus, it is expected that there are different evolutionary patterns in the genes underlying protein synthesis in the seeds of these two groups of plants. Results: Gene duplication, evolutionary rate and positive selection of a major gene family of seed storage proteins (the 11S globulin genes), were compared in dicots and monocots. The results, obtained from five species in each group, show more gene duplications, a higher evolutionary rate and positive selections of this gene family in dicots, which are rich in 11S globulins, but not in the monocots. Conclusion: Our findings provide evidence to support the suggestion that gene duplication and an accelerated evolutionary rate may be associated with higher protein synthesis in dicots as compared to monocots.
Resumo:
The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. This work proposes a fully decentralised algorithm (Epidemic K-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art distributed K-Means algorithms based on sampling methods. The experimental analysis confirms that the proposed algorithm is a practical and accurate distributed K-Means implementation for networked systems of very large and extreme scale.
Resumo:
Starch is the most widespread and abundant storage carbohydrate in crops and its production is critical to both crop yield and quality. As regards the starch content in the seeds of crop plants, there are distinct difference between grasses (Poaceae) and dicots. However, few studies have described the evolutionary pattern of genes in the starch biosynthetic pathway in these two groups of plants. In this study, therefore, an attempt was made to compare the evolutionary rate, gene duplication and selective pattern of the key genes involved in this pathway between the two groups, using five grasses and five dicots as materials. The results showed (i) distinct differences in patterns of gene duplication and loss between grasses and dicots; duplication in grasses mainly occurred prior to the divergence of grasses, whereas duplication mostly occurred in individual species within the dicots; there is less gene loss in grasses than in dicots; (ii) a considerably higher evolutionary rate in grasses than in dicots in most gene families analyzed; (iii) evidence of a different selective pattern between grasses and dicots; positive selection may have occurred asymmetrically in grasses in some gene families, e.g. AGPase small subunit. Therefore, we deduced that gene duplication contributes to, and a higher evolutionary rate is associated with, the higher starch content in grasses. In addition, two novel aspects of the evolution of the starch biosynthetic pathway were observed.
Resumo:
Evolutionary developmental genetics brings together systematists, morphologists and developmental geneticists; it will therefore impact on each of these component disciplines. The goals and methods of phylogenetic analysis are reviewed here, and the contribution of evolutionary developmental genetics to morphological systematics, in terms of character conceptualisation and primary homology assessment, is discussed. Evolutionary developmental genetics, like its component disciplines phylogenetic systematics and comparative morphology, is concerned with homology concepts. Phylogenetic concepts of homology and their limitations are considered here, and the need for independent homology statements at different levels of biological organisation is evaluated. The role of systematics in evolutionary developmental genetics is outlined. Phylogenetic systematics and comparative morphology will suggest effective sampling strategies to developmental geneticists. Phylogenetic systematics provides hypotheses of character evolution (including parallel evolution and convergence), stimulating investigations into the evolutionary gains and losses of morphologies. Comparative morphology identifies those structures that are not easily amenable to typological categorisation, and that may be of particular interest in terms of developmental genetics. The concepts of latent homology and genetic recall may also prove useful in the evolutionary interpretation of developmental genetic data.
Resumo:
This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.
Resumo:
Why does music pervade our lives and those of all known human beings living today and in the recent past? Why do we feel compelled to engage in musical activity, or at least simply enjoy listening to music even if we choose not to actively participate? I argue that this is because musicality—communication using variations in pitch, rhythm, dynamics and timbre, by a combination of the voice, body (as in dance), and material culture—was essential to the lives of our pre-linguistic hominin ancestors. As a consequence we have inherited a desire to engage with music, even if this has no adaptive benefit for us today as a species whose communication system is dominated by spoken language. In this article I provide a summary of the arguments to support this view.
Resumo:
Ensemble clustering (EC) can arise in data assimilation with ensemble square root filters (EnSRFs) using non-linear models: an M-member ensemble splits into a single outlier and a cluster of M−1 members. The stochastic Ensemble Kalman Filter does not present this problem. Modifications to the EnSRFs by a periodic resampling of the ensemble through random rotations have been proposed to address it. We introduce a metric to quantify the presence of EC and present evidence to dispel the notion that EC leads to filter failure. Starting from a univariate model, we show that EC is not a permanent but transient phenomenon; it occurs intermittently in non-linear models. We perform a series of data assimilation experiments using a standard EnSRF and a modified EnSRF by a resampling though random rotations. The modified EnSRF thus alleviates issues associated with EC at the cost of traceability of individual ensemble trajectories and cannot use some of algorithms that enhance performance of standard EnSRF. In the non-linear regimes of low-dimensional models, the analysis root mean square error of the standard EnSRF slowly grows with ensemble size if the size is larger than the dimension of the model state. However, we do not observe this problem in a more complex model that uses an ensemble size much smaller than the dimension of the model state, along with inflation and localisation. Overall, we find that transient EC does not handicap the performance of the standard EnSRF.
Resumo:
Whole-genome sequencing offers new insights into the evolution of bacterial pathogens and the etiology of bacterial disease. Staph- ylococcus aureus is a major cause of bacteria-associated mortality and invasive disease and is carried asymptomatically by 27% of adults. Eighty percent of bacteremias match the carried strain. How- ever, the role of evolutionary change in the pathogen during the progression from carriage to disease is incompletely understood. Here we use high-throughput genome sequencing to discover the genetic changes that accompany the transition from nasal carriage to fatal bloodstream infection in an individual colonized with meth- icillin-sensitive S. aureus. We found a single, cohesive population exhibiting a repertoire of 30 single-nucleotide polymorphisms and four insertion/deletion variants. Mutations accumulated at a steady rate over a 13-mo period, except for a cluster of mutations preceding the transition to disease. Although bloodstream bacteria differed by just eight mutations from the original nasally carried bacteria, half of those mutations caused truncation of proteins, including a prema- ture stop codon in an AraC-family transcriptional regulator that has been implicated in pathogenicity. Comparison with evolution in two asymptomatic carriers supported the conclusion that clusters of pro- tein-truncating mutations are highly unusual. Our results demon- strate that bacterial diversity in vivo is limited but nonetheless detectable by whole-genome sequencing, enabling the study of evolutionary dynamics within the host. Regulatory or structural changes that occur during carriage may be functionally important for pathogenesis; therefore identifying those changes is a crucial step in understanding the biological causes of invasive bacterial disease.
Resumo:
The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.
Resumo:
1. It has been postulated that climate warming may pose the greatest threat species in the tropics, where ectotherms have evolved more thermal specialist physiologies. Although species could rapidly respond to environmental change through adaptation, little is known about the potential for thermal adaptation, especially in tropical species. 2. In the light of the limited empirical evidence available and predictions from mutation-selection theory, we might expect tropical ectotherms to have limited genetic variance to enable adaptation. However, as a consequence of thermodynamic constraints, we might expect this disadvantage to be at least partially offset by a fitness advantage, that is, the ‘hotter-is-better’ hypothesis. 3. Using an established quantitative genetics model and metabolic scaling relationships, we integrate the consequences of the opposing forces of thermal specialization and thermodynamic constraints on adaptive potential by evaluating extinction risk under climate warming. We conclude that the potential advantage of a higher maximal development rate can in theory more than offset the potential disadvantage of lower genetic variance associated with a thermal specialist strategy. 4. Quantitative estimates of extinction risk are fundamentally very sensitive to estimates of generation time and genetic variance. However, our qualitative conclusion that the relative risk of extinction is likely to be lower for tropical species than for temperate species is robust to assumptions regarding the effects of effective population size, mutation rate and birth rate per capita. 5. With a view to improving ecological forecasts, we use this modelling framework to review the sensitivity of our predictions to the model’s underpinning theoretical assumptions and the empirical basis of macroecological patterns that suggest thermal specialization and fitness increase towards the tropics. We conclude by suggesting priority areas for further empirical research.
Resumo:
Evolutionary meta-algorithms for pulse shaping of broadband femtosecond duration laser pulses are proposed. The genetic algorithm searching the evolutionary landscape for desired pulse shapes consists of a population of waveforms (genes), each made from two concatenated vectors, specifying phases and magnitudes, respectively, over a range of frequencies. Frequency domain operators such as mutation, two-point crossover average crossover, polynomial phase mutation, creep and three-point smoothing as well as a time-domain crossover are combined to produce fitter offsprings at each iteration step. The algorithm applies roulette wheel selection; elitists and linear fitness scaling to the gene population. A differential evolution (DE) operator that provides a source of directed mutation and new wavelet operators are proposed. Using properly tuned parameters for DE, the meta-algorithm is used to solve a waveform matching problem. Tuning allows either a greedy directed search near the best known solution or a robust search across the entire parameter space.