22 resultados para Entropy of Tsallis
em Université de Lausanne, Switzerland
Resumo:
Recognition by the T-cell receptor (TCR) of immunogenic peptides (p) presented by Class I major histocompatibility complexes (MHC) is the key event in the immune response against virus-infected cells or tumor cells. A study of the 2C TCR/SIYR/H-2K(b) system using a computational alanine scanning and a much faster binding free energy decomposition based on the Molecular Mechanics-Generalized Born Surface Area (MM-GBSA) method is presented. The results show that the TCR-p-MHC binding free energy decomposition using this approach and including entropic terms provides a detailed and reliable description of the interactions between the molecules at an atomistic level. Comparison of the decomposition results with experimentally determined activity differences for alanine mutants yields a correlation of 0.67 when the entropy is neglected and 0.72 when the entropy is taken into account. Similarly, comparison of experimental activities with variations in binding free energies determined by computational alanine scanning yields correlations of 0.72 and 0.74 when the entropy is neglected or taken into account, respectively. Some key interactions for the TCR-p-MHC binding are analyzed and some possible side chains replacements are proposed in the context of TCR protein engineering. In addition, a comparison of the two theoretical approaches for estimating the role of each side chain in the complexation is given, and a new ad hoc approach to decompose the vibrational entropy term into atomic contributions, the linear decomposition of the vibrational entropy (LDVE), is introduced. The latter allows the rapid calculation of the entropic contribution of interesting side chains to the binding. This new method is based on the idea that the most important contributions to the vibrational entropy of a molecule originate from residues that contribute most to the vibrational amplitude of the normal modes. The LDVE approach is shown to provide results very similar to those of the exact but highly computationally demanding method.
Advanced mapping of environmental data: Geostatistics, Machine Learning and Bayesian Maximum Entropy
Resumo:
This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.
Resumo:
An active learning method is proposed for the semi-automatic selection of training sets in remote sensing image classification. The method adds iteratively to the current training set the unlabeled pixels for which the prediction of an ensemble of classifiers based on bagged training sets show maximum entropy. This way, the algorithm selects the pixels that are the most uncertain and that will improve the model if added in the training set. The user is asked to label such pixels at each iteration. Experiments using support vector machines (SVM) on an 8 classes QuickBird image show the excellent performances of the methods, that equals accuracies of both a model trained with ten times more pixels and a model whose training set has been built using a state-of-the-art SVM specific active learning method
Resumo:
The objective of this essay is to reflect on a possible relation between entropy and emergence. A qualitative, relational approach is followed. We begin by highlighting that entropy includes the concept of dispersal, relevant to our enquiry. Emergence in complex systems arises from the coordinated behavior of their parts. Coordination in turn necessitates recognition between parts, i.e., information exchange. What will be argued here is that the scope of recognition processes between parts is increased when preceded by their dispersal, which multiplies the number of encounters and creates a richer potential for recognition. A process intrinsic to emergence is dissolvence (aka submergence or top-down constraints), which participates in the information-entropy interplay underlying the creation, evolution and breakdown of higher-level entities.
Resumo:
1. Species distribution modelling is used increasingly in both applied and theoretical research to predict how species are distributed and to understand attributes of species' environmental requirements. In species distribution modelling, various statistical methods are used that combine species occurrence data with environmental spatial data layers to predict the suitability of any site for that species. While the number of data sharing initiatives involving species' occurrences in the scientific community has increased dramatically over the past few years, various data quality and methodological concerns related to using these data for species distribution modelling have not been addressed adequately. 2. We evaluated how uncertainty in georeferences and associated locational error in occurrences influence species distribution modelling using two treatments: (1) a control treatment where models were calibrated with original, accurate data and (2) an error treatment where data were first degraded spatially to simulate locational error. To incorporate error into the coordinates, we moved each coordinate with a random number drawn from the normal distribution with a mean of zero and a standard deviation of 5 km. We evaluated the influence of error on the performance of 10 commonly used distributional modelling techniques applied to 40 species in four distinct geographical regions. 3. Locational error in occurrences reduced model performance in three of these regions; relatively accurate predictions of species distributions were possible for most species, even with degraded occurrences. Two species distribution modelling techniques, boosted regression trees and maximum entropy, were the best performing models in the face of locational errors. The results obtained with boosted regression trees were only slightly degraded by errors in location, and the results obtained with the maximum entropy approach were not affected by such errors. 4. Synthesis and applications. To use the vast array of occurrence data that exists currently for research and management relating to the geographical ranges of species, modellers need to know the influence of locational error on model quality and whether some modelling techniques are particularly robust to error. We show that certain modelling techniques are particularly robust to a moderate level of locational error and that useful predictions of species distributions can be made even when occurrence data include some error.
Resumo:
In many fields, the spatial clustering of sampled data points has many consequences. Therefore, several indices have been proposed to assess the level of clustering affecting datasets (e.g. the Morisita index, Ripley's Kfunction and Rényi's generalized entropy). The classical Morisita index measures how many times it is more likely to select two measurement points from the same quadrats (the data set is covered by a regular grid of changing size) than it would be in the case of a random distribution generated from a Poisson process. The multipoint version (k-Morisita) takes into account k points with k >= 2. The present research deals with a new development of the k-Morisita index for (1) monitoring network characterization and for (2) detection of patterns in monitored phenomena. From a theoretical perspective, a connection between the k-Morisita index and multifractality has also been found and highlighted on a mathematical multifractal set.
Resumo:
We studied the distribution of Palearctic green toads (Bufo viridis subgroup), an anuran species group with three ploidy levels, inhabiting the Central Asian Amudarya River drainage. Various approaches (one-way, multivariate, components variance analyses and maximum entropy modelling) were used to estimate the effect of altitude, precipitation, temperature and land vegetation covers on the distribution of toads. It is usually assumed that polyploid species occur in regions with harsher climatic conditions (higher latitudes, elevations, etc.), but for the green toads complex, we revealed a more intricate situation. The diploid species (Bufo shaartusiensis and Bufo turanensis) inhabit the arid lowlands (from 44 to 789 m a.s.l.), while tetraploid Bufo pewzowi were recorded in mountainous regions (340-3492 m a.s.l.) with usually lower temperatures and higher precipitation rates than in the region inhabited by diploid species. The triploid species Bufo baturae was found in the Pamirs (Tajikistan) at the highest altitudes (2503-3859 m a.s.l.) under the harshest climatic conditions.
Resumo:
The vast territories that have been radioactively contaminated during the 1986 Chernobyl accident provide a substantial data set of radioactive monitoring data, which can be used for the verification and testing of the different spatial estimation (prediction) methods involved in risk assessment studies. Using the Chernobyl data set for such a purpose is motivated by its heterogeneous spatial structure (the data are characterized by large-scale correlations, short-scale variability, spotty features, etc.). The present work is concerned with the application of the Bayesian Maximum Entropy (BME) method to estimate the extent and the magnitude of the radioactive soil contamination by 137Cs due to the Chernobyl fallout. The powerful BME method allows rigorous incorporation of a wide variety of knowledge bases into the spatial estimation procedure leading to informative contamination maps. Exact measurements (?hard? data) are combined with secondary information on local uncertainties (treated as ?soft? data) to generate science-based uncertainty assessment of soil contamination estimates at unsampled locations. BME describes uncertainty in terms of the posterior probability distributions generated across space, whereas no assumption about the underlying distribution is made and non-linear estimators are automatically incorporated. Traditional estimation variances based on the assumption of an underlying Gaussian distribution (analogous, e.g., to the kriging variance) can be derived as a special case of the BME uncertainty analysis. The BME estimates obtained using hard and soft data are compared with the BME estimates obtained using only hard data. The comparison involves both the accuracy of the estimation maps using the exact data and the assessment of the associated uncertainty using repeated measurements. Furthermore, a comparison of the spatial estimation accuracy obtained by the two methods was carried out using a validation data set of hard data. Finally, a separate uncertainty analysis was conducted that evaluated the ability of the posterior probabilities to reproduce the distribution of the raw repeated measurements available in certain populated sites. The analysis provides an illustration of the improvement in mapping accuracy obtained by adding soft data to the existing hard data and, in general, demonstrates that the BME method performs well both in terms of estimation accuracy as well as in terms estimation error assessment, which are both useful features for the Chernobyl fallout study.
Resumo:
Background and Aims: The NS5A protein of the HCV is known tobe involved in viral replication and assembly and probably in theresistance to Interferon based-therapy. Previous studies identifiedinsertions or deletions from 1 to 12 nucleotides in several genomicregions. In a multicenter study (17 French and 1 Swiss laboratoriesof virology), we identified for the first time a 31 amino acidsinsertion leading to a duplication of the V3 domain in the NS5Aregion with a high prevalence. Quasispecies of each strain withduplication were characterized and the inserted V3 domain wasidentified.Methods: Between 2006 and 2008, 1067 patients chronicallyinfected with a 1b HCV were consecutively included in the study.We first amplified the V3 region by RT-PCR to detect duplication(919 samples successfully amplified). The entire NS5A region wasthen amplified, cloned and sequenced in strains bearing theduplication. V3 sequences (called R1 and R2) from each clonewere analyzed with BioEdit and compared to a V3 consensussequence (C) built from the Database Los Alamos Hepatitis C.Entropy was determined at each position.Results: V3 duplications were identified in 25 patients representinga prevalence of 2.72%. We sequenced 2043 clones from which776 had a complete coding NS5A sequence (corresponding toa mean of 30 clones per patient). At the intra-individual level,6 to 17 variants were identified per V3 region, with a maximum of3 different amino acids. At the inter-individual level, a differenceof 7 and 2 amino acids was observed between C and R1 and R2sequences, respectively. Moreover few positions presented entropyhigher than 1 (4 for the R1, 2 for the R2 and 2 for the C). Among allthe sequenced clones, more than 60% were defective virus (partialfragment of NS5A or stop codon).Conclusions: We identified a duplication of the V3 domain ingenotype 1b HCV with a high prevalence. The R2 domain, which wasthe most similar to the C region, might probably be the "original"domain, whereas R1 should be the inserted domain. Phylogeneticanalyses are under process to confirm this hypothesis.
Resumo:
Visual analysis of electroencephalography (EEG) background and reactivity during therapeutic hypothermia provides important outcome information, but is time-consuming and not always consistent between reviewers. Automated EEG analysis may help quantify the brain damage. Forty-six comatose patients in therapeutic hypothermia, after cardiac arrest, were included in the study. EEG background was quantified with burst-suppression ratio (BSR) and approximate entropy, both used to monitor anesthesia. Reactivity was detected through change in the power spectrum of signal before and after stimulation. Automatic results obtained almost perfect agreement (discontinuity) to substantial agreement (background reactivity) with a visual score from EEG-certified neurologists. Burst-suppression ratio was more suited to distinguish continuous EEG background from burst-suppression than approximate entropy in this specific population. Automatic EEG background and reactivity measures were significantly related to good and poor outcome. We conclude that quantitative EEG measurements can provide promising information regarding current state of the patient and clinical outcome, but further work is needed before routine application in a clinical setting.
Resumo:
DNA condensation observed in vitro with the addition of polyvalent counterions is due to intermolecular attractive forces. We introduce a quantitative model of these forces in a Brownian dynamics simulation in addition to a standard mean-field Poisson-Boltzmann repulsion. The comparison of a theoretical value of the effective diameter calculated from the second virial coefficient in cylindrical geometry with some experimental results allows a quantitative evaluation of the one-parameter attractive potential. We show afterward that with a sufficient concentration of divalent salt (typically approximately 20 mM MgCl(2)), supercoiled DNA adopts a collapsed form where opposing segments of interwound regions present zones of lateral contact. However, under the same conditions the same plasmid without torsional stress does not collapse. The condensed molecules present coexisting open and collapsed plectonemic regions. Furthermore, simulations show that circular DNA in 50% methanol solutions with 20 mM MgCl(2) aggregates without the requirement of torsional energy. This confirms known experimental results. Finally, a simulated DNA molecule confined in a box of variable size also presents some local collapsed zones in 20 mM MgCl(2) above a critical concentration of the DNA. Conformational entropy reduction obtained either by supercoiling or by confinement seems thus to play a crucial role in all forms of condensation of DNA.
Resumo:
In Neo-Darwinism, variation and natural selection are the two evolutionary mechanisms which propel biological evolution. Our previous article presented a histogram model [1] consisting in populations of individuals whose number changed under the influence of variation and/or fitness, the total population remaining constant. Individuals are classified into bins, and the content of each bin is calculated generation after generation by an Excel spreadsheet. Here, we apply the histogram model to a stable population with fitness F(1)=1.00 in which one or two fitter mutants emerge. In a first scenario, a single mutant emerged in the population whose fitness was greater than 1.00. The simulations ended when the original population was reduced to a single individual. The histogram model was validated by excellent agreement between its predictions and those of a classical continuous function (Eqn. 1) which predicts the number of generations needed for a favorable mutation to spread throughout a population. But in contrast to Eqn. 1, our histogram model is adaptable to more complex scenarios, as demonstrated here. In the second and third scenarios, the original population was present at time zero together with two mutants which differed from the original population by two higher and distinct fitness values. In the fourth scenario, the large original population was present at time zero together with one fitter mutant. After a number of generations, when the mutant offspring had multiplied, a second mutant was introduced whose fitness was even greater. The histogram model also allows Shannon entropy (SE) to be monitored continuously as the information content of the total population decreases or increases. The results of these simulations illustrate, in a graphically didactic manner, the influence of natural selection, operating through relative fitness, in the emergence and dominance of a fitter mutant.
Resumo:
In Neo-Darwinism, variation and natural selection are the two evolutionary mechanisms which propel biological evolution. Our previous reports presented a histogram model to simulate the evolution of populations of individuals classified into bins according to an unspecified, quantifiable phenotypic character, and whose number in each bin changed generation after generation under the influence of fitness, while the total population was maintained constant. The histogram model also allowed Shannon entropy (SE) to be monitored continuously as the information content of the total population decreased or increased. Here, a simple Perl (Practical Extraction and Reporting Language) application was developed to carry out these computations, with the critical feature of an added random factor in the percent of individuals whose offspring moved to a vicinal bin. The results of the simulations demonstrate that the random factor mimicking variation increased considerably the range of values covered by Shannon entropy, especially when the percentage of changed offspring was high. This increase in information content is interpreted as facilitated adaptability of the population.