6 resultados para variable length Markov chains

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-Equilibrium Statistical Mechanics is a broad subject. Grossly speaking, it deals with systems which have not yet relaxed to an equilibrium state, or else with systems which are in a steady non-equilibrium state, or with more general situations. They are characterized by external forcing and internal fluxes, resulting in a net production of entropy which quantifies dissipation and the extent by which, by the Second Law of Thermodynamics, time-reversal invariance is broken. In this thesis we discuss some of the mathematical structures involved with generic discrete-state-space non-equilibrium systems, that we depict with networks in all analogous to electrical networks. We define suitable observables and derive their linear regime relationships, we discuss a duality between external and internal observables that reverses the role of the system and of the environment, we show that network observables serve as constraints for a derivation of the minimum entropy production principle. We dwell on deep combinatorial aspects regarding linear response determinants, which are related to spanning tree polynomials in graph theory, and we give a geometrical interpretation of observables in terms of Wilson loops of a connection and gauge degrees of freedom. We specialize the formalism to continuous-time Markov chains, we give a physical interpretation for observables in terms of locally detailed balanced rates, we prove many variants of the fluctuation theorem, and show that a well-known expression for the entropy production due to Schnakenberg descends from considerations of gauge invariance, where the gauge symmetry is related to the freedom in the choice of a prior probability distribution. As an additional topic of geometrical flavor related to continuous-time Markov chains, we discuss the Fisher-Rao geometry of nonequilibrium decay modes, showing that the Fisher matrix contains information about many aspects of non-equilibrium behavior, including non-equilibrium phase transitions and superposition of modes. We establish a sort of statistical equivalence principle and discuss the behavior of the Fisher matrix under time-reversal. To conclude, we propose that geometry and combinatorics might greatly increase our understanding of nonequilibrium phenomena.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We start in Chapter 2 to investigate linear matrix-valued SDEs and the Itô-stochastic Magnus expansion. The Itô-stochastic Magnus expansion provides an efficient numerical scheme to solve matrix-valued SDEs. We show convergence of the expansion up to a stopping time τ and provide an asymptotic estimate of the cumulative distribution function of τ. Moreover, we show how to apply it to solve SPDEs with one and two spatial dimensions by combining it with the method of lines with high accuracy. We will see that the Magnus expansion allows us to use GPU techniques leading to major performance improvements compared to a standard Euler-Maruyama scheme. In Chapter 3, we study a short-rate model in a Cox-Ingersoll-Ross (CIR) framework for negative interest rates. We define the short rate as the difference of two independent CIR processes and add a deterministic shift to guarantee a perfect fit to the market term structure. We show how to use the Gram-Charlier expansion to efficiently calibrate the model to the market swaption surface and price Bermudan swaptions with good accuracy. We are taking two different perspectives for rating transition modelling. In Section 4.4, we study inhomogeneous continuous-time Markov chains (ICTMC) as a candidate for a rating model with deterministic rating transitions. We extend this model by taking a Lie group perspective in Section 4.5, to allow for stochastic rating transitions. In both cases, we will compare the most popular choices for a change of measure technique and show how to efficiently calibrate both models to the available historical rating data and market default probabilities. At the very end, we apply the techniques shown in this thesis to minimize the collateral-inclusive Credit/ Debit Valuation Adjustments under the constraint of small collateral postings by using a collateral account dependent on rating trigger.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the field of educational and psychological measurement, the shift from paper-based to computerized tests has become a prominent trend in recent years. Computerized tests allow for more complex and personalized test administration procedures, like Computerized Adaptive Testing (CAT). CAT, following the Item Response Theory (IRT) models, dynamically generates tests based on test-taker responses, driven by complex statistical algorithms. Even if CAT structures are complex, they are flexible and convenient, but concerns about test security should be addressed. Frequent item administration can lead to item exposure and cheating, necessitating preventive and diagnostic measures. In this thesis a method called "CHeater identification using Interim Person fit Statistic" (CHIPS) is developed, designed to identify and limit cheaters in real-time during test administration. CHIPS utilizes response times (RTs) to calculate an Interim Person fit Statistic (IPS), allowing for on-the-fly intervention using a more secret item bank. Also, a slight modification is proposed to overcome situations with constant speed, called Modified-CHIPS (M-CHIPS). A simulation study assesses CHIPS, highlighting its effectiveness in identifying and controlling cheaters. However, it reveals limitations when cheaters possess all correct answers. The M-CHIPS overcame this limitation. Furthermore, the method has shown not to be influenced by the cheaters’ ability distribution or the level of correlation between ability and speed of test-takers. Finally, the method has demonstrated flexibility for the choice of significance level and the transition from fixed-length tests to variable-length ones. The thesis discusses potential applications, including the suitability of the method for multiple-choice tests, assumptions about RT distribution and level of item pre-knowledge. Also limitations are discussed to explore future developments such as different RT distributions, unusual honest respondent behaviors, and field testing in real-world scenarios. In summary, CHIPS and M-CHIPS offer real-time cheating detection in CAT, enhancing test security and ability estimation while not penalizing test respondents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of the thesi is to formulate a suitable Item Response Theory (IRT) based model to measure HRQoL (as latent variable) using a mixed responses questionnaire and relaxing the hypothesis of normal distributed latent variable. The new model is a combination of two models already presented in literature, that is, a latent trait model for mixed responses and an IRT model for Skew Normal latent variable. It is developed in a Bayesian framework, a Markov chain Monte Carlo procedure is used to generate samples of the posterior distribution of the parameters of interest. The proposed model is test on a questionnaire composed by 5 discrete items and one continuous to measure HRQoL in children, the EQ-5D-Y questionnaire. A large sample of children collected in the schools was used. In comparison with a model for only discrete responses and a model for mixed responses and normal latent variable, the new model has better performances, in term of deviance information criterion (DIC), chain convergences times and precision of the estimates.