127 resultados para gene transcriptional regulatory network, stochastic differential equation, membership function

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bistability arises within a wide range of biological systems from the λ phage switch in bacteria to cellular signal transduction pathways in mammalian cells. Changes in regulatory mechanisms may result in genetic switching in a bistable system. Recently, more and more experimental evidence in the form of bimodal population distributions indicates that noise plays a very important role in the switching of bistable systems. Although deterministic models have been used for studying the existence of bistability properties under various system conditions, these models cannot realize cell-to-cell fluctuations in genetic switching. However, there is a lag in the development of stochastic models for studying the impact of noise in bistable systems because of the lack of detailed knowledge of biochemical reactions, kinetic rates, and molecular numbers. In this work, we develop a previously undescribed general technique for developing quantitative stochastic models for large-scale genetic regulatory networks by introducing Poisson random variables into deterministic models described by ordinary differential equations. Two stochastic models have been proposed for the genetic toggle switch interfaced with either the SOS signaling pathway or a quorum-sensing signaling pathway, and we have successfully realized experimental results showing bimodal population distributions. Because the introduced stochastic models are based on widely used ordinary differential equation models, the success of this work suggests that this approach is a very promising one for studying noise in large-scale genetic regulatory networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years considerable attention has been paid to the numerical solution of stochastic ordinary differential equations (SODEs), as SODEs are often more appropriate than their deterministic counterparts in many modelling situations. However, unlike the deterministic case numerical methods for SODEs are considerably less sophisticated due to the difficulty in representing the (possibly large number of) random variable approximations to the stochastic integrals. Although Burrage and Burrage [High strong order explicit Runge-Kutta methods for stochastic ordinary differential equations, Applied Numerical Mathematics 22 (1996) 81-101] were able to construct strong local order 1.5 stochastic Runge-Kutta methods for certain cases, it is known that all extant stochastic Runge-Kutta methods suffer an order reduction down to strong order 0.5 if there is non-commutativity between the functions associated with the multiple Wiener processes. This order reduction down to that of the Euler-Maruyama method imposes severe difficulties in obtaining meaningful solutions in a reasonable time frame and this paper attempts to circumvent these difficulties by some new techniques. An additional difficulty in solving SODEs arises even in the Linear case since it is not possible to write the solution analytically in terms of matrix exponentials unless there is a commutativity property between the functions associated with the multiple Wiener processes. Thus in this present paper first the work of Magnus [On the exponential solution of differential equations for a linear operator, Communications on Pure and Applied Mathematics 7 (1954) 649-673] (applied to deterministic non-commutative Linear problems) will be applied to non-commutative linear SODEs and methods of strong order 1.5 for arbitrary, linear, non-commutative SODE systems will be constructed - hence giving an accurate approximation to the general linear problem. Secondly, for general nonlinear non-commutative systems with an arbitrary number (d) of Wiener processes it is shown that strong local order I Runge-Kutta methods with d + 1 stages can be constructed by evaluated a set of Lie brackets as well as the standard function evaluations. A method is then constructed which can be efficiently implemented in a parallel environment for this arbitrary number of Wiener processes. Finally some numerical results are presented which illustrate the efficacy of these approaches. (C) 1999 Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many modeling situations in which parameter values can only be estimated or are subject to noise, the appropriate mathematical representation is a stochastic ordinary differential equation (SODE). However, unlike the deterministic case in which there are suites of sophisticated numerical methods, numerical methods for SODEs are much less sophisticated. Until a recent paper by K. Burrage and P.M. Burrage (1996), the highest strong order of a stochastic Runge-Kutta method was one. But K. Burrage and P.M. Burrage (1996) showed that by including additional random variable terms representing approximations to the higher order Stratonovich (or Ito) integrals, higher order methods could be constructed. However, this analysis applied only to the one Wiener process case. In this paper, it will be shown that in the multiple Wiener process case all known stochastic Runge-Kutta methods can suffer a severe order reduction if there is non-commutativity between the functions associated with the Wiener processes. Importantly, however, it is also suggested how this order can be repaired if certain commutator operators are included in the Runge-Kutta formulation. (C) 1998 Elsevier Science B.V. and IMACS. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aijt-Sahalia (2002) introduced a method to estimate transitional probability densities of di®usion processes by means of Hermite expansions with coe±cients determined by means of Taylor series. This note describes a numerical procedure to ¯nd these coe±cients based on the calculation of moments. One advantage of this procedure is that it can be used e®ectively when the mathematical operations required to ¯nd closed-form expressions for these coe±cients are otherwise infeasible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We seek numerical methods for second‐order stochastic differential equations that reproduce the stationary density accurately for all values of damping. A complete analysis is possible for scalar linear second‐order equations (damped harmonic oscillators with additive noise), where the statistics are Gaussian and can be calculated exactly in the continuous‐time and discrete‐time cases. A matrix equation is given for the stationary variances and correlation for methods using one Gaussian random variable per timestep. The only Runge–Kutta method with a nonsingular tableau matrix that gives the exact steady state density for all values of damping is the implicit midpoint rule. Numerical experiments, comparing the implicit midpoint rule with Heun and leapfrog methods on nonlinear equations with additive or multiplicative noise, produce behavior similar to the linear case.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The pioneering work of Runge and Kutta a hundred years ago has ultimately led to suites of sophisticated numerical methods suitable for solving complex systems of deterministic ordinary differential equations. However, in many modelling situations, the appropriate representation is a stochastic differential equation and here numerical methods are much less sophisticated. In this paper a very general class of stochastic Runge-Kutta methods is presented and much more efficient classes of explicit methods than previous extant methods are constructed. In particular, a method of strong order 2 with a deterministic component based on the classical Runge-Kutta method is constructed and some numerical results are presented to demonstrate the efficacy of this approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a singularly perturbed ordinary differential equation with non-smooth data is considered. The numerical method is generated by means of a Petrov-Galerkin finite element method with the piecewise-exponential test function and the piecewise-linear trial function. At the discontinuous point of the coefficient, a special technique is used. The method is shown to be first-order accurate and singular perturbation parameter uniform convergence. Finally, numerical results are presented, which are in agreement with theoretical results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Financial processes may possess long memory and their probability densities may display heavy tails. Many models have been developed to deal with this tail behaviour, which reflects the jumps in the sample paths. On the other hand, the presence of long memory, which contradicts the efficient market hypothesis, is still an issue for further debates. These difficulties present challenges with the problems of memory detection and modelling the co-presence of long memory and heavy tails. This PhD project aims to respond to these challenges. The first part aims to detect memory in a large number of financial time series on stock prices and exchange rates using their scaling properties. Since financial time series often exhibit stochastic trends, a common form of nonstationarity, strong trends in the data can lead to false detection of memory. We will take advantage of a technique known as multifractal detrended fluctuation analysis (MF-DFA) that can systematically eliminate trends of different orders. This method is based on the identification of scaling of the q-th-order moments and is a generalisation of the standard detrended fluctuation analysis (DFA) which uses only the second moment; that is, q = 2. We also consider the rescaled range R/S analysis and the periodogram method to detect memory in financial time series and compare their results with the MF-DFA. An interesting finding is that short memory is detected for stock prices of the American Stock Exchange (AMEX) and long memory is found present in the time series of two exchange rates, namely the French franc and the Deutsche mark. Electricity price series of the five states of Australia are also found to possess long memory. For these electricity price series, heavy tails are also pronounced in their probability densities. The second part of the thesis develops models to represent short-memory and longmemory financial processes as detected in Part I. These models take the form of continuous-time AR(∞) -type equations whose kernel is the Laplace transform of a finite Borel measure. By imposing appropriate conditions on this measure, short memory or long memory in the dynamics of the solution will result. A specific form of the models, which has a good MA(∞) -type representation, is presented for the short memory case. Parameter estimation of this type of models is performed via least squares, and the models are applied to the stock prices in the AMEX, which have been established in Part I to possess short memory. By selecting the kernel in the continuous-time AR(∞) -type equations to have the form of Riemann-Liouville fractional derivative, we obtain a fractional stochastic differential equation driven by Brownian motion. This type of equations is used to represent financial processes with long memory, whose dynamics is described by the fractional derivative in the equation. These models are estimated via quasi-likelihood, namely via a continuoustime version of the Gauss-Whittle method. The models are applied to the exchange rates and the electricity prices of Part I with the aim of confirming their possible long-range dependence established by MF-DFA. The third part of the thesis provides an application of the results established in Parts I and II to characterise and classify financial markets. We will pay attention to the New York Stock Exchange (NYSE), the American Stock Exchange (AMEX), the NASDAQ Stock Exchange (NASDAQ) and the Toronto Stock Exchange (TSX). The parameters from MF-DFA and those of the short-memory AR(∞) -type models will be employed in this classification. We propose the Fisher discriminant algorithm to find a classifier in the two and three-dimensional spaces of data sets and then provide cross-validation to verify discriminant accuracies. This classification is useful for understanding and predicting the behaviour of different processes within the same market. The fourth part of the thesis investigates the heavy-tailed behaviour of financial processes which may also possess long memory. We consider fractional stochastic differential equations driven by stable noise to model financial processes such as electricity prices. The long memory of electricity prices is represented by a fractional derivative, while the stable noise input models their non-Gaussianity via the tails of their probability density. A method using the empirical densities and MF-DFA will be provided to estimate all the parameters of the model and simulate sample paths of the equation. The method is then applied to analyse daily spot prices for five states of Australia. Comparison with the results obtained from the R/S analysis, periodogram method and MF-DFA are provided. The results from fractional SDEs agree with those from MF-DFA, which are based on multifractal scaling, while those from the periodograms, which are based on the second order, seem to underestimate the long memory dynamics of the process. This highlights the need and usefulness of fractal methods in modelling non-Gaussian financial processes with long memory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, the numerical modelling and simulation for fractional partial differential equations (FPDE), which have been found with widely applications in modern engineering and sciences, are attracting increased attentions. The current dominant numerical method for modelling of FPDE is the explicit Finite Difference Method (FDM), which is based on a pre-defined grid leading to inherited issues or shortcomings. This paper aims to develop an implicit meshless approach based on the radial basis functions (RBF) for numerical simulation of time fractional diffusion equations. The discrete system of equations is obtained by using the RBF meshless shape functions and the strong-forms. The stability and convergence of this meshless approach are then discussed and theoretically proven. Several numerical examples with different problem domains are used to validate and investigate accuracy and efficiency of the newly developed meshless formulation. The results obtained by the meshless formations are also compared with those obtained by FDM in terms of their accuracy and efficiency. It is concluded that the present meshless formulation is very effective for the modelling and simulation for FPDE.