961 resultados para discrete-choice models
Resumo:
Biologists are increasingly conscious of the critical role that noise plays in cellular functions such as genetic regulation, often in connection with fluctuations in small numbers of key regulatory molecules. This has inspired the development of models that capture this fundamentally discrete and stochastic nature of cellular biology - most notably the Gillespie stochastic simulation algorithm (SSA). The SSA simulates a temporally homogeneous, discrete-state, continuous-time Markov process, and of course the corresponding probabilities and numbers of each molecular species must all remain positive. While accurately serving this purpose, the SSA can be computationally inefficient due to very small time stepping so faster approximations such as the Poisson and Binomial τ-leap methods have been suggested. This work places these leap methods in the context of numerical methods for the solution of stochastic differential equations (SDEs) driven by Poisson noise. This allows analogues of Euler-Maruyuma, Milstein and even higher order methods to be developed through the Itô-Taylor expansions as well as similar derivative-free Runge-Kutta approaches. Numerical results demonstrate that these novel methods compare favourably with existing techniques for simulating biochemical reactions by more accurately capturing crucial properties such as the mean and variance than existing methods.
Resumo:
An interactive hierarchical Generative Topographic Mapping (HGTM) ¸iteHGTM has been developed to visualise complex data sets. In this paper, we build a more general visualisation system by extending the HGTM visualisation system in 3 directions: bf (1) We generalize HGTM to noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM) developed in ¸iteKabanpami. bf (2) We give the user a choice of initializing the child plots of the current plot in either em interactive, or em automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in ¸iteHGTM, whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of LTMs is employed. bf (3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualisation plots, since they can highlight the boundaries between data clusters. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a toy example and apply our system to three more complex real data sets.
Resumo:
In this contribution, certain aspects of the nonlinear dynamics of magnetic field lines are reviewed. First, the basic facts (known from literature) concerning the Hamiltonian structure are briefly summarized. The paper then concentrates on the following subjects: (i) Transition from the continuous description to discrete maps; (ii) Characteristics of incomplete chaos; (iii) Control of chaos. The presentation is concluded by some remarks on the motion of particles in stochastic magnetic fields.
Resumo:
The paper presents a comparison between the different drag models for granular flows developed in the literature and the effect of each one of them on the fast pyrolysis of wood. The process takes place on an 100 g/h lab scale bubbling fluidized bed reactor located at Aston University. FLUENT 6.3 is used as the modeling framework of the fluidized bed hydrodynamics, while the fast pyrolysis of the discrete wood particles is incorporated as an external user defined function (UDF) hooked to FLUENT’s main code structure. Three different drag models for granular flows are compared, namely the Gidaspow, Syamlal O’Brien, and Wen-Yu, already incorporated in FLUENT’s main code, and their impact on particle trajectory, heat transfer, degradation rate, product yields, and char residence time is quantified. The Eulerian approach is used to model the bubbling behavior of the sand, which is treated as a continuum. Biomass reaction kinetics is modeled according to the literature using a two-stage, semiglobal model that takes into account secondary reactions.
Resumo:
This thesis is concerned with the inventory control of items that can be considered independent of one another. The decisions when to order and in what quantity, are the controllable or independent variables in cost expressions which are minimised. The four systems considered are referred to as (Q, R), (nQ,R,T), (M,T) and (M,R,T). Wiith ((Q,R) a fixed quantity Q is ordered each time the order cover (i.e. stock in hand plus on order ) equals or falls below R, the re-order level. With the other three systems reviews are made only at intervals of T. With (nQ,R,T) an order for nQ is placed if on review the inventory cover is less than or equal to R, where n, which is an integer, is chosen at the time so that the new order cover just exceeds R. In (M, T) each order increases the order cover to M. Fnally in (M, R, T) when on review, order cover does not exceed R, enough is ordered to increase it to M. The (Q, R) system is examined at several levels of complexity, so that the theoretical savings in inventory costs obtained with more exact models could be compared with the increases in computational costs. Since the exact model was preferable for the (Q,R) system only exact models were derived for theoretical systems for the other three. Several methods of optimization were tried, but most were found inappropriate for the exact models because of non-convergence. However one method did work for each of the exact models. Demand is considered continuous, and with one exception, the distribution assumed is the normal distribution truncated so that demand is never less than zero. Shortages are assumed to result in backorders, not lost sales. However, the shortage cost is a function of three items, one of which, the backorder cost, may be either a linear, quadratic or an exponential function of the length of time of a backorder, with or without period of grace. Lead times are assumed constant or gamma distributed. Lastly, the actual supply quantity is allowed to be distributed. All the sets of equations were programmed for a KDF 9 computer and the computed performances of the four inventory control procedures are compared under each assurnption.
Resumo:
This thesis applies a hierarchical latent trait model system to a large quantity of data. The motivation for it was lack of viable approaches to analyse High Throughput Screening datasets which maybe include thousands of data points with high dimensions. High Throughput Screening (HTS) is an important tool in the pharmaceutical industry for discovering leads which can be optimised and further developed into candidate drugs. Since the development of new robotic technologies, the ability to test the activities of compounds has considerably increased in recent years. Traditional methods, looking at tables and graphical plots for analysing relationships between measured activities and the structure of compounds, have not been feasible when facing a large HTS dataset. Instead, data visualisation provides a method for analysing such large datasets, especially with high dimensions. So far, a few visualisation techniques for drug design have been developed, but most of them just cope with several properties of compounds at one time. We believe that a latent variable model (LTM) with a non-linear mapping from the latent space to the data space is a preferred choice for visualising a complex high-dimensional data set. As a type of latent variable model, the latent trait model can deal with either continuous data or discrete data, which makes it particularly useful in this domain. In addition, with the aid of differential geometry, we can imagine the distribution of data from magnification factor and curvature plots. Rather than obtaining the useful information just from a single plot, a hierarchical LTM arranges a set of LTMs and their corresponding plots in a tree structure. We model the whole data set with a LTM at the top level, which is broken down into clusters at deeper levels of t.he hierarchy. In this manner, the refined visualisation plots can be displayed in deeper levels and sub-clusters may be found. Hierarchy of LTMs is trained using expectation-maximisation (EM) algorithm to maximise its likelihood with respect to the data sample. Training proceeds interactively in a recursive fashion (top-down). The user subjectively identifies interesting regions on the visualisation plot that they would like to model in a greater detail. At each stage of hierarchical LTM construction, the EM algorithm alternates between the E- and M-step. Another problem that can occur when visualising a large data set is that there may be significant overlaps of data clusters. It is very difficult for the user to judge where centres of regions of interest should be put. We address this problem by employing the minimum message length technique, which can help the user to decide the optimal structure of the model. In this thesis we also demonstrate the applicability of the hierarchy of latent trait models in the field of document data mining.
Resumo:
The traditional method of classifying neurodegenerative diseases is based on the original clinico-pathological concept supported by 'consensus' criteria and data from molecular pathological studies. This review discusses first, current problems in classification resulting from the coexistence of different classificatory schemes, the presence of disease heterogeneity and multiple pathologies, the use of 'signature' brain lesions in diagnosis, and the existence of pathological processes common to different diseases. Second, three models of neurodegenerative disease are proposed: (1) that distinct diseases exist ('discrete' model), (2) that relatively distinct diseases exist but exhibit overlapping features ('overlap' model), and (3) that distinct diseases do not exist and neurodegenerative disease is a 'continuum' in which there is continuous variation in clinical/pathological features from one case to another ('continuum' model). Third, to distinguish between models, the distribution of the most important molecular 'signature' lesions across the different diseases is reviewed. Such lesions often have poor 'fidelity', i.e., they are not unique to individual disorders but are distributed across many diseases consistent with the overlap or continuum models. Fourth, the question of whether the current classificatory system should be rejected is considered and three alternatives are proposed, viz., objective classification, classification for convenience (a 'dissection'), or analysis as a continuum.
Resumo:
The spatial distribution of self-employment in India: evidence from semiparametric geoadditive models, Regional Studies. The entrepreneurship literature has rarely considered spatial location as a micro-determinant of occupational choice. It has also ignored self-employment in developing countries. Using Bayesian semiparametric geoadditive techniques, this paper models spatial location as a micro-determinant of self-employment choice in India. The empirical results suggest the presence of spatial occupational neighbourhoods and a clear north–south divide in self-employment when the entire sample is considered; however, spatial variation in the non-agriculture sector disappears to a large extent when individual factors that influence self-employment choice are explicitly controlled. The results further suggest non-linear effects of age, education and wealth on self-employment.
Resumo:
Potential applications of high-damping and high-stiffness composites have motivated extensive research on the effects of negative-stiffness inclusions on the overall properties of composites. Recent theoretical advances have been based on the Hashin-Shtrikman composite models, one-dimensional discrete viscoelastic systems and a two-dimensional nested triangular viscoelastic network. In this paper, we further analyze the two-dimensional triangular structure containing pre-selected negative-stiffness components to study its underlying deformation mechanisms and stability. Major new findings are structure-deformation evolution with respect to the magnitude of negative stiffness under shear loading and the phenomena related to dissipation-induced destabilization and inertia-induced stabilization, according to Lyapunov stability analysis. The evolution shows strong correlations between stiffness anomalies and deformation modes. Our stability results reveal that stable damping peaks, i.e. stably extreme effective damping properties, are achievable under hydrostatic loading when the inertia is greater than a critical value. Moreover, destabilization induced by elemental damping is observed with the critical inertia. Regardless of elemental damping, when the inertia is less than the critical value, a weaker system instability is identified.
Resumo:
The recent development of using negative stiffness inclusions to achieve extreme overall stiffness and mechanical damping of composite materials reveals a new avenue for constructing high performance materials. One of the negative stiffness sources can be obtained from phase transforming materials in the vicinity of their phase transition, as suggested by the Landau theory. To understand the underlying mechanism from a microscopic viewpoint, we theoretically analyze a 2D, nested triangular lattice cell with pre-chosen elements containing negative stiffness to demonstrate anomalies in overall stiffness and damping. Combining with current knowledge from continuum models, based on the composite theory, such as the Voigt, Reuss, and Hashin-Shtrikman model, we further explore the stability of the system with Lyapunov's indirect stability theorem. The evolution of the microstructure in terms of the discrete system is discussed. A potential application of the results presented here is to develop special thin films with unusual in-plane mechanical properties. © 2006 Elsevier B.V. All rights reserved.
Resumo:
Determination of the so-called optical constants (complex refractive index N, which is usually a function of the wavelength, and physical thickness D) of thin films from experimental data is a typical inverse non-linear problem. It is still a challenge to the scientific community because of the complexity of the problem and its basic and technological significance in optics. Usually, solutions are looked for models with 3-10 parameters. Best estimates of these parameters are obtained by minimization procedures. Herein, we discuss the choice of orthogonal polynomials for the dispersion law of the thin film refractive index. We show the advantage of their use, compared to the Selmeier, Lorentz or Cauchy models.
Resumo:
Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.
Resumo:
As more of the economy moves from traditional manufacturing to the service sector, the nature of work is becoming less tangible and thus, the representation of human behaviour in models is becoming more important. Representing human behaviour and decision making in models is challenging, both in terms of capturing the essence of the processes, and also the way that those behaviours and decisions are or can be represented in the models themselves. In order to advance understanding in this area, a useful first step is to evaluate and start to classify the various types of behaviour and decision making that are required to be modelled. This talk will attempt to set out and provide an initial classification of the different types of behaviour and decision making that a modeller might want to represent in a model. Then, it will be useful to start to assess the main methods of simulation in terms of their capability in representing these various aspects. The three main simulation methods, System Dynamics, Agent Based Modelling and Discrete Event Simulation all achieve this to varying degrees. There is some evidence that all three methods can, within limits, represent the key aspects of the system being modelled. The three simulation approaches are then assessed for their suitability in modelling these various aspects. Illustration of behavioural modelling will be provided from cases in supply chain management, evacuation modelling and rail disruption.
Resumo:
In non-linear random effects some attention has been very recently devoted to the analysis ofsuitable transformation of the response variables separately (Taylor 1996) or not (Oberg and Davidian 2000) from the transformations of the covariates and, as far as we know, no investigation has been carried out on the choice of link function in such models. In our study we consider the use of a random effect model when a parameterized family of links (Aranda-Ordaz 1981, Prentice 1996, Pregibon 1980, Stukel 1988 and Czado 1997) is introduced. We point out the advantages and the drawbacks associated with the choice of this data-driven kind of modeling. Difficulties in the interpretation of regression parameters, and therefore in understanding the influence of covariates, as well as problems related to loss of efficiency of estimates and overfitting, are discussed. A case study on radiotherapy usage in breast cancer treatment is discussed.
Resumo:
2000 Mathematics Subject Classification: 94A29, 94B70