875 resultados para Exponential random graph models
Resumo:
2000 Mathematics Subject Classification: 60K15, 60K20, 60G20,60J75, 60J80, 60J85, 60-08, 90B15.
Resumo:
In this dissertation, we apply mathematical programming techniques (i.e., integer programming and polyhedral combinatorics) to develop exact approaches for influence maximization on social networks. We study four combinatorial optimization problems that deal with maximizing influence at minimum cost over a social network. To our knowl- edge, all previous work to date involving influence maximization problems has focused on heuristics and approximation. We start with the following viral marketing problem that has attracted a significant amount of interest from the computer science literature. Given a social network, find a target set of customers to seed with a product. Then, a cascade will be caused by these initial adopters and other people start to adopt this product due to the influence they re- ceive from earlier adopters. The idea is to find the minimum cost that results in the entire network adopting the product. We first study a problem called the Weighted Target Set Selection (WTSS) Prob- lem. In the WTSS problem, the diffusion can take place over as many time periods as needed and a free product is given out to the individuals in the target set. Restricting the number of time periods that the diffusion takes place over to be one, we obtain a problem called the Positive Influence Dominating Set (PIDS) problem. Next, incorporating partial incentives, we consider a problem called the Least Cost Influence Problem (LCIP). The fourth problem studied is the One Time Period Least Cost Influence Problem (1TPLCIP) which is identical to the LCIP except that we restrict the number of time periods that the diffusion takes place over to be one. We apply a common research paradigm to each of these four problems. First, we work on special graphs: trees and cycles. Based on the insights we obtain from special graphs, we develop efficient methods for general graphs. On trees, first, we propose a polynomial time algorithm. More importantly, we present a tight and compact extended formulation. We also project the extended formulation onto the space of the natural vari- ables that gives the polytope on trees. Next, building upon the result for trees---we derive the polytope on cycles for the WTSS problem; as well as a polynomial time algorithm on cycles. This leads to our contribution on general graphs. For the WTSS problem and the LCIP, using the observation that the influence propagation network must be a directed acyclic graph (DAG), the strong formulation for trees can be embedded into a formulation on general graphs. We use this to design and implement a branch-and-cut approach for the WTSS problem and the LCIP. In our computational study, we are able to obtain high quality solutions for random graph instances with up to 10,000 nodes and 20,000 edges (40,000 arcs) within a reasonable amount of time.
Resumo:
Longitudinal data, where data are repeatedly observed or measured on a temporal basis of time or age provides the foundation of the analysis of processes which evolve over time, and these can be referred to as growth or trajectory models. One of the traditional ways of looking at growth models is to employ either linear or polynomial functional forms to model trajectory shape, and account for variation around an overall mean trend with the inclusion of random eects or individual variation on the functional shape parameters. The identification of distinct subgroups or sub-classes (latent classes) within these trajectory models which are not based on some pre-existing individual classification provides an important methodology with substantive implications. The identification of subgroups or classes has a wide application in the medical arena where responder/non-responder identification based on distinctly diering trajectories delivers further information for clinical processes. This thesis develops Bayesian statistical models and techniques for the identification of subgroups in the analysis of longitudinal data where the number of time intervals is limited. These models are then applied to a single case study which investigates the neuropsychological cognition for early stage breast cancer patients undergoing adjuvant chemotherapy treatment from the Cognition in Breast Cancer Study undertaken by the Wesley Research Institute of Brisbane, Queensland. Alternative formulations to the linear or polynomial approach are taken which use piecewise linear models with a single turning point, change-point or knot at a known time point and latent basis models for the non-linear trajectories found for the verbal memory domain of cognitive function before and after chemotherapy treatment. Hierarchical Bayesian random eects models are used as a starting point for the latent class modelling process and are extended with the incorporation of covariates in the trajectory profiles and as predictors of class membership. The Bayesian latent basis models enable the degree of recovery post-chemotherapy to be estimated for short and long-term followup occasions, and the distinct class trajectories assist in the identification of breast cancer patients who maybe at risk of long-term verbal memory impairment.
Resumo:
Velocity jump processes are discrete random walk models that have many applications including the study of biological and ecological collective motion. In particular, velocity jump models are often used to represent a type of persistent motion, known as a “run and tumble”, which is exhibited by some isolated bacteria cells. All previous velocity jump processes are non-interacting, which means that crowding effects and agent-to-agent interactions are neglected. By neglecting these agent-to-agent interactions, traditional velocity jump models are only applicable to very dilute systems. Our work is motivated by the fact that many applications in cell biology, such as wound healing, cancer invasion and development, often involve tissues that are densely packed with cells where cell-to-cell contact and crowding effects can be important. To describe these kinds of high cell density problems using a velocity jump process we introduce three different classes of crowding interactions into a one-dimensional model. Simulation data and averaging arguments lead to a suite of continuum descriptions of the interacting velocity jump processes. We show that the resulting systems of hyperbolic partial differential equations predict the mean behavior of the stochastic simulations very well.
Resumo:
Poisson distribution has often been used for count like accident data. Negative Binomial (NB) distribution has been adopted in the count data to take care of the over-dispersion problem. However, Poisson and NB distributions are incapable of taking into account some unobserved heterogeneities due to spatial and temporal effects of accident data. To overcome this problem, Random Effect models have been developed. Again another challenge with existing traffic accident prediction models is the distribution of excess zero accident observations in some accident data. Although Zero-Inflated Poisson (ZIP) model is capable of handling the dual-state system in accident data with excess zero observations, it does not accommodate the within-location correlation and between-location correlation heterogeneities which are the basic motivations for the need of the Random Effect models. This paper proposes an effective way of fitting ZIP model with location specific random effects and for model calibration and assessment the Bayesian analysis is recommended.
Resumo:
Statistical methods are often used to analyse commercial catch and effort data to provide standardised fishing effort and/or a relative index of fish abundance for input into stock assessment models. Achieving reliable results has proved difficult in Australia's Northern Prawn Fishery (NPF), due to a combination of such factors as the biological characteristics of the animals, some aspects of the fleet dynamics, and the changes in fishing technology. For this set of data, we compared four modelling approaches (linear models, mixed models, generalised estimating equations, and generalised linear models) with respect to the outcomes of the standardised fishing effort or the relative index of abundance. We also varied the number and form of vessel covariates in the models. Within a subset of data from this fishery, modelling correlation structures did not alter the conclusions from simpler statistical models. The random-effects models also yielded similar results. This is because the estimators are all consistent even if the correlation structure is mis-specified, and the data set is very large. However, the standard errors from different models differed, suggesting that different methods have different statistical efficiency. We suggest that there is value in modelling the variance function and the correlation structure, to make valid and efficient statistical inferences and gain insight into the data. We found that fishing power was separable from the indices of prawn abundance only when we offset the impact of vessel characteristics at assumed values from external sources. This may be due to the large degree of confounding within the data, and the extreme temporal changes in certain aspects of individual vessels, the fleet and the fleet dynamics.
Resumo:
Random walk models are often used to interpret experimental observations of the motion of biological cells and molecules. A key aim in applying a random walk model to mimic an in vitro experiment is to estimate the Fickian diffusivity (or Fickian diffusion coefficient),D. However, many in vivo experiments are complicated by the fact that the motion of cells and molecules is hindered by the presence of obstacles. Crowded transport processes have been modeled using repeated stochastic simulations in which a motile agent undergoes a random walk on a lattice that is populated by immobile obstacles. Early studies considered the most straightforward case in which the motile agent and the obstacles are the same size. More recent studies considered stochastic random walk simulations describing the motion of an agent through an environment populated by obstacles of different shapes and sizes. Here, we build on previous simulation studies by analyzing a general class of lattice-based random walk models with agents and obstacles of various shapes and sizes. Our analysis provides exact calculations of the Fickian diffusivity, allowing us to draw conclusions about the role of the size, shape and density of the obstacles, as well as examining the role of the size and shape of the motile agent. Since our analysis is exact, we calculateDdirectly without the need for random walk simulations. In summary, we find that the shape, size and density of obstacles has a major influence on the exact Fickian diffusivity. Furthermore, our results indicate that the difference in diffusivity for symmetric and asymmetric obstacles is significant.
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.
Resumo:
In this paper, we study a problem of designing a multi-hop wireless network for interconnecting sensors (hereafter called source nodes) to a Base Station (BS), by deploying a minimum number of relay nodes at a subset of given potential locations, while meeting a quality of service (QoS) objective specified as a hop count bound for paths from the sources to the BS. The hop count bound suffices to ensure a certain probability of the data being delivered to the BS within a given maximum delay under a light traffic model. We observe that the problem is NP-Hard. For this problem, we propose a polynomial time approximation algorithm based on iteratively constructing shortest path trees and heuristically pruning away the relay nodes used until the hop count bound is violated. Results show that the algorithm performs efficiently in various randomly generated network scenarios; in over 90% of the tested scenarios, it gave solutions that were either optimal or were worse than optimal by just one relay. We then use random graph techniques to obtain, under a certain stochastic setting, an upper bound on the average case approximation ratio of a class of algorithms (including the proposed algorithm) for this problem as a function of the number of source nodes, and the hop count bound. To the best of our knowledge, the average case analysis is the first of its kind in the relay placement literature. Since the design is based on a light traffic model, we also provide simulation results (using models for the IEEE 802.15.4 physical layer and medium access control) to assess the traffic levels up to which the QoS objectives continue to be met. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
This paper proposes a novel experimental test procedure to estimate the reliability of structural dynamical systems under excitations specified via random process models. The samples of random excitations to be used in the test are modified by the addition of an artificial control force. An unbiased estimator for the reliability is derived based on measured ensemble of responses under these modified inputs based on the tenets of Girsanov transformation. The control force is selected so as to reduce the sampling variance of the estimator. The study observes that an acceptable choice for the control force can be made solely based on experimental techniques and the estimator for the reliability can be deduced without taking recourse to mathematical model for the structure under study. This permits the proposed procedure to be applied in the experimental study of time-variant reliability of complex structural systems that are difficult to model mathematically. Illustrative example consists of a multi-axes shake table study on bending-torsion coupled, geometrically non-linear, five-storey frame under uni/bi-axial, non-stationary, random base excitation. Copyright (c) 2014 John Wiley & Sons, Ltd.
Resumo:
Simulation of pedestrian evacuations of smart buildings in emergency is a powerful tool for building analysis, dynamic evacuation planning and real-time response to the evolving state of evacuations. Macroscopic pedestrian models are low-complexity models that are and well suited to algorithmic analysis and planning, but are quite abstract. Microscopic simulation models allow for a high level of simulation detail but can be computationally intensive. By combining micro- and macro- models we can use each to overcome the shortcomings of the other and enable new capability and applications for pedestrian evacuation simulation that would not be possible with either alone. We develop the EvacSim multi-agent pedestrian simulator and procedurally generate macroscopic flow graph models of building space, integrating micro- and macroscopic approaches to simulation of the same emergency space. By “coupling” flow graph parameters to microscopic simulation results, the graph model captures some of the higher detail and fidelity of the complex microscopic simulation model. The coupled flow graph is used for analysis and prediction of the movement of pedestrians in the microscopic simulation, and investigate the performance of dynamic evacuation planning in simulated emergencies using a variety of strategies for allocation of macroscopic evacuation routes to microscopic pedestrian agents. The predictive capability of the coupled flow graph is exploited for the decomposition of microscopic simulation space into multiple future states in a scalable manner. By simulating multiple future states of the emergency in short time frames, this enables sensing strategy based on simulation scenario pattern matching which we show to achieve fast scenario matching, enabling rich, real-time feedback in emergencies in buildings with meagre sensing capabilities.
Resumo:
Models of complex systems with n components typically have order n<sup>2</sup> parameters because each component can potentially interact with every other. When it is impractical to measure these parameters, one may choose random parameter values and study the emergent statistical properties at the system level. Many influential results in theoretical ecology have been derived from two key assumptions: that species interact with random partners at random intensities and that intraspecific competition is comparable between species. Under these assumptions, community dynamics can be described by a community matrix that is often amenable to mathematical analysis. We combine empirical data with mathematical theory to show that both of these assumptions lead to results that must be interpreted with caution. We examine 21 empirically derived community matrices constructed using three established, independent methods. The empirically derived systems are more stable by orders of magnitude than results from random matrices. This consistent disparity is not explained by existing results on predator-prey interactions. We investigate the key properties of empirical community matrices that distinguish them from random matrices. We show that network topology is less important than the relationship between a species’ trophic position within the food web and its interaction strengths. We identify key features of empirical networks that must be preserved if random matrix models are to capture the features of real ecosystems.
Predicting random level and seasonality of hotel prices. A structural equation growth curve approach
Resumo:
This article examines the effect on price of different characteristics of holiday hotels in the sun-and-beach segment, under the hedonic function perspective. Monthly prices of the majority of hotels in the Spanish continental Mediterranean coast are gathered from May to October 1999 from the tour operator catalogues. Hedonic functions are specified as random-effect models and parametrized as structural equation models with two latent variables, a random peak season price and a random width of seasonal fluctuations. Characteristics of the hotel and the region where they are located are used as predictors of both latent variables. Besides hotel category, region, distance to the beach, availability of parking place and room equipment have an effect on peak price and also on seasonality. 3- star hotels have the highest seasonality and hotels located in the southern regions the lowest, which could be explained by a warmer climate in autumn
Resumo:
The performance of a model-based diagnosis system could be affected by several uncertainty sources, such as,model errors,uncertainty in measurements, and disturbances. This uncertainty can be handled by mean of interval models.The aim of this thesis is to propose a methodology for fault detection, isolation and identification based on interval models. The methodology includes some algorithms to obtain in an automatic way the symbolic expression of the residual generators enhancing the structural isolability of the faults, in order to design the fault detection tests. These algorithms are based on the structural model of the system. The stages of fault detection, isolation, and identification are stated as constraint satisfaction problems in continuous domains and solved by means of interval based consistency techniques. The qualitative fault isolation is enhanced by a reasoning in which the signs of the symptoms are derived from analytical redundancy relations or bond graph models of the system. An initial and empirical analysis regarding the differences between interval-based and statistical-based techniques is presented in this thesis. The performance and efficiency of the contributions are illustrated through several application examples, covering different levels of complexity.
Resumo:
The human electroencephalogram (EEG) is globally characterized by a 1/f power spectrum superimposed with certain peaks, whereby the "alpha peak" in a frequency range of 8-14 Hz is the most prominent one for relaxed states of wakefulness. We present simulations of a minimal dynamical network model of leaky integrator neurons attached to the nodes of an evolving directed and weighted random graph (an Erdos-Renyi graph). We derive a model of the dendritic field potential (DFP) for the neurons leading to a simulated EEG that describes the global activity of the network. Depending on the network size, we find an oscillatory transition of the simulated EEG when the network reaches a critical connectivity. This transition, indicated by a suitably defined order parameter, is reflected by a sudden change of the network's topology when super-cycles are formed from merging isolated loops. After the oscillatory transition, the power spectra of simulated EEG time series exhibit a 1/f continuum superimposed with certain peaks. (c) 2007 Elsevier B.V. All rights reserved.