918 resultados para symbolic computation
Resumo:
A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.
Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.
The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.
The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.
All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.
Resumo:
Uncertainty quantification (UQ) is both an old and new concept. The current novelty lies in the interactions and synthesis of mathematical models, computer experiments, statistics, field/real experiments, and probability theory, with a particular emphasize on the large-scale simulations by computer models. The challenges not only come from the complication of scientific questions, but also from the size of the information. It is the focus in this thesis to provide statistical models that are scalable to massive data produced in computer experiments and real experiments, through fast and robust statistical inference.
Chapter 2 provides a practical approach for simultaneously emulating/approximating massive number of functions, with the application on hazard quantification of Soufri\`{e}re Hills volcano in Montserrate island. Chapter 3 discusses another problem with massive data, in which the number of observations of a function is large. An exact algorithm that is linear in time is developed for the problem of interpolation of Methylation levels. Chapter 4 and Chapter 5 are both about the robust inference of the models. Chapter 4 provides a new criteria robustness parameter estimation criteria and several ways of inference have been shown to satisfy such criteria. Chapter 5 develops a new prior that satisfies some more criteria and is thus proposed to use in practice.
Resumo:
In this paper, we describe a decentralized privacy-preserving protocol for securely casting trust ratings in distributed reputation systems. Our protocol allows n participants to cast their votes in a way that preserves the privacy of individual values against both internal and external attacks. The protocol is coupled with an extensive theoretical analysis in which we formally prove that our protocol is resistant to collusion against as many as n-1 corrupted nodes in the semi-honest model. The behavior of our protocol is tested in a real P2P network by measuring its communication delay and processing overhead. The experimental results uncover the advantages of our protocol over previous works in the area; without sacrificing security, our decentralized protocol is shown to be almost one order of magnitude faster than the previous best protocol for providing anonymous feedback.
Resumo:
Design embeds ideas in communication and artefacts in subtle and psychologically powerful ways. Sociologist Pierre Bourdieu coined the term ‘symbolic violence’ to describe how powerful ideologies, priorities, values and even sensibilities are constructed and reproduced through cultural institutions, processes and practices. Through symbolic violence, individuals learn to consider unjust conditions as natural and even come to value customs and ideas that are oppressive. Symbolic violence normalises structural violence and enables real violence to take place, often preceding it and later justifying it. Feminist, class, race and indigenous scholars and activists describe how oppressions (how patriarchy, racism, colonialism, etc.) exist within institutions and structures, and also within cultural practices that embed ideologies into everyday life. The theory of symbolic violence sheds light on how design can function to naturalise oppressions and then obfuscate power relations around this process. Through symbolic violence, design can function as an enabler for the exploitation of certain groups of people and the environment they (and ultimately ‘we’) depend on to live. Design functions as symbolic violence when it is involved with the creation and reproduction of ideas, practices, tools and processes that result in structural and other types of violence (including ecocide). Breaking symbolic violence involves discovering how it works and building capacities to challenge and transform dysfunctional ideologies, structures and institutions. This conversation will give participants an opportunity to discuss, critique and/or develop the theory of design as symbolic violence as a basis for the development of design strategies for social justice.
Resumo:
Collecting data via a questionnaire and analyzing them while preserving respondents’ privacy may increase the number of respondents and the truthfulness of their responses. It may also reduce the systematic differences between respondents and non-respondents. In this paper, we propose a privacy-preserving method for collecting and analyzing survey responses using secure multi-party computation (SMC). The method is secure under the semi-honest adversarial model. The proposed method computes a wide variety of statistics. Total and stratified statistical counts are computed using the secure protocols developed in this paper. Then, additional statistics, such as a contingency table, a chi-square test, an odds ratio, and logistic regression, are computed within the R statistical environment using the statistical counts as building blocks. The method was evaluated on a questionnaire dataset of 3,158 respondents sampled for a medical study and simulated questionnaire datasets of up to 50,000 respondents. The computation time for the statistical analyses linearly scales as the number of respondents increases. The results show that the method is efficient and scalable for practical use. It can also be used for other applications in which categorical data are collected.
Resumo:
The main goal of this thesis is to discuss the determination of homological invariants of polynomial ideals. Thereby we consider different coordinate systems and analyze their meaning for the computation of certain invariants. In particular, we provide an algorithm that transforms any ideal into strongly stable position if char k = 0. With a slight modification, this algorithm can also be used to achieve a stable or quasi-stable position. If our field has positive characteristic, the Borel-fixed position is the maximum we can obtain with our method. Further, we present some applications of Pommaret bases, where we focus on how to directly read off invariants from this basis. In the second half of this dissertation we take a closer look at another homological invariant, namely the (absolute) reduction number. It is a known fact that one immediately receives the reduction number from the basis of the generic initial ideal. However, we show that it is not possible to formulate an algorithm – based on analyzing only the leading ideal – that transforms an ideal into a position, which allows us to directly receive this invariant from the leading ideal. So in general we can not read off the reduction number of a Pommaret basis. This result motivates a deeper investigation of which properties a coordinate system must possess so that we can determine the reduction number easily, i.e. by analyzing the leading ideal. This approach leads to the introduction of some generalized versions of the mentioned stable positions, such as the weakly D-stable or weakly D-minimal stable position. The latter represents a coordinate system that allows to determine the reduction number without any further computations. Finally, we introduce the notion of β-maximal position, which provides lots of interesting algebraic properties. In particular, this position is in combination with weakly D-stable sufficient for the weakly D-minimal stable position and so possesses a connection to the reduction number.
Resumo:
This work reports an alternative method for single non-relativistic charged particle trajectory computation in 2D electrostatic or magnetostatic fields. This task is approached by analytical computation of particle trajectory, by parts, considering the constant fields within each finite element. This method has some advantages over numerical integration ones: numerical miscomputation of trajectories, and stability problems can be avoided. Among the examples presented in this paper, an interesting alternative approach for positive ion extraction from cyclotrons is shown, using strip-foils. Other particle optics devices can benefit of a method such the one proposed in this paper, as beam bending devices, spectrometers, among others. This method can be extended for particle trajectory computation in 3D domains.
Resumo:
This chapter presents an exploratory study involving a group of athletic shoe enthusiasts and their feelings towards customized footwear. These "sneakerheads" demonstrate their infatuation with sneakers via activities ranging from creating catalogs of custom shoes to buying and selling rare athletic footwear online. The key characteristic these individuals share is that, for them, athletic shoes are a fundamental fashion accessory stepped in symbolism and meaning. A series of in-depth interviews utilizing the Zaltman Metaphor Elicitation Technique (ZMET) provide a better understanding of how issues such as art, self-expression, exclusivity, peer recognition, and counterfeit goods interact with the mass customization of symbolic products by category experts.
Resumo:
We present Dithen, a novel computation-as-a-service (CaaS) cloud platform specifically tailored to the parallel ex-ecution of large-scale multimedia tasks. Dithen handles the upload/download of both multimedia data and executable items, the assignment of compute units to multimedia workloads, and the reactive control of the available compute units to minimize the cloud infrastructure cost under deadline-abiding execution. Dithen combines three key properties: (i) the reactive assignment of individual multimedia tasks to available computing units according to availability and predetermined time-to-completion constraints; (ii) optimal resource estimation based on Kalman-filter estimates; (iii) the use of additive increase multiplicative decrease (AIMD) algorithms (famous for being the resource management in the transport control protocol) for the control of the number of units servicing workloads. The deployment of Dithen over Amazon EC2 spot instances is shown to be capable of processing more than 80,000 video transcoding, face detection and image processing tasks (equivalent to the processing of more than 116 GB of compressed data) for less than $1 in billing cost from EC2. Moreover, the proposed AIMD-based control mechanism, in conjunction with the Kalman estimates, is shown to provide for more than 27% reduction in EC2 spot instance cost against methods based on reactive resource estimation. Finally, Dithen is shown to offer a 38% to 500% reduction of the billing cost against the current state-of-the-art in CaaS platforms on Amazon EC2 (Amazon Lambda and Amazon Autoscale). A baseline version of Dithen is currently available at dithen.com.
Resumo:
We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.
Resumo:
This paper is part of the Project “Adaptive thinking and flexible computation: Critical issues”. In this paper we discuss different perspectives of flexibility and adaptive thinking in literature. We also discuss the idea of proceptual thinking and how this idea is important in our perspective of adaptive thinking. The paper analyses a situation developed with a first grade classroom and its teacher named the day number. It is a daily activity at the beginning of the school day. It consists on to look for the date number and to think about different ways of writing it using the four arithmetic operations. The analyzed activity was developed on March 19, so the challenge was to write 19 in several ways. The data show the pupils’ enthusiasm and their efforts to find different ways of writing the number. Some used large numbers and division, which they were just starting to learn. The pupils presented symbolic expressions of 19, decomposing and recomposing it in a flexible manner.
Resumo:
This paper is concerned with the discontinuous Galerkin approximation of the Maxwell eigenproblem. After reviewing the theory developed in [5], we present a set of numerical experiments which both validate the theory, and provide further insight regarding the practical performance of discontinuous Galerkin methods, particularly in the case when non-conforming meshes, characterized by the presence of hanging nodes, are employed.
Resumo:
Computers employing some degree of data flow organisation are now well established as providing a possible vehicle for concurrent computation. Although data-driven computation frees the architecture from the constraints of the single program counter, processor and global memory, inherent in the classic von Neumann computer, there can still be problems with the unconstrained generation of fresh result tokens if a pure data flow approach is adopted. The advantages of allowing serial processing for those parts of a program which are inherently serial, and of permitting a demand-driven, as well as data-driven, mode of operation are identified and described. The MUSE machine described here is a structured architecture supporting both serial and parallel processing which allows the abstract structure of a program to be mapped onto the machine in a logical way.
Resumo:
Symbolic execution is a powerful program analysis technique, but it is very challenging to apply to programs built using event-driven frameworks, such as Android. The main reason is that the framework code itself is too complex to symbolically execute. The standard solution is to manually create a framework model that is simpler and more amenable to symbolic execution. However, developing and maintaining such a model by hand is difficult and error-prone. We claim that we can leverage program synthesis to introduce a high-degree of automation to the process of framework modeling. To support this thesis, we present three pieces of work. First, we introduced SymDroid, a symbolic executor for Android. While Android apps are written in Java, they are compiled to Dalvik bytecode format. Instead of analyzing an app’s Java source, which may not be available, or decompiling from Dalvik back to Java, which requires significant engineering effort and introduces yet another source of potential bugs in an analysis, SymDroid works directly on Dalvik bytecode. Second, we introduced Pasket, a new system that takes a first step toward automatically generating Java framework models to support symbolic execution. Pasket takes as input the framework API and tutorial programs that exercise the framework. From these artifacts and Pasket's internal knowledge of design patterns, Pasket synthesizes an executable framework model by instantiating design patterns, such that the behavior of a synthesized model on the tutorial programs matches that of the original framework. Lastly, in order to scale program synthesis to framework models, we devised adaptive concretization, a novel program synthesis algorithm that combines the best of the two major synthesis strategies: symbolic search, i.e., using SAT or SMT solvers, and explicit search, e.g., stochastic enumeration of possible solutions. Adaptive concretization parallelizes multiple sub-synthesis problems by partially concretizing highly influential unknowns in the original synthesis problem. Thanks to adaptive concretization, Pasket can generate a large-scale model, e.g., thousands lines of code. In addition, we have used an Android model synthesized by Pasket and found that the model is sufficient to allow SymDroid to execute a range of apps.