3 resultados para crowds, crowd sourcing

em DRUM (Digital Repository at the University of Maryland)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In a microscopic setting, humans behave in rich and unexpected ways. In a macroscopic setting, however, distinctive patterns of group behavior emerge, leading statistical physicists to search for an underlying mechanism. The aim of this dissertation is to analyze the macroscopic patterns of competing ideas in order to discern the mechanics of how group opinions form at the microscopic level. First, we explore the competition of answers in online Q&A (question and answer) boards. We find that a simple individual-level model can capture important features of user behavior, especially as the number of answers to a question grows. Our model further suggests that the wisdom of crowds may be constrained by information overload, in which users are unable to thoroughly evaluate each answer and therefore tend to use heuristics to pick what they believe is the best answer. Next, we explore models of opinion spread among voters to explain observed universal statistical patterns such as rescaled vote distributions and logarithmic vote correlations. We introduce a simple model that can explain both properties, as well as why it takes so long for large groups to reach consensus. An important feature of the model that facilitates agreement with data is that individuals become more stubborn (unwilling to change their opinion) over time. Finally, we explore potential underlying mechanisms for opinion formation in juries, by comparing data to various types of models. We find that different null hypotheses in which jurors do not interact when reaching a decision are in strong disagreement with data compared to a simple interaction model. These findings provide conceptual and mechanistic support for previous work that has found mutual influence can play a large role in group decisions. In addition, by matching our models to data, we are able to infer the time scales over which individuals change their opinions for different jury contexts. We find that these values increase as a function of the trial time, suggesting that jurors and judicial panels exhibit a kind of stubbornness similar to what we include in our model of voting behavior.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Prior research shows that electronic word of mouth (eWOM) wields considerable influence over consumer behavior. However, as the volume and variety of eWOM grows, firms are faced with challenges in analyzing and responding to this information. In this dissertation, I argue that to meet the new challenges and opportunities posed by the expansion of eWOM and to more accurately measure its impacts on firms and consumers, we need to revisit our methodologies for extracting insights from eWOM. This dissertation consists of three essays that further our understanding of the value of social media analytics, especially with respect to eWOM. In the first essay, I use machine learning techniques to extract semantic structure from online reviews. These semantic dimensions describe the experiences of consumers in the service industry more accurately than traditional numerical variables. To demonstrate the value of these dimensions, I show that they can be used to substantially improve the accuracy of econometric models of firm survival. In the second essay, I explore the effects on eWOM of online deals, such as those offered by Groupon, the value of which to both consumers and merchants is controversial. Through a combination of Bayesian econometric models and controlled lab experiments, I examine the conditions under which online deals affect online reviews and provide strategies to mitigate the potential negative eWOM effects resulting from online deals. In the third essay, I focus on how eWOM can be incorporated into efforts to reduce foodborne illness, a major public health concern. I demonstrate how machine learning techniques can be used to monitor hygiene in restaurants through crowd-sourced online reviews. I am able to identify instances of moral hazard within the hygiene inspection scheme used in New York City by leveraging a dictionary specifically crafted for this purpose. To the extent that online reviews provide some visibility into the hygiene practices of restaurants, I show how losses from information asymmetry may be partially mitigated in this context. Taken together, this dissertation contributes by revisiting and refining the use of eWOM in the service sector through a combination of machine learning and econometric methodologies.