2 resultados para Security constraints
em DRUM (Digital Repository at the University of Maryland)
Resumo:
We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.
Resumo:
In this work we introduce a new mathematical tool for optimization of routes, topology design, and energy efficiency in wireless sensor networks. We introduce a vector field formulation that models communication in the network, and routing is performed in the direction of this vector field at every location of the network. The magnitude of the vector field at every location represents the density of amount of data that is being transited through that location. We define the total communication cost in the network as the integral of a quadratic form of the vector field over the network area. With the above formulation, we introduce a mathematical machinery based on partial differential equations very similar to the Maxwell's equations in electrostatic theory. We show that in order to minimize the cost, the routes should be found based on the solution of these partial differential equations. In our formulation, the sensors are sources of information, and they are similar to the positive charges in electrostatics, the destinations are sinks of information and they are similar to negative charges, and the network is similar to a non-homogeneous dielectric media with variable dielectric constant (or permittivity coefficient). In one of the applications of our mathematical model based on the vector fields, we offer a scheme for energy efficient routing. Our routing scheme is based on changing the permittivity coefficient to a higher value in the places of the network where nodes have high residual energy, and setting it to a low value in the places of the network where the nodes do not have much energy left. Our simulations show that our method gives a significant increase in the network life compared to the shortest path and weighted shortest path schemes. Our initial focus is on the case where there is only one destination in the network, and later we extend our approach to the case where there are multiple destinations in the network. In the case of having multiple destinations, we need to partition the network into several areas known as regions of attraction of the destinations. Each destination is responsible for collecting all messages being generated in its region of attraction. The complexity of the optimization problem in this case is how to define regions of attraction for the destinations and how much communication load to assign to each destination to optimize the performance of the network. We use our vector field model to solve the optimization problem for this case. We define a vector field, which is conservative, and hence it can be written as the gradient of a scalar field (also known as a potential field). Then we show that in the optimal assignment of the communication load of the network to the destinations, the value of that potential field should be equal at the locations of all the destinations. Another application of our vector field model is to find the optimal locations of the destinations in the network. We show that the vector field gives the gradient of the cost function with respect to the locations of the destinations. Based on this fact, we suggest an algorithm to be applied during the design phase of a network to relocate the destinations for reducing the communication cost function. The performance of our proposed schemes is confirmed by several examples and simulation experiments. In another part of this work we focus on the notions of responsiveness and conformance of TCP traffic in communication networks. We introduce the notion of responsiveness for TCP aggregates and define it as the degree to which a TCP aggregate reduces its sending rate to the network as a response to packet drops. We define metrics that describe the responsiveness of TCP aggregates, and suggest two methods for determining the values of these quantities. The first method is based on a test in which we drop a few packets from the aggregate intentionally and measure the resulting rate decrease of that aggregate. This kind of test is not robust to multiple simultaneous tests performed at different routers. We make the test robust to multiple simultaneous tests by using ideas from the CDMA approach to multiple access channels in communication theory. Based on this approach, we introduce tests of responsiveness for aggregates, and call it CDMA based Aggregate Perturbation Method (CAPM). We use CAPM to perform congestion control. A distinguishing feature of our congestion control scheme is that it maintains a degree of fairness among different aggregates. In the next step we modify CAPM to offer methods for estimating the proportion of an aggregate of TCP traffic that does not conform to protocol specifications, and hence may belong to a DDoS attack. Our methods work by intentionally perturbing the aggregate by dropping a very small number of packets from it and observing the response of the aggregate. We offer two methods for conformance testing. In the first method, we apply the perturbation tests to SYN packets being sent at the start of the TCP 3-way handshake, and we use the fact that the rate of ACK packets being exchanged in the handshake should follow the rate of perturbations. In the second method, we apply the perturbation tests to the TCP data packets and use the fact that the rate of retransmitted data packets should follow the rate of perturbations. In both methods, we use signature based perturbations, which means packet drops are performed with a rate given by a function of time. We use analogy of our problem with multiple access communication to find signatures. Specifically, we assign orthogonal CDMA based signatures to different routers in a distributed implementation of our methods. As a result of orthogonality, the performance does not degrade because of cross interference made by simultaneously testing routers. We have shown efficacy of our methods through mathematical analysis and extensive simulation experiments.