3 resultados para Combinatorial optimization

em DRUM (Digital Repository at the University of Maryland)


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Datacenters have emerged as the dominant form of computing infrastructure over the last two decades. The tremendous increase in the requirements of data analysis has led to a proportional increase in power consumption and datacenters are now one of the fastest growing electricity consumers in the United States. Another rising concern is the loss of throughput due to network congestion. Scheduling models that do not explicitly account for data placement may lead to a transfer of large amounts of data over the network causing unacceptable delays. In this dissertation, we study different scheduling models that are inspired by the dual objectives of minimizing energy costs and network congestion in a datacenter. As datacenters are equipped to handle peak workloads, the average server utilization in most datacenters is very low. As a result, one can achieve huge energy savings by selectively shutting down machines when demand is low. In this dissertation, we introduce the network-aware machine activation problem to find a schedule that simultaneously minimizes the number of machines necessary and the congestion incurred in the network. Our model significantly generalizes well-studied combinatorial optimization problems such as hard-capacitated hypergraph covering and is thus strongly NP-hard. As a result, we focus on finding good approximation algorithms. Data-parallel computation frameworks such as MapReduce have popularized the design of applications that require a large amount of communication between different machines. Efficient scheduling of these communication demands is essential to guarantee efficient execution of the different applications. In the second part of the thesis, we study the approximability of the co-flow scheduling problem that has been recently introduced to capture these application-level demands. Finally, we also study the question, "In what order should one process jobs?'' Often, precedence constraints specify a partial order over the set of jobs and the objective is to find suitable schedules that satisfy the partial order. However, in the presence of hard deadline constraints, it may be impossible to find a schedule that satisfies all precedence constraints. In this thesis we formalize different variants of job scheduling with soft precedence constraints and conduct the first systematic study of these problems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this dissertation, we apply mathematical programming techniques (i.e., integer programming and polyhedral combinatorics) to develop exact approaches for influence maximization on social networks. We study four combinatorial optimization problems that deal with maximizing influence at minimum cost over a social network. To our knowl- edge, all previous work to date involving influence maximization problems has focused on heuristics and approximation. We start with the following viral marketing problem that has attracted a significant amount of interest from the computer science literature. Given a social network, find a target set of customers to seed with a product. Then, a cascade will be caused by these initial adopters and other people start to adopt this product due to the influence they re- ceive from earlier adopters. The idea is to find the minimum cost that results in the entire network adopting the product. We first study a problem called the Weighted Target Set Selection (WTSS) Prob- lem. In the WTSS problem, the diffusion can take place over as many time periods as needed and a free product is given out to the individuals in the target set. Restricting the number of time periods that the diffusion takes place over to be one, we obtain a problem called the Positive Influence Dominating Set (PIDS) problem. Next, incorporating partial incentives, we consider a problem called the Least Cost Influence Problem (LCIP). The fourth problem studied is the One Time Period Least Cost Influence Problem (1TPLCIP) which is identical to the LCIP except that we restrict the number of time periods that the diffusion takes place over to be one. We apply a common research paradigm to each of these four problems. First, we work on special graphs: trees and cycles. Based on the insights we obtain from special graphs, we develop efficient methods for general graphs. On trees, first, we propose a polynomial time algorithm. More importantly, we present a tight and compact extended formulation. We also project the extended formulation onto the space of the natural vari- ables that gives the polytope on trees. Next, building upon the result for trees---we derive the polytope on cycles for the WTSS problem; as well as a polynomial time algorithm on cycles. This leads to our contribution on general graphs. For the WTSS problem and the LCIP, using the observation that the influence propagation network must be a directed acyclic graph (DAG), the strong formulation for trees can be embedded into a formulation on general graphs. We use this to design and implement a branch-and-cut approach for the WTSS problem and the LCIP. In our computational study, we are able to obtain high quality solutions for random graph instances with up to 10,000 nodes and 20,000 edges (40,000 arcs) within a reasonable amount of time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we introduce a new mathematical tool for optimization of routes, topology design, and energy efficiency in wireless sensor networks. We introduce a vector field formulation that models communication in the network, and routing is performed in the direction of this vector field at every location of the network. The magnitude of the vector field at every location represents the density of amount of data that is being transited through that location. We define the total communication cost in the network as the integral of a quadratic form of the vector field over the network area. With the above formulation, we introduce a mathematical machinery based on partial differential equations very similar to the Maxwell's equations in electrostatic theory. We show that in order to minimize the cost, the routes should be found based on the solution of these partial differential equations. In our formulation, the sensors are sources of information, and they are similar to the positive charges in electrostatics, the destinations are sinks of information and they are similar to negative charges, and the network is similar to a non-homogeneous dielectric media with variable dielectric constant (or permittivity coefficient). In one of the applications of our mathematical model based on the vector fields, we offer a scheme for energy efficient routing. Our routing scheme is based on changing the permittivity coefficient to a higher value in the places of the network where nodes have high residual energy, and setting it to a low value in the places of the network where the nodes do not have much energy left. Our simulations show that our method gives a significant increase in the network life compared to the shortest path and weighted shortest path schemes. Our initial focus is on the case where there is only one destination in the network, and later we extend our approach to the case where there are multiple destinations in the network. In the case of having multiple destinations, we need to partition the network into several areas known as regions of attraction of the destinations. Each destination is responsible for collecting all messages being generated in its region of attraction. The complexity of the optimization problem in this case is how to define regions of attraction for the destinations and how much communication load to assign to each destination to optimize the performance of the network. We use our vector field model to solve the optimization problem for this case. We define a vector field, which is conservative, and hence it can be written as the gradient of a scalar field (also known as a potential field). Then we show that in the optimal assignment of the communication load of the network to the destinations, the value of that potential field should be equal at the locations of all the destinations. Another application of our vector field model is to find the optimal locations of the destinations in the network. We show that the vector field gives the gradient of the cost function with respect to the locations of the destinations. Based on this fact, we suggest an algorithm to be applied during the design phase of a network to relocate the destinations for reducing the communication cost function. The performance of our proposed schemes is confirmed by several examples and simulation experiments. In another part of this work we focus on the notions of responsiveness and conformance of TCP traffic in communication networks. We introduce the notion of responsiveness for TCP aggregates and define it as the degree to which a TCP aggregate reduces its sending rate to the network as a response to packet drops. We define metrics that describe the responsiveness of TCP aggregates, and suggest two methods for determining the values of these quantities. The first method is based on a test in which we drop a few packets from the aggregate intentionally and measure the resulting rate decrease of that aggregate. This kind of test is not robust to multiple simultaneous tests performed at different routers. We make the test robust to multiple simultaneous tests by using ideas from the CDMA approach to multiple access channels in communication theory. Based on this approach, we introduce tests of responsiveness for aggregates, and call it CDMA based Aggregate Perturbation Method (CAPM). We use CAPM to perform congestion control. A distinguishing feature of our congestion control scheme is that it maintains a degree of fairness among different aggregates. In the next step we modify CAPM to offer methods for estimating the proportion of an aggregate of TCP traffic that does not conform to protocol specifications, and hence may belong to a DDoS attack. Our methods work by intentionally perturbing the aggregate by dropping a very small number of packets from it and observing the response of the aggregate. We offer two methods for conformance testing. In the first method, we apply the perturbation tests to SYN packets being sent at the start of the TCP 3-way handshake, and we use the fact that the rate of ACK packets being exchanged in the handshake should follow the rate of perturbations. In the second method, we apply the perturbation tests to the TCP data packets and use the fact that the rate of retransmitted data packets should follow the rate of perturbations. In both methods, we use signature based perturbations, which means packet drops are performed with a rate given by a function of time. We use analogy of our problem with multiple access communication to find signatures. Specifically, we assign orthogonal CDMA based signatures to different routers in a distributed implementation of our methods. As a result of orthogonality, the performance does not degrade because of cross interference made by simultaneously testing routers. We have shown efficacy of our methods through mathematical analysis and extensive simulation experiments.