877 resultados para Load Balancing
Resumo:
Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for intertask synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to the more restrictive data-parallel and fork-join concurrency models, the advanced features being introduced into task-parallelmodels in turn enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and, as a side effect, reduced power consumption. In this article, we develop a systematic approach to compile loop nests into concurrent, dynamically constructed graphs of dependent tasks. We propose a simple and effective heuristic that selects the most profitable parallelization idiom for every dependence type and communication pattern. This heuristic enables the extraction of interband parallelism (cross-barrier parallelism) in a number of numerical computations that range from linear algebra to structured grids and image processing. The proposed static analysis and code generation alleviates the burden of a full-blown dependence resolver to track the readiness of tasks at runtime. We evaluate our approach and algorithms in the PPCG compiler, targeting OpenStream, a representative dataflow task-parallel language with explicit intertask dependences and a lightweight runtime. Experimental results demonstrate the effectiveness of the approach.
Resumo:
中国计算机学会
Resumo:
为了提高实时集群系统中各节点的利用率并防止出现CPU使用率的热点,研究了实时集群系统中基于反馈控制的实时调度框架,并提出了一种新的面向任意图结构的负载平衡算法。该算法基于差异迁移系数和扩散负载平衡原理指导任务迁移以实现系统负载均衡,同时结合反馈控制以避免系统节点使用率振荡。实验结果表明,该算法不仅实现了节点间负载平衡、有效避免了局部热点,而且与反馈控制算法有机集成,保证整个系统稳定运行。
Resumo:
柔性制造系统使生产加工路径有很多可选性,所以调度系统必须考虑机器调度问题。分配规则调度是一种最基本、最具影响力的动态调度方法。然而,分配规则调度方法很少考虑机器顺序选择。兼顾工件选择和机器选择两方面,本文运用交互投标过程,构建基于合同网协议调度的协商规则。研究作业车间动态调度问题,提出并构建了5种合同网规则调度方法。通过实验分析结果表明,基于合同网交互投标模式的规则调度能够大大改善调度系统性能,提高设备的利用率和设备负荷平衡指标。
Resumo:
基于鞍钢新轧集团股份有限公司冷轧薄板厂MES工程的实际情况,介绍了机组排产作业计划过程中的任务分配方法。该方法以机组与生产任务的最佳匹配、机组的负荷平衡为性能指标,采用了实时最小负荷分配规则,科学地解决了冷轧生产线CIMS环境下机组排产作业计划的在线生成问题,实现了生产线上的生产路径优化控制,从而提高了生产质量和设备利用率。
Resumo:
Landslide is a kind of serious geological hazards and its damage is very great. In recent years, landslides become more and more frequent along with increase of scale of engineering constructions and cause greater loss. Consequently, how to protect landslides has become important research subject in the engineering field. This paper improves the method how to compute landslide thrust and solves the irrational problem in the design of piles because of the irrational landslide thrust according to the theory and technology of existed anti-slide piles and pre-stressed cable anti-slide piles. Modern pre-stressing technology has been introduced and load balancing method has been used to improve the stressing behavior of anti-slide piles. Anchor cables, anti-slide piles and modern pre-stressing technology have been used to prevention complicated landslide. It is an important base to select values for the landslide thrust. An improved method to calculate design thrust of anti-slide piles has been presented in this paper on the base of residual thrust method by comparing existing methods to select values of landslide thrust in the design of anti-slide piles. In the method, residual landslide thrust behind the anti-slide piles and residual skid resistance before the piles has been analyzed, equitable distribution of residual landslide thrust behind the piles has been realized, and the method to select value of design thrust becomes more reasonable. The pre-stressed cable anti-slide piles are developed from the common anti-slide piles and are common method to prevent landslide. Their principle is that internal force of anti-slide piles is adjusted and size of section is diminished by changing constraint conditions of anti-slide piles. For landslides with deep slip surface and large scale of slopes, limitation of the method appears. Such landslides are in need of long piles and anchor cables which are not only non-economic but also can generate larger deformation and leave potential danger after prevention. For solving the problem, a new kind of anti-slide piles, inner pre-stressing force anti-slide piles, is presented in this paper, and its principle is that an additional force, which is generated in the inner anti-slide piles by arranging pre-stressed reinforcement or tight wire in a certain form in interior of anti-slide piles and stretching the steel reinforcement or tight wire, may balance out the internal force induced by landslide thrust whole or partly (load balancing method). The method will change bending moment which anti-slide piles are not good at bearing into compressive stress which piles are good at bearing, improve stressing performance of anti-slide piles greatly, diminish size of section, and make anti-slide piles not fissured in the natural service or postpone appearance of the fissures, and improve viability of anti-slide piles. Pre-stressed cable anti-slide piles and inner pre-stressing force anti-slide piles go by the general name of pre-stressed structure anti-slide piles in the paper, and their design and calculation method is also analyzed. A new calculation method is provided in the paper for design of anti-slide piles. For pre-stressed structure anti-slide piles, a new computation mode is firstly presented in the paper on the foundation of cantilever piles. In the mode, constraint form of load-bearing section of the anti-slide piles should be confirmed according to reservoir conditions in order to figure out amount of pre-stress of the anchor cables, and internal force should be analyzed for the load-bearing section of pre-stressed structure anti-slide piles so as to confirm anchorage section of anti-slide piles. Pre-stressed cables of the pre-stressed cable anti-slide piles can be arranged as required. This paper analyzes the load-bearing section of single-row and double-row pre-stressed cable anti-slide piles and provides a calculation method for design of the pre-stressed cable anti-slide piles. Inner pre-stressing force anti-slide piles are a new kind of structural style. Their load-bearing section is divided into four computation modes according to whether pre-stressed cables are applied for exterior of the anti-slide piles, and whether single-row or double-row exterior pre-stressed cables are applied. The load balancing method is used to analyze the computation modes for providing a method to design the inner pre-stressing force anti-slide piles rationally. Pre-stressed cable anti-slide piles and inner pre-stressing force anti-slide piles are applied to research on Mahe landfall in Yalong Lenggu hydropower station by the improved method to select value of design thrust of anti-slide piles. A good effect is obtained in the analysis.
Resumo:
Numerical modeling of groundwater is very important for understanding groundwater flow and solving hydrogeological problem. Today, groundwater studies require massive model cells and high calculation accuracy, which are beyond single-CPU computer’s capabilities. With the development of high performance parallel computing technologies, application of parallel computing method on numerical modeling of groundwater flow becomes necessary and important. Using parallel computing can improve the ability to resolve various hydro-geological and environmental problems. In this study, parallel computing method on two main types of modern parallel computer architecture, shared memory parallel systems and distributed shared memory parallel systems, are discussed. OpenMP and MPI (PETSc) are both used to parallelize the most widely used groundwater simulator, MODFLOW. Two parallel solvers, P-PCG and P-MODFLOW, were developed for MODFLOW. The parallelized MODFLOW was used to simulate regional groundwater flow in Beishan, Gansu Province, which is a potential high-level radioactive waste geological disposal area in China. 1. The OpenMP programming paradigm was used to parallelize the PCG (preconditioned conjugate-gradient method) solver, which is one of the main solver for MODFLOW. The parallel PCG solver, P-PCG, is verified using an 8-processor computer. Both the impact of compilers and different model domain sizes were considered in the numerical experiments. The largest test model has 1000 columns, 1000 rows and 1000 layers. Based on the timing results, execution times using the P-PCG solver are typically about 1.40 to 5.31 times faster than those using the serial one. In addition, the simulation results are the exact same as the original PCG solver, because the majority of serial codes were not changed. It is worth noting that this parallelizing approach reduces cost in terms of software maintenance because only a single source PCG solver code needs to be maintained in the MODFLOW source tree. 2. P-MODFLOW, a domain decomposition–based model implemented in a parallel computing environment is developed, which allows efficient simulation of a regional-scale groundwater flow. The basic approach partitions a large model domain into any number of sub-domains. Parallel processors are used to solve the model equations within each sub-domain. The use of domain decomposition method to achieve the MODFLOW program distributed shared memory parallel computing system will process the application of MODFLOW be extended to the fleet of the most popular systems, so that a large-scale simulation could take full advantage of hundreds or even thousands parallel processors. P-MODFLOW has a good parallel performance, with the maximum speedup of 18.32 (14 processors). Super linear speedups have been achieved in the parallel tests, indicating the efficiency and scalability of the code. Parallel program design, load balancing and full use of the PETSc were considered to achieve a highly efficient parallel program. 3. The characterization of regional ground water flow system is very important for high-level radioactive waste geological disposal. The Beishan area, located in northwestern Gansu Province, China, is selected as a potential site for disposal repository. The area includes about 80000 km2 and has complicated hydrogeological conditions, which greatly increase the computational effort of regional ground water flow models. In order to reduce computing time, parallel computing scheme was applied to regional ground water flow modeling. Models with over 10 million cells were used to simulate how the faults and different recharge conditions impact regional ground water flow pattern. The results of this study provide regional ground water flow information for the site characterization of the potential high-level radioactive waste disposal.
Resumo:
A well-known paradigm for load balancing in distributed systems is the``power of two choices,''whereby an item is stored at the less loaded of two (or more) random alternative servers. We investigate the power of two choices in natural settings for distributed computing where items and servers reside in a geometric space and each item is associated with the server that is its nearest neighbor. This is in fact the backdrop for distributed hash tables such as Chord, where the geometric space is determined by clockwise distance on a one-dimensional ring. Theoretically, we consider the following load balancing problem. Suppose that servers are initially hashed uniformly at random to points in the space. Sequentially, each item then considers d candidate insertion points also chosen uniformly at random from the space,and selects the insertion point whose associated server has the least load. For the one-dimensional ring, and for Euclidean distance on the two-dimensional torus, we demonstrate that when n data items are hashed to n servers,the maximum load at any server is log log n / log d + O(1) with high probability. While our results match the well-known bounds in the standard setting in which each server is selected equiprobably, our applications do not have this feature, since the sizes of the nearest-neighbor regions around servers are non-uniform. Therefore, the novelty in our methods lies in developing appropriate tail bounds on the distribution of nearest-neighbor region sizes and in adapting previous arguments to this more general setting. In addition, we provide simulation results demonstrating the load balance that results as the system size scales into the millions.
Resumo:
In this paper we examine a number of admission control and scheduling protocols for high-performance web servers based on a 2-phase policy for serving HTTP requests. The first "registration" phase involves establishing the TCP connection for the HTTP request and parsing/interpreting its arguments, whereas the second "service" phase involves the service/transmission of data in response to the HTTP request. By introducing a delay between these two phases, we show that the performance of a web server could be potentially improved through the adoption of a number of scheduling policies that optimize the utilization of various system components (e.g. memory cache and I/O). In addition, to its premise for improving the performance of a single web server, the delineation between the registration and service phases of an HTTP request may be useful for load balancing purposes on clusters of web servers. We are investigating the use of such a mechanism as part of the Commonwealth testbed being developed at Boston University.
Resumo:
MPLS (Multi-Protocol Label Switching) has recently emerged to facilitate the engineering of network traffic. This can be achieved by directing packet flows over paths that satisfy multiple requirements. MPLS has been regarded as an enhancement to traditional IP routing, which has the following problems: (1) all packets with the same IP destination address have to follow the same path through the network; and (2) paths have often been computed based on static and single link metrics. These problems may cause traffic concentration, and thus degradation in quality of service. In this paper, we investigate by simulations a range of routing solutions and examine the tradeoff between scalability and performance. At one extreme, IP packet routing using dynamic link metrics provides a stateless solution but may lead to routing oscillations. At the other extreme, we consider a recently proposed Profile-based Routing (PBR), which uses knowledge of potential ingress-egress pairs as well as the traffic profile among them. Minimum Interference Routing (MIRA) is another recently proposed MPLS-based scheme, which only exploits knowledge of potential ingress-egress pairs but not their traffic profile. MIRA and the more conventional widest-shortest path (WSP) routing represent alternative MPLS-based approaches on the spectrum of routing solutions. We compare these solutions in terms of utility, bandwidth acceptance ratio as well as their scalability (routing state and computational overhead) and load balancing capability. While the simplest of the per-flow algorithms we consider, the performance of WSP is close to dynamic per-packet routing, without the potential instabilities of dynamic routing.
Resumo:
We consider the problem of performing topological optimizations of distributed hash tables. Such hash tables include Chord and Tapestry and are a popular building block for distributed applications. Optimizing topologies over one dimensional hash spaces is particularly difficult as the higher dimensionality of the underlying network makes close fits unlikely. Instead, current schemes are limited to heuristically performing local optimizations finding the best of small random set of peers. We propose a new class of topology optimizations based on the existence of clusters of close overlay members within the underlying network. By constructing additional overlays for each cluster, a significant portion of the search procedure can be performed within the local cluster with a corresponding reduction in the search time. Finally, we discuss the effects of these additional overlays on spatial locality and other load balancing scheme.
Resumo:
This paper presents the design and implementation of an infrastructure that enables any Web application, regardless of its current state, to be stopped and uninstalled from a particular server, transferred to a new server, then installed, loaded, and resumed, with all these events occurring "on the fly" and totally transparent to clients. Such functionalities allow entire applications to fluidly move from server to server, reducing the overhead required to administer the system, and increasing its performance in a number of ways: (1) Dynamic replication of new instances of applications to several servers to raise throughput for scalability purposes, (2) Moving applications to servers to achieve load balancing or other resource management goals, (3) Caching entire applications on servers located closer to clients.
Resumo:
We consider the load-balancing problems which arise from parallel scientific codes containing multiple computational phases, or loops over subsets of the data, which are separated by global synchronisation points. We motivate, derive and describe the implementation of an approach which we refer to as the multiphase mesh partitioning strategy to address such issues. The technique is tested on several examples of meshes, both real and artificial, containing multiple computational phases and it is demonstrated that our method can achieve high quality partitions where a standard mesh partitioning approach fails.