855 resultados para parallel computation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

International Conference with Peer Review 2012 IEEE International Conference in Geoscience and Remote Sensing Symposium (IGARSS), 22-27 July 2012, Munich, Germany

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Catastrophic events, such as wars and terrorist attacks, tornadoes and hurricanes, earthquakes, tsunamis, floods and landslides, are always accompanied by a large number of casualties. The size distribution of these casualties has separately been shown to follow approximate power law (PL) distributions. In this paper, we analyze the statistical distributions of the number of victims of catastrophic phenomena, in particular, terrorism, and find double PL behavior. This means that the data sets are better approximated by two PLs instead of a single one. We plot the PL parameters, corresponding to several events, and observe an interesting pattern in the charts, where the lines that connect each pair of points defining the double PLs are almost parallel to each other. A complementary data analysis is performed by means of the computation of the entropy. The results reveal relationships hidden in the data that may trigger a future comprehensive explanation of this type of phenomena.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the past few years Tabling has emerged as a powerful logic programming model. The integration of concurrent features into the implementation of Tabling systems is demanded by need to use recently developed tabling applications within distributed systems, where a process has to respond concurrently to several requests. The support for sharing of tables among the concurrent threads of a Tabling process is a desirable feature, to allow one of Tabling’s virtues, the re-use of computations by other threads and to allow efficient usage of available memory. However, the incremental completion of tables which are evaluated concurrently is not a trivial problem. In this dissertation we describe the integration of concurrency mechanisms, by the way of multi-threading, in a state of the art Tabling and Prolog system, XSB. We begin by reviewing the main concepts for a formal description of tabled computations, called SLG resolution and for the implementation of Tabling under the SLG-WAM, the abstract machine supported by XSB. We describe the different scheduling strategies provided by XSB and introduce some new properties of local scheduling, a scheduling strategy for SLG resolution. We proceed to describe our implementation work by describing the process of integrating multi-threading in a Prolog system supporting Tabling, without addressing the problem of shared tables. We describe the trade-offs and implementation decisions involved. We then describe an optimistic algorithm for the concurrent sharing of completed tables, Shared Completed Tables, which allows the sharing of tables without incurring in deadlocks, under local scheduling. This method relies on the execution properties of local scheduling and includes full support for negation. We provide a theoretical framework and discuss the implementation’s correctness and complexity. After that, we describe amethod for the sharing of tables among threads that allows parallelism in the computation of inter-dependent subgoals, which we name Concurrent Completion. We informally argue for the correctness of Concurrent Completion. We give detailed performance measurements of the multi-threaded XSB systems over a variety of machines and operating systems, for both the Shared Completed Tables and the Concurrent Completion implementations. We focus our measurements inthe overhead over the sequential engine and the scalability of the system. We finish with a comparison of XSB with other multi-threaded Prolog systems and we compare our approach to concurrent tabling with parallel and distributed methods for the evaluation of tabling. Finally, we identify future research directions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a global multiprocessor scheduling algorithm for the Linux kernel that combines the global EDF scheduler with a priority-aware work-stealing load balancing scheme, enabling parallel real-time tasks to be executed on more than one processor at a given time instant. We state that some priority inversion may actually be acceptable, provided it helps reduce contention, communication, synchronisation and coordination between parallel threads, while still guaranteeing the expected system’s predictability. Experimental results demonstrate the low scheduling overhead of the proposed approach comparatively to an existing real-time deadline-oriented scheduling class for the Linux kernel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for its good performance, ease of implementation and theoretical bounds on space and time. Cores treat their own double-ended queues (deques) as a stack, pushing and popping threads from the bottom, but treat the deque of another randomly selected busy core as a queue, stealing threads only from the top, whenever they are idle. However, this standard approach cannot be directly applied to real-time systems, where the importance of parallelising tasks is increasing due to the limitations of multiprocessor scheduling theory regarding parallelism. Using one deque per core is obviously a source of priority inversion since high priority tasks may eventually be enqueued after lower priority tasks, possibly leading to deadline misses as in this case the lower priority tasks are the candidates when a stealing operation occurs. Our proposal is to replace the single non-priority deque of work-stealing with ordered per-processor priority deques of ready threads. The scheduling algorithm starts with a single deque per-core, but unlike traditional work-stealing, the total number of deques in the system may now exceed the number of processors. Instead of stealing randomly, cores steal from the highest priority deque.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-time embedded applications require to process large amounts of data within small time windows. Parallelize and distribute workloads adaptively is suitable solution for computational demanding applications. The purpose of the Parallel Real-Time Framework for distributed adaptive embedded systems is to guarantee local and distributed processing of real-time applications. This work identifies some promising research directions for parallel/distributed real-time embedded applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graphics processors were originally developed for rendering graphics but have recently evolved towards being an architecture for general-purpose computations. They are also expected to become important parts of embedded systems hardware -- not just for graphics. However, this necessitates the development of appropriate timing analysis techniques which would be required because techniques developed for CPU scheduling are not applicable. The reason is that we are not interested in how long it takes for any given GPU thread to complete, but rather how long it takes for all of them to complete. We therefore develop a simple method for finding an upper bound on the makespan of a group of GPU threads executing the same program and competing for the resources of a single streaming multiprocessor (whose architecture is based on NVIDIA Fermi, with some simplifying assunptions). We then build upon this method to formulate the derivation of the exact worst-case makespan (and corresponding schedule) as an optimization problem. Addressing the issue of tractability, we also present a technique for efficiently computing a safe estimate of the worstcase makespan with minimal pessimism, which may be used when finding an exact value would take too long.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High-level parallel languages offer a simple way for application programmers to specify parallelism in a form that easily scales with problem size, leaving the scheduling of the tasks onto processors to be performed at runtime. Therefore, if the underlying system cannot efficiently execute those applications on the available cores, the benefits will be lost. In this paper, we consider how to schedule highly heterogenous parallel applications that require real-time performance guarantees on multicore processors. The paper proposes a novel scheduling approach that combines the global Earliest Deadline First (EDF) scheduler with a priority-aware work-stealing load balancing scheme, which enables parallel realtime tasks to be executed on more than one processor at a given time instant. Experimental results demonstrate the better scalability and lower scheduling overhead of the proposed approach comparatively to an existing real-time deadline-oriented scheduling class for the Linux kernel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multicore platforms have transformed parallelism into a main concern. Parallel programming models are being put forward to provide a better approach for application programmers to expose the opportunities for parallelism by pointing out potentially parallel regions within tasks, leaving the actual and dynamic scheduling of these regions onto processors to be performed at runtime, exploiting the maximum amount of parallelism. It is in this context that this paper proposes a scheduling approach that combines the constant-bandwidth server abstraction with a priority-aware work-stealing load balancing scheme which, while ensuring isolation among tasks, enables parallel tasks to be executed on more than one processor at a given time instant.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recent trends of chip architectures with higher number of heterogeneous cores, and non-uniform memory/non-coherent caches, brings renewed attention to the use of Software Transactional Memory (STM) as a fundamental building block for developing parallel applications. Nevertheless, although STM promises to ease concurrent and parallel software development, it relies on the possibility of aborting conflicting transactions to maintain data consistency, which impacts on the responsiveness and timing guarantees required by embedded real-time systems. In these systems, contention delays must be (efficiently) limited so that the response times of tasks executing transactions are upper-bounded and task sets can be feasibly scheduled. In this paper we assess the use of STM in the development of embedded real-time software, defending that the amount of contention can be reduced if read-only transactions access recent consistent data snapshots, progressing in a wait-free manner. We show how the required number of versions of a shared object can be calculated for a set of tasks. We also outline an algorithm to manage conflicts between update transactions that prevents starvation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Over the last three decades, computer architects have been able to achieve an increase in performance for single processors by, e.g., increasing clock speed, introducing cache memories and using instruction level parallelism. However, because of power consumption and heat dissipation constraints, this trend is going to cease. In recent times, hardware engineers have instead moved to new chip architectures with multiple processor cores on a single chip. With multi-core processors, applications can complete more total work than with one core alone. To take advantage of multi-core processors, parallel programming models are proposed as promising solutions for more effectively using multi-core processors. This paper discusses some of the existent models and frameworks for parallel programming, leading to outline a draft parallel programming model for Ada.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider a wireless sensor network (WSN) where a broadcast from a sensor node does not reach all sensor nodes in the network; such networks are often called multihop networks. Sensor nodes take individual sensor readings, however, in many cases, it is relevant to compute aggregated quantities of these readings. In fact, the minimum and maximum of all sensor readings at an instant are often interesting because they indicate abnormal behavior, for example if the maximum temperature is very high then it may be that a fire has broken out. In this context, we propose an algorithm for computing the min or max of sensor readings in a multihop network. This algorithm has the particularly interesting property of having a time complexity that does not depend on the number of sensor nodes; only the network diameter and the range of the value domain of sensor readings matter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a dynamic scheduler that supports the coexistence of guaranteed and non-guaranteed bandwidth servers to efficiently handle soft-tasks’ overloads by making additional capacity available from two sources: (i) residual capacity allocated but unused when jobs complete in less than their budgeted execution time; (ii) stealing capacity from inactive non-isolated servers used to schedule best-effort jobs. The effectiveness of the proposed approach in reducing the mean tardiness of periodic jobs is demonstrated through extensive simulations. The achieved results become even more significant when tasks’ computation times have a large variance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a new strategy to integrate shared resources and precedence constraints among real-time tasks, assuming no precise information on critical sections and computation times is available. The concept of bandwidth inheritance is combined with a greedy capacity sharing and stealing policy to efficiently exchange bandwidth among tasks, minimising the degree of deviation from the ideal system's behaviour caused by inter-application blocking. The proposed capacity exchange protocol (CXP) focus on exchanging extra capacities as early, and not necessarily as fairly, as possible. This loss of optimality is worth the reduced complexity as the protocol's behaviour nevertheless tends to be fair in the long run and outperforms other solutions in highly dynamic scenarios, as demonstrated by extensive simulations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. This has motivated many initiatives that have been developing scientific workflow tools. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from workflow tasks specification, decentralizing the control of workflow activities, and allowing their tasks to run autonomous in distributed infrastructures, for instance on Clouds. Furthermore many workflow tools only support the execution of Direct Acyclic Graphs (DAG) without the concept of iterations, where activities are executed millions of iterations during long periods of time and supporting dynamic workflow reconfigurations after certain iteration. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on the Process Networks model, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures, e. g. on Clouds. Each AWA executes a Task developed as a Java class that implements a generic interface allowing end-users to code their applications without concerns for low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables support to dynamic workflow reconfiguration and monitoring of the execution of workflows. We describe how AWARD supports dynamic reconfiguration and discuss typical workflow reconfiguration scenarios. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to a small dedicated cluster and the Amazon (Elastic Computing EC2) Cloud.