904 resultados para Parallel and Distributed Processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, an architecture based on a scalable and flexible set of Evolvable Processing arrays is presented. FPGA-native Dynamic Partial Reconfiguration (DPR) is used for evolution, which is done intrinsically, letting the system to adapt autonomously to variable run-time conditions, including the presence of transient and permanent faults. The architecture supports different modes of operation, namely: independent, parallel, cascaded or bypass mode. These modes of operation can be used during evolution time or during normal operation. The evolvability of the architecture is combined with fault-tolerance techniques, to enhance the platform with self-healing features, making it suitable for applications which require both high adaptability and reliability. Experimental results show that such a system may benefit from accelerated evolution times, increased performance and improved dependability, mainly by increasing fault tolerance for transient and permanent faults, as well as providing some fault identification possibilities. The evolvable HW array shown is tailored for window-based image processing applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unstructured mesh based codes for the modelling of continuum physics phenomena have evolved to provide the facility to model complex interacting systems. Such codes have the potential to provide a high performance on parallel platforms for a small investment in programming. The critical parameters for success are to minimise changes to the code to allow for maintenance while providing high parallel efficiency, scalability to large numbers of processors and portability to a wide range of platforms. The paradigm of domain decomposition with message passing has for some time been demonstrated to provide a high level of efficiency, scalability and portability across shared and distributed memory systems without the need to re-author the code into a new language. This paper addresses these issues in the parallelisation of a complex three dimensional unstructured mesh Finite Volume multiphysics code and discusses the implications of automating the parallelisation process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we survey the most relevant results for the prioritybased schedulability analysis of real-time tasks, both for the fixed and dynamic priority assignment schemes. We give emphasis to the worst-case response time analysis in non-preemptive contexts, which is fundamental for the communication schedulability analysis. We define an architecture to support priority-based scheduling of messages at the application process level of a specific fieldbus communication network, the PROFIBUS. The proposed architecture improves the worst-case messages’ response time, overcoming the limitation of the first-come-first-served (FCFS) PROFIBUS queue implementations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consider the problem of scheduling a set of sporadically arriving tasks on a uniform multiprocessor with the goal of meeting deadlines. A processor p has the speed Sp. Tasks can be preempted but they cannot migrate between processors. We propose an algorithm which can schedule all task sets that any other possible algorithm can schedule assuming that our algorithm is given processors that are three times faster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. This has motivated many initiatives that have been developing scientific workflow tools. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from workflow tasks specification, decentralizing the control of workflow activities, and allowing their tasks to run autonomous in distributed infrastructures, for instance on Clouds. Furthermore many workflow tools only support the execution of Direct Acyclic Graphs (DAG) without the concept of iterations, where activities are executed millions of iterations during long periods of time and supporting dynamic workflow reconfigurations after certain iteration. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on the Process Networks model, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures, e. g. on Clouds. Each AWA executes a Task developed as a Java class that implements a generic interface allowing end-users to code their applications without concerns for low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables support to dynamic workflow reconfiguration and monitoring of the execution of workflows. We describe how AWARD supports dynamic reconfiguration and discuss typical workflow reconfiguration scenarios. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to a small dedicated cluster and the Amazon (Elastic Computing EC2) Cloud.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The IEEE 802.15.4 protocol proposes a flexible communication solution for Low-Rate Wireless Personal Area Networks including sensor networks. It presents the advantage to fit different requirements of potential applications by adequately setting its parameters. When enabling its beacon mode, the protocol makes possible real-time guarantees by using its Guaranteed Time Slot (GTS) mechanism. This paper analyzes the performance of the GTS allocation mechanism in IEEE 802.15.4. The analysis gives a full understanding of the behavior of the GTS mechanism with regards to delay and throughput metrics. First, we propose two accurate models of service curves for a GTS allocation as a function of the IEEE 802.15.4 parameters. We then evaluate the delay bounds guaranteed by an allocation of a GTS using Network Calculus formalism. Finally, based on the analytic results, we analyze the impact of the IEEE 802.15.4 parameters on the throughput and delay bound guaranteed by a GTS allocation. The results of this work pave the way for an efficient dimensioning of an IEEE 802.15.4 cluster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decimal multiplication is an integral part of financial, commercial, and internet-based computations. A novel design for single digit decimal multiplication that reduces the critical path delay and area for an iterative multiplier is proposed in this research. The partial products are generated using single digit multipliers, and are accumulated based on a novel RPS algorithm. This design uses n single digit multipliers for an n × n multiplication. The latency for the multiplication of two n-digit Binary Coded Decimal (BCD) operands is (n + 1) cycles and a new multiplication can begin every n cycle. The accumulation of final partial products and the first iteration of partial product generation for next set of inputs are done simultaneously. This iterative decimal multiplier offers low latency and high throughput, and can be extended for decimal floating-point multiplication.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper presents a design for a hardware genetic algorithm which uses a pipeline of systolic arrays. These arrays have been designed using systolic synthesis techniques which involve expressing the algorithm as a set of uniform recurrence relations. The final design divorces the fitness function evaluation from the hardware and can process chromosomes of different lengths, giving the design a generic quality. The paper demonstrates the design methodology by progressively re-writing a simple genetic algorithm, expressed in C code, into a form from which systolic structures can be deduced. This paper extends previous work by introducing a simplification to a previous systolic design for the genetic algorithm. The simplification results in the removal of 2N 2 + 4N cells and reduces the time complexity by 3N + 1 cycles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a configurable architecture which was designed to aid in the simulation of ULSI circuits at the transistor level. Elsewhere [1] this architecture was shown to be able to run such simulations several times as fast as standard circuit simulators such as SPICES. In this paper, after describing the overall idea and the the architecture of the system as a whole, I concentrate on the description of the architecture of the processing elements of the computing array.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The 4CaaSt project aims at developing a PaaS framework that enables flexible definition, marketing, deployment and management of Cloud-based services and applications. The major innovations proposed by 4CaaSt are the blueprint and its management and lifecycle, a one stop shop for Cloud services and the management of resources in the PaaS level (including elasticity). 4CaaSt also provides a portfolio of ready to use Cloud native services and Cloud- aware immigrant technologies.