Biblioteca Digital

904 resultados para Parallel and Distributed Processing

A novel FPGA-based evolvable hardware system based on multiple processing arrays

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, an architecture based on a scalable and flexible set of Evolvable Processing arrays is presented. FPGA-native Dynamic Partial Reconfiguration (DPR) is used for evolution, which is done intrinsically, letting the system to adapt autonomously to variable run-time conditions, including the presence of transient and permanent faults. The architecture supports different modes of operation, namely: independent, parallel, cascaded or bypass mode. These modes of operation can be used during evolution time or during normal operation. The evolvability of the architecture is combined with fault-tolerance techniques, to enhance the platform with self-healing features, making it suitable for applications which require both high adaptability and reliability. Experimental results show that such a system may benefit from accelerated evolution times, increased performance and improved dependability, mainly by increasing fault tolerance for transient and permanent faults, as well as providing some fault identification possibilities. The evolvable HW array shown is tailored for window-based image processing applications.

Parallel and Streaming Truth Discovery in Large-Scale Quantitative Crowdsourcing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Parallel and Streaming Truth Discovery in Large-Scale Quantitative Crowdsourcing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ACKNOWLEDGEMENTS This research is based upon work supported in part by the U.S. ARL and U.K. Ministry of Defense under Agreement Number W911NF-06-3-0001, and by the NSF under award CNS-1213140. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views or represent the official policies of the NSF, the U.S. ARL, the U.S. Government, the U.K. Ministry of Defense or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Issues and strategies in the parallelisation of unstructured multiphysics code

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unstructured mesh based codes for the modelling of continuum physics phenomena have evolved to provide the facility to model complex interacting systems. Such codes have the potential to provide a high performance on parallel platforms for a small investment in programming. The critical parameters for success are to minimise changes to the code to allow for maintenance while providing high parallel efficiency, scalability to large numbers of processors and portability to a wide range of platforms. The paradigm of domain decomposition with message passing has for some time been demonstrated to provide a high level of efficiency, scalability and portability across shared and distributed memory systems without the need to re-author the code into a new language. This paper addresses these issues in the parallelisation of a complex three dimensional unstructured mesh Finite Volume multiphysics code and discusses the implications of automating the parallelisation process.

From task scheduling in single processor environments to message scheduling in a PROFIBUS fieldbus network

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we survey the most relevant results for the prioritybased schedulability analysis of real-time tasks, both for the fixed and dynamic priority assignment schemes. We give emphasis to the worst-case response time analysis in non-preemptive contexts, which is fundamental for the communication schedulability analysis. We define an architecture to support priority-based scheduling of messages at the application process level of a specific fieldbus communication network, the PROFIBUS. The proposed architecture improves the worst-case messages’ response time, overcoming the limitation of the first-come-first-served (FCFS) PROFIBUS queue implementations.

Competitive analysis of partitioned scheduling on uniform multiprocessors

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consider the problem of scheduling a set of sporadically arriving tasks on a uniform multiprocessor with the goal of meeting deadlines. A processor p has the speed Sp. Tasks can be preempted but they cannot migrate between processors. We propose an algorithm which can schedule all task sets that any other possible algorithm can schedule assuming that our algorithm is given processors that are three times faster.

Autonomic workflow activities: the award framework

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. This has motivated many initiatives that have been developing scientific workflow tools. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from workflow tasks specification, decentralizing the control of workflow activities, and allowing their tasks to run autonomous in distributed infrastructures, for instance on Clouds. Furthermore many workflow tools only support the execution of Direct Acyclic Graphs (DAG) without the concept of iterations, where activities are executed millions of iterations during long periods of time and supporting dynamic workflow reconfigurations after certain iteration. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on the Process Networks model, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures, e. g. on Clouds. Each AWA executes a Task developed as a Java class that implements a generic interface allowing end-users to code their applications without concerns for low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables support to dynamic workflow reconfiguration and monitoring of the execution of workflows. We describe how AWARD supports dynamic reconfiguration and discuss typical workflow reconfiguration scenarios. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to a small dedicated cluster and the Amazon (Elastic Computing EC2) Cloud.

GTS allocation analysis in IEEE 802.15.4 for real-time wireless sensor networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The IEEE 802.15.4 protocol proposes a flexible communication solution for Low-Rate Wireless Personal Area Networks including sensor networks. It presents the advantage to fit different requirements of potential applications by adequately setting its parameters. When enabling its beacon mode, the protocol makes possible real-time guarantees by using its Guaranteed Time Slot (GTS) mechanism. This paper analyzes the performance of the GTS allocation mechanism in IEEE 802.15.4. The analysis gives a full understanding of the behavior of the GTS mechanism with regards to delay and throughput metrics. First, we propose two accurate models of service curves for a GTS allocation as a function of the IEEE 802.15.4 parameters. We then evaluate the delay bounds guaranteed by an allocation of a GTS using Network Calculus formalism. Finally, based on the analytic results, we analyze the impact of the IEEE 802.15.4 parameters on the throughput and delay bound guaranteed by a GTS allocation. The results of this work pave the way for an efficient dimensioning of an IEEE 802.15.4 cluster.

Fixed Point Decimal Multiplication Using RPS Algorithm

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decimal multiplication is an integral part of financial, commercial, and internet-based computations. A novel design for single digit decimal multiplication that reduces the critical path delay and area for an iterative multiplier is proposed in this research. The partial products are generated using single digit multipliers, and are accumulated based on a novel RPS algorithm. This design uses n single digit multipliers for an n × n multiplication. The latency for the multiplication of two n-digit Binary Coded Decimal (BCD) operands is (n + 1) cycles and a new multiplication can begin every n cycle. The accumulation of final partial products and the first iteration of partial product generation for next set of inputs are done simultaneously. This iterative decimal multiplier offers low latency and high throughput, and can be extended for decimal floating-point multiplication.

Synthesis of a systolic array genetic algorithm

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper presents a design for a hardware genetic algorithm which uses a pipeline of systolic arrays. These arrays have been designed using systolic synthesis techniques which involve expressing the algorithm as a set of uniform recurrence relations. The final design divorces the fitness function evaluation from the hardware and can process chromosomes of different lengths, giving the design a generic quality. The paper demonstrates the design methodology by progressively re-writing a simple genetic algorithm, expressed in C code, into a form from which systolic structures can be deduced. This paper extends previous work by introducing a simplification to a previous systolic design for the genetic algorithm. The simplification results in the removal of 2N 2 + 4N cells and reduces the time complexity by 3N + 1 cycles.

Parallelisation of the decision based network optimisation algorithms

Relevância:

100.00% 100.00%

Publicador:

Experiences using reconfigurable FPGAs in implementing Monte-Carlo methods

Relevância:

100.00% 100.00%

Publicador:

Accelerating JPEG compression with a dynamically reconfigurable FPGA systolic array

Relevância:

100.00% 100.00%

Publicador:

A configurable approach to circuit simulation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a configurable architecture which was designed to aid in the simulation of ULSI circuits at the transistor level. Elsewhere [1] this architecture was shown to be able to run such simulations several times as fast as standard circuit simulators such as SPICES. In this paper, after describing the overall idea and the the architecture of the system as a whole, I concentrate on the description of the architecture of the processing elements of the computing array.

4CaaSt: Comprehensive management of Cloud services through a PaaS

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The 4CaaSt project aims at developing a PaaS framework that enables flexible definition, marketing, deployment and management of Cloud-based services and applications. The major innovations proposed by 4CaaSt are the blueprint and its management and lifecycle, a one stop shop for Cloud services and the management of resources in the PaaS level (including elasticity). 4CaaSt also provides a portfolio of ready to use Cloud native services and Cloud- aware immigrant technologies.

«
1
2
3
4
5
6
7
8
...
60
61
»