905 resultados para heterogeneous platforms
Resumo:
The emergence of programmable logic devices as processing platforms for digital signal processing applications poses challenges concerning rapid implementation and high level optimization of algorithms on these platforms. This paper describes Abhainn, a rapid implementation methodology and toolsuite for translating an algorithmic expression of the system to a working implementation on a heterogeneous multiprocessor/field programmable gate array platform, or a standalone system on programmable chip solution. Two particular focuses for Abhainn are the automated but configurable realisation of inter-processor communuication fabrics, and the establishment of novel dedicated hardware component design methodologies allowing algorithm level transformation for system optimization. This paper outlines the approaches employed in both these particular instances.
Resumo:
Heterogeneous computing technologies, such as multi-core CPUs, GPUs and FPGAs can provide significant performance improvements. However, developing applications for these technologies often results in coupling applications to specific devices, typically through the use of proprietary tools. This paper presents SHEPARD, a compile time and run-time framework that decouples application development from the target platform and enables run-time allocation of tasks to heterogeneous computing devices. Through the use of special annotated functions, called managed tasks, SHEPARD approximates a task's performance on available devices, and coupled with the approximation of current device demand, decides which device can satisfy the task with the lowest overall execution time. Experiments using a task parallel application, based on an in-memory database, demonstrate the opportunity for automatic run-time task allocation to achieve speed-up over a static allocation to a single specific device. © 2014 IEEE.
Resumo:
Heterogeneous multicore platforms are becoming an interesting alternative for embedded computing systems with limited power supply as they can execute specific tasks in an efficient manner. Nonetheless, one of the main challenges of such platforms consists of optimising the energy consumption in the presence of temporal constraints. This paper addresses the problem of task-to-core allocation onto heterogeneous multicore platforms such that the overall energy consumption of the system is minimised. To this end, we propose a two-phase approach that considers both dynamic and leakage energy consumption: (i) the first phase allocates tasks to the cores such that the dynamic energy consumption is reduced; (ii) the second phase refines the allocation performed in the first phase in order to achieve better sleep states by trading off the dynamic energy consumption with the reduction in leakage energy consumption. This hybrid approach considers core frequency set-points, tasks energy consumption and sleep states of the cores to reduce the energy consumption of the system. Major value has been placed on a realistic power model which increases the practical relevance of the proposed approach. Finally, extensive simulations have been carried out to demonstrate the effectiveness of the proposed algorithm. In the best-case, savings up to 18% of energy are reached over the first fit algorithm, which has shown, in previous works, to perform better than other bin-packing heuristics for the target heterogeneous multicore platform.
Resumo:
In this paper, we propose a design paradigm for energy efficient and variation-aware operation of next-generation multicore heterogeneous platforms. The main idea behind the proposed approach lies on the observation that not all operations are equally important in shaping the output quality of various applications and of the overall system. Based on such an observation, we suggest that all levels of the software design stack, including the programming model, compiler, operating system (OS) and run-time system should identify the critical tasks and ensure correct operation of such tasks by assigning them to dynamically adjusted reliable cores/units. Specifically, based on error rates and operating conditions identified by a sense-and-adapt (SeA) unit, the OS selects and sets the right mode of operation of the overall system. The run-time system identifies the critical/less-critical tasks based on special directives and schedules them to the appropriate units that are dynamically adjusted for highly-accurate/approximate operation by tuning their voltage/frequency. Units that execute less significant operations can operate at voltages less than what is required for correct operation and consume less power, if required, since such tasks do not need to be always exact as opposed to the critical ones. Such scheme can lead to energy efficient and reliable operation, while reducing the design cost and overheads of conventional circuit/micro-architecture level techniques.
Resumo:
As multimedia-enabled mobile devices such as smart phones and tablets are becoming the day-to-day computing device of choice for users of all ages, everyone expects that all mobile multimedia applications and services should be as smooth and as high-quality as the desktop experience. The grand challenge in delivering multimedia to mobile devices using the Internet is to ensure the quality of experience that meets the users' expectations, within reasonable costs, while supporting heterogeneous platforms and wireless network conditions. This book aims to provide a holistic overview of the current and future technologies used for delivering high-quality mobile multimedia applications, while focusing on user experience as the key requirement. The book opens with a section dealing with the challenges in mobile video delivery as one of the most bandwidth-intensive media that requires smooth streaming and a user-centric strategy to ensure quality of experience. The second section addresses this challenge by introducing some important concepts for future mobile multimedia coding and the network technologies to deliver quality services. The last section combines the user and technology perspectives by demonstrating how user experience can be measured using case studies on urban community interfaces and Internet telephones.
Resumo:
In the Biodiversity World (BDW) project we have created a flexible and extensible Web Services-based Grid environment for biodiversity researchers to solve problems in biodiversity and analyse biodiversity patterns. In this environment, heterogeneous and globally distributed biodiversity-related resources such as data sets and analytical tools are made available to be accessed and assembled by users into workflows to perform complex scientific experiments. One such experiment is bioclimatic modelling of the geographical distribution of individual species using climate variables in order to predict past and future climate-related changes in species distribution. Data sources and analytical tools required for such analysis of species distribution are widely dispersed, available on heterogeneous platforms, present data in different formats and lack interoperability. The BDW system brings all these disparate units together so that the user can combine tools with little thought as to their availability, data formats and interoperability. The current Web Servicesbased Grid environment enables execution of the BDW workflow tasks in remote nodes but with a limited scope. The next step in the evolution of the BDW architecture is to enable workflow tasks to utilise computational resources available within and outside the BDW domain. We describe the present BDW architecture and its transition to a new framework which provides a distributed computational environment for mapping and executing workflows in addition to bringing together heterogeneous resources and analytical tools.
Resumo:
The tool proposed, known as WSPControl, enables remote monitoring of computers across the Internet using distributed applications. Through a Web Services architecture is possible the communication between these distributed applications across heterogeneous platforms, also eliminates the need for additional settings in computer networks, such as release of ports or proxy. The tool is divided into three modules, namely: • Client Interface: developed in C Sharp, is responsible for capturing data on performance of the monitored computer also connects to the Web Services to report this data. • Web Services Interface: developed in PHP using the PHP SOAP library, is responsible for facilitating the communication between internet applications and client. • Internet Interface: developed in PHP, is responsible for reading and interpreting the information captured these available on the Internet
Resumo:
In this paper we advocate the Loop-of-stencil-reduce pattern as a way to simplify the parallel programming of heterogeneous platforms (multicore+GPUs). Loop-of-Stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop. It transparently targets (by using OpenCL) combinations of CPU cores and GPUs, and it makes it possible to simplify the deployment of a single stencil computation kernel on different GPUs. The paper discusses the implementation of Loop-of-stencil-reduce within the FastFlow parallel framework, considering a simple iterative data-parallel application as running example (Game of Life) and a highly effective parallel filter for visual data restoration to assess performance. Thanks to the high-level design of the Loop-of-stencil-reduce, it was possible to run the filter seamlessly on a multicore machine, on multi-GPUs, and on both.
Resumo:
In this paper, we develop a fast implementation of an hyperspectral coded aperture (HYCA) algorithm on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems, which includes a wide variety of devices, from dense multicore systems from major manufactures such as Intel or ARM to new accelerators such as graphics processing units (GPUs), field programmable gate arrays (FPGAs), the Intel Xeon Phi and other custom devices. Our proposed implementation of HYCA significantly reduces its computational cost. Our experiments have been conducted using simulated data and reveal considerable acceleration factors. This kind of implementations with the same descriptive language on different architectures are very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
Resumo:
Stochastic volatility models are of fundamental importance to the pricing of derivatives. One of the most commonly used models of stochastic volatility is the Heston Model in which the price and volatility of an asset evolve as a pair of coupled stochastic differential equations. The computation of asset prices and volatilities involves the simulation of many sample trajectories with conditioning. The problem is treated using the method of particle filtering. While the simulation of a shower of particles is computationally expensive, each particle behaves independently making such simulations ideal for massively parallel heterogeneous computing platforms. In this paper, we present our portable Opencl implementation of the Heston model and discuss its performance and efficiency characteristics on a range of architectures including Intel cpus, Nvidia gpus, and Intel Many-Integrated-Core (mic) accelerators.
Resumo:
CMPs enable simultaneous execution of multiple applications on the same platforms that share cache resources. Diversity in the cache access patterns of these simultaneously executing applications can potentially trigger inter-application interference, leading to cache pollution. Whereas a large cache can ameliorate this problem, the issues of larger power consumption with increasing cache size, amplified at sub-100nm technologies, makes this solution prohibitive. In this paper in order to address the issues relating to power-aware performance of caches, we propose a caching structure that addresses the following: 1. Definition of application-specific cache partitions as an aggregation of caching units (molecules). The parameters of each molecule namely size, associativity and line size are chosen so that the power consumed by it and access time are optimal for the given technology. 2. Application-Specific resizing of cache partitions with variable and adaptive associativity per cache line, way size and variable line size. 3. A replacement policy that is transparent to the partition in terms of size, heterogeneity in associativity and line size. Through simulation studies we establish the superiority of molecular cache (caches built as aggregations of molecules) that offers a 29% power advantage over that of an equivalently performing traditional cache.
Resumo:
The Alliance for Coastal Technologies (ACT) Workshop on Towed Vehicles: Undulating Platforms As Tools for Mapping Coastal Processes and Water Quality Assessment was convened February 5-7,2007 at The Embassy Suites Hotel, Seaside, California and sponsored by the ACT-Pacific Coast partnership at the Moss Landing Marine Laboratories (MLML). The TUV workshop was co-chaired by Richard Burt (Chelsea Technology Group) and Stewart Lamerdin (MLML Marine Operations). Invited participants were selected to provide a uniform representation of the academic researchers, private sector product developers, and existing and potential data product users from the resource management community to enable development of broad consensus opinions on the application of TUV platforms in coastal resource assessment and management. The workshop was organized to address recognized limitations of point-based monitoring programs, which, while providing valuable data, are incapable of describing the spatial heterogeneity and the extent of features distributed in the bulk solution. This is particularly true as surveys approach the coastal zone where tidal and estuarine influences result in spatially and temporally heterogeneous water masses and entrained biological components. Aerial or satellite based remote sensing can provide an assessment of the aerial extent of plumes and blooms, yet provide no information regarding the third dimension of these features. Towed vehicles offer a cost-effective solution to this problem by providing platforms, which can sample in the horizontal, vertical, and time-based domains. Towed undulating vehicles (henceforth TUVs) represent useful platforms for event-response characterization. This workshop reviewed the current status of towed vehicle technology focusing on limitations of depth, data telemetry, instrument power demands, and ship requirements in an attempt to identify means to incorporate such technology more routinely in monitoring and event-response programs. Specifically, the participants were charged to address the following: (1) Summarize the state of the art in TUV technologies; (2) Identify how TUV platforms are used and how they can assist coastal managers in fulfilling their regulatory and management responsibilities; (3) Identify barriers and challenges to the application of TUV technologies in management and research activities, and (4) Recommend a series of community actions to overcome identified barriers and challenges. A series of plenary presentation were provided to enhance subsequent breakout discussions by the participants. Dave Nelson (University of Rhode Island) provided extensive summaries and real-world assessment of the operational features of a variety of TUV platforms available in the UNOLs scientific fleet. Dr. Burke Hales (Oregon State University) described the modification of TUV to provide a novel sampling platform for high resolution mapping of chemical distributions in near real time. Dr. Sonia Batten (Sir Alister Hardy Foundation for Ocean Sciences) provided an overview on the deployment of specialized towed vehicles equipped with rugged continuous plankton recorders on ships of opportunity to obtain long-term, basin wide surveys of zooplankton community structure, enhancing our understanding of trends in secondary production in the upper ocean. [PDF contains 32 pages]
The s-mote: a versatile heterogeneous multi-radio platform for wireless sensor networks applications
Resumo:
This paper presents a novel architecture and its implementation for a versatile, miniaturised mote which can communicate concurrently using a variety of combinations of ISM bands, has increased processing capability, and interoperability with mainstream GSM technology. All these features are integrated in a small form factor platform. The platform can have many configurations which could satisfy a variety of applications’ constraints. To the best of our knowledge, it is the first integrated platform of this type reported in the literature. The proposed platform opens the way for enhanced levels of Quality of Service (QoS), with respect to reliability, availability and latency, in addition to facilitating interoperability and power reduction compared to existing platforms. The small form factor also allows potential of integration with other mobile platforms including smart phones.