963 resultados para Hardware


Relevância:

20.00% 20.00%

Publicador:

Resumo:

International audience

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Im Rahmen der wissenschaftlichen Ausbildung sind Praktika vielerorts ein wichtiger Bestandteil der Lehre. Sie zeichnen sich im Regelfall dadurch aus, dass die Studierenden die gestellten Versuche an speziell ausgestatteten Laborplätzen durchführen, was neben extrem hohen Kosten zu einer Begrenzung der maximalen Teilnehmerzahl führt. In diesem Zusammenhang scheint es auf den ersten Blick nicht mÃglich, Konzepte einer Virtuellen Universität umzusetzen, da die Studierenden âžvor Ort✠sein müssen. In diesem Dokument stellen wir das so genannte Mobile Hardware-Praktikum vor, das den Studierenden die Teilnahme zu jeder Zeit und von jedem beliebigen Ort aus erlaubt und dennoch ein Gefühl der Präsenz im Labor vermittelt. Gleichzeitig kann weit mehr als 100 Studierenden die Teilnahme ermÃglicht werden. Erreicht wird dies durch ein speziell für diesen Zweck entwickeltes webbasiertes Learning Management System in Kombination mit Hardware-Komponenten, die einem voll ausgestatteten Labor-Arbeitsplatz entsprechen und den Teilnehmern für die Zeit des Praktikums auf Leihbasis zur Verfügung gestellt werden. Die Experimente werden von den teilnehmenden Gruppen in Eigenregie gelÃst und elektronisch abgegeben. Die Bewertung erfolgt ebenfalls elektronisch.(DIPF/Orig.)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The philosophy of minimalism in robotics promotes gaining an understanding of sensing and computational requirements for solving a task. This minimalist approach lies in contrast to the common practice of first taking an existing sensory motor system, and only afterwards determining how to apply the robotic system to the task. While it may seem convenient to simply apply existing hardware systems to the task at hand, this design philosophy often proves to be wasteful in terms of energy consumption and cost, along with unnecessary complexity and decreased reliability. While impressive in terms of their versatility, complex robots such as the PR2 (which cost hundreds of thousands of dollars) are impractical for many common applications. Instead, if a specific task is required, sensing and computational requirements can be determined specific to that task, and a clever hardware implementation can be built to accomplish the task. Since this minimalist hardware would be designed around accomplishing the specified task, significant reductions in hardware complexity can be obtained. This can lead to huge advantages in battery life, cost, and reliability. Even if cost is of no concern, battery life is often a limiting factor in many applications. Thus, a minimalist hardware system is critical in achieving the system requirements. In this thesis, we will discuss an implementation of a counting, tracking, and actuation system as it relates to ergodic bodies to illustrate a minimalist design methodology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

New generation embedded systems demand high performance, efficiency and flexibility. Reconfigurable hardware can provide all these features. However the costly reconfiguration process and the lack of management support have prevented a broader use of these resources. To solve these issues we have developed a scheduler that deals with task-graphs at run-time, steering its execution in the reconfigurable resources while carrying out both prefetch and replacement techniques that cooperate to hide most of the reconfiguration delays. In our scheduling environment task-graphs are analyzed at design-time to extract useful information. This information is used at run-time to obtain near-optimal schedules, escaping from local-optimum decisions, while only carrying out simple computations. Moreover, we have developed a hardware implementation of the scheduler that applies all the optimization techniques while introducing a delay of only a few clock cycles. In the experiments our scheduler clearly outperforms conventional run-time schedulers based on As-Soon-As-Possible techniques. In addition, our replacement policy, specially designed for reconfigurable systems, achieves almost optimal results both regarding reuse and performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconfigurable platforms are a promising technology that offers an interesting trade-off between flexibility and performance, which many recent embedded system applications demand, especially in fields such as multimedia processing. These applications typically involve multiple ad-hoc tasks for hardware acceleration, which are usually represented using formalisms such as Data Flow Diagrams (DFDs), Data Flow Graphs (DFGs), Control and Data Flow Graphs (CDFGs) or Petri Nets. However, none of these models is able to capture at the same time the pipeline behavior between tasks (that therefore can coexist in order to minimize the application execution time), their communication patterns, and their data dependencies. This paper proves that the knowledge of all this information can be effectively exploited to reduce the resource requirements and the timing performance of modern reconfigurable systems, where a set of hardware accelerators is used to support the computation. For this purpose, this paper proposes a novel task representation model, named Temporal Constrained Data Flow Diagram (TCDFD), which includes all this information. This paper also presents a mapping-scheduling algorithm that is able to take advantage of the new TCDFD model. It aims at minimizing the dynamic reconfiguration overhead while meeting the communication requirements among the tasks. Experimental results show that the presented approach achieves up to 75% of resources saving and up to 89% of reconfiguration overhead reduction with respect to other state-of-the-art techniques for reconfigurable platforms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconfigurable hardware can be used to build a multitasking system where tasks are assigned to HW resources at run-time according to the requirements of the running applications. These tasks are frequently represented as direct acyclic graphs and their execution is typically controlled by an embedded processor that schedules the graph execution. In order to improve the efficiency of the system, the scheduler can apply prefetch and reuse techniques that can greatly reduce the reconfiguration latencies. For an embedded processor all these computations represent a heavy computational load that can significantly reduce the system performance. To overcome this problem we have implemented a HW scheduler using reconfigurable resources. In addition we have implemented both prefetch and replacement techniques that obtain as good results as previous complex SW approaches, while demanding just a few clock cycles to carry out the computations. We consider that the HW cost of the system (in our experiments 3% of a Virtex-II PRO xc2vp30 FPGA) is affordable taking into account the great efficiency of the techniques applied to hide the reconfiguration latency and the negligible run-time penalty introduced by the scheduler computations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the multi-core CPU world, transactional memory (TM)has emerged as an alternative to lock-based programming for thread synchronization. Recent research proposes the use of TM in GPU architectures, where a high number of computing threads, organized in SIMT fashion, requires an effective synchronization method. In contrast to CPUs, GPUs offer two memory spaces: global memory and local memory. The local memory space serves as a shared scratch-pad for a subset of the computing threads, and it is used by programmers to speed-up their applications thanks to its low latency. Prior work from the authors proposed a lightweight hardware TM (HTM) support based in the local memory, modifying the SIMT execution model and adding a conflict detection mechanism. An efficient implementation of these features is key in order to provide an effective synchronization mechanism at the local memory level. After a quick description of the main features of our HTM design for GPU local memory, in this work we gather together a number of proposals designed with the aim of improving those mechanisms with high impact on performance. Firstly, the SIMT execution model is modified to increase the parallelism of the application when transactions must be serialized in order to make forward progress. Secondly, the conflict detection mechanism is optimized depending on application characteristics, such us the read/write sets, the probability of conflict between transactions and the existence of read-only transactions. As these features can be present in hardware simultaneously, it is a task of the compiler and runtime to determine which ones are more important for a given application. This work includes a discussion on the analysis to be done in order to choose the best configuration solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

En esta tesis doctoral se presentan distintas soluciones para la adquisición de datos provenientes de matrices de sensores resistivos, y en concreto de sensores táctiles piezorresistivos. Los circuitos propuestos reducen el hardware de acondicionamiento y adquisición clásico, implementado una conexión directa entre el sensor y el dispositivo digital (FPGA) que recibe los datos. El objetivo es la adquisición en paralelo y con bajo coste y consumo de área de grandes cantidades de datos provenientes de los sensores matriciales, aprovechando las capacidades de las FPGAs para llevar a cabo medidas simultáneas de varios sensores. Dependiendo del tipo de direccionamiento que pueda ser empleado, dos soluciones son propuestas. En el caso donde el número de unidades sensoriales de la matriz no sea excesivamente alto y el direccionamiento pueda ser realizado sin compartir conexionado, el valor de resistencia de los distintos elementos de la matriz se obtiene a partir del tiempo de descarga de una red RC o integrador pasivo que incluye al sensor. Por otro lado, para matrices con un gran número de elementos o donde el direccionamiento de los mismos haga uso de conexiones compartidas, el uso de un circuito integrador activo reduce la diafonía entre los elementos medidos simultáneamente. El análisis y caracterización de los circuitos propuestos para un rango de resistencias de un sensor táctil piezorresistivo da lugar a una resolución efectiva en la conversión analógico-digital de 10 bits y 8 bits para los circuitos de conexión directa basados en el integrador pasivo y activo, respectivamente. En cuanto a la exactitud en la medida del valor de resistencia, se alcanzan errores relativos del 0,066% (integrador pasivo) y del 0,77% (integrador activo), empleando una novedosa técnica de calibración que hace uso de un único elemento de referencia. Por último, se propone una arquitectura para un sistema táctil basada en los circuitos anteriormente citados. Dos implementaciones se han desarrollado: un prototipo para caracterización y pruebas de laboratorio, y otro para un demostrador en una mano robótica comercial (mano de Barrett). Con estas realizaciones se comprueba que el sistema táctil es capaz de realizar el refresco del conjunto de sensores con una tasa lo suficientemente alta para aplicaciones que requieran una rápida respuesta dinámica (por ejemplo, detección de deslizamiento de objetos en tareas de manipulación con manos robóticas). Además, el paralelismo de las FPGAs no sólo se explota en la adquisición de datos, sino que el pre-procesado que puede realizarse en el sensor inteligente resultante tiene un gran potencial. Como ejemplo, en este trabajo se extraen los momentos geométricos y la elipse asociados a las imágenes táctiles adquiridas por cada uno de los sensores que conforman el sistema.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current industry proposals for Hardware Transactional Memory (HTM) focus on best-effort solutions (BE-HTM) where hardware limits are imposed on transactions. These designs may show a significant performance degradation due to high contention scenarios and different hardware and operating system limitations that abort transactions, e.g. cache overflows, hardware and software exceptions, etc. To deal with these events and to ensure forward progress, BE-HTM systems usually provide a software fallback path to execute a lock-based version of the code. In this paper, we propose a hardware implementation of an irrevocability mechanism as an alternative to the software fallback path to gain insight into the hardware improvements that could enhance the execution of such a fallback. Our mechanism anticipates the abort that causes the transaction serialization, and stalls other transactions in the system so that transactional work loss is mini- mized. In addition, we evaluate the main software fallback path approaches and propose the use of ticket locks that hold precise information of the number of transactions waiting to enter the fallback. Thus, the separation of transactional and fallback execution can be achieved in a precise manner. The evaluation is carried out using the Simics/GEMS simulator and the complete range of STAMP transactional suite benchmarks. We obtain significant performance benefits of around twice the speedup and an abort reduction of 50% over the software fallback path for a number of benchmarks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Los sistemas comerciales que ofrecen memoria transaccional (TM) implementan un sistema hardware best-effort (BE-HTM) con limitaciones. Es necesario programar un fallback software basado en cerrojos para asegurar el progreso de la aplicación. En este artículo se propone un nuevo tipo de irrevocabilidad hardware (un modo transaccional que marca las transacciones como no abortables) para hacer frente a las limitaciones de los sistemas BE-HTM de una manera mas eficiente, y para liberar a al usuario de tener que programar un fallback. Se basa en el concepto de suscripción relajada utilizada o en el contexto de la programación de fallbacks basada o en cerrojos, donde la transacción se suscribe al cerrojo al final de la misma en lugar de al principio. El mecanismo de irrevocabilidad relajada hardware no involucra cambios en el protocolo de coherencia y se compara con su homólogo software, que proponemos como un fallback con suscripción relajada de espera escapada. También proponemos la irrevocabilidad relajada con anticipación, un mecanismo que no se puede implementar en software, y que mejora el rendimiento de las aplicaciones con múltiples reemplazos de bloques transaccionales de caché. La evaluación de las propuestas se lleva a cabo con el simulador Simics/GEMS junto con la suite de benchmarks STAMP, y se obtiene una mejora de rendimiento sobre el fallback del 14% al 28% para algunos benchmarks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conventional vehicles are creating pollution problems, global warming and the extinction of high density fuels. To address these problems, automotive companies and universities are researching on hybrid electric vehicles where two different power devices are used to propel a vehicle. This research studies the development and testing of a dynamic model for Prius 2010 Hybrid Synergy Drive (HSD), a power-split device. The device was modeled and integrated with a hybrid vehicle model. To add an electric only mode for vehicle propulsion, the hybrid synergy drive was modified by adding a clutch to carrier 1. The performance of the integrated vehicle model was tested with UDDS drive cycle using rule-based control strategy. The dSPACE Hardware-In-the-Loop (HIL) simulator was used for HIL simulation test. The HIL simulation result shows that the integration of developed HSD dynamic model with a hybrid vehicle model was successful. The HSD model was able to split power and isolate engine speed from vehicle speed in hybrid mode.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today, modern System-on-a-Chip (SoC) systems have grown rapidly due to the increased processing power, while maintaining the size of the hardware circuit. The number of transistors on a chip continues to increase, but current SoC designs may not be able to exploit the potential performance, especially with energy consumption and chip area becoming two major concerns. Traditional SoC designs usually separate software and hardware. Thus, the process of improving the system performance is a complicated task for both software and hardware designers. The aim of this research is to develop hardware acceleration workflow for software applications. Thus, system performance can be improved with constraints of energy consumption and on-chip resource costs. The characteristics of software applications can be identified by using profiling tools. Hardware acceleration can have significant performance improvement for highly mathematical calculations or repeated functions. The performance of SoC systems can then be improved, if the hardware acceleration method is used to accelerate the element that incurs performance overheads. The concepts mentioned in this study can be easily applied to a variety of sophisticated software applications. The contributions of SoC-based hardware acceleration in the hardware-software co-design platform include the following: (1) Software profiling methods are applied to H.264 Coder-Decoder (CODEC) core. The hotspot function of aimed application is identified by using critical attributes such as cycles per loop, loop rounds, etc. (2) Hardware acceleration method based on Field-Programmable Gate Array (FPGA) is used to resolve system bottlenecks and improve system performance. The identified hotspot function is then converted to a hardware accelerator and mapped onto the hardware platform. Two types of hardware acceleration methods â central bus design and co-processor design, are implemented for comparison in the proposed architecture. (3) System specifications, such as performance, energy consumption, and resource costs, are measured and analyzed. The trade-off of these three factors is compared and balanced. Different hardware accelerators are implemented and evaluated based on system requirements. 4) The system verification platform is designed based on Integrated Circuit (IC) workflow. Hardware optimization techniques are used for higher performance and less resource costs. Experimental results show that the proposed hardware acceleration workflow for software applications is an efficient technique. The system can reach 2.8X performance improvements and save 31.84% energy consumption by applying the Bus-IP design. The Co-processor design can have 7.9X performance and save 75.85% energy consumption.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bilinear pairings can be used to construct cryptographic systems with very desirable properties. A pairing performs a mapping on members of groups on elliptic and genus 2 hyperelliptic curves to an extension of the finite field on which the curves are defined. The finite fields must, however, be large to ensure adequate security. The complicated group structure of the curves and the expensive field operations result in time consuming computations that are an impediment to the practicality of pairing-based systems. The Tate pairing can be computed efficiently using the ɳT method. Hardware architectures can be used to accelerate the required operations by exploiting the parallelism inherent to the algorithmic and finite field calculations. The Tate pairing can be performed on elliptic curves of characteristic 2 and 3 and on genus 2 hyperelliptic curves of characteristic 2. Curve selection is dependent on several factors including desired computational speed, the area constraints of the target device and the required security level. In this thesis, custom hardware processors for the acceleration of the Tate pairing are presented and implemented on an FPGA. The underlying hardware architectures are designed with care to exploit available parallelism while ensuring resource efficiency. The characteristic 2 elliptic curve processor contains novel units that return a pairing result in a very low number of clock cycles. Despite the more complicated computational algorithm, the speed of the genus 2 processor is comparable. Pairing computation on each of these curves can be appealing in applications with various attributes. A flexible processor that can perform pairing computation on elliptic curves of characteristic 2 and 3 has also been designed. An integrated hardware/software design and verification environment has been developed. This system automates the procedures required for robust processor creation and enables the rapid provision of solutions for a wide range of cryptographic applications.