994 resultados para Hardware Solver


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The last two decades have seen many exciting examples of tiny robots from a few cm3 to less than one cm3. Although individually limited, a large group of these robots has the potential to work cooperatively and accomplish complex tasks. Two examples from nature that exhibit this type of cooperation are ant and bee colonies. They have the potential to assist in applications like search and rescue, military scouting, infrastructure and equipment monitoring, nano-manufacture, and possibly medicine. Most of these applications require the high level of autonomy that has been demonstrated by large robotic platforms, such as the iRobot and Honda ASIMO. However, when robot size shrinks down, current approaches to achieve the necessary functions are no longer valid. This work focused on challenges associated with the electronics and fabrication. We addressed three major technical hurdles inherent to current approaches: 1) difficulty of compact integration; 2) need for real-time and power-efficient computations; 3) unavailability of commercial tiny actuators and motion mechanisms. The aim of this work was to provide enabling hardware technologies to achieve autonomy in tiny robots. We proposed a decentralized application-specific integrated circuit (ASIC) where each component is responsible for its own operation and autonomy to the greatest extent possible. The ASIC consists of electronics modules for the fundamental functions required to fulfill the desired autonomy: actuation, control, power supply, and sensing. The actuators and mechanisms could potentially be post-fabricated on the ASIC directly. This design makes for a modular architecture. The following components were shown to work in physical implementations or simulations: 1) a tunable motion controller for ultralow frequency actuation; 2) a nonvolatile memory and programming circuit to achieve automatic and one-time programming; 3) a high-voltage circuit with the highest reported breakdown voltage in standard 0.5 μm CMOS; 4) thermal actuators fabricated using CMOS compatible process; 5) a low-power mixed-signal computational architecture for robotic dynamics simulator; 6) a frequency-boost technique to achieve low jitter in ring oscillators. These contributions will be generally enabling for other systems with strict size and power constraints such as wireless sensor nodes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract not available

Relevância:

20.00% 20.00%

Publicador:

Resumo:

International audience

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Im Rahmen der wissenschaftlichen Ausbildung sind Praktika vielerorts ein wichtiger Bestandteil der Lehre. Sie zeichnen sich im Regelfall dadurch aus, dass die Studierenden die gestellten Versuche an speziell ausgestatteten Laborplätzen durchführen, was neben extrem hohen Kosten zu einer Begrenzung der maximalen Teilnehmerzahl führt. In diesem Zusammenhang scheint es auf den ersten Blick nicht möglich, Konzepte einer Virtuellen Universität umzusetzen, da die Studierenden „vor Ort“ sein müssen. In diesem Dokument stellen wir das so genannte Mobile Hardware-Praktikum vor, das den Studierenden die Teilnahme zu jeder Zeit und von jedem beliebigen Ort aus erlaubt und dennoch ein Gefühl der Präsenz im Labor vermittelt. Gleichzeitig kann weit mehr als 100 Studierenden die Teilnahme ermöglicht werden. Erreicht wird dies durch ein speziell für diesen Zweck entwickeltes webbasiertes Learning Management System in Kombination mit Hardware-Komponenten, die einem voll ausgestatteten Labor-Arbeitsplatz entsprechen und den Teilnehmern für die Zeit des Praktikums auf Leihbasis zur Verfügung gestellt werden. Die Experimente werden von den teilnehmenden Gruppen in Eigenregie gelöst und elektronisch abgegeben. Die Bewertung erfolgt ebenfalls elektronisch.(DIPF/Orig.)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The philosophy of minimalism in robotics promotes gaining an understanding of sensing and computational requirements for solving a task. This minimalist approach lies in contrast to the common practice of first taking an existing sensory motor system, and only afterwards determining how to apply the robotic system to the task. While it may seem convenient to simply apply existing hardware systems to the task at hand, this design philosophy often proves to be wasteful in terms of energy consumption and cost, along with unnecessary complexity and decreased reliability. While impressive in terms of their versatility, complex robots such as the PR2 (which cost hundreds of thousands of dollars) are impractical for many common applications. Instead, if a specific task is required, sensing and computational requirements can be determined specific to that task, and a clever hardware implementation can be built to accomplish the task. Since this minimalist hardware would be designed around accomplishing the specified task, significant reductions in hardware complexity can be obtained. This can lead to huge advantages in battery life, cost, and reliability. Even if cost is of no concern, battery life is often a limiting factor in many applications. Thus, a minimalist hardware system is critical in achieving the system requirements. In this thesis, we will discuss an implementation of a counting, tracking, and actuation system as it relates to ergodic bodies to illustrate a minimalist design methodology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Solving linear systems is an important problem for scientific computing. Exploiting parallelism is essential for solving complex systems, and this traditionally involves writing parallel algorithms on top of a library such as MPI. The SPIKE family of algorithms is one well-known example of a parallel solver for linear systems. The Hierarchically Tiled Array data type extends traditional data-parallel array operations with explicit tiling and allows programmers to directly manipulate tiles. The tiles of the HTA data type map naturally to the block nature of many numeric computations, including the SPIKE family of algorithms. The higher level of abstraction of the HTA enables the same program to be portable across different platforms. Current implementations target both shared-memory and distributed-memory models. In this thesis we present a proof-of-concept for portable linear solvers. We implement two algorithms from the SPIKE family using the HTA library. We show that our implementations of SPIKE exploit the abstractions provided by the HTA to produce a compact, clean code that can run on both shared-memory and distributed-memory models without modification. We discuss how we map the algorithms to HTA programs as well as examine their performance. We compare the performance of our HTA codes to comparable codes written in MPI as well as current state-of-the-art linear algebra routines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

New generation embedded systems demand high performance, efficiency and flexibility. Reconfigurable hardware can provide all these features. However the costly reconfiguration process and the lack of management support have prevented a broader use of these resources. To solve these issues we have developed a scheduler that deals with task-graphs at run-time, steering its execution in the reconfigurable resources while carrying out both prefetch and replacement techniques that cooperate to hide most of the reconfiguration delays. In our scheduling environment task-graphs are analyzed at design-time to extract useful information. This information is used at run-time to obtain near-optimal schedules, escaping from local-optimum decisions, while only carrying out simple computations. Moreover, we have developed a hardware implementation of the scheduler that applies all the optimization techniques while introducing a delay of only a few clock cycles. In the experiments our scheduler clearly outperforms conventional run-time schedulers based on As-Soon-As-Possible techniques. In addition, our replacement policy, specially designed for reconfigurable systems, achieves almost optimal results both regarding reuse and performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconfigurable platforms are a promising technology that offers an interesting trade-off between flexibility and performance, which many recent embedded system applications demand, especially in fields such as multimedia processing. These applications typically involve multiple ad-hoc tasks for hardware acceleration, which are usually represented using formalisms such as Data Flow Diagrams (DFDs), Data Flow Graphs (DFGs), Control and Data Flow Graphs (CDFGs) or Petri Nets. However, none of these models is able to capture at the same time the pipeline behavior between tasks (that therefore can coexist in order to minimize the application execution time), their communication patterns, and their data dependencies. This paper proves that the knowledge of all this information can be effectively exploited to reduce the resource requirements and the timing performance of modern reconfigurable systems, where a set of hardware accelerators is used to support the computation. For this purpose, this paper proposes a novel task representation model, named Temporal Constrained Data Flow Diagram (TCDFD), which includes all this information. This paper also presents a mapping-scheduling algorithm that is able to take advantage of the new TCDFD model. It aims at minimizing the dynamic reconfiguration overhead while meeting the communication requirements among the tasks. Experimental results show that the presented approach achieves up to 75% of resources saving and up to 89% of reconfiguration overhead reduction with respect to other state-of-the-art techniques for reconfigurable platforms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconfigurable hardware can be used to build a multitasking system where tasks are assigned to HW resources at run-time according to the requirements of the running applications. These tasks are frequently represented as direct acyclic graphs and their execution is typically controlled by an embedded processor that schedules the graph execution. In order to improve the efficiency of the system, the scheduler can apply prefetch and reuse techniques that can greatly reduce the reconfiguration latencies. For an embedded processor all these computations represent a heavy computational load that can significantly reduce the system performance. To overcome this problem we have implemented a HW scheduler using reconfigurable resources. In addition we have implemented both prefetch and replacement techniques that obtain as good results as previous complex SW approaches, while demanding just a few clock cycles to carry out the computations. We consider that the HW cost of the system (in our experiments 3% of a Virtex-II PRO xc2vp30 FPGA) is affordable taking into account the great efficiency of the techniques applied to hide the reconfiguration latency and the negligible run-time penalty introduced by the scheduler computations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the multi-core CPU world, transactional memory (TM)has emerged as an alternative to lock-based programming for thread synchronization. Recent research proposes the use of TM in GPU architectures, where a high number of computing threads, organized in SIMT fashion, requires an effective synchronization method. In contrast to CPUs, GPUs offer two memory spaces: global memory and local memory. The local memory space serves as a shared scratch-pad for a subset of the computing threads, and it is used by programmers to speed-up their applications thanks to its low latency. Prior work from the authors proposed a lightweight hardware TM (HTM) support based in the local memory, modifying the SIMT execution model and adding a conflict detection mechanism. An efficient implementation of these features is key in order to provide an effective synchronization mechanism at the local memory level. After a quick description of the main features of our HTM design for GPU local memory, in this work we gather together a number of proposals designed with the aim of improving those mechanisms with high impact on performance. Firstly, the SIMT execution model is modified to increase the parallelism of the application when transactions must be serialized in order to make forward progress. Secondly, the conflict detection mechanism is optimized depending on application characteristics, such us the read/write sets, the probability of conflict between transactions and the existence of read-only transactions. As these features can be present in hardware simultaneously, it is a task of the compiler and runtime to determine which ones are more important for a given application. This work includes a discussion on the analysis to be done in order to choose the best configuration solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

En esta tesis doctoral se presentan distintas soluciones para la adquisición de datos provenientes de matrices de sensores resistivos, y en concreto de sensores táctiles piezorresistivos. Los circuitos propuestos reducen el hardware de acondicionamiento y adquisición clásico, implementado una conexión directa entre el sensor y el dispositivo digital (FPGA) que recibe los datos. El objetivo es la adquisición en paralelo y con bajo coste y consumo de área de grandes cantidades de datos provenientes de los sensores matriciales, aprovechando las capacidades de las FPGAs para llevar a cabo medidas simultáneas de varios sensores. Dependiendo del tipo de direccionamiento que pueda ser empleado, dos soluciones son propuestas. En el caso donde el número de unidades sensoriales de la matriz no sea excesivamente alto y el direccionamiento pueda ser realizado sin compartir conexionado, el valor de resistencia de los distintos elementos de la matriz se obtiene a partir del tiempo de descarga de una red RC o integrador pasivo que incluye al sensor. Por otro lado, para matrices con un gran número de elementos o donde el direccionamiento de los mismos haga uso de conexiones compartidas, el uso de un circuito integrador activo reduce la diafonía entre los elementos medidos simultáneamente. El análisis y caracterización de los circuitos propuestos para un rango de resistencias de un sensor táctil piezorresistivo da lugar a una resolución efectiva en la conversión analógico-digital de 10 bits y 8 bits para los circuitos de conexión directa basados en el integrador pasivo y activo, respectivamente. En cuanto a la exactitud en la medida del valor de resistencia, se alcanzan errores relativos del 0,066% (integrador pasivo) y del 0,77% (integrador activo), empleando una novedosa técnica de calibración que hace uso de un único elemento de referencia. Por último, se propone una arquitectura para un sistema táctil basada en los circuitos anteriormente citados. Dos implementaciones se han desarrollado: un prototipo para caracterización y pruebas de laboratorio, y otro para un demostrador en una mano robótica comercial (mano de Barrett). Con estas realizaciones se comprueba que el sistema táctil es capaz de realizar el refresco del conjunto de sensores con una tasa lo suficientemente alta para aplicaciones que requieran una rápida respuesta dinámica (por ejemplo, detección de deslizamiento de objetos en tareas de manipulación con manos robóticas). Además, el paralelismo de las FPGAs no sólo se explota en la adquisición de datos, sino que el pre-procesado que puede realizarse en el sensor inteligente resultante tiene un gran potencial. Como ejemplo, en este trabajo se extraen los momentos geométricos y la elipse asociados a las imágenes táctiles adquiridas por cada uno de los sensores que conforman el sistema.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current industry proposals for Hardware Transactional Memory (HTM) focus on best-effort solutions (BE-HTM) where hardware limits are imposed on transactions. These designs may show a significant performance degradation due to high contention scenarios and different hardware and operating system limitations that abort transactions, e.g. cache overflows, hardware and software exceptions, etc. To deal with these events and to ensure forward progress, BE-HTM systems usually provide a software fallback path to execute a lock-based version of the code. In this paper, we propose a hardware implementation of an irrevocability mechanism as an alternative to the software fallback path to gain insight into the hardware improvements that could enhance the execution of such a fallback. Our mechanism anticipates the abort that causes the transaction serialization, and stalls other transactions in the system so that transactional work loss is mini- mized. In addition, we evaluate the main software fallback path approaches and propose the use of ticket locks that hold precise information of the number of transactions waiting to enter the fallback. Thus, the separation of transactional and fallback execution can be achieved in a precise manner. The evaluation is carried out using the Simics/GEMS simulator and the complete range of STAMP transactional suite benchmarks. We obtain significant performance benefits of around twice the speedup and an abort reduction of 50% over the software fallback path for a number of benchmarks.