Biblioteca Digital

868 resultados para FPGA parallel SAT solver

A method for Kinematic Calibration of a Parallel Robot by using one camera in hand and a spherical object

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main purpose of robot calibration is the correction of the possible errors in the robot parameters. This paper presents a method for a kinematic calibration of a parallel robot that is equipped with one camera in hand. In order to preserve the mechanical configuration of the robot, the camera is utilized to acquire incremental positions of the end effector from a spherical object that is fixed in the word reference frame. The positions of the end effector are related to incremental positions of resolvers of the motors of the robot, and a kinematic model of the robot is used to find a new group of parameters which minimizes errors in the kinematic equations. Additionally, properties of the spherical object and intrinsic camera parameters are utilized to model the projection of the object in the image and improving spatial measurements. Finally, the robotic system is designed to carry out tracking tasks and the calibration of the robot is validated by means of integrating the errors of the visual controller.

Bio-inspired FPGA Architecture for Self-Calibration of an Image Compression Core based on Wavelet Transforms in Embedded Systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A generic bio-inspired adaptive architecture for image compression suitable to be implemented in embedded systems is presented. The architecture allows the system to be tuned during its calibration phase. An evolutionary algorithm is responsible of making the system evolve towards the required performance. A prototype has been implemented in a Xilinx Virtex-5 FPGA featuring an adaptive wavelet transform core directed at improving image compression for specific types of images. An Evolution Strategy has been chosen as the search algorithm and its typical genetic operators adapted to allow for a hardware friendly implementation. HW/SW partitioning issues are also considered after a high level description of the algorithm is profiled which validates the proposed resource allocation in the device fabric. To check the robustness of the system and its adaptation capabilities, different types of images have been selected as validation patterns. A direct application of such a system is its deployment in an unknown environment during design time, letting the calibration phase adjust the system parameters so that it performs efcient image compression. Also, this prototype implementation may serve as an accelerator for the automatic design of evolved transform coefficients which are later on synthesized and implemented in a non-adaptive system in the final implementation device, whether it is a HW or SW based computing device. The architecture has been built in a modular way so that it can be easily extended to adapt other types of image processing cores. Details on this pluggable component point of view are also given in the paper.

Current Mode with RMS Voltage and Offset Control Loops for a Single-Phase Aircraft Inverter Suitable for Parallel and 3-Phase Operation Modes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rms voltage regulation may be an attractive possibility for controlling power inverters. Combined with a Hall Effect sensor for current control, it keeps its parallel operation capability while increasing its noise immunity, which may lead to a reduction of the Total Harmonic Distortion (THD). Besides, as voltage regulation is designed in DC, a simple PI regulator can provide accurate voltage tracking. Nevertheless, this approach does not lack drawbacks. Its narrow voltage bandwidth makes transients last longer and it increases the voltage THD when feeding non-linear loads, such as rectifying stages. On the other hand, the implementation can fall into offset voltage error. Furthermore, the information of the output voltage phase is hidden for the control as well, making the synchronization of a 3-phase setup not trivial. This paper explains the concept, design and implementation of the whole control scheme, in an on board inverter able to run in parallel and within a 3-phase setup. Special attention is paid to solve the problems foreseen at implementation level: a third analog loop accounts for the offset level is added and a digital algorithm guarantees 3-phase voltage synchronization.

Serial or Parallel Linear-Assisted Switching Converter as Envelope Amplifier: Optimization and Comparison

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a theoretical analysis and an optimization method for envelope amplifier. Highly efficient envelope amplifiers based on a switching converter in parallel or series with a linear regulator have been analyzed and optimized. The results of the optimization process have been shown and these two architectures are compared regarding their complexity and efficiency. The optimization method that is proposed is based on the previous knowledge about the transmitted signal type (OFDM, WCDMA...) and it can be applied to any signal type as long as the envelope probability distribution is known. Finally, it is shown that the analyzed architectures have an inherent efficiency limit.

FPGA Acceleration of Monte Carlo-based Financial Simulation: Design Challenges and Lessons Learnt

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The simulation of interest rate derivatives is a powerful tool to face the current market ﬂuctuations. However, the complexity of the ﬁnancial models and the way they are processed require exorbitant computation times, what is in clear conﬂict with the need of a processing time as short as possible to operate in the ﬁnancial market. To shorten the computation time of ﬁnancial derivatives the use of hardware accelerators becomes a must.

Independence in CLP Languages

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Studying independence of goals has proven very useful in the context of logic programming. In particular, it has provided a formal basis for powerful automatic parallelization tools, since independence ensures that two goals may be evaluated in parallel while preserving correctness and eciency. We extend the concept of independence to constraint logic programs (CLP) and prove that it also ensures the correctness and eciency of the parallel evaluation of independent goals. Independence for CLP languages is more complex than for logic programming as search space preservation is necessary but no longer sucient for ensuring correctness and eciency. Two additional issues arise. The rst is that the cost of constraint solving may depend upon the order constraints are encountered. The second is the need to handle dynamic scheduling. We clarify these issues by proposing various types of search independence and constraint solver independence, and show how they can be combined to allow dierent optimizations, from parallelism to intelligent backtracking. Sucient conditions for independence which can be evaluated \a priori" at run-time are also proposed. Our study also yields new insights into independence in logic programming languages. In particular, we show that search space preservation is not only a sucient but also a necessary condition for ensuring correctness and eciency of parallel execution.

Implementación de Algoritmos de Procesado de Señal sobre FPGA: Especificación, Reutilización y Exploración del Espacio de Diseño

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Esta Tesis aborda el diseño e implementación de aplicaciones en el campo de procesado de señal, utilizando como plataforma los dispositivos reconfigurables FPGA. Esta plataforma muestra una alta capacidad de lógica, e incorpora elementos orientados al procesado de señal, que unido a su relativamente bajo coste, la hacen ideal para el desarrollo de aplicaciones de procesado de señal cuando se requiere realizar un procesado intensivo y se buscan unas altas prestaciones. Sin embargo, el coste asociado al desarrollo en estas plataformas es elevado. Mientras que el aumento en la capacidad lógica de los dispositivos FPGA permite el desarrollo de sistemas completos, los requisitos de altas prestaciones obligan a que en muchas ocasiones se deban optimizar operadores a muy bajo nivel. Además de las restricciones temporales que imponen este tipo de aplicaciones, también tienen asociadas restricciones de área asociadas al dispositivo, lo que obliga a evaluar y verificar entre diferentes alternativas de implementación. El ciclo de diseño e implementación para estas aplicaciones se puede prolongar tanto, que es normal que aparezcan nuevos modelos de FPGA, con mayor capacidad y mayor velocidad, antes de completar el sistema, y que hagan a las restricciones utilizadas para el diseño del sistema inútiles. Para mejorar la productividad en el desarrollo de estas aplicaciones, y con ello acortar su ciclo de diseño, se pueden encontrar diferentes métodos. Esta Tesis se centra en la reutilización de componentes hardware previamente diseñados y verificados. Aunque los lenguajes HDL convencionales permiten reutilizar componentes ya definidos, se pueden realizar mejoras en la especificación que simplifiquen el proceso de incorporar componentes a nuevos diseños. Así, una primera parte de la Tesis se orientará a la especificación de diseños basada en componentes predefinidos. Esta especificación no sólo busca mejorar y simplificar el proceso de añadir componentes a una descripción, sino que también busca mejorar la calidad del diseño especificado, ofreciendo una mayor posibilidad de configuración e incluso la posibilidad de informar de características de la propia descripción. Reutilizar una componente ya descrito depende en gran medida de la información que se ofrezca para su integración en un sistema. En este sentido los HDLs convencionales únicamente proporcionan junto con la descripción del componente la interfaz de entrada/ salida y un conjunto de parámetros para su configuración, mientras que el resto de información requerida normalmente se acompaña mediante documentación externa. En la segunda parte de la Tesis se propondrán un conjunto de encapsulados cuya finalidad es incorporar junto con la propia descripción del componente, información que puede resultar útil para su integración en otros diseños. Incluyendo información de la implementación, ayuda a la configuración del componente, e incluso información de cómo configurar y conectar al componente para realizar una función. Finalmente se elegirá una aplicación clásica en el campo de procesado de señal, la transformada rápida de Fourier (FFT), y se utilizará como ejemplo de uso y aplicación, tanto de las posibilidades de especificación como de los encapsulados descritos. El objetivo del diseño realizado no sólo mostrará ejemplos de la especificación propuesta, sino que también se buscará obtener una implementación de calidad comparable con resultados de la literatura. Para ello, el diseño realizado se orientará a su implementación en FPGA, aprovechando tanto los elementos lógicos generalistas como elementos específicos de bajo nivel disponibles en estos dispositivos. Finalmente, la especificación de la FFT obtenida se utilizará para mostrar cómo incorporar en su interfaz información que ayude para su selección y configuración desde fases tempranas del ciclo de diseño. Abstract This PhD. thesis addresses the design and implementation of signal processing applications using reconfigurable FPGA platforms. This kind of platform exhibits high logic capability, incorporates dedicated signal processing elements and provides a low cost solution, which makes it ideal for the development of signal processing applications, where intensive data processing is required in order to obtain high performance. However, the cost associated to the hardware development on these platforms is high. While the increase in logic capacity of FPGA devices allows the development of complete systems, high-performance constraints require the optimization of operators at very low level. In addition to time constraints imposed by these applications, Area constraints are also applied related to the particular device, which force to evaluate and verify a design among different implementation alternatives. The design and implementation cycle for these applications can be tedious and long, being therefore normal that new FPGA models with a greater capacity and higher speed appear before completing the system implementation. Thus, the original constraints which guided the design of the system become useless. Different methods can be used to improve the productivity when developing these applications, and consequently shorten their design cycle. This PhD. Thesis focuses on the reuse of hardware components previously designed and verified. Although conventional HDLs allow the reuse of components already defined, their specification can be improved in order to simplify the process of incorporating new design components. Thus, a first part of the PhD. Thesis will focus on the specification of designs based on predefined components. This specification improves and simplifies the process of adding components to a description, but it also seeks to improve the quality of the design specified with better configuration options and even offering to report on features of the description. Hardware reuse of a component for its integration into a system largely depends on the information it offers. In this sense the conventional HDLs only provide together with the component description, the input/output interface and a set of parameters for its configuration, while other information is usually provided by external documentation. In the second part of the Thesis we will propose a formal way of encapsulation which aims to incorporate with the component description information that can be useful for its integration into other designs. This information will include features of the own implementation, but it will also support component configuration, and even information on how to configure and connect the component to carry out a function. Finally, the fast Fourier transform (FFT) will be chosen as a well-known signal processing application. It will be used as case study to illustrate the possibilities of proposed specification and encapsulation formalisms. The objective of the FFT design is not only to show practical examples of the proposed specification, but also to obtain an implementation of a quality comparable to scientific literature results. The design will focus its implementation on FPGA platforms, using general logic elements as base of the implementation, but also taking advantage of low-level specific elements available on these devices. Last, the specification of the obtained FFT will be used to show how to incorporate in its interface information to assist in the selection and configuration process early in the design cycle.

Designing a high performance parallel logic programming system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compilation techniques such as those portrayed by the Warren Abstract Machine(WAM) have greatly improved the speed of execution of logic programs. The research presented herein is geared towards providing additional performance to logic programs through the use of parallelism, while preserving the conventional semantics of logic languages. Two áreas to which special attention is given are the preservation of sequential performance and storage efficiency, and the use of low overhead mechanisms for controlling parallel execution. Accordingly, the techniques used for supporting parallelism are efficient extensions of those which have brought high inferencing speeds to sequential implementations. At a lower level, special attention is also given to design and simulation detail and to the architectural implications of the execution model behavior. This paper offers an overview of the basic concepts and techniques used in the parallel design, simulation tools used, and some of the results obtained to date.

Improving the efficiency of nondeterministic indepemndent and-parallel systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present the design and implementation of the and-parallel component of ACE. ACE is a computational model for the full Prolog language that simultaneously exploits both or-parallelism and independent and-parallelism. A high performance implementation of the ACE model has been realized and its performance reported in this paper. We discuss how some of the standard problems which appear when implementing and-parallel systems are solved in ACE. We then propose a number of optimizations aimed at reducing the overheads and the increased memory consumption which occur in such systems when using previously proposed solutions. Finally, we present results from an implementation of ACE which includes the optimizations proposed. The results show that ACE exploits and-parallelism with high efficiency and high speedups. Furthermore, they also show that the proposed optimizations, which are applicable to many other and-parallel systems, significantly decrease memory consumption and increase speedups and absolute performance both in forwards execution and during backtracking.

And-or parallel prolog: a recomputation based approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We argüe that in order to exploit both Independent And- and Or-parallelism in Prolog programs there is advantage in recomputing some of the independent goals, as opposed to all their solutions being reused. We present an abstract model, called the Composition-Tree, for representing and-or parallelism in Prolog Programs. The Composition-tree closely mirrors sequential Prolog execution by recomputing some independent goals rather than fully re-using them. We also outline two environment representation techniques for And-Or parallel execution of full Prolog based on the Composition-tree model abstraction. We argüe that these techniques have advantages over earlier proposals for exploiting and-or parallelism in Prolog.

StreamCloud: An Elastic Parallel-Distributed Stream Processing Engine

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, applications in domains such as telecommunications, network security or large scale sensor networks showed the limits of the traditional store-then-process paradigm. In this context, Stream Processing Engines emerged as a candidate solution for all these applications demanding for high processing capacity with low processing latency guarantees. With Stream Processing Engines, data streams are not persisted but rather processed on the fly, producing results continuously. Current Stream Processing Engines, either centralized or distributed, do not scale with the input load due to single-node bottlenecks. Moreover, they are based on static configurations that lead to either under or over-provisioning. This Ph.D. thesis discusses StreamCloud, an elastic paralleldistributed stream processing engine that enables for processing of large data stream volumes. Stream- Cloud minimizes the distribution and parallelization overhead introducing novel techniques that split queries into parallel subqueries and allocate them to independent sets of nodes. Moreover, Stream- Cloud elastic and dynamic load balancing protocols enable for effective adjustment of resources depending on the incoming load. Together with the parallelization and elasticity techniques, Stream- Cloud defines a novel fault tolerance protocol that introduces minimal overhead while providing fast recovery. StreamCloud has been fully implemented and evaluated using several real word applications such as fraud detection applications or network analysis applications. The evaluation, conducted using a cluster with more than 300 cores, demonstrates the large scalability, the elasticity and fault tolerance effectiveness of StreamCloud. Resumen En los útimos años, aplicaciones en dominios tales como telecomunicaciones, seguridad de redes y redes de sensores de gran escala se han encontrado con múltiples limitaciones en el paradigma tradicional de bases de datos. En este contexto, los sistemas de procesamiento de flujos de datos han emergido como solución a estas aplicaciones que demandan una alta capacidad de procesamiento con una baja latencia. En los sistemas de procesamiento de flujos de datos, los datos no se persisten y luego se procesan, en su lugar los datos son procesados al vuelo en memoria produciendo resultados de forma continua. Los actuales sistemas de procesamiento de flujos de datos, tanto los centralizados, como los distribuidos, no escalan respecto a la carga de entrada del sistema debido a un cuello de botella producido por la concentración de flujos de datos completos en nodos individuales. Por otra parte, éstos están basados en configuraciones estáticas lo que conducen a un sobre o bajo aprovisionamiento. Esta tesis doctoral presenta StreamCloud, un sistema elástico paralelo-distribuido para el procesamiento de flujos de datos que es capaz de procesar grandes volúmenes de datos. StreamCloud minimiza el coste de distribución y paralelización por medio de una técnica novedosa la cual particiona las queries en subqueries paralelas repartiéndolas en subconjuntos de nodos independientes. Ademas, Stream- Cloud posee protocolos de elasticidad y equilibrado de carga que permiten una optimización de los recursos dependiendo de la carga del sistema. Unidos a los protocolos de paralelización y elasticidad, StreamCloud define un protocolo de tolerancia a fallos que introduce un coste mínimo mientras que proporciona una rápida recuperación. StreamCloud ha sido implementado y evaluado mediante varias aplicaciones del mundo real tales como aplicaciones de detección de fraude o aplicaciones de análisis del tráfico de red. La evaluación ha sido realizada en un cluster con más de 300 núcleos, demostrando la alta escalabilidad y la efectividad tanto de la elasticidad, como de la tolerancia a fallos de StreamCloud.

Flexible scheduling for non-deterministic, and-parallel execution of logic programs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract is not available.

IDRA (IDeal Resource Allocation): Computing ideal speedups in parallel logic programming

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a technique to estimate accurate speedups for parallel logic programs with relative independence from characteristics of a given implementation or underlying parallel hardware. The proposed technique is based on gathering accurate data describing one execution at run-time, which is fed to a simulator. Alternative schedulings are then simulated and estimates computed for the corresponding speedups. A tool implementing the aforementioned techniques is presented, and its predictions are compared to the performance of real systems, showing good correlation.

Using attributed variables in the implementation of concurrent and parallel logic programming systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Incorporating the possibility of attaching attributes to variables in a logic programming system has been shown to allow the addition of general constraint solving capabilities to it. This approach is very attractive in that by adding a few primitives any logic programming system can be turned into a generic constraint logic programming system in which constraint solving can be user deñned, and at source level - an extreme example of the "glass box" approach. In this paper we propose a different and novel use for the concept of attributed variables: developing a generic parallel/concurrent (constraint) logic programming system, using the same "glass box" flavor. We argüe that a system which implements attributed variables and a few additional primitives can be easily customized at source level to implement many of the languages and execution models of parallelism and concurrency currently proposed, in both shared memory and distributed systems. We illustrate this through examples and report on an implementation of our ideas.

&ACE: A high-performance parallel prolog system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years a lot of research has been invested in parallel processing of numerical applications. However, parallel processing of Symbolic and AI applications has received less attention. This paper presents a system for parallel symbolic computitig, narned ACE, based on the logic programming paradigm. ACE is a computational model for the full Prolog language, capable of exploiting Or-parall< lism and Independent And-parallelism. In this paper vve focus on the implementation of the and-parallel part of the ACE system (ralled &ACE) on a shared memory multiprocessor, d< scribing its organization, some optimizations, and presenting some performance figures, proving the abilhy of &ACE to efficiently exploit parallelism.

«
1
2
...
42
43
44
45
46
47
48
...
57
58
»