972 resultados para hardware implementation
Resumo:
The main objective of this paper is to detail the development of a feasible hardware design based on Evolutionary Algorithms (EAs) to determine flight path planning for Unmanned Aerial Vehicles (UAVs) navigating terrain with obstacle boundaries. The design architecture includes the hardware implementation of Light Detection And Ranging (LiDAR) terrain and EA population memories within the hardware, as well as the EA search and evaluation algorithms used in the optimizing stage of path planning. A synthesisable Very-high-speed integrated circuit Hardware Description Language (VHDL) implementation of the design was developed, for realisation on a Field Programmable Gate Array (FPGA) platform. Simulation results show significant speedup compared with an equivalent software implementation written in C++, suggesting that the present approach is well suited for UAV real-time path planning applications.
Resumo:
Fast calculation of quantities such as in-cylinder volume and indicated power is important in internal combustion engine research. Multiple channels of data including crank angle and pressure were collected for this purpose using a fully instrumented diesel engine research facility. Currently, existing methods use software to post-process the data, first calculating volume from crank angle, then calculating the indicated work and indicated power from the area enclosed by the pressure-volume indicator diagram. Instead, this work investigates the feasibility of achieving real-time calculation of volume and power via hardware implementation on Field Programmable Gate Arrays (FPGAs). Alternative hardware implementations were investigated using lookup tables, Taylor series methods or the CORDIC (CoOrdinate Rotation DIgital Computer) algorithm to compute the trigonometric operations in the crank angle to volume calculation, and the CORDIC algorithm was found to use the least amount of resources. Simulation of the hardware based implementation showed that the error in the volume and indicated power is less than 0.1%.
Resumo:
Traditional area-based matching techniques make use of similarity metrics such as the Sum of Absolute Differences(SAD), Sum of Squared Differences (SSD) and Normalised Cross Correlation (NCC). Non-parametric matching algorithms such as the rank and census rely on the relative ordering of pixel values rather than the pixels themselves as a similarity measure. Both traditional area-based and non-parametric stereo matching techniques have an algorithmic structure which is amenable to fast hardware realisation. This investigation undertakes a performance assessment of these two families of algorithms for robustness to radiometric distortion and random noise. A generic implementation framework is presented for the stereo matching problem and the relative hardware requirements for the various metrics investigated.
Resumo:
The feasibility of using an in-hardware implementation of a genetic algorithm (GA) to solve the computationally expensive travelling salesman problem (TSP) is explored, especially in regard to hardware resource requirements for problem and population sizes. We investigate via numerical experiments whether a small population size might prove sufficient to obtain reasonable quality solutions for the TSP, thereby permitting relatively resource efficient hardware implementation on field programmable gate arrays (FPGAs). Software experiments on two TSP benchmarks involving 48 and 532 cities were used to explore the extent to which population size can be reduced without compromising solution quality, and results show that a GA allowed to run for a large number of generations with a smaller population size can yield solutions of comparable quality to those obtained using a larger population. This finding is then used to investigate feasible problem sizes on a targeted Virtex-7 vx485T-2 FPGA platform via exploration of hardware resource requirements for memory and data flow operations.
Resumo:
The feasibility of real-time calculation of parameters for an internal combustion engine via reconfigurable hardware implementation is investigated as an alternative to software computation. A detailed in-hardware field programmable gate array (FPGA)-based design is developed and evaluated using input crank angle and in-cylinder pressure data from fully instrumented diesel engines in the QUT Biofuel Engine Research Facility (BERF). Results indicate the feasibility of employing a hardware-based implementation for real-time processing for speeds comparable to the data sampling rate currently used in the facility, with acceptably low level of discrepancies between hardware and software-based calculation of key engine parameters.
Computation of ECG signal features using MCMC modelling in software and FPGA reconfigurable hardware
Resumo:
Computational optimisation of clinically important electrocardiogram signal features, within a single heart beat, using a Markov-chain Monte Carlo (MCMC) method is undertaken. A detailed, efficient data-driven software implementation of an MCMC algorithm has been shown. Initially software parallelisation is explored and has been shown that despite the large amount of model parameter inter-dependency that parallelisation is possible. Also, an initial reconfigurable hardware approach is explored for future applicability to real-time computation on a portable ECG device, under continuous extended use.
Resumo:
A Field Programmable Gate Array (FPGA) based hardware accelerator for multi-conductor parasitic capacitance extraction, using Method of Moments (MoM), is presented in this paper. Due to the prohibitive cost of solving a dense algebraic system formed by MoM, linear complexity fast solver algorithms have been developed in the past to expedite the matrix-vector product computation in a Krylov sub-space based iterative solver framework. However, as the number of conductors in a system increases leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products present a time bottleneck, especially for ill-conditioned system matrices. In this work, an FPGA based hardware implementation is proposed to parallelize the iterative matrix solution for multiple RHS vectors in a low-rank compression based fast solver scheme. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple conductors in a Ball Grid Array (BGA) package. Speed-ups up to 13x over equivalent software implementation on an Intel Core i5 processor for dense matrix-vector products and 12x for QR compressed matrix-vector products is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board.
Resumo:
With the rapid growth of the Internet and digital communications, the volume of sensitive electronic transactions being transferred and stored over and on insecure media has increased dramatically in recent years. The growing demand for cryptographic systems to secure this data, across a multitude of platforms, ranging from large servers to small mobile devices and smart cards, has necessitated research into low cost, flexible and secure solutions. As constraints on architectures such as area, speed and power become key factors in choosing a cryptosystem, methods for speeding up the development and evaluation process are necessary. This thesis investigates flexible hardware architectures for the main components of a cryptographic system. Dedicated hardware accelerators can provide significant performance improvements when compared to implementations on general purpose processors. Each of the designs proposed are analysed in terms of speed, area, power, energy and efficiency. Field Programmable Gate Arrays (FPGAs) are chosen as the development platform due to their fast development time and reconfigurable nature. Firstly, a reconfigurable architecture for performing elliptic curve point scalar multiplication on an FPGA is presented. Elliptic curve cryptography is one such method to secure data, offering similar security levels to traditional systems, such as RSA, but with smaller key sizes, translating into lower memory and bandwidth requirements. The architecture is implemented using different underlying algorithms and coordinates for dedicated Double-and-Add algorithms, twisted Edwards algorithms and SPA secure algorithms, and its power consumption and energy on an FPGA measured. Hardware implementation results for these new algorithms are compared against their software counterparts and the best choices for minimum area-time and area-energy circuits are then identified and examined for larger key and field sizes. Secondly, implementation methods for another component of a cryptographic system, namely hash functions, developed in the recently concluded SHA-3 hash competition are presented. Various designs from the three rounds of the NIST run competition are implemented on FPGA along with an interface to allow fair comparison of the different hash functions when operating in a standardised and constrained environment. Different methods of implementation for the designs and their subsequent performance is examined in terms of throughput, area and energy costs using various constraint metrics. Comparing many different implementation methods and algorithms is nontrivial. Another aim of this thesis is the development of generic interfaces used both to reduce implementation and test time and also to enable fair baseline comparisons of different algorithms when operating in a standardised and constrained environment. Finally, a hardware-software co-design cryptographic architecture is presented. This architecture is capable of supporting multiple types of cryptographic algorithms and is described through an application for performing public key cryptography, namely the Elliptic Curve Digital Signature Algorithm (ECDSA). This architecture makes use of the elliptic curve architecture and the hash functions described previously. These components, along with a random number generator, provide hardware acceleration for a Microblaze based cryptographic system. The trade-off in terms of performance for flexibility is discussed using dedicated software, and hardware-software co-design implementations of the elliptic curve point scalar multiplication block. Results are then presented in terms of the overall cryptographic system.
Resumo:
This paper presents a multi-language framework to FPGA hardware development which aims to satisfy the dual requirement of high-level hardware design and efficient hardware implementation. The central idea of this framework is the integration of different hardware languages in a way that harnesses the best features of each language. This is illustrated in this paper by the integration of two hardware languages in the form of HIDE: a structured hardware language which provides more abstract and elegant hardware descriptions and compositions than are possible in traditional hardware description languages such as VHDL or Verilog, and Handel-C: an ANSI C-like hardware language which allows software and hardware engineers alike to target FPGAs from high-level algorithmic descriptions. On the one hand, HIDE has proven to be very successful in the description and generation of highly optimised parameterisable FPGA circuits from geometric descriptions. On the other hand, Handel-C has also proven to be very successful in the rapid design and prototyping of FPGA circuits from algorithmic application descriptions. The proposed integrated framework hence harnesses HIDE for the generation of highly optimised circuits for regular parts of algorithms, while Handel-C is used as a top-level design language from which HIDE functionality is dynamically invoked. The overall message of this paper posits that there need not be an exclusive choice between different hardware design flows. Rather, an integrated framework where different design flows can seamlessly interoperate should be adopted. Although the idea might seem simple prima facie, it could have serious implications on the design of future generations of hardware languages.
Resumo:
A full hardware implementation of a Weighted Fair Queuing (WFQ) packet scheduler is proposed. The circuit architecture presented has been implemented using Altera Stratix II FPGA technology, utilizing RLDII and QDRII memory components. The circuit can provide fine granularity Quality of Service (QoS) support at a line throughput rate of 12.8Gb/s in its current implementation. The authors suggest that, due to the flexible and scalable modular circuit design approach used, the current circuit architecture can be targeted for a full ASIC implementation to deliver 50 Gb/s throughput. The circuit itself comprises three main components; a WFQ algorithm computation circuit, a tag/time-stamp sort and retrieval circuit, and a high throughput shared buffer. The circuit targets the support of emerging wireline and wireless network nodes that focus on Service Level Agreements (SLA's) and Quality of Experience.
Resumo:
In a sigma-delta analog to digital (A/D) As most of the sigma-delta ADC applications require converter, the most computationally intensive block is decimation filters with linear phase characteristics, the decimation filter and its hardware implementation symmetric Finite Impulse Response (FIR) filters are may require millions of transistors. Since these widely used for implementation. But the number of FIR converters are now targeted for a portable application, filter coefficients will be quite large for implementing a a hardware efficient design is an implicit requirement. narrow band decimation filter. Implementing decimation In this effect, this paper presents a computationally filter in several stages reduces the total number of filter efficient polyphase implementation of non-recursive coefficients, and hence reduces the hardware complexity cascaded integrator comb (CIC) decimators for and power consumption [2]. Sigma-Delta Converters (SDCs). The SDCs are The first stage of decimation filter can be operating at high oversampling frequencies and hence implemented very efficiently using a cascade of integrators require large sampling rate conversions. The filtering and comb filters which do not require multiplication or and rate reduction are performed in several stages to coefficient storage. The remaining filtering is performed reduce hardware complexity and power dissipation. either in single stage or in two stages with more complex The CIC filters are widely adopted as the first stage of FIR or infinite impulse response (IIR) filters according to decimation due to its multiplier free structure. In this the requirements. The amount of passband aliasing or research, the performance of polyphase structure is imaging error can be brought within prescribed bounds by compared with the CICs using recursive and increasing the number of stages in the CIC filter. The non-recursive algorithms in terms of power, speed and width of the passband and the frequency characteristics area. This polyphase implementation offers high speed outside the passband are severely limited. So, CIC filters operation and low power consumption. The polyphase are used to make the transition between high and low implementation of 4th order CIC filter with a sampling rates. Conventional filters operating at low decimation factor of '64' and input word length of sampling rate are used to attain the required transition '4-bits' offers about 70% and 37% of power saving bandwidth and stopband attenuation. compared to the corresponding recursive and Several papers are available in literature that deals non-recursive implementations respectively. The same with different implementations of decimation filter polyphase CIC filter can operate about 7 times faster architecture for sigma-delta ADCs. Hogenauer has than the recursive and about 3.7 times faster than the described the design procedures for decimation and non-recursive CIC filters.
Resumo:
This work presents two schemes of measuring the linear and angular kinematics of a rigid body using a kinematically redundant array of triple-axis accelerometers with potential applications in biomechanics. A novel angular velocity estimation algorithm is proposed and evaluated that can compensate for angular velocity errors using measurements of the direction of gravity. Analysis and discussion of optimal sensor array characteristics are provided. A damped 2 axis pendulum was used to excite all 6 DoF of the a suspended accelerometer array through determined complex motion and is the basis of both simulation and experimental studies. The relationship between accuracy and sensor redundancy is investigated for arrays of up to 100 triple axis (300 accelerometer axes) accelerometers in simulation and 10 equivalent sensors (30 accelerometer axes) in the laboratory test rig. The paper also reports on the sensor calibration techniques and hardware implementation.
Resumo:
This paper presents a control method for a class of continuous-time switched systems, using state feedback variable structure controllers. The method is applied to the control of a two-cell dc-dc buck converter and a control circuit design using the software PSpice is proposed. The design is based on Lyapunov-Metzler-SPR systems and the performance of the resulting control system is superior to that afforded by a recently-proposed alternative sliding-mode control technique. The dc-dc power converters are very used in industrial applications, for instance, in power systems of hybrid electric vehicles and aircrafts. Good results were obtained and the proposed design is also inexpensive because it uses electric components that can be easily found for the hardware implementation. Future researches on the subject include the hardware validation of the dc-dc converter controller and the robust control design of switched systems, with structural failures. © 2011 IEEE.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)