782 resultados para FPGA, VHDL, Picoblaze, SERDES
Resumo:
Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the a mission should be aborted due to mechanical or other failure. On-board cameras provide information that can be used in the determination of potential landing sites, which are continually updated and ranked to prevent injury and minimize damage. Pulse Coupled Neural Networks have been used for the detection of features in images that assist in the classification of vegetation and can be used to minimize damage to the aerial vehicle. However, a significant drawback in the use of PCNNs is that they are computationally expensive and have been more suited to off-line applications on conventional computing architectures. As heterogeneous computing architectures are becoming more common, an OpenCL implementation of a PCNN feature generator is presented and its performance is compared across OpenCL kernels designed for CPU, GPU and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images obtained during unmanned aerial vehicle trials to determine the plausibility for real-time feature detection.
Resumo:
Safety concerns in the operation of autonomous aerial systems require safe-landing protocols be followed during situations where the mission should be aborted due to mechanical or other failure. This article presents a pulse-coupled neural network (PCNN) to assist in the vegetation classification in a vision-based landing site detection system for an unmanned aircraft. We propose a heterogeneous computing architecture and an OpenCL implementation of a PCNN feature generator. Its performance is compared across OpenCL kernels designed for CPU, GPU, and FPGA platforms. This comparison examines the compute times required for network convergence under a variety of images to determine the plausibility for real-time feature detection.
Resumo:
A nine level modular multilevel cascade converter (MMCC) based on four full bridge cells is shown driving a piezoelectric ultrasonic transducer at 71 and 39 kHz, in simulation and experimentally. The modular cells are small stackable PCBs, each with two fully integrated surface mount 22 V, 40 A MOSFET half-bridge converters, and include all control signal and power isolation. In this work, the bridges operate at 12 V and 384 kHz, to deliver a 96 Vpp 9 level waveform with an effective switching frequency of 3 MHz. A 9 pH air cored inductor forms a low pass filter in conjunction with the 3000 pF capacitance of the transducer load. Eight equally phase-displaced naturally sampled pulse width modulation (PWM) drive signals, along with the modulating sinusoid, are generated using phase accumulation techniques in a dedicated FPGA. Experimental time domain and FFT plots of the multilevel and transducer output waveforms are presented and discussed.
Resumo:
This paper introduces our dedicated authenticated encryption scheme ICEPOLE. ICEPOLE is a high-speed hardware-oriented scheme, suitable for high-throughput network nodes or generally any environment where specialized hardware (such as FPGAs or ASICs) can be used to provide high data processing rates. ICEPOLE-128 (the primary ICEPOLE variant) is very fast. On the modern FPGA device Virtex 6, a basic iterative architecture of ICEPOLE reaches 41 Gbits/s, which is over 10 times faster than the equivalent implementation of AES-128-GCM. The throughput-to-area ratio is also substantially better when compared to AES-128-GCM. We have carefully examined the security of the algorithm through a range of cryptanalytic techniques and our findings indicate that ICEPOLE offers high security level.
Resumo:
Purpose – The purpose of this paper is to describe an innovative compliance control architecture for hybrid multi‐legged robots. The approach was verified on the hybrid legged‐wheeled robot ASGUARD, which was inspired by quadruped animals. The adaptive compliance controller allows the system to cope with a variety of stairs, very rough terrain, and is also able to move with high velocity on flat ground without changing the control parameters. Design/methodology/approach – The paper shows how this adaptivity results in a versatile controller for hybrid legged‐wheeled robots. For the locomotion control we use an adaptive model of motion pattern generators. The control approach takes into account the proprioceptive information of the torques, which are applied on the legs. The controller itself is embedded on a FPGA‐based, custom designed motor control board. An additional proprioceptive inclination feedback is used to make the same controller more robust in terms of stair‐climbing capabilities. Findings – The robot is well suited for disaster mitigation as well as for urban search and rescue missions, where it is often necessary to place sensors or cameras into dangerous or inaccessible areas to get a better situation awareness for the rescue personnel, before they enter a possibly dangerous area. A rugged, waterproof and dust‐proof corpus and the ability to swim are additional features of the robot. Originality/value – Contrary to existing approaches, a pre‐defined walking pattern for stair‐climbing was not used, but an adaptive approach based only on internal sensor information. In contrast to many other walking pattern based robots, the direct proprioceptive feedback was used in order to modify the internal control loop, thus adapting the compliance of each leg on‐line.
Resumo:
Large Display Arrays (LDAs) use Light Emitting Diodes (LEDs) in order to inform a viewing audience. A matrix of individually driven LEDs allows the area represented to display text, images and video. LDAs have undergone rapid development over the past 10 years in both the modular and semi-flexible formats. This thesis critically analyses the communication architecture and processor functionality of current LDAs and presents an alternative method, that is, Scalable Flexible Large Display Arrays (SFLDAs). SFLDAs are more adaptable to a variety of applications because of enhancements in scalability and flexibility. Scalability is the ability to configure SFLDAs from 0.8m2 to 200m2. Flexibility is increased functionality within the processors to handle changes in configuration and the use of a communication architecture that standardises two-way communication throughout the SFLDA. While common video platforms such as Digital Video Interface (DVI), Serial Digital Interface (SDI), and High Definition Multimedia Interface (HDMI) are considered as solutions for the communication architecture of SFLDAs, so too is modulation, fibre optic, capacitive coupling and Ethernet. From an analysis of these architectures, Ethernet was identified as the best solution. The use of Ethernet as the communication architecture in SFLDAs means that both hardware and software modules are capable of interfacing to the SFLDAs. The Video to Ethernet Processor Unit (VEPU), Scoreboard, Image and Control Software (SICS) and Ethernet to LED Processor Unit (ELPU) have been developed to form the key components in designing and implementing the first SFLDA. Data throughput rate and spectrophotometer tests were used to measure the effectiveness of Ethernet within the SFLDA constructs. The result of testing and analysis of these architectures showed that Ethernet satisfactorily met the requirements of SFLDAs.
Resumo:
Tridiagonal diagonally dominant linear systems arise in many scientific and engineering applications. The standard Thomas algorithm for solving such systems is inherently serial forming a bottleneck in computation. Algorithms such as cyclic reduction and SPIKE reduce a single large tridiagonal system into multiple small independent systems which can be solved in parallel. We have developed portable cyclic reduction and SPIKE algorithm OpenCL implementations with the intent to target a range of co-processors in a heterogeneous computing environment including Field Programmable Gate Arrays (FPGAs), Graphics Processing Units (GPUs) and other multi-core processors. In this paper, we evaluate these designs in the context of solver performance, resource efficiency and numerical accuracy.
Resumo:
An alternative approach to digital PWM generation uses an accumulator rather than a counter to generate the carrier. This offers several advantages. The resolution and gain of the pulse width modulator remain constant regardless of the module clock frequency and PWM output frequency. The PWM resolution also becomes fixed at the register width. Even at high PWM frequencies, the resolution remains high when averaged over a number of PWM cycles. An inherent dithering of the PWM waveform introduced over successive cycles blurs the switching spectra without distorting the modulating waveform. The technique also lends itself to easily generating several phase shifted PWM waveforms suitable for multilevel converter modulation. Several example waveforms generated using both simulation and FPGA hardware are presented.
Resumo:
Emerging embedded applications are based on evolving standards (e.g., MPEG2/4, H.264/265, IEEE802.11a/b/g/n). Since most of these applications run on handheld devices, there is an increasing need for a single chip solution that can dynamically interoperate between different standards and their derivatives. In order to achieve high resource utilization and low power dissipation, we propose REDEFINE, a polymorphic ASIC in which specialized hardware units are replaced with basic hardware units that can create the same functionality by runtime re-composition. It is a ``future-proof'' custom hardware solution for multiple applications and their derivatives in a domain. In this article, we describe a compiler framework and supporting hardware comprising compute, storage, and communication resources. Applications described in high-level language (e.g., C) are compiled into application substructures. For each application substructure, a set of compute elements on the hardware are interconnected during runtime to form a pattern that closely matches the communication pattern of that particular application. The advantage is that the bounded CEs are neither processor cores nor logic elements as in FPGAs. Hence, REDEFINE offers the power and performance advantage of an ASIC and the hardware reconfigurability and programmability of that of an FPGA/instruction set processor. In addition, the hardware supports custom instruction pipelining. Existing instruction-set extensible processors determine a sequence of instructions that repeatedly occur within the application to create custom instructions at design time to speed up the execution of this sequence. We extend this scheme further, where a kernel is compiled into custom instructions that bear strong producer-consumer relationship (and not limited to frequently occurring sequences of instructions). Custom instructions, realized as hardware compositions effected at runtime, allow several instances of the same to be active in parallel. A key distinguishing factor in majority of the emerging embedded applications is stream processing. To reduce the overheads of data transfer between custom instructions, direct communication paths are employed among custom instructions. In this article, we present the overview of the hardware-aware compiler framework, which determines the NoC-aware schedule of transports of the data exchanged between the custom instructions on the interconnect. The results for the FFT kernel indicate a 25% reduction in the number of loads/stores, and throughput improves by log(n) for n-point FFT when compared to sequential implementation. Overall, REDEFINE offers flexibility and a runtime reconfigurability at the expense of 1.16x in power and 8x in area when compared to an ASIC. REDEFINE implementation consumes 0.1x the power of an FPGA implementation. In addition, the configuration overhead of the FPGA implementation is 1,000x more than that of REDEFINE.
Resumo:
Problems like windup or rollover arise in a PI controller working under saturation. Hence anti-windup schemes are necessary to minimize performance degradation.Similar situation may occur in a Proportional Resonant(PR)controller in the presence of a sustained error input.Several methods can be employed based on existing knowledge on PI controller to counter this problem.In this paper few such schemes are proposed and implemented in FPGA and MATLAB and from the obtained results their possible use and limitations have been studied.
Resumo:
High performance video standards use prediction techniques to achieve high picture quality at low bit rates. The type of prediction decides the bit rates and the image quality. Intra Prediction achieves high video quality with significant reduction in bit rate. This paper present an area optimized architecture for Intra prediction, for H.264 decoding at HDTV resolution with a target of achieving 60 fps. The architecture was validated on Virtex-5 FPGA based platform. The architecture achieves a frame rate of 64 fps. The architecture is based on multi-level memory hierarchy to reduce latency and ensure optimum resources utilization. It removes redundancy by reusing same functional blocks across different modes. The proposed architecture uses only 13% of the total LUTs available on the Xilinx FPGA XC5VLX50T.
Resumo:
The 4ÃÂ4 discrete cosine transform is one of the most important building blocks for the emerging video coding standard, viz. H.264. The conventional implementation does some approximation to the transform matrix elements to facilitate integer arithmetic, for which hardware is suitably prepared. Though the transform coding does not involve any multiplications, quantization process requires sixteen 16-bit multiplications. The algorithm used here eliminates the process of approximation in transform coding and multiplication in the quantization process, by usage of algebraic integer coding. We propose an area-efficient implementation of the transform and quantization blocks based on the algebraic integer coding. The designs were synthesized with 90 nm TSMC CMOS technology and were also implemented on a Xilinx FPGA. The gate counts and throughput achievable in this case are 7000 and 125 Msamples/sec.
Resumo:
High performance video standards use prediction techniques to achieve high picture quality at low bit rates. The type of prediction decides the bit rates and the image quality. Intra Prediction achieves high video quality with significant reduction in bit rate. This paper presents novel area optimized architecture for Intra prediction of H.264 decoding at HDTV resolution. The architecture has been validated on a Xilinx Virtex-5 FPGA based platform and achieved a frame rate of 64 fps. The architecture is based on multi-level memory hierarchy to reduce latency and ensure optimum resources utilization. It removes redundancy by reusing same functional blocks across different modes. The proposed architecture uses only 13% of the total LUTs available on the Xilinx FPGA XC5VLX50T.
Resumo:
Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.