49 resultados para Advanced signal processing
Resumo:
This paper describes the design, application, and evaluation of a user friendly, flexible, scalable and inexpensive Advanced Educational Parallel (AdEPar) digital signal processing (DSP) system based on TMS320C25 digital processors to implement DSP algorithms. This system will be used in the DSP laboratory by graduate students to work on advanced topics such as developing parallel DSP algorithms. The graduating senior students who have gained some experience in DSP can also use the system. The DSP laboratory has proved to be a useful tool in the hands of the instructor to teach the mathematically oriented topics of DSP that are often difficult for students to grasp. The DSP laboratory with assigned projects has greatly improved the ability of the students to understand such complex topics as the fast Fourier transform algorithm, linear and circular convolution, the theory and design of infinite impulse response (IIR) and finite impulse response (FIR) filters. The user friendly PC software support of the AdEPar system makes it easy to develop DSP programs for students. This paper gives the architecture of the AdEPar DSP system. The communication between processors and the PC-DSP processor communication are explained. The parallel debugger kernels and the restrictions of the system are described. The programming in the AdEPar is explained, and two benchmarks (parallel FFT and DES) are presented to show the system performance.
Resumo:
With the advent of new video standards such as MPEG-4 part-10 and H.264/H.26L, demands for advanced video coding, particularly in the area of variable block size video motion estimation (VBSME), are increasing. In this paper, we propose a new one-dimensional (1-D) very large-scale integration architecture for full-search VBSME (FSVBSME). The VBS sum of absolute differences (SAD) computation is performed by re-using the results of smaller sub-block computations. These are distributed and combined by incorporating a shuffling mechanism within each processing element. Whereas a conventional 1-D architecture can process only one motion vector (MV), this new architecture can process up to 41 MV sub-blocks (within a macroblock) in the same number of clock cycles.
Resumo:
The technical challenges in the design and programming of signal processors for multimedia communication are discussed. The development of terminal equipment to meet such demand presents a significant technical challenge, considering that it is highly desirable that the equipment be cost effective, power efficient, versatile, and extensible for future upgrades. The main challenges in the design and programming of signal processors for multimedia communication are, general-purpose signal processor design, application-specific signal processor design, operating systems and programming support and application programming. The size of FFT is programmable so that it can be used for various OFDM-based communication systems, such as digital audio broadcasting (DAB), digital video broadcasting-terrestrial (DVB-T) and digital video broadcasting-handheld (DVB-H). The clustered architecture design and distributed ping-pong register files in the PAC DSP raise new challenges of code generation.
Resumo:
In this paper, we propose a novel linear transmit precoding strategy for multiple-input, multiple-output (MIMO) systems employing improper signal constellations. In particular, improved zero-forcing (ZF) and minimum mean square error (MMSE) precoders are derived based on modified cost functions, and are shown to achieve a superior performance without loss of spectrum efficiency compared to the conventional linear and nonlinear precoders. The superiority of the proposed precoders over the conventional solutions are verified by both simulation and analytical results. The novel approach to precoding design is also applied to the case of an imperfect channel estimate with a known error covariance as well as to the multi-user scenario where precoding based on the nullspace of channel transmission matrix is employed to decouple multi-user channels. In both cases, the improved precoding schemes yield significant performance gain compared to the conventional counterparts.
Resumo:
Advances in silicon technology have been a key development in the realisation of many telecommunication and signal processing systems. In many cases, the development of application-specific digital signal processing (DSP) chips is the most cost-effective solution and provides the highest performance. Advances made in computer-aided design (CAD) tools and design methodologies now allow designers to develop complex chips within months or even weeks. This paper gives an insight into the challenges and design methodologies of implementing advanced highperformance chips for DSP. In particular, the paper reviews some of the techniques used to develop circuit architectures from high-level descriptions and the tools which are then used to realise silicon layout.
Resumo:
With a significant increment of the number of digital cameras used for various purposes, there is a demanding call for advanced video analysis techniques that can be used to systematically interpret and understand the semantics of video contents, which have been recorded in security surveillance, intelligent transportation, health care, video retrieving and summarization. Understanding and interpreting human behaviours based on video analysis have observed competitive challenges due to non-rigid human motion, self and mutual occlusions, and changes of lighting conditions. To solve these problems, advanced image and signal processing technologies such as neural network, fuzzy logic, probabilistic estimation theory and statistical learning have been overwhelmingly investigated.
Resumo:
In this paper, we introduce an efficient method for particle selection in tracking objects in complex scenes. Firstly, we improve the proposal distribution function of the tracking algorithm, including current observation, reducing the cost of evaluating particles with a very low likelihood. In addition, we use a partitioned sampling approach to decompose the dynamic state in several stages. It enables to deal with high-dimensional states without an excessive computational cost. To represent the color distribution, the appearance of the tracked object is modelled by sampled pixels. Based on this representation, the probability of any observation is estimated using non-parametric techniques in color space. As a result, we obtain a Probability color Density Image (PDI) where each pixel points its membership to the target color model. In this way, the evaluation of all particles is accelerated by computing the likelihood p(z|x) using the Integral Image of the PDI.
Resumo:
This paper presents single-chip FPGA Rijndael algorithm implementations of the Advanced Encryption Standard (AES) algorithm, Rijndael. In particular, the designs utilise look-up tables to implement the entire Rijndael Round function. A comparison is provided between these designs and similar existing implementations. Hardware implementations of encryption algorithms prove much faster than equivalent software implementations and since there is a need to perform encryption on data in real time, speed is very important. In particular, Field Programmable Gate Arrays (FPGAs) are well suited to encryption implementations due to their flexibility and an architecture, which can be exploited to accommodate typical encryption transformations. In this paper, a Look-Up Table (LUT) methodology is introduced where complex and slow operations are replaced by simple LUTs. A LUT-based fully pipelined Rijndael implementation is described which has a pre-placement performance of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design in which look-up tables are utilised to implement only one of the Round function transformations, and 6 times faster than other previous single-chip implementations. Iterative Rijndael implementations based on the Look-Up-Table design approach are also discussed and prove faster than typical iterative implementations.
Resumo:
Details are presented of the IRIS synthesis system for high-performance digital signal processing. This tool allows non-specialists to automatically derive VLSI circuit architectures from high-level, algorithmic representations, and provides a quick route to silicon implementation. The applicability of the system is demonstrated using the design example of a one-dimensional Discrete Cosine Transform circuit.
Resumo:
Mixture of Gaussians (MoG) modelling [13] is a popular approach to background subtraction in video sequences. Although the algorithm shows good empirical performance, it lacks theoretical justification. In this paper, we give a justification for it from an online stochastic expectation maximization (EM) viewpoint and extend it to a general framework of regularized online classification EM for MoG with guaranteed convergence. By choosing a special regularization function, l1 norm, we derived a new set of updating equations for l1 regularized online MoG. It is shown empirically that l1 regularized online MoG converge faster than the original online MoG .
Resumo:
The paper presents IPPro which is a high performance, scalable soft-core processor targeted for image processing applications. It has been based on the Xilinx DSP48E1 architecture using the ZYNQ Field Programmable Gate Array and is a scalar 16-bit RISC processor that operates at 526MHz, giving 526MIPS of performance. Each IPPro core uses 1 DSP48, 1 Block RAM and 330 Kintex-7 slice-registers, thus making the processor as compact as possible whilst maintaining flexibility and programmability. A key aspect of the approach is in reducing the application design time and implementation effort by using multiple IPPro processors in a SIMD mode. For different applications, this allows us to exploit different levels of parallelism and mapping for the specified processing architecture with the supported instruction set. In this context, a Traffic Sign Recognition (TSR) algorithm has been prototyped on a Zedboard with the colour and morphology operations accelerated using multiple IPPros. Simulation and experimental results demonstrate that the processing platform is able to achieve a speedup of 15 to 33 times for colour filtering and morphology operations respectively, with a reduced design effort and time.
Resumo:
With security and surveillance, there is an increasing need to be able to process image data efficiently and effectively either at source or in a large data networks. Whilst Field Programmable Gate Arrays have been seen as a key technology for enabling this, they typically use high level and/or hardware description language synthesis approaches; this provides a major disadvantage in terms of the time needed to design or program them and to verify correct operation; it considerably reduces the programmability capability of any technique based on this technology. The work here proposes a different approach of using optimised soft-core processors which can be programmed in software. In particular, the paper proposes a design tool chain for programming such processors that uses the CAL Actor Language as a starting point for describing an image processing algorithm and targets its implementation to these custom designed, soft-core processors on FPGA. The main purpose is to exploit the task and data parallelism in order to achieve the same parallelism as a previous HDL implementation but avoiding the design time, verification and debugging steps associated with such approaches.
Resumo:
The Field Programmable Gate Array (FPGA) implementation of the commonly used Histogram of Oriented Gradients (HOG) algorithm is explored. The HOG algorithm is employed to extract features for object detection. A key focus has been to explore the use of a new FPGA-based processor which has been targeted at image processing. The paper gives details of the mapping and scheduling factors that influence the performance and the stages that were undertaken to allow the algorithm to be deployed on FPGA hardware, whilst taking into account the specific IPPro architecture features. We show that multi-core IPPro performance can exceed that of against state-of-the-art FPGA designs by up to 3.2 times with reduced design and implementation effort and increased flexibility all on a low cost, Zynq programmable system.
Resumo:
This paper presents a new type of Flexible Macroblock Ordering (FMO) type for the H.264 Advanced Video Coding (AVC) standard, which can more efficiently flag the position and shape of regions of interest (ROIs) in each frame. In H.264/AVC, 7 types of FMO have been defined, all of which are designed for error resilience. Most previous work related to ROI processing has adopted Type-2 (foreground & background), or Type-6 (explicit), to flag the position and shape of the ROI. However, only rectangular shapes are allowed in Type-2 and for non-rectangular shapes, the non-ROI macroblocks may be wrongly flagged as being within the ROI, which could seriously affect subsequent processing of the ROI. In Type-6, each macroblock in a frame uses fixed-length bits to indicate to its slice group. In general, each ROI is assigned to one slice group identity. Although this FMO type can more accurately flag the position and shape of the ROI, it incurs a significant bitrate overhead. The proposed new FMO type uses the smallest rectangle that covers the ROI to indicate its position and a spiral binary mask is employed within the rectangle to indicate the shape of the ROI. This technique can accurately flag the ROI and provide significantly savings in the bitrate overhead. Compared with Type-6, an 80% to 90% reduction in the bitrate overhead can be obtained while achieving the same accuracy.