142 resultados para High performance processors


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Sphere Decoding (SD) is a highly effective detection technique for Multiple-Input Multiple-Output (MIMO) wireless communications receivers, offering quasi-optimal accuracy with relatively low computational complexity as compared to the ideal ML detector. Despite this, the computational demands of even low-complexity SD variants, such as Fixed Complexity SD (FSD), remains such that implementation on modern software-defined network equipment is a highly challenging process, and indeed real-time solutions for MIMO systems such as 4 4 16-QAM 802.11n are unreported. This paper overcomes this barrier. By exploiting large-scale networks of fine-grained softwareprogrammable processors on Field Programmable Gate Array (FPGA), a series of unique SD implementations are presented, culminating in the only single-chip, real-time quasi-optimal SD for 44 16-QAM 802.11n MIMO. Furthermore, it demonstrates that the high performance software-defined architectures which enable these implementations exhibit cost comparable to dedicated circuit architectures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

To enable reliable data transfer in next generation Multiple-Input Multiple-Output (MIMO) communication systems, terminals must be able to react to fluctuating channel conditions by having flexible modulation schemes and antenna configurations. This creates a challenging real-time implementation problem: to provide the high performance required of cutting edge MIMO standards, such as 802.11n, with the flexibility for this behavioural variability. FPGA softcore processors offer a solution to this problem, and in this paper we show how heterogeneous SISD/SIMD/MIMD architectures can enable programmable multicore architectures on FPGA with similar performance and cost as traditional dedicated circuit-based architectures. When applied to a 4×4 16-QAM Fixed-Complexity Sphere Decoder (FSD) detector we present the first soft-processor based solution for real-time 802.11n MIMO.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The initial part of this paper reviews the early challenges (c 1980) in achieving real-time silicon implementations of DSP computations. In particular, it discusses research on application specific architectures, including bit level systolic circuits that led to important advances in achieving the DSP performance levels then required. These were many orders of magnitude greater than those achievable using programmable (including early DSP) processors, and were demonstrated through the design of commercial digital correlator and digital filter chips. As is discussed, an important challenge was the application of these concepts to recursive computations as occur, for example, in Infinite Impulse Response (IIR) filters. An important breakthrough was to show how fine grained pipelining can be used if arithmetic is performed most significant bit (msb) first. This can be achieved using redundant number systems, including carry-save arithmetic. This research and its practical benefits were again demonstrated through a number of novel IIR filter chip designs which at the time, exhibited performance much greater than previous solutions. The architectural insights gained coupled with the regular nature of many DSP and video processing computations also provided the foundation for new methods for the rapid design and synthesis of complex DSP System-on-Chip (SoC), Intellectual Property (IP) cores. This included the creation of a wide portfolio of commercial SoC video compression cores (MPEG2, MPEG4, H.264) for very high performance applications ranging from cell phones to High Definition TV (HDTV). The work provided the foundation for systematic methodologies, tools and design flows including high-level design optimizations based on "algorithmic engineering" and also led to the creation of the Abhainn tool environment for the design of complex heterogeneous DSP platforms comprising processors and multiple FPGAs. The paper concludes with a discussion of the problems faced by designers in developing complex DSP systems using current SoC technology. © 2007 Springer Science+Business Media, LLC.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The fabrication and performance of the first bit-level systolic correlator array is described. The application of systolic array concepts at the bit level provides a simple and extremely powerful method for implementing high-performance digital processing functions. The resulting structure is highly regular, facilitating yield enhancement through fault-tolerant redundancy techniques and therefore ideally suited to implementation as a VLSI chip. The CMOS/SOS chip operates at 35 MHz, is fully cascadable and exhibits 64-stage correlation for 1-bit reference and 4-bit data. 7 refs.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Optimized circuits for implementing high-performance bit-parallel IIR filters are presented. Circuits constructed mainly from simple carry save adders and based on most-significant-bit (MSB) first arithmetic are described. Two methods resulting in systems which are 100% efficient in that they are capable of sampling data every cycle are presented. In the first approach the basic circuit is modified so that the level of pipelining used is compatible with the small, but fixed, latency associated with the computation in question. This is achieved through insertion of pipeline delays (half latches) on every second row of cells. This produces an area-efficient solution in which the throughput rate is determined by a critical path of 76 gate delays. A second approach combines the MSB first arithmetic methods with the scattered look-ahead methods. Important design issues are addressed, including wordlength truncation, overflow detection, and saturation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The paper presents IPPro which is a high performance, scalable soft-core processor targeted for image processing applications. It has been based on the Xilinx DSP48E1 architecture using the ZYNQ Field Programmable Gate Array and is a scalar 16-bit RISC processor that operates at 526MHz, giving 526MIPS of performance. Each IPPro core uses 1 DSP48, 1 Block RAM and 330 Kintex-7 slice-registers, thus making the processor as compact as possible whilst maintaining flexibility and programmability. A key aspect of the approach is in reducing the application design time and implementation effort by using multiple IPPro processors in a SIMD mode. For different applications, this allows us to exploit different levels of parallelism and mapping for the specified processing architecture with the supported instruction set. In this context, a Traffic Sign Recognition (TSR) algorithm has been prototyped on a Zedboard with the colour and morphology operations accelerated using multiple IPPros. Simulation and experimental results demonstrate that the processing platform is able to achieve a speedup of 15 to 33 times for colour filtering and morphology operations respectively, with a reduced design effort and time.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The end of Dennard scaling has pushed power consumption into a first order concern for current systems, on par with performance. As a result, near-threshold voltage computing (NTVC) has been proposed as a potential means to tackle the limited cooling capacity of CMOS technology. Hardware operating in NTV consumes significantly less power, at the cost of lower frequency, and thus reduced performance, as well as increased error rates. In this paper, we investigate if a low-power systems-on-chip, consisting of ARM's asymmetric big.LITTLE technology, can be an alternative to conventional high performance multicore processors in terms of power/energy in an unreliable scenario. For our study, we use the Conjugate Gradient solver, an algorithm representative of the computations performed by a large range of scientific and engineering codes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An SVD processor system is presented in which each processing element is implemented using a simple CORDIC unit. The internal recursive loop within the CORDIC module is exploited, with pipelining being used to multiplex the two independent micro-rotations onto a single CORDIC processor. This leads to a high performance and efficient hardware architecture. In addition, a novel method for scale factor correction is presented which only need be applied once at the end of the computation. This also reduces the computation time. The net result is an SVD architecture based on a conventional CORDIC approach, which combines high performance with high silicon area efficiency.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A solvent-vapour thermoplastic bonding process is reported which provides high strength bonding of PMMA over a large area for multi-channel and multi-layer microfluidic devices with shallow high resolution channel features. The bond process utilises a low temperature vacuum thermal fusion step with prior exposure of the substrate to chloroform (CHCl3) vapour to reduce bond temperature to below the PMMA glass transition temperature. Peak tensile and shear bond strengths greater than 3 MPa were achieved for a typical channel depth reduction of 25 µm. The device-equivalent bond performance was evaluated for multiple layers and high resolution channel features using double-side and single-side exposure of the bonding pieces. A single-sided exposure process was achieved which is suited to multi-layer bonding with channel alignment at the expense of greater depth loss and a reduction in peak bond strength. However, leak and burst tests demonstrate bond integrity up to at least 10 bar channel pressure over the full substrate area of 100 mm x 100 mm. The inclusion of metal tracks within the bond resulted in no loss of performance. The vertical wall integrity between channels was found to be compromised by solvent permeation for wall thicknesses of 100 µm which has implications for high resolution serpentine structures. Bond strength is reduced considerably for multi-layer patterned substrates where features on each layer are not aligned, despite the presence of an intermediate blank substrate. Overall a high performance bond process has been developed that has the potential to meet the stringent specifications for lab-on-chip deployment in harsh environmental conditions for applications such as deep ocean profiling.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We describe a simple strategy, which is based on the idea of space confinement, for the synthesis of carbon coating on LiFePO4 nanoparticles/graphene nanosheets composites in a water-in-oil emulsion system. The prepared composite displayed high performance as a cathode material for lithium-ion battery, such as high reversible lithium storage capacity (158 mA h g-1 after 100 cycles), high coulombic efficiency (over 97%), excellent cycling stability and high rate capability (as high as 83 mA h g -1 at 60 C). Very significantly, the preparation method employed can be easily adapted and be extended as a general approach to sophisticated compositions and structures for the preparation of highly dispersed nanosized structure on graphene. 

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A facile method to synthesize well-dispersed TiO2 quantum dots on graphene nanosheets (TiO2-QDs/GNs) in a water-in-oil (W/O) emulsion system is reported. The TiO2/graphene composites display high performance as an anode material for lithium-ion batteries (LIBs), such as having high reversible lithium storage capacity, high Coulombic efficiency, excellent cycling stability, and high rate capability. The excellent electrochemical performance and special structure of the composites thus offer a way to prepare novel graphene-based electrode materials for high-energy-density and high-power LIBs.