998 resultados para Processor architecture


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The most promising way to maintain reliable data transfer across the rapidly fluctuating channels used by next generation multiple-input multiple-output communications schemes is to exploit run-time variable modulation and antenna configurations. This demands that the baseband signal processing architectures employed in the communications terminals must provide low cost and high performance with runtime reconfigurability. We present a softcore-processor based solution to this issue, and show for the first time, that such programmable architectures can enable real-time data operation for cutting-edge standards
such as 802.11n; furthermore, by exploiting deep processing pipelines and interleaved task execution, the cost and performance of these architectures is shown to be on a par with traditional dedicated circuit based solutions. We believe this to be the first such programmable architecture to achieve this, and the combination of implementation efficiency and programmability makes this implementation style the most promising approach for hosting such dynamic architectures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a security architecture for the basic cross indexing systems emerging as foundational structures in current health information systems. In these systems unique identifiers are issued to healthcare providers and consumers. In most cases, such numbering schemes are national in scope and must therefore necessarily be used via an indexing system to identify records contained in pre-existing local, regional or national health information systems. Most large scale electronic health record systems envisage that such correlation between national healthcare identifiers and pre-existing identifiers will be performed by some centrally administered cross referencing, or index system. This paper is concerned with the security architecture for such indexing servers and the manner in which they interface with pre-existing health systems (including both workstations and servers). The paper proposes two required structures to achieve the goal of a national scale, and secure exchange of electronic health information, including: (a) the employment of high trust computer systems to perform an indexing function, and (b) the development and deployment of an appropriate high trust interface module, a Healthcare Interface Processor (HIP), to be integrated into the connected workstations or servers of healthcare service providers. This proposed architecture is specifically oriented toward requirements identified in the Connectivity Architecture for Australia’s e-health scheme as outlined by NEHTA and the national e-health strategy released by the Australian Health Ministers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the architecture of a fault-tolerant, special-purpose multi-microprocessor system for solving Partial Differential Equations (PDEs). The modular nature of the architecture allows the use of hundreds of Processing Elements (PEs) for high throughput. Its performance is evaluated by both analytical and simulation methods. The results indicate that the system can achieve high operation rates and is not sensitive to inter-processor communication delay.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is an increased interest in the use of Unmanned Aerial Vehicles for load transportation from environmental remote sensing to construction and parcel delivery. One of the main challenges is accurate control of the load position and trajectory. This paper presents an assessment of real flight trials for the control of an autonomous multi-rotor with a suspended slung load using only visual feedback to determine the load position. This method uses an onboard camera to take advantage of a common visual marker detection algorithm to robustly detect the load location. The load position is calculated using an onboard processor, and transmitted over a wireless network to a ground station integrating MATLAB/SIMULINK and Robotic Operating System (ROS) and a Model Predictive Controller (MPC) to control both the load and the UAV. To evaluate the system performance, the position of the load determined by the visual detection system in real flight is compared with data received by a motion tracking system. The multi-rotor position tracking performance is also analyzed by conducting flight trials using perfect load position data and data obtained only from the visual system. Results show very accurate estimation of the load position (~5% Offset) using only the visual system and demonstrate that the need for an external motion tracking system is not needed for this task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of scheduling divisible loads in distributed computing systems, in presence of processor release time is considered. The objective is to find the optimal sequence of load distribution and the optimal load fractions assigned to each processor in the system such that the processing time of the entire processing load is a minimum. This is a difficult combinatorial optimization problem and hence genetic algorithms approach is presented for its solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Processor architects have a challenging task of evaluating a large design space consisting of several interacting parameters and optimizations. In order to assist architects in making crucial design decisions, we build linear regression models that relate Processor performance to micro-architecture parameters, using simulation based experiments. We obtain good approximate models using an iterative process in which Akaike's information criteria is used to extract a good linear model from a small set of simulations, and limited further simulation is guided by the model using D-optimal experimental designs. The iterative process is repeated until desired error bounds are achieved. We used this procedure to establish the relationship of the CPI performance response to 26 key micro-architectural parameters using a detailed cycle-by-cycle superscalar processor simulator The resulting models provide a significance ordering on all micro-architectural parameters and their interactions, and explain the performance variations of micro-architectural techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we develop a multithreaded VLSI processor linear array architecture to render complex environments based on the radiosity approach. The processing elements are identical and multithreaded. They work in Single Program Multiple Data (SPMD) mode. A new algorithm to do the radiosity computations based on the progressive refinement approach[2] is proposed. Simulation results indicate that the architecture is latency tolerant and scalable. It is shown that a linear array of 128 uni-threaded processing elements sustains a throughput close to 0.4 million patches/sec.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a Petri net model for a commercial network processor (Intel iXP architecture) which is a multithreaded multiprocessor architecture. We consider and model three different applications viz., IPv4 forwarding, network address translation, and IP security running on IXP 2400/2850. A salient feature of the Petri net model is its ability to model the application, architecture and their interaction in great detail. The model is validated using the Intel proprietary tool (SDK 3.51 for IXP architecture) over a range of configurations. We conduct a detailed performance evaluation, identify the bottleneck resource, and propose a few architectural extensions and evaluate them in detail.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a Radix-4(3) based FFT architecture suitable for OFDM based WLAN applications. The radix-4(3) parallel unrolled architecture presented here, uses a radix-4 butterfly unit which takes all four inputs in parallel and can selectively produce one out of the four outputs. A 64 point FFT processor based on the proposed architecture has been implemented in UMC 130nm 1P8M CMOS process with a maximum clock frequency of 100 MHz and area of 0.83mm(2). The proposed processor provides a throughput of four times the clock rate and can finish one 64 point FFT computation in 16 clock cycles. For IEEE 802.11a/g WLAN, the processor needs to be operated at a clock rate of 5 MHz with a power consumption of 2.27 mW which is 27% less than the previously reported low power implementations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose a fully parallel 64K point radix-4(4) FFT processor. The radix-4(4) parallel unrolled architecture uses a novel radix-4 butterfly unit which takes all four inputs in parallel and can selectively produce one out of the four outputs. The radix-4(4) block can take all 256 inputs in parallel and can use the select control signals to generate one out of the 256 outputs. The resultant 64K point FFT processor shows significant reduction in intermediate memory but with increased hardware complexity. Compared to the state-of-art implementation 5], our architecture shows reduced latency with comparable throughput and area. The 64K point FFT architecture was synthesized using a 130nm CMOS technology which resulted in a throughput of 1.4 GSPS and latency of 47.7 mu s with a maximum clock frequency of 350MHz. When compared to 5], the latency is reduced by 303 mu s with 50.8% reduction in area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a low latency optical data center top of rack switch using recirculation buffering and a hybrid MZ/SOA switch architecture to reduce the network power dissipated on future optically connected server chips by 53%. © OSA 2014.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a novel architecture of vision chip for fast traffic lane detection (FTLD). The architecture consists of a 32*32 SIMD processing element (PE) array processor and a dual-core RISC processor. The PE array processor performs low-level pixel-parallel image processing at high speed and outputs image features for high-level image processing without I/O bottleneck. The dual-core processor carries out high-level image processing. A parallel fast lane detection algorithm for this architecture is developed. The FPGA system with a CMOS image sensor is used to implement the architecture. Experiment results show that the system can perform the fast traffic lane detection at 50fps rate. It is much faster than previous works and has good robustness that can operate in various intensity of light. The novel architecture of vision chip is able to meet the demand of real-time lane departure warning system.