985 resultados para Software architectures


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sphere Decoding (SD) is a highly effective detection technique for Multiple-Input Multiple-Output (MIMO) wireless communications receivers, offering quasi-optimal accuracy with relatively low computational complexity as compared to the ideal ML detector. Despite this, the computational demands of even low-complexity SD variants, such as Fixed Complexity SD (FSD), remains such that implementation on modern software-defined network equipment is a highly challenging process, and indeed real-time solutions for MIMO systems such as 4 4 16-QAM 802.11n are unreported. This paper overcomes this barrier. By exploiting large-scale networks of fine-grained softwareprogrammable processors on Field Programmable Gate Array (FPGA), a series of unique SD implementations are presented, culminating in the only single-chip, real-time quasi-optimal SD for 44 16-QAM 802.11n MIMO. Furthermore, it demonstrates that the high performance software-defined architectures which enable these implementations exhibit cost comparable to dedicated circuit architectures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the rapid expansion of the internet and the increasing demand on Web servers, many techniques were developed to overcome the servers' hardware performance limitation. Mirrored Web Servers is one of the techniques used where a number of servers carrying the same "mirrored" set of services are deployed. Client access requests are then distributed over the set of mirrored servers to even up the load. In this paper we present a generic reference software architecture for load balancing over mirrored web servers. The architecture was designed adopting the latest NaSr architectural style [1] and described using the ADLARS [2] architecture description language. With minimal effort, different tailored product architectures can be generated from the reference architecture to serve different network protocols and server operating systems. An example product system is described and a sample Java implementation is presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Performance evaluation of parallel software and architectural exploration of innovative hardware support face a common challenge with emerging manycore platforms: they are limited by the slow running time and the low accuracy of software simulators. Manycore FPGA prototypes are difficult to build, but they offer great rewards. Software running on such prototypes runs orders of magnitude faster than current simulators. Moreover, researchers gain significant architectural insight during the modeling process. We use the Formic FPGA prototyping board [1], which specifically targets scalable and cost-efficient multi-board prototyping, to build and test a 64-board model of a 512-core, MicroBlaze-based, non-coherent hardware prototype with a full network-on-chip in a 3D-mesh topology. We expand the hardware architecture to include the ARM Versatile Express platforms and build a 520-core heterogeneous prototype of 8 Cortex-A9 cores and 512 MicroBlaze cores. We then develop an MPI library for the prototype and evaluate it extensively using several bare-metal and MPI benchmarks. We find that our processor prototype is highly scalable, models faithfully single-chip multicore architectures, and is a very efficient platform for parallel programming research, being 50,000 times faster than software simulation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces hybrid address spaces as a fundamental design methodology for implementing scalable runtime systems on many-core architectures without hardware support for cache coherence. We use hybrid address spaces for an implementation of MapReduce, a programming model for large-scale data processing, and the implementation of a remote memory access (RMA) model. Both implementations are available on the Intel SCC and are portable to similar architectures. We present the design and implementation of HyMR, a MapReduce runtime system whereby different stages and the synchronization operations between them alternate between a distributed memory address space and a shared memory address space, to improve performance and scalability. We compare HyMR to a reference implementation and we find that HyMR improves performance by a factor of 1.71× over a set of representative MapReduce benchmarks. We also compare HyMR with Phoenix++, a state-of-art implementation for systems with hardware-managed cache coherence in terms of scalability and sustained to peak data processing bandwidth, where HyMR demon- strates improvements of a factor of 3.1× and 3.2× respectively. We further evaluate our hybrid remote memory access (HyRMA) programming model and assess its performance to be superior of that of message passing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Software Product-Line Engineering has emerged in recent years, as an important strategy for maximising reuse within the context of a family of related products. In current approaches to software product-lines, there is general agreement that the definition of a reference-architecture for the product-line is an important step in the software engineering process. In this paper we introduce ADLARS, a new form of architecture Description language that places emphasis on the capture of architectural relationships. ADLARS is designed for use within a product-line engineering process. The language supports both the definition of architectural structure, and of important architectural relationships. In particular it supports capture of the relationships between product features, component and task architectures, interfaces and parameter requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As the complexity of computing systems grows, reliability and energy are two crucial challenges asking for holistic solutions. In this paper, we investigate the interplay among concurrency, power dissipation, energy consumption and voltage-frequency scaling for a key numerical kernel for the solution of sparse linear systems. Concretely, we leverage a task-parallel implementation of the Conjugate Gradient method, equipped with an state-of-the-art pre-conditioner embedded in the ILUPACK software, and target a low-power multi core processor from ARM.In addition, we perform a theoretical analysis on the impact of a technique like Near Threshold Voltage Computing (NTVC) from the points of view of increased hardware concurrency and error rate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large integer multiplication is a major performance bottleneck in fully homomorphic encryption (FHE) schemes over the integers. In this paper two optimised multiplier architectures for large integer multiplication are proposed. The first of these is a low-latency hardware architecture of an integer-FFT multiplier. Secondly, the use of low Hamming weight (LHW) parameters is applied to create a novel hardware architecture for large integer multiplication in integer-based FHE schemes. The proposed architectures are implemented, verified and compared on the Xilinx Virtex-7 FPGA platform. Finally, the proposed implementations are employed to evaluate the large multiplication in the encryption step of FHE over the integers. The analysis shows a speed improvement factor of up to 26.2 for the low-latency design compared to the corresponding original integer-based FHE software implementation. When the proposed LHW architecture is combined with the low-latency integer-FFT accelerator to evaluate a single FHE encryption operation, the performance results show that a speed improvement by a factor of approximately 130 is possible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A dissertação de doutoramento apresentada insere-se na área de electrónica não-linear de rádio-frequência (RF), UHF e microondas, tendo como principal campo de acção o estudo da distorção nãolinear em arquitecturas de recepção rádio, nomeadamente receptores de conversão directa como Power Meters, RFID (Radio Frequency IDentification) ou SDR (Software Define Radio) front-ends. Partindo de um estudo exaustivo das actuais arquitecturas de recepção de radiofrequência e revendo todos os conceitos teóricos relacionados com o desempenho não-linear dos sistemas/componentes electrónicos, foram desenvolvidos algoritmos matemáticos de modulação dos comportamentos não-lineares destas arquitecturas, simulados e testados em laboratório e propostas novas arquitecturas para a minimização ou cancelamento do impacto negativo de grandes interferidores em frequências vizinhas ao do sistema pretendido.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flexible radio transmitters based on the Software-Defined Radio (SDR) concept are gaining an increased research importance due to the unparalleled proliferation of new wireless standards operating at different frequencies, using dissimilar coding and modulation schemes, and targeted for different ends. In this new wireless communications paradigm, the physical layer of the radio transmitter must be able to support the simultaneous transmission of multi-band, multi-rate, multi-standard signals, which in practice is very hard or very inefficient to implement using conventional approaches. Nevertheless, the last developments in this field include novel all-digital transmitter architectures where the radio datapath is digital from the baseband up to the RF stage. Such concept has inherent high flexibility and poses an important step towards the development of SDR-based transmitters. However, the truth is that implementing such radio for a real world communications scenario is a challenging task, where a few key limitations are still preventing a wider adoption of this concept. This thesis aims exactly to address some of these limitations by proposing and implementing innovative all-digital transmitter architectures with inherent higher flexibility and integration, and where improving important figures of merit, such as coding efficiency, signal-to-noise ratio, usable bandwidth and in-band and out-of-band noise will also be addressed. In the first part of this thesis, the concept of transmitting RF data using an entirely digital approach based on pulsed modulation is introduced. A comparison between several implementation technologies is also presented, allowing to state that FPGAs provide an interesting compromise between performance, power efficiency and flexibility, thus making them an interesting choice as an enabling technology for pulse-based all-digital transmitters. Following this discussion, the fundamental concepts inherent to pulsed modulators, its key advantages, main limitations and typical enhancements suitable for all-digital transmitters are also presented. The recent advances regarding the two most common classes of pulse modulated transmitters, namely the RF and the baseband-level are introduced, along with several examples of state-of-the-art architectures found on the literature. The core of this dissertation containing the main developments achieved during this PhD work is then presented and discussed. The first key contribution to the state-of-the-art presented here consists in the development of a novel ΣΔ-based all-digital transmitter architecture capable of multiband and multi-standard data transmission in a very flexible and integrated way, where the pulsed RF output operating in the microwave frequency range is generated inside a single FPGA device. A fundamental contribution regarding the simultaneous transmission of multiple RF signals is then introduced by presenting and describing novel all-digital transmitter architectures that take advantage of multi-gigabit data serializers available on current high-end FPGAs in order to transmit in a time-interleaved approach multiple independent RF carriers. Further improvements in this design approach allowed to provide a two-stage up-conversion transmitter architecture enabling the fine frequency tuning of concurrent multichannel multi-standard signals. Finally, further improvements regarding two key limitations inherent to current all-digital transmitter approaches are then addressed, namely the poor coding efficiency and the combined high quality factor and tunability requirements of the RF output filter. The followed design approach based on poliphase multipath circuits allowed to create a new FPGA-embedded agile transmitter architecture that significantly improves important figures of merit, such as coding efficiency and SNR, while maintains the high flexibility that is required for supporting multichannel multimode data transmission.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main motivation for the work presented here began with previously conducted experiments with a programming concept at the time named "Macro". These experiments led to the conviction that it would be possible to build a system of engine control from scratch, which could eliminate many of the current problems of engine management systems in a direct and intrinsic way. It was also hoped that it would minimize the full range of software and hardware needed to make a final and fully functional system. Initially, this paper proposes to make a comprehensive survey of the state of the art in the specific area of software and corresponding hardware of automotive tools and automotive ECUs. Problems arising from such software will be identified, and it will be clear that practically all of these problems stem directly or indirectly from the fact that we continue to make comprehensive use of extremely long and complex "tool chains". Similarly, in the hardware, it will be argued that the problems stem from the extreme complexity and inter-dependency inside processor architectures. The conclusions are presented through an extensive list of "pitfalls" which will be thoroughly enumerated, identified and characterized. Solutions will also be proposed for the various current issues and for the implementation of these same solutions. All this final work will be part of a "proof-of-concept" system called "ECU2010". The central element of this system is the before mentioned "Macro" concept, which is an graphical block representing one of many operations required in a automotive system having arithmetic, logic, filtering, integration, multiplexing functions among others. The end result of the proposed work is a single tool, fully integrated, enabling the development and management of the entire system in one simple visual interface. Part of the presented result relies on a hardware platform fully adapted to the software, as well as enabling high flexibility and scalability in addition to using exactly the same technology for ECU, data logger and peripherals alike. Current systems rely on a mostly evolutionary path, only allowing online calibration of parameters, but never the online alteration of their own automotive functionality algorithms. By contrast, the system developed and described in this thesis had the advantage of following a "clean-slate" approach, whereby everything could be rethought globally. In the end, out of all the system characteristics, "LIVE-Prototyping" is the most relevant feature, allowing the adjustment of automotive algorithms (eg. Injection, ignition, lambda control, etc.) 100% online, keeping the engine constantly working, without ever having to stop or reboot to make such changes. This consequently eliminates any "turnaround delay" typically present in current automotive systems, thereby enhancing the efficiency and handling of such systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, an open source solution for measurement of temperature and ultrasonic signals (RF-lines) is proposed. This software is an alternative to the expensive commercial data acquisition software, enabling the user to tune applications to particular acquisition architectures. The collected ultrasonic and temperature signals were used for non-invasive temperature estimation using neural networks. The existence of precise temperature estimators is an essential point aiming at the secure and effective applica tion of thermal therapies in humans. If such estimators exist then effective controllers could be developed for the therapeutic instrumentation. In previous works the time-shift between RF-lines echoes were extracted, and used for creation of neural networks estimators. The obtained estimators successfully represent the temperature in the time-space domain, achieving a maximum absolute error inferior to the threshold value defined for hyperthermia/diathermia applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Algorithm and Architectures for Real-Time Control Workshop had the objective to investigate the state of the art and to present new research and application results in software and hardware for real-timecontrol, as well as to bring together engeneers and computer scientists who are researchers, developers and practitioners, both from the academic and the industrial world.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recent trends of chip architectures with higher number of heterogeneous cores, and non-uniform memory/non-coherent caches, brings renewed attention to the use of Software Transactional Memory (STM) as a fundamental building block for developing parallel applications. Nevertheless, although STM promises to ease concurrent and parallel software development, it relies on the possibility of aborting conflicting transactions to maintain data consistency, which impacts on the responsiveness and timing guarantees required by embedded real-time systems. In these systems, contention delays must be (efficiently) limited so that the response times of tasks executing transactions are upper-bounded and task sets can be feasibly scheduled. In this paper we assess the use of STM in the development of embedded real-time software, defending that the amount of contention can be reduced if read-only transactions access recent consistent data snapshots, progressing in a wait-free manner. We show how the required number of versions of a shared object can be calculated for a set of tasks. We also outline an algorithm to manage conflicts between update transactions that prevents starvation.