782 resultados para FPGA, VHDL, Picoblaze, SERDES
Resumo:
Very high speed and low area hardware architectures of the SHACAL-1 encryption algorithm are presented in this paper. The SHACAL algorithm was a submission to the New European Schemes for Signatures, Integrity and Encryption (NESSIE) project and it is based on the SHA-1 hash algorithm. To date, there have been no performance metrics published on hardware implementations of this algorithm. A fully pipelined SHACAL-1 encryption architecture is described in this paper and when implemented on a Virtex-II X2V4000 FPGA device, it runs at a throughput of 17 Gbps. A fully pipelined decryption architecture achieves a speed of 13 Gbps when implemented on the same device. In addition, iterative architectures of the algorithm are presented. The SHACAL-1 decryption algorithm is derived and also presented in this paper, since it was not provided in the submission to NESSIE. © Springer-Verlag Berlin Heidelberg 2003.
Resumo:
A Physical Unclonable Function (PUF) can be used to provide authentication of devices by producing die-unique responses. In PUFs based on ring oscillators (ROs), the responses are derived from the oscillation frequencies of the ROs. However, RO PUFs can be vulnerable to attack due to the frequency distribution characteristics of the RO arrays. In this paper, in order to improve the design of RO PUFs for FPGA devices, the frequencies of RO arrays implemented on a large number of FPGA chips are statistically analyzed. Three RO frequency distribution (ROFD) characteristics are observed and discussed. Based on these ROFD characteristics, two RO comparison strategies are proposed that can be used to improve the design of RO PUFs. It is found that the symmetrical RO comparison strategy has the highest entropy density.
Resumo:
We describe a pre-processing correlation attack on an FPGA implementation of AES, protected with a random clocking countermeasure that exhibits complex variations in both the location and amplitude of the power consumption patterns of the AES rounds. It is demonstrated that the merged round patterns can be pre-processed to identify and extract the individual round amplitudes, enabling a successful power analysis attack. We show that the requirement of the random clocking countermeasure to provide a varying execution time between processing rounds can be exploited to select a sub-set of data where sufficient current decay has occurred, further improving the attack. In comparison with the countermeasure's estimated security of 3 million traces from an integration attack, we show that through application of our proposed techniques that the countermeasure can now be broken with as few as 13k traces.
Resumo:
Large integer multiplication is a major performance bottleneck in fully homomorphic encryption (FHE) schemes over the integers. In this paper two optimised multiplier architectures for large integer multiplication are proposed. The first of these is a low-latency hardware architecture of an integer-FFT multiplier. Secondly, the use of low Hamming weight (LHW) parameters is applied to create a novel hardware architecture for large integer multiplication in integer-based FHE schemes. The proposed architectures are implemented, verified and compared on the Xilinx Virtex-7 FPGA platform. Finally, the proposed implementations are employed to evaluate the large multiplication in the encryption step of FHE over the integers. The analysis shows a speed improvement factor of up to 26.2 for the low-latency design compared to the corresponding original integer-based FHE software implementation. When the proposed LHW architecture is combined with the low-latency integer-FFT accelerator to evaluate a single FHE encryption operation, the performance results show that a speed improvement by a factor of approximately 130 is possible.
Resumo:
WHIRLBOB, also known as STRIBOBr2, is an AEAD (Authenticated Encryption with Associated Data) algorithm derived from STRIBOBr1 and the Whirlpool hash algorithm. WHIRLBOB/STRIBOBr2 is a second round candidate in the CAESAR competition. As with STRIBOBr1, the reduced-size Sponge design has a strong provable security link with a standardized hash algorithm. The new design utilizes only the LPS or ρ component of Whirlpool in flexibly domain-separated BLNK Sponge mode. The number of rounds is increased from 10 to 12 as a countermeasure against Rebound Distinguishing attacks. The 8 ×8 - bit S-Box used by Whirlpool and WHIRLBOB is constructed from 4 ×4 - bit “MiniBoxes”. We report on fast constant-time Intel SSSE3 and ARM NEON SIMD WHIRLBOB implementations that keep full miniboxes in registers and access them via SIMD shuffles. This is an efficient countermeasure against AES-style cache timing side-channel attacks. Another main advantage of WHIRLBOB over STRIBOBr1 (and most other AEADs) is its greatly reduced implementation footprint on lightweight platforms. On many lower-end microcontrollers the total software footprint of π+BLNK = WHIRLBOB AEAD is less than half a kilobyte. We also report an FPGA implementation that requires 4,946 logic units for a single round of WHIRLBOB, which compares favorably to 7,972 required for Keccak / Keyak on the same target platform. The relatively small S-Box gate count also enables efficient 64-bit bitsliced straight-line implementations. We finally present some discussion and analysis on the relationships between WHIRLBOB, Whirlpool, the Russian GOST Streebog hash, and the recent draft Russian Encryption Standard Kuznyechik.
Resumo:
The upcoming IEEE 802.11ac standard boosts the throughput of previous IEEE 802.11n by adding wider 80 MHz and 160 MHz channels with up to 8 antennas (versus 40 MHz channel and 4 antennas in 802.11n). This necessitates new 1-8 stream 256/512-point Fast Fourier Transform (FFT) / inverse FFT (IFFT) processing with 80/160 MSample/s throughput. Although there are abundant related work, they all fail to meet the requirements of IEEE 802.11ac FFT/IFFT on point size, throughput and multiple data streams at the same time. This paper proposes the first software defined FFT/IFFT architecture as a solution. By making use of a customised soft stream processor on FPGA, we show how a software defined FFT architecture can meet all the requirements of IEEE 802.11ac with low cost and high resource efficiency. When compared with dedicated Xilinx FFT core, our implementation exhibits only one third of the resources also up to three times of resource efficiency.
Resumo:
Software-programmable `soft' processors have shown tremendous potential for efficient realisation of high performance signal processing operations on Field Programmable Gate Array (FPGA), whilst lowering the design burden by avoiding the need to design fine-grained custom circuit archi-tectures. However, the complex data access patterns, high memory bandwidth and computational requirements of sliding window applications, such as Motion Estimation (ME) and Matrix Multiplication (MM), lead to low performance, inefficient soft processor realisations. This paper resolves this issue, showing how by adding support for block data addressing and accelerators for high performance loop execution, performance and resource efficiency over four times better than current best-in-class metrics can be achieved. In addition, it demonstrates the first recorded real-time soft ME estimation realisation for H.263 systems.
Resumo:
Pre-processing (PP) of received symbol vector and channel matrices is an essential pre-requisite operation for Sphere Decoder (SD)-based detection of Multiple-Input Multiple-Output (MIMO) wireless systems. PP is a highly complex operation, but relative to the total SD workload it represents a relatively small fraction of the overall computational cost of detecting an OFDM MIMO frame in standards such as 802.11n. Despite this, real-time PP architectures are highly inefficient, dominating the resource cost of real-time SD architectures. This paper resolves this issue. By reorganising the ordering and QR decomposition sub operations of PP, we describe a Field Programmable Gate Array (FPGA)-based PP architecture for the Fixed Complexity Sphere Decoder (FSD) applied to 4 × 4 802.11n MIMO which reduces resource cost by 50% as compared to state-of-the-art solutions whilst maintaining real-time performance.
Resumo:
Field programmable gate array (FPGA) technology is a powerful platform for implementing computationally complex, digital signal processing (DSP) systems. Applications that are multi-modal, however, are designed for worse case conditions. In this paper, genetic sequencing techniques are applied to give a more sophisticated decomposition of the algorithmic variations, thus allowing an unified hardware architecture which gives a 10-25% area saving and 15% power saving for a digital radar receiver.
Resumo:
Lattice-based cryptography has gained credence recently as a replacement for current public-key cryptosystems, due to its quantum-resilience, versatility, and relatively low key sizes. To date, encryption based on the learning with errors (LWE) problem has only been investigated from an ideal lattice standpoint, due to its computation and size efficiencies. However, a thorough investigation of standard lattices in practice has yet to be considered. Standard lattices may be preferred to ideal lattices due to their stronger security assumptions and less restrictive parameter selection process. In this paper, an area-optimised hardware architecture of a standard lattice-based cryptographic scheme is proposed. The design is implemented on a FPGA and it is found that both encryption and decryption fit comfortably on a Spartan-6 FPGA. This is the first hardware architecture for standard lattice-based cryptography reported in the literature to date, and thus is a benchmark for future implementations.
Additionally, a revised discrete Gaussian sampler is proposed which is the fastest of its type to date, and also is the first to investigate the cost savings of implementing with lamda_2-bits of precision. Performance results are promising in comparison to the hardware designs of the equivalent ring-LWE scheme, which in addition to providing a stronger security proof; generate 1272 encryptions per second and 4395 decryptions per second.
Resumo:
Power capping is a fundamental method for reducing the energy consumption of a wide range of modern computing environments, ranging from mobile embedded systems to datacentres. Unfortunately, maximising performance and system efficiency under static power caps remains challenging, while maximising performance under dynamic power caps has been largely unexplored. We present an adaptive power capping method that reduces the power consumption and maximizes the performance of heterogeneous SoCs for mobile and server platforms. Our technique combines power capping with coordinated DVFS, data partitioning and core allocations on a heterogeneous SoC with ARM processors and FPGA resources. We design our framework as a run-time system based on OpenMP and OpenCL to utilise the heterogeneous resources. We evaluate it through five data-parallel benchmarks on the Xilinx SoC which allows fully voltage and frequency control. Our experiments show a significant performance boost of 30% under dynamic power caps with concurrent execution on ARM and FPGA, compared to a naive separate approach.
Resumo:
Cryptographic algorithms have been designed to be computationally secure, however it has been shown that when they are implemented in hardware, that these devices leak side channel information that can be used to mount an attack that recovers the secret encryption key. In this paper an overlapping window power spectral density (PSD) side channel attack, targeting an FPGA device running the Advanced Encryption Standard is proposed. This improves upon previous research into PSD attacks by reducing the amount of pre-processing (effort) required. It is shown that the proposed overlapping window method requires less processing effort than that of using a sliding window approach, whilst overcoming the issues of sampling boundaries. The method is shown to be effective for both aligned and misaligned data sets and is therefore recommended as an improved approach in comparison with existing time domain based correlation attacks.
Resumo:
Esta tese apresenta um estudo exploratório sobre sistemas de comunicação por luz visível e as suas aplicações em sistemas de transporte inteligentes como forma a melhorar a segurança nas estradas. Foram desenvolvidos neste trabalho, modelos conceptuais e analíticos adequados à caracterização deste tipo de sistemas. Foi desenvolvido um protótipo de baixo custo, capaz de suportar a disseminação de informação utilizando semáforos. A sua realização carece de um estudo detalhado, nomeadamente: i) foi necessário obter modelos capazes de descrever os padrões de radiação numa área de serviço pré-definida; ii) foi necessário caracterizar o meio de comunicações; iii) foi necessário estudar o comportamento de vários esquemas de modulação de forma a optar pelo mais robusto; finalmente, iv) obter a implementação do sistema baseado em FPGA e componentes discretos. O protótipo implementado foi testado em condições reais. Os resultados alcançados mostram os méritos desta solução, chegando mesmo a encorajar a utilização desta tecnologia em outros cenários de aplicação.
Resumo:
Flexible radio transmitters based on the Software-Defined Radio (SDR) concept are gaining an increased research importance due to the unparalleled proliferation of new wireless standards operating at different frequencies, using dissimilar coding and modulation schemes, and targeted for different ends. In this new wireless communications paradigm, the physical layer of the radio transmitter must be able to support the simultaneous transmission of multi-band, multi-rate, multi-standard signals, which in practice is very hard or very inefficient to implement using conventional approaches. Nevertheless, the last developments in this field include novel all-digital transmitter architectures where the radio datapath is digital from the baseband up to the RF stage. Such concept has inherent high flexibility and poses an important step towards the development of SDR-based transmitters. However, the truth is that implementing such radio for a real world communications scenario is a challenging task, where a few key limitations are still preventing a wider adoption of this concept. This thesis aims exactly to address some of these limitations by proposing and implementing innovative all-digital transmitter architectures with inherent higher flexibility and integration, and where improving important figures of merit, such as coding efficiency, signal-to-noise ratio, usable bandwidth and in-band and out-of-band noise will also be addressed. In the first part of this thesis, the concept of transmitting RF data using an entirely digital approach based on pulsed modulation is introduced. A comparison between several implementation technologies is also presented, allowing to state that FPGAs provide an interesting compromise between performance, power efficiency and flexibility, thus making them an interesting choice as an enabling technology for pulse-based all-digital transmitters. Following this discussion, the fundamental concepts inherent to pulsed modulators, its key advantages, main limitations and typical enhancements suitable for all-digital transmitters are also presented. The recent advances regarding the two most common classes of pulse modulated transmitters, namely the RF and the baseband-level are introduced, along with several examples of state-of-the-art architectures found on the literature. The core of this dissertation containing the main developments achieved during this PhD work is then presented and discussed. The first key contribution to the state-of-the-art presented here consists in the development of a novel ΣΔ-based all-digital transmitter architecture capable of multiband and multi-standard data transmission in a very flexible and integrated way, where the pulsed RF output operating in the microwave frequency range is generated inside a single FPGA device. A fundamental contribution regarding the simultaneous transmission of multiple RF signals is then introduced by presenting and describing novel all-digital transmitter architectures that take advantage of multi-gigabit data serializers available on current high-end FPGAs in order to transmit in a time-interleaved approach multiple independent RF carriers. Further improvements in this design approach allowed to provide a two-stage up-conversion transmitter architecture enabling the fine frequency tuning of concurrent multichannel multi-standard signals. Finally, further improvements regarding two key limitations inherent to current all-digital transmitter approaches are then addressed, namely the poor coding efficiency and the combined high quality factor and tunability requirements of the RF output filter. The followed design approach based on poliphase multipath circuits allowed to create a new FPGA-embedded agile transmitter architecture that significantly improves important figures of merit, such as coding efficiency and SNR, while maintains the high flexibility that is required for supporting multichannel multimode data transmission.
Resumo:
In this paper digital part of a self-calibrating quadrature-receiver is described, containing a digital calibration-engine. The blind source-separation-based calibration-engine eliminates the RF-impairments in real-time hence improving the receiver's performance without the need for test/pilot tones, trimming or use of power-hungry discrete components. Furthermore, an efficient time-multiplexed calibration-engine architecture is proposed and implemented on an FPGA utilising a reduced-range multiplier structure. The use of reduced-range multipliers results in substantial reduction of area as well as power consumption without a compromise in performance when compared with an efficiently designed general purpose multiplier. The performance of the calibration-engine does not depend on the modulation format or the constellation size of the received signal; hence it can be easily integrated into the digital signal processing paths of any receiver.