Biblioteca Digital

947 resultados para FPGA, Elettronica digitale, Sintesi logica

Wavelet packet transforms for system-on-chip applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A methodology for the production of silicon cores for wavelet packet decomposition has been developed. The scheme utilizes efficient scalable architectures for both orthonormal and biorthogonal wavelet transforms. The cores produced from these architectures can be readily scaled for any wavelet function and are easily configurable for any subband structure. The cores are fully parameterized in terms of wavelet choice and appropriate wordlengths. Designs produced are portable across a range of silicon foundries as well as FPGA and PLD technologies. A number of exemplar implementations have been produced.

Multi-Modal CTL:Completeness, Complexity, and an Application

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We define a multi-modal version of Computation Tree Logic (ctl) by extending the language with path quantifiers E and A where d denotes one of finitely many dimensions, interpreted over Kripke structures with one total relation for each dimension. As expected, the logic is axiomatised by taking a copy of a ctl axiomatisation for each dimension. Completeness is proved by employing the completeness result for ctl to obtain a model along each dimension in turn. We also show that the logic is decidable and that its satisfiability problem is no harder than the corresponding problem for ctl. We then demonstrate how Normative Systems can be conceived as a natural interpretation of such a multi-dimensional ctl logic. © 2009 Springer Science+Business Media B.V.

Hardware Comparison of the ISO/IEC 29192-2 Block Ciphers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As ubiquitous computing becomes a reality, sensitive information is increasingly processed and transmitted by smart cards, mobile devices and various types of embedded systems. This has led to the requirement of a new class of lightweight cryptographic algorithm to ensure security in these resource constrained environments. The International Organization for Standardization (ISO) has recently standardised two low-cost block ciphers for this purpose, Clefia and Present. In this paper we provide the first comprehensive hardware architecture comparison between these ciphers, as well as a comparison with the current National Institute of Standards and Technology (NIST) standard, the Advanced Encryption Standard.

The impact of global routing on the performance of NoCs in FPGAs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the over-provisioned routing resource on FPGA, the topology choice for NoC implementation on FPGA is more flexible than on ASIC. However, it is well understood that the global wire routing impacts the performance of NoC on FPGA because the topology is routed by using fixed routing fabric. An important question that arises is: will the benefit of diameter reduction by using a highly connective topology outweigh the impact of global routing? To answer this question, we investigate FPGA based packet switched NoC implementations with different sizes and topologies, and quantitatively measure the impact of global routing to each of these networks. The result shows that with sufficient routing resources on modern FPGA, the global routing is not on the critical path of the system, and thus is not a dominating factor for the performance of practical multi-hop NoC system. © 2011 IEEE.

Shortening design time through multiplatform simulations with a portable OpenCL golden-model: the LDPC decoder case

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hardware designers and engineers typically need to explore a multi-parametric design space in order to find the best configuration for their designs using simulations that can take weeks to months to complete. For example, designers of special purpose chips need to explore parameters such as the optimal bitwidth and data representation. This is the case for the development of complex algorithms such as Low-Density Parity-Check (LDPC) decoders used in modern communication systems. Currently, high-performance computing offers a wide set of acceleration options, that range from multicore CPUs to graphics processing units (GPUs) and FPGAs. Depending on the simulation requirements, the ideal architecture to use can vary. In this paper we propose a new design flow based on OpenCL, a unified multiplatform programming model, which accelerates LDPC decoding simulations, thereby significantly reducing architectural exploration and design time. OpenCL-based parallel kernels are used without modifications or code tuning on multicore CPUs, GPUs and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL for mapping the simulations into FPGAs. To the best of our knowledge, this is the first time that a single, unmodified OpenCL code is used to target those three different platforms. We show that, depending on the design parameters to be explored in the simulation, on the dimension and phase of the design, the GPU or the FPGA may suit different purposes more conveniently, providing different acceleration factors. For example, although simulations can typically execute more than 3x faster on FPGAs than on GPUs, the overhead of circuit synthesis often outweighs the benefits of FPGA-accelerated execution.

High-Speed Fully Homomorphic Encryption Over the Integers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A fully homomorphic encryption (FHE) scheme is envisioned as a key cryptographic tool in building a secure and reliable cloud computing environment, as it allows arbitrary evaluation of a ciphertext without revealing the plaintext. However, existing FHE implementations remain impractical due to very high time and resource costs. To the authors’ knowledge, this paper presents the first hardware implementation of a full encryption primitive for FHE over the integers using FPGA technology. A large-integer multiplier architecture utilising Integer-FFT multiplication is proposed, and a large-integer Barrett modular reduction module is designed incorporating the proposed multiplier. The encryption primitive used in the integer-based FHE scheme is designed employing the proposed multiplier and modular reduction modules. The designs are verified using the Xilinx Virtex-7 FPGA platform. Experimental results show that a speed improvement factor of up to 44 is achievable for the hardware implementation of the FHE encryption scheme when compared to its corresponding software implementation. Moreover, performance analysis shows further speed improvements of the integer-based FHE encryption primitives may still be possible, for example through further optimisations or by targeting an ASIC platform.

High-radix systolic modular multiplication on reconfigurable hardware

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The overall aim of the work presented in this paper has been to develop Montgomery modular multiplication architectures suitable for implementation on modern reconfigurable hardware. Accordingly, novel high-radix systolic array Montgomery multiplier designs are presented, as we believe that the inherent regular structure and absence of global interconnect associated with these, make them well-suited for implementation on modern FPGAs. Unlike previous approaches, each processing element (PE) comprises both an adder and a multiplier. The inclusion of a multiplier in the PE means that the need to pre-compute or store any multiples of the operands is avoided. This also allows very high-radix implementations to be realised, further reducing the amount of clock cycles per modular multiplication, while still maintaining a competitive critical delay. For demonstrative purposes, 512-bit and 1024-bit FPGA implementations using radices of 2(8) and 2(16) are presented. The subsequent throughput rates are the fastest reported to date.

Rapid design of discrete orthonormal wavelet transforms using silicon IP components

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A rapid design methodology for orthonormal wavelet transform cores has been developed. This methodology is based on a generic, scaleable architecture utilising time-interleaved coefficients for the wavelet transform filters. The architecture has been captured in VHDL and parameterised in terms of wavelet family, wavelet type, data word length and coefficient word length. The control circuit is embedded within the cores and allows them to be cascaded without any interface glue logic for any desired level of decomposition. Case studies for stand alone and cascaded silicon cores for single and multi-stage wavelet analysis respectively are reported. The design time to produce silicon layout of a wavelet based system has been reduced to typically less than a day. The cores are comparable in area and performance to handcrafted designs. The designs are portable across a range of foundries and are also applicable to FPGA and PLD implementations.

A Parameterisable Motion Estimation Core

Relevância:

10.00% 10.00%

Publicador:

A Hardware Acceleration Scheme for Memory-Efficient Flow Processing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a hardware solution for network flow processing at full line rate. Advanced memory architecture using DDR3 SDRAMs is proposed to cope with the flow match limitations in packet throughput, number of supported flows and number of packet header fields (or tuples) supported for flow identifications. The described architecture has been prototyped for accommodating 8 million flows, and tested on an FPGA platform achieving a minimum of 70 million lookups per second. This is sufficient to process internet traffic flows at 40 Gigabit Ethernet.

Accelerating integer-based fully homomorphic encryption using Comba multiplication

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fully Homomorphic Encryption (FHE) is a recently developed cryptographic technique which allows computations on encrypted data. There are many interesting applications for this encryption method, especially within cloud computing. However, the computational complexity is such that it is not yet practical for real-time applications. This work proposes optimised hardware architectures of the encryption step of an integer-based FHE scheme with the aim of improving its practicality. A low-area design and a high-speed parallel design are proposed and implemented on a Xilinx Virtex-7 FPGA, targeting the available DSP slices, which offer high-speed multiplication and accumulation. Both use the Comba multiplication scheduling method to manage the large multiplications required with uneven sized multiplicands and to minimise the number of read and write operations to RAM. Results show that speed up factors of 3.6 and 10.4 can be achieved for the encryption step with medium-sized security parameters for the low-area and parallel designs respectively, compared to the benchmark software implementation on an Intel Core2 Duo E8400 platform running at 3 GHz.

Enhancing design space exploration by extending CPU/GPU specifications onto FPGAs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.

Improving RO PUF design using frequency distribution characteristics

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A Physical Unclonable Function (PUF) can be used to provide authentication of devices by producing die-unique responses. In PUFs based on ring oscillators (ROs) the responses are derived from the oscillation frequencies of the ROs. However, RO PUFs can be vulnerable to attack due to the frequency distribution characteristics of the RO arrays. In this letter, in order to improve the design of RO PUFs for FPGA devices, the frequencies of RO arrays implemented on a large number of FPGA chips are statistically analyzed. Three RO frequency distribution (ROFD) characteristics, which can be used to improve the design of RO PUFs are observed and discussed.

Very high speed 17 Gbps SHACAL encryption architecture

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Very high speed and low area hardware architectures of the SHACAL-1 encryption algorithm are presented in this paper. The SHACAL algorithm was a submission to the New European Schemes for Signatures, Integrity and Encryption (NESSIE) project and it is based on the SHA-1 hash algorithm. To date, there have been no performance metrics published on hardware implementations of this algorithm. A fully pipelined SHACAL-1 encryption architecture is described in this paper and when implemented on a Virtex-II X2V4000 FPGA device, it runs at a throughput of 17 Gbps. A fully pipelined decryption architecture achieves a speed of 13 Gbps when implemented on the same device. In addition, iterative architectures of the algorithm are presented. The SHACAL-1 decryption algorithm is derived and also presented in this paper, since it was not provided in the submission to NESSIE. © Springer-Verlag Berlin Heidelberg 2003.

RO PUF Design in FPGAs with new comparison strategies

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A Physical Unclonable Function (PUF) can be used to provide authentication of devices by producing die-unique responses. In PUFs based on ring oscillators (ROs), the responses are derived from the oscillation frequencies of the ROs. However, RO PUFs can be vulnerable to attack due to the frequency distribution characteristics of the RO arrays. In this paper, in order to improve the design of RO PUFs for FPGA devices, the frequencies of RO arrays implemented on a large number of FPGA chips are statistically analyzed. Three RO frequency distribution (ROFD) characteristics are observed and discussed. Based on these ROFD characteristics, two RO comparison strategies are proposed that can be used to improve the design of RO PUFs. It is found that the symmetrical RO comparison strategy has the highest entropy density.

«
1
2
...
55
56
57
58
59
60
61
...
63
64
»