5 resultados para Optimisations
Resumo:
Parallelizing compilers have difficulty analysing and optimising complex code. To address this, some analysis may be delayed until run-time, and techniques such as speculative execution used. Furthermore, to enhance performance, a feedback loop may be setup between the compile time and run-time analysis systems, as in iterative compilation. To extend this, it is proposed that the run-time analysis collects information about the values of variables not already determined, and estimates a probability measure for the sampled values. These measures may be used to guide optimisations in further analyses of the program. To address the problem of variables with measures as values, this paper also presents an outline of a novel combination of previous probabilistic denotational semantics models, applied to a simple imperative language.
Resumo:
Randomising set index functions can reduce the number of conflict misses in data caches by spreading the cache blocks uniformly over all sets. Typically, the randomisation functions compute the exclusive ors of several address bits. Not all randomising set index functions perform equally well, which calls for the evaluation of many set index functions. This paper discusses and improves a technique that tackles this problem by predicting the miss rate incurred by a randomisation function, based on profiling information. A new way of looking at randomisation functions is used, namely the null space of the randomisation function. The members of the null space describe pairs of cache blocks that are mapped to the same set. This paper presents an analytical model of the error made by the technique and uses this to propose several optimisations to the technique. The technique is then applied to generate a conflict-free randomisation function for the SPEC benchmarks. (C) 2003 Elsevier Science B.V. All rights reserved.
Resumo:
A fully homomorphic encryption (FHE) scheme is envisioned as a key cryptographic tool in building a secure and reliable cloud computing environment, as it allows arbitrary evaluation of a ciphertext without revealing the plaintext. However, existing FHE implementations remain impractical due to very high time and resource costs. To the authors’ knowledge, this paper presents the first hardware implementation of a full encryption primitive for FHE over the integers using FPGA technology. A large-integer multiplier architecture utilising Integer-FFT multiplication is proposed, and a large-integer Barrett modular reduction module is designed incorporating the proposed multiplier. The encryption primitive used in the integer-based FHE scheme is designed employing the proposed multiplier and modular reduction modules. The designs are verified using the Xilinx Virtex-7 FPGA platform. Experimental results show that a speed improvement factor of up to 44 is achievable for the hardware implementation of the FHE encryption scheme when compared to its corresponding software implementation. Moreover, performance analysis shows further speed improvements of the integer-based FHE encryption primitives may still be possible, for example through further optimisations or by targeting an ASIC platform.
Resumo:
This study proposes an approach to optimally allocate multiple types of flexible AC transmission system (FACTS) devices in market-based power systems with wind generation. The main objective is to maximise profit by minimising device investment cost, and the system's operating cost considering both normal conditions and possible contingencies. The proposed method accurately evaluates the long-term costs and benefits gained by FACTS devices (FDs) installation to solve a large-scale optimisation problem. The objective implies maximising social welfare as well as minimising compensations paid for generation re-scheduling and load shedding. Many technical operation constraints and uncertainties are included in problem formulation. The overall problem is solved using both particle swarm optimisations for attaining optimal FDs allocation as main problem and optimal power flow as sub-optimisation problem. The effectiveness of the proposed approach is demonstrated on modified IEEE 14-bus test system and IEEE 118-bus test system.
Resumo:
Current data-intensive image processing applications push traditional embedded architectures to their limits. FPGA based hardware acceleration is a potential solution but the programmability gap and time consuming HDL design flow is significant. The proposed research approach to develop “FPGA based programmable hardware acceleration platform” that uses, large number of Streaming Image processing Processors (SIPPro) potentially addresses these issues. SIPPro is pipelined in-order soft-core processor architecture with specific optimisations for image processing applications. Each SIPPro core uses 1 DSP48, 2 Block RAMs and 370 slice-registers, making the processor as compact as possible whilst maintaining flexibility and programmability. It is area efficient, scalable and high performance softcore architecture capable of delivering 530 MIPS per core using Xilinx Zynq SoC (ZC7Z020-3). To evaluate the feasibility of the proposed architecture, a Traffic Sign Recognition (TSR) algorithm has been prototyped on a Zedboard with the color and morphology operations accelerated using multiple SIPPros. Simulation and experimental results demonstrate that the processing platform is able to achieve a speedup of 15 and 33 times for color filtering and morphology operations respectively, with a significant reduced design effort and time.