29 resultados para Hardware Transactional Memory
Resumo:
Contention on the memory bus in COTS based multicore systems is becoming a major determining factor of the execution time of a task. Analyzing this extra execution time is non-trivial because (i) bus arbitration protocols in such systems are often undocumented and (ii) the times when the memory bus is requested to be used are not explicitly controlled by the operating system scheduler; they are instead a result of cache misses. We present a method for finding an upper bound on the extra execution time of a task due to contention on the memory bus in COTS based multicore systems. This method makes no assumptions on the bus arbitration protocol (other than assuming that it is work-conserving).
Resumo:
Prepared for presentation at the Portuguese Finance Network International Conference 2014, Vilamoura, Portugal, June 18-20
Resumo:
Reconfigurable computing experienced a considerable expansion in the last few years, due in part to the fast run-time partial reconfiguration features offered by recent SRAM-based Field Programmable Gate Arrays (FPGAs), which allowed the implementation in real-time of dynamic resource allocation strategies, with multiple independent functions from different applications sharing the same logic resources in the space and temporal domains. However, when the sequence of reconfigurations to be performed is not predictable, the efficient management of the logic space available becomes the greatest challenge posed to these systems. Resource allocation decisions have to be made concurrently with system operation, taking into account function priorities and optimizing the space currently available. As a consequence of the unpredictability of this allocation procedure, the logic space becomes fragmented, with many small areas of free resources failing to satisfy most requests and so remaining unused. A rearrangement of the currently running functions is therefore necessary, so as to obtain enough contiguous space to implement incoming functions, avoiding the spreading of their components and the resulting degradation of system performance. A novel active relocation procedure for Configurable Logic Blocks (CLBs) is herein presented, able to carry out online rearrangements, defragmenting the available FPGA resources without disturbing functions currently running.
Resumo:
Mestrado em Engenharia Informática, Área de Especialização em Tecnologias do Conhecimento e da Decisão
Resumo:
The goal of this study was to propose a new functional magnetic resonance imaging (fMRI) paradigm using a language-free adaptation of a 2-back working memory task to avoid cultural and educational bias. We additionally provide an index of the validity of the proposed paradigm and test whether the experimental task discriminates the behavioural performances of healthy participants from those of individuals with working memory deficits. Ten healthy participants and nine patients presenting working memory (WM) deficits due to acquired brain injury (ABI) performed the developed task. To inspect whether the paradigm activates brain areas typically involved in visual working memory (VWM), brain activation of the healthy participants was assessed with fMRIs. To examine the task's capacity to discriminate behavioural data, performances of the healthy participants in the task were compared with those of ABI patients. Data were analysed with GLM-based random effects procedures and t-tests. We found an increase of the BOLD signal in the specialized areas of VWM. Concerning behavioural performances, healthy participants showed the predicted pattern of more hits, less omissions and a tendency for fewer false alarms, more self-corrected responses, and faster reaction times, when compared with subjects presenting WM impairments. The results suggest that this task activates brain areas involved in VWM and discriminates behavioural performances of clinical and non-clinical groups. It can thus be used as a research methodology for behavioural and neuroimaging studies of VWM in block-design paradigms.
Resumo:
Coarse Grained Reconfigurable Architectures (CGRAs) are emerging as enabling platforms to meet the high performance demanded by modern applications (e.g. 4G, CDMA, etc.). Recently proposed CGRAs offer time-multiplexing and dynamic applications parallelism to enhance device utilization and reduce energy consumption at the cost of additional memory (up to 50% area of the overall platform). To reduce the memory overheads, novel CGRAs employ either statistical compression, intermediate compact representation, or multicasting. Each compaction technique has different properties (i.e. compression ratio, decompression time and decompression energy) and is best suited for a particular class of applications. However, existing research only deals with these methods separately. Moreover, they only analyze the compaction ratio and do not evaluate the associated energy overheads. To tackle these issues, we propose a polymorphic compression architecture that interleaves these techniques in a unique platform. The proposed architecture allows each application to take advantage of a separate compression/decompression hierarchy (consisting of various types and implementations of hardware/software decoders) tailored to its needs. Simulation results, using different applications (FFT, Matrix multiplication, and WLAN), reveal that the choice of compression hierarchy has a significant impact on compression ratio (up to 52%), decompression energy (up to 4 orders of magnitude), and configuration time (from 33 n to 1.5 s) for the tested applications. Synthesis results reveal that introducing adaptivity incurs negligible additional overheads (1%) compared to the overall platform area.
Resumo:
The last decade has witnessed a major shift towards the deployment of embedded applications on multi-core platforms. However, real-time applications have not been able to fully benefit from this transition, as the computational gains offered by multi-cores are often offset by performance degradation due to shared resources, such as main memory. To efficiently use multi-core platforms for real-time systems, it is hence essential to tightly bound the interference when accessing shared resources. Although there has been much recent work in this area, a remaining key problem is to address the diversity of memory arbiters in the analysis to make it applicable to a wide range of systems. This work handles diverse arbiters by proposing a general framework to compute the maximum interference caused by the shared memory bus and its impact on the execution time of the tasks running on the cores, considering different bus arbiters. Our novel approach clearly demarcates the arbiter-dependent and independent stages in the analysis of these upper bounds. The arbiter-dependent phase takes the arbiter and the task memory-traffic pattern as inputs and produces a model of the availability of the bus to a given task. Then, based on the availability of the bus, the arbiter-independent phase determines the worst-case request-release scenario that maximizes the interference experienced by the tasks due to the contention for the bus. We show that the framework addresses the diversity problem by applying it to a memory bus shared by a fixed-priority arbiter, a time-division multiplexing (TDM) arbiter, and an unspecified work-conserving arbiter using applications from the MediaBench test suite. We also experimentally evaluate the quality of the analysis by comparison with a state-of-the-art TDM analysis approach and consistently showing a considerable reduction in maximum interference.
Resumo:
Accepted in 13th IEEE Symposium on Embedded Systems for Real-Time Multimedia (ESTIMedia 2015), Amsterdam, Netherlands.
Resumo:
Work in Progress Session, 21st IEEE Real-Time and Embedded Techonology and Applications Symposium (RTAS 2015). 13 to 16, Apr, 2015, pp 27-28. Seattle, U.S.A..
Resumo:
This paper analyzes several natural and man-made complex phenomena in the perspective of dynamical systems. Such phenomena are often characterized by the absence of a characteristic length-scale, long range correlations and persistent memory, which are features also associated to fractional order systems. For each system, the output, interpreted as a manifestation of the system dynamics, is analyzed by means of the Fourier transform. The amplitude spectrum is approximated by a power law function and the parameters are interpreted as an underlying signature of the system dynamics. The complex systems under analysis are then compared in a global perspective in order to unveil and visualize hidden relationships among them.
Resumo:
4º Festival Nacional de Robótica - Actas do Encontro Científico - Proceedings of the Scientific Meeting, Biblioteca Almeida Garrett, Palácio de Cristal – Porto, 23-24 Abril 2004
Resumo:
23rd International Conference on Real-Time Networks and Systems (RTNS 2015). 4 to 6, Nov, 2015, Main Track. Lille, France. Best Paper Award Nominee
Resumo:
A crescente evolução dos dispositivos contendo circuitos integrados, em especial os FPGAs (Field Programmable Logic Arrays) e atualmente os System on a chip (SoCs) baseados em FPGAs, juntamente com a evolução das ferramentas, tem deixado um espaço entre o lançamento e a produção de materiais didáticos que auxiliem os engenheiros no Co- Projecto de hardware/software a partir dessas tecnologias. Com o intuito de auxiliar na redução desse intervalo temporal, o presente trabalho apresenta o desenvolvimento de documentos (tutoriais) direcionados a duas tecnologias recentes: a ferramenta de desenvolvimento de hardware/software VIVADO; e o SoC Zynq-7000, Z-7010, ambos desenvolvidos pela Xilinx. Os documentos produzidos são baseados num projeto básico totalmente implementado em lógica programável e do mesmo projeto implementado através do processador programável embarcado, para que seja possível avaliar o fluxo de projeto da ferramenta para um projeto totalmente implementado em hardware e o fluxo de projeto para o mesmo projeto implementado numa estrutura de harware/software.
Resumo:
This Thesis has the main target to make a research about FPAA/dpASPs devices and technologies applied to control systems. These devices provide easy way to emulate analog circuits that can be reconfigurable by programming tools from manufactures and in case of dpASPs are able to be dynamically reconfigurable on the fly. It is described different kinds of technologies commercially available and also academic projects from researcher groups. These technologies are very recent and are in ramp up development to achieve a level of flexibility and integration to penetrate more easily the market. As occurs with CPLD/FPGAs, the FPAA/dpASPs technologies have the target to increase the productivity, reducing the development time and make easier future hardware reconfigurations reducing the costs. FPAA/dpAsps still have some limitations comparing with the classic analog circuits due to lower working frequencies and emulation of complex circuits that require more components inside the integrated circuit. However, they have great advantages in sensor signal condition, filter circuits and control systems. This thesis focuses practical implementations of these technologies to control system PID controllers. The result of the experiments confirms the efficacy of FPAA/dpASPs on signal condition and control systems.