974 resultados para Software-reconfigurable array processing architectures
Resumo:
Reliable flow simulation software is inevitable to determine an optimal injection strategy in Liquid Composite Molding processes. Several methodologies can be implemented into standard software in order to reduce CPU time. Post-processing techniques might be one of them. Post-processing a finite element solution is a well-known procedure, which consists in a recalculation of the originally obtained quantities such that the rate of convergence increases without the need for expensive remeshing techniques. Post-processing is especially effective in problems where better accuracy is required for derivatives of nodal variables in regions where Dirichlet essential boundary condition is imposed strongly. In previous works influence of smoothness of non-homogeneous Dirichlet condition, imposed on smooth front was examined. However, usually quite a non-smooth boundary is obtained at each time step of the infiltration process due to discretization. Then direct application of post-processing techniques does not improve final results as expected. The new contribution of this paper lies in improvement of the standard methodology. Improved results clearly show that the recalculated flow front is closer to the ”exact” one, is smoother that the previous one and it improves local disturbances of the “exact” solution.
Resumo:
Sparse matrix-vector multiplication (SMVM) is a fundamental operation in many scientific and engineering applications. In many cases sparse matrices have thousands of rows and columns where most of the entries are zero, while non-zero data is spread over the matrix. This sparsity of data locality reduces the effectiveness of data cache in general-purpose processors quite reducing their performance efficiency when compared to what is achieved with dense matrix multiplication. In this paper, we propose a parallel processing solution for SMVM in a many-core architecture. The architecture is tested with known benchmarks using a ZYNQ-7020 FPGA. The architecture is scalable in the number of core elements and limited only by the available memory bandwidth. It achieves performance efficiencies up to almost 70% and better performances than previous FPGA designs.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologias da Universidade Nova de Lisboa para a obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Dissertation presented to obtain the degree of Doctor of Philosophy in Electrical Engineering, speciality on Perceptional Systems, by the Universidade Nova de Lisboa, Faculty of Sciences and Technology
Resumo:
Coarse Grained Reconfigurable Architectures (CGRAs) are emerging as enabling platforms to meet the high performance demanded by modern applications (e.g. 4G, CDMA, etc.). Recently proposed CGRAs offer time-multiplexing and dynamic applications parallelism to enhance device utilization and reduce energy consumption at the cost of additional memory (up to 50% area of the overall platform). To reduce the memory overheads, novel CGRAs employ either statistical compression, intermediate compact representation, or multicasting. Each compaction technique has different properties (i.e. compression ratio, decompression time and decompression energy) and is best suited for a particular class of applications. However, existing research only deals with these methods separately. Moreover, they only analyze the compaction ratio and do not evaluate the associated energy overheads. To tackle these issues, we propose a polymorphic compression architecture that interleaves these techniques in a unique platform. The proposed architecture allows each application to take advantage of a separate compression/decompression hierarchy (consisting of various types and implementations of hardware/software decoders) tailored to its needs. Simulation results, using different applications (FFT, Matrix multiplication, and WLAN), reveal that the choice of compression hierarchy has a significant impact on compression ratio (up to 52%), decompression energy (up to 4 orders of magnitude), and configuration time (from 33 n to 1.5 s) for the tested applications. Synthesis results reveal that introducing adaptivity incurs negligible additional overheads (1%) compared to the overall platform area.
Resumo:
As of today, AUTOSAR is the de facto standard in the automotive industry, providing a common software architec- ture and development process for automotive applications. While this standard is originally written for singlecore operated Elec- tronic Control Units (ECU), new guidelines and recommendations have been added recently to provide support for multicore archi- tectures. This update came as a response to the steady increase of the number and complexity of the software functions embedded in modern vehicles, which call for the computing power of multicore execution environments. In this paper, we enumerate and analyze the design options and the challenges of porting AUTOSAR-based automotive applications onto multicore platforms. In particular, we investigate those options when considering the emerging many- core architectures that provide a more scalable environment than the traditional multicore systems. Such platforms are suitable to enable massive parallel execution, and their design is more suitable for partitioning and isolating the software components.
Resumo:
Distributed real-time systems such as automotive applications are becoming larger and more complex, thus, requiring the use of more powerful hardware and software architectures. Furthermore, those distributed applications commonly have stringent real-time constraints. This implies that such applications would gain in flexibility if they were parallelized and distributed over the system. In this paper, we consider the problem of allocating fixed-priority fork-join Parallel/Distributed real-time tasks onto distributed multi-core nodes connected through a Flexible Time Triggered Switched Ethernet network. We analyze the system requirements and present a set of formulations based on a constraint programming approach. Constraint programming allows us to express the relations between variables in the form of constraints. Our approach is guaranteed to find a feasible solution, if one exists, in contrast to other approaches based on heuristics. Furthermore, approaches based on constraint programming have shown to obtain solutions for these type of formulations in reasonable time.
Resumo:
3rd Workshop on High-performance and Real-time Embedded Systems (HIRES 2015). 21, Jan, 2015. Amsterdam, Netherlands.
Resumo:
Article in Press, Corrected Proof
Resumo:
Recent embedded processor architectures containing multiple heterogeneous cores and non-coherent caches renewed attention to the use of Software Transactional Memory (STM) as a building block for developing parallel applications. STM promises to ease concurrent and parallel software development, but relies on the possibility of abort conflicting transactions to maintain data consistency, which in turns affects the execution time of tasks carrying transactions. Because of this fact the timing behaviour of the task set may not be predictable, thus it is crucial to limit the execution time overheads resulting from aborts. In this paper we formalise a FIFO-based algorithm to order the sequence of commits of concurrent transactions. Then, we propose and evaluate two non-preemptive and one SRP-based fully-preemptive scheduling strategies, in order to avoid transaction starvation.
Resumo:
The recent technological advancements and market trends are causing an interesting phenomenon towards the convergence of High-Performance Computing (HPC) and Embedded Computing (EC) domains. On one side, new kinds of HPC applications are being required by markets needing huge amounts of information to be processed within a bounded amount of time. On the other side, EC systems are increasingly concerned with providing higher performance in real-time, challenging the performance capabilities of current architectures. The advent of next-generation many-core embedded platforms has the chance of intercepting this converging need for predictable high-performance, allowing HPC and EC applications to be executed on efficient and powerful heterogeneous architectures integrating general-purpose processors with many-core computing fabrics. To this end, it is of paramount importance to develop new techniques for exploiting the massively parallel computation capabilities of such platforms in a predictable way. P-SOCRATES will tackle this important challenge by merging leading research groups from the HPC and EC communities. The time-criticality and parallelisation challenges common to both areas will be addressed by proposing an integrated framework for executing workload-intensive applications with real-time requirements on top of next-generation commercial-off-the-shelf (COTS) platforms based on many-core accelerated architectures. The project will investigate new HPC techniques that fulfil real-time requirements. The main sources of indeterminism will be identified, proposing efficient mapping and scheduling algorithms, along with the associated timing and schedulability analysis, to guarantee the real-time and performance requirements of the applications.
Resumo:
The study of chemical diffusion in biological tissues is a research field of high importance and with application in many clinical, research and industrial areas. The evaluation of diffusion and viscosity properties of chemicals in tissues is necessary to characterize treatments or inclusion of preservatives in tissues or organs for low temperature conservation. Recently, we have demonstrated experimentally that the diffusion properties and dynamic viscosity of sugars and alcohols can be evaluated from optical measurements. Our studies were performed in skeletal muscle, but our results have revealed that the same methodology can be used with other tissues and different chemicals. Considering the significant number of studies that can be made with this method, it becomes necessary to turn data processing and calculation easier. With this objective, we have developed a software application that integrates all processing and calculations, turning the researcher work easier and faster. Using the same experimental data that previously was used to estimate the diffusion and viscosity of glucose in skeletal muscle, we have repeated the calculations with the new application. Comparing between the results obtained with the new application and with previous independent routines we have demonstrated great similarity and consequently validated the application. This new tool is now available to be used in similar research to obtain the diffusion properties of other chemicals in different tissues or organs.
Resumo:
Dissertation presented to obtain the PhD degree in Electrical and Computer Engineering - Electronics
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.