17 resultados para floating
em Repositório Científico do Instituto Politécnico de Lisboa - Portugal
Resumo:
This paper presents a single precision floating point arithmetic unit with support for multiplication, addition, fused multiply-add, reciprocal, square-root and inverse squareroot with high-performance and low resource usage. The design uses a piecewise 2nd order polynomial approximation to implement reciprocal, square-root and inverse square-root. The unit can be configured with any number of operations and is capable to calculate any function with a throughput of one operation per cycle. The floatingpoint multiplier of the unit is also used to implement the polynomial approximation and the fused multiply-add operation. We have compared our implementation with other state-of-the-art proposals, including the Xilinx Core-Gen operators, and conclude that the approach has a high relative performance/area efficiency. © 2014 Technical University of Munich (TUM).
Resumo:
A package of B-spline finite strip models is developed for the linear analysis of piezolaminated plates and shells. This package is associated to a global optimization technique in order to enhance the performance of these types of structures, subjected to various types of objective functions and/or constraints, with discrete and continuous design variables. The models considered are based on a higher-order displacement field and one can apply them to the static, free vibration and buckling analyses of laminated adaptive structures with arbitrary lay-ups, loading and boundary conditions. Genetic algorithms, with either binary or floating point encoding of design variables, were considered to find optimal locations of piezoelectric actuators as well as to determine the best voltages applied to them in order to obtain a desired structure shape. These models provide an overall economy of computing effort for static and vibration problems.
Resumo:
A oferta de serviços baseados em comunicações sem fios tem vindo a crescer exponencialmente na última década. Cada vez mais são exigidas maiores taxas de transmissão assim como uma melhor QoS, sem comprometer a potência de transmissão ou argura de banda disponível. A tecnologia MIMO consegue oferecer um aumento da capacidade destes sistemas sem requerer aumento da largura de banda ou da potência transmitida. O trabalho desenvolvido nesta dissertação consistiu no estudo dos sistemas MIMO, caracterizados pela utilização de múltiplas antenas para transmitir e receber a informação. Com um sistema deste tipo consegue-se obter um ganho de diversidade espacial utilizando códigos espaço-temporais, que exploram simultaneamente o domínio espacial e o domínio do tempo. Nesta dissertação é dado especial ênfase à codificação por blocos no espaço-tempo de Alamouti, a qual será implementada em FPGA, nomeadamente a parte de recepção. Esta implementação é efectuada para uma configuração de antenas 2x1, utilizando vírgula flutuante e para três tipos de modulação: BPSK, QPSK e 16-QAM. Por fim será analisada a relação entre a precisão alcançada na representação numérica dos resultados e os recursos consumidos pela FPGA. Com a arquitectura adoptada conseguem se obter taxas de transferência na ordem dos 29,141 Msimb/s (sem pipelines) a 262,674 Msimb/s (com pipelines), para a modulação BPSK.
Resumo:
A two terminal optically addressed image processing device based on two stacked sensing/switching p-i-n a-SiC:H diodes is presented. The charge packets are injected optically into the p-i-n sensing photodiode and confined at the illuminated regions changing locally the electrical field profile across the p-i-n switching diode. A red scanner is used for charge readout. The various design parameters and addressing architecture trade-offs are discussed. The influence on the transfer functions of an a-SiC:H sensing absorber optimized for red transmittance and blue collection or of a floating anode in between is analysed. Results show that the thin a-SiC:H sensing absorber confines the readout to the switching diode and filters the light allowing full colour detection at two appropriated voltages. When the floating anode is used the spectral response broadens, allowing B&W image recognition with improved light-to-dark sensitivity. A physical model supports the image and colour recognition process.
Resumo:
This paper describes an implementation of a long distance echo canceller, operating on full-duplex with hands-free and in real-time with a single Digital Signal Processor (DSP). The proposed solution is based on short length adaptive filters centered on the positions of the most significant echoes, which are tracked by time delay estimators, for which we use a new approach. To deal with double talking situations a speech detector is employed. The floating-point DSP TMS320C6713 from Texas Instruments is used with software written in C++, with compiler optimizations for fast execution. The resulting algorithm enables long distance echo cancellation with low computational requirements, suited for embbeded systems. It reaches greater echo return loss enhancement and shows faster convergence speed when compared to the conventional approach. The experimental results approach the CCITT G.165 recommendation levels.
Resumo:
Chapter in Book Proceedings with Peer Review First Iberian Conference, IbPRIA 2003, Puerto de Andratx, Mallorca, Spain, JUne 4-6, 2003. Proceedings
Resumo:
Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia Civil na Área de Especialização de Estruturas
Resumo:
Relatório do Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia de Electrónica e Telecomunicações
Resumo:
Trabalho de Projeto para obtenção do grau de Mestre em Engenharia de Eletrónica e Telecomunicações
Resumo:
Floating-point computing with more than one TFLOP of peak performance is already a reality in recent Field-Programmable Gate Arrays (FPGA). General-Purpose Graphics Processing Units (GPGPU) and recent many-core CPUs have also taken advantage of the recent technological innovations in integrated circuit (IC) design and had also dramatically improved their peak performances. In this paper, we compare the trends of these computing architectures for high-performance computing and survey these platforms in the execution of algorithms belonging to different scientific application domains. Trends in peak performance, power consumption and sustained performances, for particular applications, show that FPGAs are increasing the gap to GPUs and many-core CPUs moving them away from high-performance computing with intensive floating-point calculations. FPGAs become competitive for custom floating-point or fixed-point representations, for smaller input sizes of certain algorithms, for combinational logic problems and parallel map-reduce problems. © 2014 Technical University of Munich (TUM).
Resumo:
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.
Resumo:
This paper presents a model for the simulation of an offshore wind system having a rectifier input voltage malfunction at one phase. The offshore wind system model comprises a variable-speed wind turbine supported on a floating platform, equipped with a permanent magnet synchronous generator using full-power four-level neutral point clamped converter. The link from the offshore floating platform to the onshore electrical grid is done through a light high voltage direct current submarine cable. The drive train is modeled by a three-mass model. Considerations about the smart grid context are offered for the use of the model in such a context. The rectifier voltage malfunction domino effect is presented as a case study to show capabilities of the model. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
Recent integrated circuit technologies have opened the possibility to design parallel architectures with hundreds of cores on a single chip. The design space of these parallel architectures is huge with many architectural options. Exploring the design space gets even more difficult if, beyond performance and area, we also consider extra metrics like performance and area efficiency, where the designer tries to design the architecture with the best performance per chip area and the best sustainable performance. In this paper we present an algorithm-oriented approach to design a many-core architecture. Instead of doing the design space exploration of the many core architecture based on the experimental execution results of a particular benchmark of algorithms, our approach is to make a formal analysis of the algorithms considering the main architectural aspects and to determine how each particular architectural aspect is related to the performance of the architecture when running an algorithm or set of algorithms. The architectural aspects considered include the number of cores, the local memory available in each core, the communication bandwidth between the many-core architecture and the external memory and the memory hierarchy. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm and determined an equation that relates the number of execution cycles with the architectural parameters. Based on this equation a many-core architecture has been designed. The results obtained indicate that a 100 mm(2) integrated circuit design of the proposed architecture, using a 65 nm technology, is able to achieve 464 GFLOPs (double precision floating-point) for a memory bandwidth of 16 GB/s. This corresponds to a performance efficiency of 71 %. Considering a 45 nm technology, a 100 mm(2) chip attains 833 GFLOPs which corresponds to 84 % of peak performance These figures are better than those obtained by previous many-core architectures, except for the area efficiency which is limited by the lower memory bandwidth considered. The results achieved are also better than those of previous state-of-the-art many-cores architectures designed specifically to achieve high performance for matrix multiplication.
Resumo:
Single processor architectures are unable to provide the required performance of high performance embedded systems. Parallel processing based on general-purpose processors can achieve these performances with a considerable increase of required resources. However, in many cases, simplified optimized parallel cores can be used instead of general-purpose processors achieving better performance at lower resource utilization. In this paper, we propose a configurable many-core architecture to serve as a co-processor for high-performance embedded computing on Field-Programmable Gate Arrays. The architecture consists of an array of configurable simple cores with support for floating-point operations interconnected with a configurable interconnection network. For each core it is possible to configure the size of the internal memory, the supported operations and number of interfacing ports. The architecture was tested in a ZYNQ-7020 FPGA in the execution of several parallel algorithms. The results show that the proposed many-core architecture achieves better performance than that achieved with a parallel generalpurpose processor and that up to 32 floating-point cores can be implemented in a ZYNQ-7020 SoC FPGA.
Resumo:
The integrated numerical tool SWAMS (Simulation of Wave Action on Moored Ships) is used to simulate the behavior of a moored container carrier inside Sines’ Harbour. Wave, wind, currents, floating ship and moorings interaction is discussed. Several case scenarios are compared differing in the layout of the harbour and wind and wave conditions. The several harbour layouts correspond to proposed alternatives for the future expansion of Sines’ terminal XXI that include the extension of the East breakwater and of the quay. Additionally, the influence of wind on the behavior of the ship moored and the introduction of pre tensioning the mooring lines was analyzed. Hydrodynamic forces acting on the ship are determined using a modified version of the WAMIT model. This modified model utilizes the Haskind relations and the non-linear wave field inside the harbour obtained with finite element numerical model, BOUSS-WMH (Boussinesq Wave Model for Harbors) to get the wave forces on the ship. The time series of the moored ship motions and forces on moorings are obtained using BAS solver. © 2015 Taylor & Francis Group, London.