991 resultados para GPU - graphics processing unit


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Various mechanisms have been proposed to explain extreme waves or rogue waves in an oceanic environment including directional focusing, dispersive focusing, wave-current interaction, and nonlinear modulational instability. The Benjamin-Feir instability (nonlinear modulational instability), however, is considered to be one of the primary mechanisms for rogue-wave occurrence. The nonlinear Schrodinger equation is a well-established approximate model based on the same assumptions as required for the derivation of the Benjamin-Feir theory. Solutions of the nonlinear Schrodinger equation, including new rogue-wave type solutions are presented in the author's dissertation work. The solutions are obtained by using a predictive eigenvalue map based predictor-corrector procedure developed by the author. Features of the predictive map are explored and the influences of certain parameter variations are investigated. The solutions are rescaled to match the length scales of waves generated in a wave tank. Based on the information provided by the map and the details of physical scaling, a framework is developed that can serve as a basis for experimental investigations into a variety of extreme waves as well localizations in wave fields. To derive further fundamental insights into the complexity of extreme wave conditions, Smoothed Particle Hydrodynamics (SPH) simulations are carried out on an advanced Graphic Processing Unit (GPU) based parallel computational platform. Free surface gravity wave simulations have successfully characterized water-wave dispersion in the SPH model while demonstrating extreme energy focusing and wave growth in both linear and nonlinear regimes. A virtual wave tank is simulated wherein wave motions can be excited from either side. Focusing of several wave trains and isolated waves has been simulated. With properly chosen parameters, dispersion effects are observed causing a chirped wave train to focus and exhibit growth. By using the insights derived from the study of the nonlinear Schrodinger equation, modulational instability or self-focusing has been induced in a numerical wave tank and studied through several numerical simulations. Due to the inherent dissipative nature of SPH models, simulating persistent progressive waves can be problematic. This issue has been addressed and an observation-based solution has been provided. The efficacy of SPH in modeling wave focusing can be critical to further our understanding and predicting extreme wave phenomena through simulations. A deeper understanding of the mechanisms underlying extreme energy localization phenomena can help facilitate energy harnessing and serve as a basis to predict and mitigate the impact of energy focusing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação (mestrado)–Universidade de Brasília, Universidade UnB de Planaltina, Programa de Pós-Graduação em Ciência de Materiais, 2015.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Solving a complex Constraint Satisfaction Problem (CSP) is a computationally hard task which may require a considerable amount of time. Parallelism has been applied successfully to the job and there are already many applications capable of harnessing the parallel power of modern CPUs to speed up the solving process. Current Graphics Processing Units (GPUs), containing from a few hundred to a few thousand cores, possess a level of parallelism that surpasses that of CPUs and there are much less applications capable of solving CSPs on GPUs, leaving space for further improvement. This paper describes work in progress in the solving of CSPs on GPUs, CPUs and other devices, such as Intel Many Integrated Cores (MICs), in parallel. It presents the gains obtained when applying more devices to solve some problems and the main challenges that must be faced when using devices with as different architectures as CPUs and GPUs, with a greater focus on how to effectively achieve good load balancing between such heterogeneous devices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To reduce the amount of time needed to solve the most complex Constraint Satisfaction Problems (CSPs) usually multi-core CPUs are used. There are already many applications capable of harnessing the parallel power of these devices to speed up the CSPs solving process. Nowadays, the Graphics Processing Units (GPUs) possess a level of parallelism that surpass the CPUs, containing from a few hundred to a few thousand cores and there are much less applications capable of solving CSPs on GPUs, leaving space for possible improvements. This article describes the work in progress for solving CSPs on GPUs and CPUs and compares results with some state-of-the-art solvers, presenting already some good results on GPUs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O estudo das curvas características de um transístor permite conhecer um conjunto de parâmetros essenciais à sua utilização tanto no domínio da amplificação de sinais como em circuitos de comutação. Deste estudo é possível obter dados em condições que muitas vezes não constam na documentação fornecida pelos fabricantes. O trabalho que aqui se apresenta consiste no desenvolvimento de um sistema que permite de forma simples, eficiente e económica obter as curvas características de um transístor (bipolar de junção, efeito de campo de junção e efeito de campo de metal-óxido semicondutor), podendo ainda ser utilizado como instrumento pedagógico na introdução ao estudo dos dispositivos semicondutores ou no projecto de amplificadores transistorizados. O sistema é constituído por uma unidade de condicionamento de sinal, uma unidade de processamento de dados (hardware) e por um programa informático que permite o processamento gráfico dos dados obtidos, isto é, traçar as curvas características do transístor. O seu princípio de funcionamento consiste na utilização de um conversor Digital-Analógico (DAC) como fonte de tensão variável, alimentando a base (TBJ) ou a porta (JFET e MOSFET) do dispositivo a testar. Um segundo conversor fornece a variação da tensão VCE ou VDS necessária à obtenção de cada uma das curvas. O controlo do processo é garantido por uma unidade de processamento local, baseada num microcontrolador da família 8051, responsável pela leitura dos valores em corrente e em tensão recorrendo a conversores Analógico-Digital (ADC). Depois de processados, os dados são transmitidos através de uma ligação USB para um computador no qual um programa procede à representação gráfica, das curvas características de saída e à determinação de outros parâmetros característicos do dispositivo semicondutor em teste. A utilização de componentes convencionais e a simplicidade construtiva do projecto tornam este sistema económico, de fácil utilização e flexível, pois permite com pequenas alterações

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis aims at addressing the development of autonomous behaviors, for search and exploration with a mini-UAV (Unmanned Aerial Vehicle), or also called MAV (Mini Aerial Vehicle) prototype, in order to gather information in rescue scenarios. The platform used in this work is a four rotor helicopter, known as quad-rotor from the German company Ascending Technologies GmbH, which is later assembled with a on-board processing unit (i.e. a tiny light weight computer) and a on-board sensor suite (i.e. 2D-LIDAR and Ultrasonic Sonar). This work can be divided into two phases. In the first phase an Indoor Position Tracking system was settled in order to obtain the Cartesian coordinates (i.e. X, Y, Z) and orientation (i.e.heading) which provides the relative position and orientation of the platform. The second phase was the design and implementation of medium/high level controllers on each command input in order to autonomously control the aircraft position, which is the first step towards an autonomous hovering flight, and any autonomous behavior (e.g. Landing, Object avoidance, Follow the wall). The main work is carried out in the Laboratory ”Intelligent Systems for Emergencies and Civil Defense”, in collaboration with ”Dipartimento di Informatica e Sistemistica” of Sapienza Univ. of Rome and ”Istituto Superiore Antincendi” of the Italian Firemen Department.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

No ambiente empresarial actual, cada vez mais competitivo e exigente, é um factor fundamental para o sucesso das empresas a sua capacidade de atingir e melhorar os níveis de satisfação exigidos pelos clientes. Para identificar as melhorias a implementar, as empresas devem ser capazes de monitorizar e controlar todas as suas actividades e processos. O acompanhamento realizado às actividades delegadas a empresas externas, como por exemplo o transporte de mercadorias, é dificultado quando os prestadores destes serviços não possuem ferramentas de apoio que disponibilizem informação necessária para o efeito. A necessidade de colmatar esta dificuldade na recolha da informação durante a distribuição de uma encomenda na empresa Caetano Parts, uma empresa de revenda de peças de substituição automóvel, levou ao desenvolvimento de uma ferramenta que permite fazer o seguimento de uma encomenda em todas as suas fases, permitindo ao responsável pelas operações acompanhar o estado da encomenda desde o instante em que a encomenda é colocada, passando pelo seu processamento dentro das instalações, até à sua entrega ao cliente. O sistema desenvolvido é composto por dois componentes, o front-end e o back-end. O front-end é composto por uma aplicação web, e por uma aplicação Android para dispositivos móveis. A aplicação web disponibiliza a gestão da base de dados, o acompanhamento do estado da encomenda e a análise das operações. A aplicação Android é disponibilizada às empresas responsáveis pelo transporte das encomendas e possibilita a actualização online da informação acerca do processo de entrega. O back-end é composto pela unidade de armazenamento e processamento da informação e encontra-se alojado num servidor com ligação à internet, disponibilizando uma interface com o serviço móvel do tipo serviço web. A concepção, desenvolvimento e descrição das funcionalidades desta ferramenta são abordadas ao longo do trabalho. Os testes realizados ao longo do desenvolvimento validaram o correcto funcionamento da ferramenta, estando pronta para a realização de um teste piloto.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A dificuldade de controlo de um motor de indução, bem como o armazenamento de energia CC e posterior utilização como energia alternada promoveram o desenvolvimento de variadores de frequência e inversores. Assim, como projeto de tese de mestrado em Automação e Sistemas surge o desenvolvimento de um variador de frequência. Para elaboração do variador de frequência efetuou-se um estudo sobre as técnicas de modulação utilizadas nos inversores. A técnica escolhida e utilizada é a Sinusoidal Pulse Width Modulation (SPWM). Esta técnica baseia-se na modelação por largura de impulso (PWM), o qual é formado por comparação de um sinal de referência com um sinal de portadora de elevada frequência. Por sua vez, a topologia escolhida para o inversor corresponde a um Voltage Source Inverter (VSI) de ponte trifásica completa a três terminais. O desenvolvimento da técnica de modulação SPWM levou ao desenvolvimento de um modelo de simulação em SIMULINK, o qual permitiu retirar conclusões sobre os resultados obtidos. Na fase de implementação, foram desenvolvidas placas para o funcionamento do variador de frequência. Assim, numa fase inicial foi desenvolvida a placa de controlo, a qual contém a unidade de processamento e que é responsável pela atuação de Insulated Gate Bipolar Transistors (IGBTs). Para além disso, foi desenvolvida uma placa para proteção dos IGBTs (evitando condução simultânea no mesmo terminal) e uma placa de fontes isoladas para alimentação dos circuitos e para atuação dos IGBTs. Ainda, foi desenvolvida a técnica de SPWM em software para a unidade de controlo e finalmente foi desenvolvida uma interface gráfica para interação com o utilizador. A validação do projeto foi conseguida através da variação da velocidade do motor de indução trifásico. Para isso, este foi colocado a funcionar a diversas frequências de funcionamento e a diferentes amplitudes. Para além disso, o seu funcionamento foi também validado utilizando uma carga trifásica equilibrada de 3 lâmpadas de forma a ser visualizada a variação de frequência e variação de amplitude.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mestrado em Engenharia Informática, Área de Especialização em Tecnologias do Conhecimento e da Decisão

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this manuscript we tackle the problem of semidistributed user selection with distributed linear precoding for sum rate maximization in multiuser multicell systems. A set of adjacent base stations (BS) form a cluster in order to perform coordinated transmission to cell-edge users, and coordination is carried out through a central processing unit (CU). However, the message exchange between BSs and the CU is limited to scheduling control signaling and no user data or channel state information (CSI) exchange is allowed. In the considered multicell coordinated approach, each BS has its own set of cell-edge users and transmits only to one intended user while interference to non-intended users at other BSs is suppressed by signal steering (precoding). We use two distributed linear precoding schemes, Distributed Zero Forcing (DZF) and Distributed Virtual Signalto-Interference-plus-Noise Ratio (DVSINR). Considering multiple users per cell and the backhaul limitations, the BSs rely on local CSI to solve the user selection problem. First we investigate how the signal-to-noise-ratio (SNR) regime and the number of antennas at the BSs impact the effective channel gain (the magnitude of the channels after precoding) and its relationship with multiuser diversity. Considering that user selection must be based on the type of implemented precoding, we develop metrics of compatibility (estimations of the effective channel gains) that can be computed from local CSI at each BS and reported to the CU for scheduling decisions. Based on such metrics, we design user selection algorithms that can find a set of users that potentially maximizes the sum rate. Numerical results show the effectiveness of the proposed metrics and algorithms for different configurations of users and antennas at the base stations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consumer-electronics systems are becoming increasingly complex as the number of integrated applications is growing. Some of these applications have real-time requirements, while other non-real-time applications only require good average performance. For cost-efficient design, contemporary platforms feature an increasing number of cores that share resources, such as memories and interconnects. However, resource sharing causes contention that must be resolved by a resource arbiter, such as Time-Division Multiplexing. A key challenge is to configure this arbiter to satisfy the bandwidth and latency requirements of the real-time applications, while maximizing the slack capacity to improve performance of their non-real-time counterparts. As this configuration problem is NP-hard, a sophisticated automated configuration method is required to avoid negatively impacting design time. The main contributions of this article are: 1) An optimal approach that takes an existing integer linear programming (ILP) model addressing the problem and wraps it in a branch-and-price framework to improve scalability. 2) A faster heuristic algorithm that typically provides near-optimal solutions. 3) An experimental evaluation that quantitatively compares the branch-and-price approach to the previously formulated ILP model and the proposed heuristic. 4) A case study of an HD video and graphics processing system that demonstrates the practical applicability of the approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica, Sistemas e Computadores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current computer systems have evolved from featuring only a single processing unit and limited RAM, in the order of kilobytes or few megabytes, to include several multicore processors, o↵ering in the order of several tens of concurrent execution contexts, and have main memory in the order of several tens to hundreds of gigabytes. This allows to keep all data of many applications in the main memory, leading to the development of inmemory databases. Compared to disk-backed databases, in-memory databases (IMDBs) are expected to provide better performance by incurring in less I/O overhead. In this dissertation, we present a scalability study of two general purpose IMDBs on multicore systems. The results show that current general purpose IMDBs do not scale on multicores, due to contention among threads running concurrent transactions. In this work, we explore di↵erent direction to overcome the scalability issues of IMDBs in multicores, while enforcing strong isolation semantics. First, we present a solution that requires no modification to either database systems or to the applications, called MacroDB. MacroDB replicates the database among several engines, using a master-slave replication scheme, where update transactions execute on the master, while read-only transactions execute on slaves. This reduces contention, allowing MacroDB to o↵er scalable performance under read-only workloads, while updateintensive workloads su↵er from performance loss, when compared to the standalone engine. Second, we delve into the database engine and identify the concurrency control mechanism used by the storage sub-component as a scalability bottleneck. We then propose a new locking scheme that allows the removal of such mechanisms from the storage sub-component. This modification o↵ers performance improvement under all workloads, when compared to the standalone engine, while scalability is limited to read-only workloads. Next we addressed the scalability limitations for update-intensive workloads, and propose the reduction of locking granularity from the table level to the attribute level. This further improved performance for intensive and moderate update workloads, at a slight cost for read-only workloads. Scalability is limited to intensive-read and read-only workloads. Finally, we investigate the impact applications have on the performance of database systems, by studying how operation order inside transactions influences the database performance. We then propose a Read before Write (RbW) interaction pattern, under which transaction perform all read operations before executing write operations. The RbW pattern allowed TPC-C to achieve scalable performance on our modified engine for all workloads. Additionally, the RbW pattern allowed our modified engine to achieve scalable performance on multicores, almost up to the total number of cores, while enforcing strong isolation.