983 resultados para Graphics processing unit programming
Resumo:
Current computer systems have evolved from featuring only a single processing unit and limited RAM, in the order of kilobytes or few megabytes, to include several multicore processors, o↵ering in the order of several tens of concurrent execution contexts, and have main memory in the order of several tens to hundreds of gigabytes. This allows to keep all data of many applications in the main memory, leading to the development of inmemory databases. Compared to disk-backed databases, in-memory databases (IMDBs) are expected to provide better performance by incurring in less I/O overhead. In this dissertation, we present a scalability study of two general purpose IMDBs on multicore systems. The results show that current general purpose IMDBs do not scale on multicores, due to contention among threads running concurrent transactions. In this work, we explore di↵erent direction to overcome the scalability issues of IMDBs in multicores, while enforcing strong isolation semantics. First, we present a solution that requires no modification to either database systems or to the applications, called MacroDB. MacroDB replicates the database among several engines, using a master-slave replication scheme, where update transactions execute on the master, while read-only transactions execute on slaves. This reduces contention, allowing MacroDB to o↵er scalable performance under read-only workloads, while updateintensive workloads su↵er from performance loss, when compared to the standalone engine. Second, we delve into the database engine and identify the concurrency control mechanism used by the storage sub-component as a scalability bottleneck. We then propose a new locking scheme that allows the removal of such mechanisms from the storage sub-component. This modification o↵ers performance improvement under all workloads, when compared to the standalone engine, while scalability is limited to read-only workloads. Next we addressed the scalability limitations for update-intensive workloads, and propose the reduction of locking granularity from the table level to the attribute level. This further improved performance for intensive and moderate update workloads, at a slight cost for read-only workloads. Scalability is limited to intensive-read and read-only workloads. Finally, we investigate the impact applications have on the performance of database systems, by studying how operation order inside transactions influences the database performance. We then propose a Read before Write (RbW) interaction pattern, under which transaction perform all read operations before executing write operations. The RbW pattern allowed TPC-C to achieve scalable performance on our modified engine for all workloads. Additionally, the RbW pattern allowed our modified engine to achieve scalable performance on multicores, almost up to the total number of cores, while enforcing strong isolation.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão Industrial
Resumo:
Dissertação de mestrado integrado em Engenharia Civil
Resumo:
Aquest projecte presenta el disseny, construcció i programació d’un robot autònom, com a base per una proposta educativa. Per aconseguir aquest objectiu s’ha dotat el robot d’una unitat de procés, un sistema de locomoció i un seguit de sensors que proporcionaran a la unitat informació respecte l’entorn. Per gestionar totes aquestes funcionalitats, s’ha fet servir un sistema operatiu en temps real capaç de gestionar amb efectivitat les tasques que puguin ser executades pel robot. Finalment, s’ha exposat una detallada descripció dels costos per una producció de volum mig i de caire merament educatiu.
Resumo:
A graphical processing unit (GPU) is a hardware device normally used to manipulate computer memory for the display of images. GPU computing is the practice of using a GPU device for scientific or general purpose computations that are not necessarily related to the display of images. Many problems in econometrics have a structure that allows for successful use of GPU computing. We explore two examples. The first is simple: repeated evaluation of a likelihood function at different parameter values. The second is a more complicated estimator that involves simulation and nonparametric fitting. We find speedups from 1.5 up to 55.4 times, compared to computations done on a single CPU core. These speedups can be obtained with very little expense, energy consumption, and time dedicated to system maintenance, compared to equivalent performance solutions using CPUs. Code for the examples is provided.
Resumo:
The Computational Biophysics Group at the Universitat Pompeu Fabra (GRIB-UPF) hosts two unique computational resources dedicated to the execution of large scale molecular dynamics (MD) simulations: (a) the ACMD molecular-dynamics software, used on standard personal computers with graphical processing units (GPUs); and (b) the GPUGRID. net computing network, supported by users distributed worldwide that volunteer GPUs for biomedical research. We leveraged these resources and developed studies, protocols and open-source software to elucidate energetics and pathways of a number of biomolecular systems, with a special focus on flexible proteins with many degrees of freedom. First, we characterized ion permeation through the bactericidal model protein Gramicidin A conducting one of the largest studies to date with the steered MD biasing methodology. Next, we addressed an open problem in structural biology, the determination of drug-protein association kinetics; we reconstructed the binding free energy, association, and dissaciociation rates of a drug like model system through a spatial decomposition and a Makov-chain analysis. The work was published in the Proceedings of the National Academy of Sciences and become one of the few landmark papers elucidating a ligand-binding pathway. Furthermore, we investigated the unstructured Kinase Inducible Domain (KID), a 28-peptide central to signalling and transcriptional response; the kinetics of this challenging system was modelled with a Markovian approach in collaboration with Frank Noe’s group at the Freie University of Berlin. The impact of the funding includes three peer-reviewed publication on high-impact journals; three more papers under review; four MD analysis components, released as open-source software; MD protocols; didactic material, and code for the hosting group.
Resumo:
Purpose: The objective of this study is to investigate the feasibility of detecting and quantifying 3D cerebrovascular wall motion from a single 3D rotational x-ray angiography (3DRA) acquisition within a clinically acceptable time and computing from the estimated motion field for the further biomechanical modeling of the cerebrovascular wall. Methods: The whole motion cycle of the cerebral vasculature is modeled using a 4D B-spline transformation, which is estimated from a 4D to 2D + t image registration framework. The registration is performed by optimizing a single similarity metric between the entire 2D + t measured projection sequence and the corresponding forward projections of the deformed volume at their exact time instants. The joint use of two acceleration strategies, together with their implementation on graphics processing units, is also proposed so as to reach computation times close to clinical requirements. For further characterizing vessel wall properties, an approximation of the wall thickness changes is obtained through a strain calculation. Results: Evaluation on in silico and in vitro pulsating phantom aneurysms demonstrated an accurate estimation of wall motion curves. In general, the error was below 10% of the maximum pulsation, even in the situation when substantial inhomogeneous intensity pattern was present. Experiments on in vivo data provided realistic aneurysm and vessel wall motion estimates, whereas in regions where motion was neither visible nor anatomically possible, no motion was detected. The use of the acceleration strategies enabled completing the estimation process for one entire cycle in 5-10 min without degrading the overall performance. The strain map extracted from our motion estimation provided a realistic deformation measure of the vessel wall. Conclusions: The authors' technique has demonstrated that it can provide accurate and robust 4D estimates of cerebrovascular wall motion within a clinically acceptable time, although it has to be applied to a larger patient population prior to possible wide application to routine endovascular procedures. In particular, for the first time, this feasibility study has shown that in vivo cerebrovascular motion can be obtained intraprocedurally from a 3DRA acquisition. Results have also shown the potential of performing strain analysis using this imaging modality, thus making possible for the future modeling of biomechanical properties of the vascular wall.
Resumo:
The AQUAREL project studied the availability and optional utilization methods for fish processing side streams and other aquatic biomaterial in the Republic of Karelia. Additionally processing aquatic biomaterial with manure and sewage sludge was studied. Based on the results, the most feasible option today is to process fish side streams to fish oil and dewatered oil-free residue and to use them for fish or animal feed production. However, it is necessary to highlight, that changes in e.g. economic environment, energy prices and demand may require re-evaluating the results and conclusions made in the project. Producing fish oil from fish processing side streams is an easy and relatively simple production process generating a valuable end product. The functionality of the process was confirmed in a pilot conducted in the project. The oil and solids are separated from the heated fish waste based on gravity. The fish oil separating on top of the separator unit is removed. Fish oil can as such be utilized for heating purposes, fish meal or animal feed production, but it can also be further processed to biodiesel. However, due to currently moderate energy prices in Russia, biodiesel production is not economically profitable. Even if the fish oil production process is not complicated, the operative management of small-scale fish oil production unit requires dedicated resources and separate facilities especially to meet hygiene requirements. Managing the side streams is not a core business for fish farmers. Efficient and economically profitable fish oil production requires a centralized production unit with bigger processing capacity. One fish processing unit needs to be designed to manage side streams collected from several fish farms. The optimum location for the processing unit is in the middle of the fish farms. Based on the transportation cost analysis in the Republic of Karelia, it is not economically efficient to transport bio-wastes for more than 100 km since the transportation costs start increasing substantially. Another issue to be considered is that collection of side streams, including the dead fish, from the fish farms should be organized on a daily basis in order to eliminate the need for storing the side streams at the farms. Based on AQUAREL project studies there are different public funding sources available for supporting and enabling profitable and environmentally sustainable utilization, research or development of fish processing side streams and other aquatic biomaterial. Different funding programmes can be utilized by companies, research organizations, authorities and non-governmental organizations.
Resumo:
The purpose of this work was to evaluate the physical, physicochemical, chemical and microbiological characteristics of in natura açai (Euterpe precatoria Mart.)beverageprocessed and commercialized in Rio Branco, Acre, submitting it to acidification and pasteurization treatments and evaluating their effects. Açaí fruits were processed to obtain the beverage as generally consumed. A 25 L sample was collected from a processing unit at a market in Rio Branco and transported to the Laboratory of Food Technology at the Federal University of Acre, for sampling of the experiments in a completely randomized design. Analyses of total solids, pH, total titrable acidity, proteins, lipids, moulds and yeasts, total and heat-tolerant coliforms at 45 ºC were performed in in natura beverage and after treatments. The results of the ANOVA showed, except for lipids, difference (p < 0.01) in the parameters. The in natura açaí beverage presented an elevated contamination by total and heat-tolerant coliforms at 45 ºC, moulds and yeasts, being in hygienic-sanitary conditions both unsatisfactory and unsafe for consumption. Pasteurization was efficient in reducing the beverage microbiota; it reduced contamination to an acceptable level according to the legislation, warranting food quality and safety. The acidified treatment partially reduced the microbiota. The beverage was classified as fine or type C.
Resumo:
La tomographie d’émission par positrons (TEP) est une modalité d’imagerie moléculaire utilisant des radiotraceurs marqués par des isotopes émetteurs de positrons permettant de quantifier et de sonder des processus biologiques et physiologiques. Cette modalité est surtout utilisée actuellement en oncologie, mais elle est aussi utilisée de plus en plus en cardiologie, en neurologie et en pharmacologie. En fait, c’est une modalité qui est intrinsèquement capable d’offrir avec une meilleure sensibilité des informations fonctionnelles sur le métabolisme cellulaire. Les limites de cette modalité sont surtout la faible résolution spatiale et le manque d’exactitude de la quantification. Par ailleurs, afin de dépasser ces limites qui constituent un obstacle pour élargir le champ des applications cliniques de la TEP, les nouveaux systèmes d’acquisition sont équipés d’un grand nombre de petits détecteurs ayant des meilleures performances de détection. La reconstruction de l’image se fait en utilisant les algorithmes stochastiques itératifs mieux adaptés aux acquisitions à faibles statistiques. De ce fait, le temps de reconstruction est devenu trop long pour une utilisation en milieu clinique. Ainsi, pour réduire ce temps, on les données d’acquisition sont compressées et des versions accélérées d’algorithmes stochastiques itératifs qui sont généralement moins exactes sont utilisées. Les performances améliorées par l’augmentation de nombre des détecteurs sont donc limitées par les contraintes de temps de calcul. Afin de sortir de cette boucle et permettre l’utilisation des algorithmes de reconstruction robustes, de nombreux travaux ont été effectués pour accélérer ces algorithmes sur les dispositifs GPU (Graphics Processing Units) de calcul haute performance. Dans ce travail, nous avons rejoint cet effort de la communauté scientifique pour développer et introduire en clinique l’utilisation des algorithmes de reconstruction puissants qui améliorent la résolution spatiale et l’exactitude de la quantification en TEP. Nous avons d’abord travaillé sur le développement des stratégies pour accélérer sur les dispositifs GPU la reconstruction des images TEP à partir des données d’acquisition en mode liste. En fait, le mode liste offre de nombreux avantages par rapport à la reconstruction à partir des sinogrammes, entre autres : il permet d’implanter facilement et avec précision la correction du mouvement et le temps de vol (TOF : Time-Of Flight) pour améliorer l’exactitude de la quantification. Il permet aussi d’utiliser les fonctions de bases spatio-temporelles pour effectuer la reconstruction 4D afin d’estimer les paramètres cinétiques des métabolismes avec exactitude. Cependant, d’une part, l’utilisation de ce mode est très limitée en clinique, et d’autre part, il est surtout utilisé pour estimer la valeur normalisée de captation SUV qui est une grandeur semi-quantitative limitant le caractère fonctionnel de la TEP. Nos contributions sont les suivantes : - Le développement d’une nouvelle stratégie visant à accélérer sur les dispositifs GPU l’algorithme 3D LM-OSEM (List Mode Ordered-Subset Expectation-Maximization), y compris le calcul de la matrice de sensibilité intégrant les facteurs d’atténuation du patient et les coefficients de normalisation des détecteurs. Le temps de calcul obtenu est non seulement compatible avec une utilisation clinique des algorithmes 3D LM-OSEM, mais il permet également d’envisager des reconstructions rapides pour les applications TEP avancées telles que les études dynamiques en temps réel et des reconstructions d’images paramétriques à partir des données d’acquisitions directement. - Le développement et l’implantation sur GPU de l’approche Multigrilles/Multitrames pour accélérer l’algorithme LMEM (List-Mode Expectation-Maximization). L’objectif est de développer une nouvelle stratégie pour accélérer l’algorithme de référence LMEM qui est un algorithme convergent et puissant, mais qui a l’inconvénient de converger très lentement. Les résultats obtenus permettent d’entrevoir des reconstructions en temps quasi-réel que ce soit pour les examens utilisant un grand nombre de données d’acquisition aussi bien que pour les acquisitions dynamiques synchronisées. Par ailleurs, en clinique, la quantification est souvent faite à partir de données d’acquisition en sinogrammes généralement compressés. Mais des travaux antérieurs ont montré que cette approche pour accélérer la reconstruction diminue l’exactitude de la quantification et dégrade la résolution spatiale. Pour cette raison, nous avons parallélisé et implémenté sur GPU l’algorithme AW-LOR-OSEM (Attenuation-Weighted Line-of-Response-OSEM) ; une version de l’algorithme 3D OSEM qui effectue la reconstruction à partir de sinogrammes sans compression de données en intégrant les corrections de l’atténuation et de la normalisation dans les matrices de sensibilité. Nous avons comparé deux approches d’implantation : dans la première, la matrice système (MS) est calculée en temps réel au cours de la reconstruction, tandis que la seconde implantation utilise une MS pré- calculée avec une meilleure exactitude. Les résultats montrent que la première implantation offre une efficacité de calcul environ deux fois meilleure que celle obtenue dans la deuxième implantation. Les temps de reconstruction rapportés sont compatibles avec une utilisation clinique de ces deux stratégies.
Resumo:
Simulating spiking neural networks is of great interest to scientists wanting to model the functioning of the brain. However, large-scale models are expensive to simulate due to the number and interconnectedness of neurons in the brain. Furthermore, where such simulations are used in an embodied setting, the simulation must be real-time in order to be useful. In this paper we present NeMo, a platform for such simulations which achieves high performance through the use of highly parallel commodity hardware in the form of graphics processing units (GPUs). NeMo makes use of the Izhikevich neuron model which provides a range of realistic spiking dynamics while being computationally efficient. Our GPU kernel can deliver up to 400 million spikes per second. This corresponds to a real-time simulation of around 40 000 neurons under biologically plausible conditions with 1000 synapses per neuron and a mean firing rate of 10 Hz.
Resumo:
The movement of graphics and audio programming towards three dimensions is to better simulate the way we experience our world. In this project I looked to use methods for coming closer to such simulation via realistic graphics and sound combined with a natural interface. I did most of my work on a Dell OptiPlex with an 800 MHz Pentium III processor and an NVIDlA GeForce 256 AGP Plus graphics accelerator -high end products in the consumer market as of April 2000. For graphics, I used OpenGL [1], an open·source, multi-platform set of graphics libraries that is relatively easy to use, coded in C . The basic engine I first put together was a system to place objects in a scene and to navigate around the scene in real time. Once I accomplished this, I was able to investigate specific techniques for making parts of a scene more appealing.
Resumo:
This paper aims to design and develop a control and monitoring system of vending machines, based on a Central Processing Unit with peripheral Internet communication. Coupled with the condom vending machines, a data acquisition module will be connected to the original circuits in order to collect and send, via internet, the information to the healthy government agencies, in the form of charts and reports. In the face of this, such agencies may analyze these data and compare them with the rates of reduction, in medium or long term, of the STD/AIDS in their respective regions, after the implementation of these vending machines, together with the conventional preventing programs. Reading the methodology, this paper is about an explaining and bibliography research, with the aspect of a qualitative-quantitative methodology, presenting a deductive method of approach and an indirect documentation technique research. About the results of the tests and simulations, we concluded that the implementation of this system will have the same success in any other type of dispenser machine
Resumo:
This paper traces the development of a software tool, based oil a combination of artificial neural networks (ANN) and a few process equations. aiming to serve as a backup operation instrument in the reference generation for real-time controllers of a steel tandem cold mill By emulating the mathematical model responsible for generating presets under normal operational conditions, the system works as ail option to maintain plant operation in the event of a failure in the processing unit that executes the mathematical model. The system, built from the production data collected over six years of plant operation, steered to the replacement of the former backup operation mode (based oil a lookup table). which degraded both product quality and plant productivity. The study showed that ANN are appropriated tools for the intended purpose and that by this instrument it is possible to achieve nearly the totality of the presets needed by this land of process. The text characterizes the problem, relates the investigated options to solve it. justifies the choice of the ANN approach, describes the methodology and system implementation and, finally, shows and discusses the attained results. (C) 2009 Elsevier Ltd. All rights reserved
Resumo:
SAFT techniques are based on the sequential activation, in emission and reception, of the array elements and the post-processing of all the received signals to compose the image. Thus, the image generation can be divided into two stages: (1) the excitation and acquisition stage, where the signals received by each element or group of elements are stored; and (2) the beamforming stage, where the signals are combined together to obtain the image pixels. The use of Graphics Processing Units (GPUs), which are programmable devices with a high level of parallelism, can accelerate the computations of the beamforming process, that usually includes different functions such as dynamic focusing, band-pass filtering, spatial filtering or envelope detection. This work shows that using GPU technology can accelerate, in more than one order of magnitude with respect to CPU implementations, the beamforming and post-processing algorithms in SAFT imaging. ©2009 IEEE.