995 resultados para graphics processing units
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the potential of a system comprising GPUs and CPUs, these devices should be presented to the programmer as a single platform. The efficient combination of the power of CPU and GPU devices is highly dependent on each device’s characteristics, resulting in platform specific applications that cannot be ported to different systems. Also, the most efficient work balance among devices is highly dependable on the computations to be performed and respective data sizes. In this work, we propose a solution for heterogeneous environments based on the abstraction level provided by algorithmic skeletons. Our goal is to take full advantage of the power of all CPU and GPU devices present in a system, without the need for different kernel implementations nor explicit work-distribution.To that end, we extended Marrow, an algorithmic skeleton framework for multi-GPUs, to support CPU computations and efficiently balance the work-load between devices. Our approach is based on an offline training execution that identifies the ideal work balance and platform configurations for a given application and input data size. The evaluation of this work shows that the combination of CPU and GPU devices can significantly boost the performance of our benchmarks in the tested environments, when compared to GPU-only executions.
Resumo:
Pig meat production was valued at €290 (£198) million at farm gate in Republic of Ireland (ROI) in 2007. In Northern Ireland (NI) in 2006, pig meat was estimated to account for almost seven percent of gross turnover in the food and drinks processing sector at £190 (€280) million. Whilst researching for this report it emerged that comparable figures for the value of the pig meat industry on ROI and NI are not available. This report showed that pig production on the IOI has changed from a small-scale enterprise carried out by a large number of mixed farmers to a modern industry comprised of a small number of specialist producers operating large-scale units. Most products for retailers are prepared and packed in specialised cutting and processing units which may or may not be integrated in the slaughter plant. For some pork products, various additives such as salt, herbs and flavour enhancers are added. Pork products are then stored and transported, frozen or chilled to wholesale, retail and catering facilities for ultimate sale to consumers.
Resumo:
The Computational Biophysics Group at the Universitat Pompeu Fabra (GRIB-UPF) hosts two unique computational resources dedicated to the execution of large scale molecular dynamics (MD) simulations: (a) the ACMD molecular-dynamics software, used on standard personal computers with graphical processing units (GPUs); and (b) the GPUGRID. net computing network, supported by users distributed worldwide that volunteer GPUs for biomedical research. We leveraged these resources and developed studies, protocols and open-source software to elucidate energetics and pathways of a number of biomolecular systems, with a special focus on flexible proteins with many degrees of freedom. First, we characterized ion permeation through the bactericidal model protein Gramicidin A conducting one of the largest studies to date with the steered MD biasing methodology. Next, we addressed an open problem in structural biology, the determination of drug-protein association kinetics; we reconstructed the binding free energy, association, and dissaciociation rates of a drug like model system through a spatial decomposition and a Makov-chain analysis. The work was published in the Proceedings of the National Academy of Sciences and become one of the few landmark papers elucidating a ligand-binding pathway. Furthermore, we investigated the unstructured Kinase Inducible Domain (KID), a 28-peptide central to signalling and transcriptional response; the kinetics of this challenging system was modelled with a Markovian approach in collaboration with Frank Noe’s group at the Freie University of Berlin. The impact of the funding includes three peer-reviewed publication on high-impact journals; three more papers under review; four MD analysis components, released as open-source software; MD protocols; didactic material, and code for the hosting group.
Resumo:
Työssä tarkastellaan kolmen eri valmistajan signaaliprosessoriperheitä. Työn tavoitteena on tutkia prosessoreiden teknistä soveltuvuutta suunnitteilla olevaan taajuusmuuttajatuoteperheeseen. Työn alkuosassa käydään taajuusmuuttajan rakenne läpi ja selostetaan oikosulkumoottorin yleisimmät ohjausmenetelmät. Työssä selvitetään myös signaaliprosessorin ja integroitujen oheispiirien toimintaa. Työn painopiste prosessoreiden teknisten ominaisuuksien vertailussa. Työssä on vertailtu muun muassa prosessoreiden sisäistä rakennetta, käskykantojen ominaisuuksia, keskeytysten palveluun kuluvaa aikaa ja oheispiirien ominaisuuksia. Oheispiirien, erityisesti analogiadigitaalimuuntimen halutunlainen toiminta on moottorinohjausohjelmiston kannalta tärkeää. Työhön sisällytetyt prosessoriperheet on pisteytetty tarkasteltujen ominaisuuksien osalta. Vertailun tuloksena on esitetty haettuun tarkoitukseen teknisesti soveltuvin prosessoriperhe ja prosessorityyppi. Työssä ei kuitenkaan voida antaa yleistä paremmuusjärjestystä tutkituille prosessoreille.
Resumo:
The key information processing units within gene regulatory networks are enhancers. Enhancer activity is associated with the production of tissue-specific noncoding RNAs, yet the existence of such transcripts during cardiac development has not been established. Using an integrated genomic approach, we demonstrate that fetal cardiac enhancers generate long noncoding RNAs (lncRNAs) during cardiac differentiation and morphogenesis. Enhancer expression correlates with the emergence of active enhancer chromatin states, the initiation of RNA polymerase II at enhancer loci and expression of target genes. Orthologous human sequences are also transcribed in fetal human hearts and cardiac progenitor cells. Through a systematic bioinformatic analysis, we identified and characterized, for the first time, a catalog of lncRNAs that are expressed during embryonic stem cell differentiation into cardiomyocytes and associated with active cardiac enhancer sequences. RNA-sequencing demonstrates that many of these transcripts are polyadenylated, multi-exonic long noncoding RNAs. Moreover, knockdown of two enhancer-associated lncRNAs resulted in the specific downregulation of their predicted target genes. Interestingly, the reactivation of the fetal gene program, a hallmark of the stress response in the adult heart, is accompanied by increased expression of fetal cardiac enhancer transcripts. Altogether, these findings demonstrate that the activity of cardiac enhancers and expression of their target genes are associated with the production of enhancer-derived lncRNAs.
Resumo:
Biofilms constitute a physical barrier, protecting the encased bacteria from detergents and sanitizers. The objective of this work was to analyze the effectiveness of sodium hypochlorite (NaOCl) against strains of Staphylococcus aureus isolated from raw milk of cows with subclinical mastitis and Staphylococcus aureus isolated from the milking environment (blowers and milk conducting tubes). The results revealed that, in the presence of NaOCl (150ppm), the number of adhered cells of the twelve S. aureus strains was significantly reduced. When the same strains were evaluated in biofilm condition, different results were obtained. It was found that, after a contact period of five minutes with NaOCl (150ppm), four strains (two strains from milk , one from the blowers and one from a conductive rubber) were still able to grow. Although with the increasing contact time between the bacteria and the NaOCl (150ppm), no growth was detected for any of the strains. Concerning the efficiency of NaOCl on total biofilm biomass formation by each S. aureus strain, a decrease was observed when these strains were in contact with 150 ppm NaOCl for a total period of 10 minutes. This study highlights the importance of a correct sanitation protocol of all the milk processing units which can indeed significantly reduce the presence of microorganisms, leading to a decrease of cow´s mastitis and milk contamination.
Resumo:
With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.
Resumo:
Salted lamb meat blanket, originated from boning, salting, and drying of whole lamb carcass, was studied aiming at obtaining information that support the search for guarantees of origin for this typical regional product from the city of Petrolina-Pernambuco-Brazil. Data from three processing units were obtained, where it was observed the use of a traditional local technology that uses salting, an ancient preservation method; however, with a peculiar boning technique, resulting in a meat product with great potential for exploitation in the form of meat blanket. Based on the values of pH (6.22 ± 0.22), water activity (0.97 ± 0.02), and moisture (69.86 ± 2.26) lamb meat blanket is considered a perishable product, and consequently it requires the use of other preservation methods combined with salt, which along with the results of the microbiological analyses (absence of Salmonella sp, score <10 MPN/g of halophilic bacteria, total coliforms between 6.7 × 10³ and 5.2 × 10(6) FUC/g, and Staphylococcus from 8.1 × 10³ CFU/g at uncountable) reinforce the need of hygienic practices to ensure product safety. These results, together with the product notoriety and the organization of the sector are important factors in achieving Geographical Indication of the Salted lamb Meat blanket of Petrolina.
Resumo:
Le code source de la libraire développée accompagne ce dépôt dans l'état où il était à ce moment. Il est possible de trouver une version plus à jour sur github (http://github.com/abergeron).
Resumo:
The forms of natural rubber studied were sheet [RSS 4 and RSS 5], ISNR 20 and EBC. In the case of the latter two forms samples from estate and nonestate sectors were included. The samples were collected from different locations at specified intervals, for a particular period. The effect of the extent of mastication on raw rubber properties as well as the properties of the compounds and vulcanizates also studied. The consistency in raw rubber properties and breakdown behavior of skim rubber were studied by collecting samples periodically from selected processing units. The effect of incorporation of skim with ISNR 20 has also been investigated
Resumo:
The authors compare the performance of two types of controllers one based on the multilayered network and the other based on the single layered CMAC network (cerebellar model articulator controller). The neurons (information processing units) in the multi-layered network use Gaussian activation functions. The control scheme which is considered is a predictive control algorithm, along the lines used by Willis et al. (1991), Kambhampati and Warwick (1991). The process selected as a test bed is a continuous stirred tank reactor. The reaction taking place is an irreversible exothermic reaction in a constant volume reactor cooled by a single coolant stream. This reactor is a simplified version of the first tank in the two tank system given by Henson and Seborg (1989).
Resumo:
The functional networks of cultured neurons exhibit complex network properties similar to those found in vivo. Starting from random seeding, cultures undergo significant reorganization during the initial period in vitro, yet despite providing an ideal platform for observing developmental changes in neuronal connectivity, little is known about how a complex functional network evolves from isolated neurons. In the present study, evolution of functional connectivity was estimated from correlations of spontaneous activity. Network properties were quantified using complex measures from graph theory and used to compare cultures at different stages of development during the first 5 weeks in vitro. Networks obtained from young cultures (14 days in vitro) exhibited a random topology, which evolved to a small-world topology during maturation. The topology change was accompanied by an increased presence of highly connected areas (hubs) and network efficiency increased with age. The small-world topology balances integration of network areas with segregation of specialized processing units. The emergence of such network structure in cultured neurons, despite a lack of external input, points to complex intrinsic biological mechanisms. Moreover, the functional network of cultures at mature ages is efficient and highly suited to complex processing tasks.
Resumo:
Empirical mode decomposition (EMD) is a data-driven method used to decompose data into oscillatory components. This paper examines to what extent the defined algorithm for EMD might be susceptible to data format. Two key issues with EMD are its stability and computational speed. This paper shows that for a given signal there is no significant difference between results obtained with single (binary32) and double (binary64) floating points precision. This implies that there is no benefit in increasing floating point precision when performing EMD on devices optimised for single floating point format, such as graphical processing units (GPUs).
Resumo:
Large-scale simulations of parts of the brain using detailed neuronal models to improve our understanding of brain functions are becoming a reality with the usage of supercomputers and large clusters. However, the high acquisition and maintenance cost of these computers, including the physical space, air conditioning, and electrical power, limits the number of simulations of this kind that scientists can perform. Modern commodity graphical cards, based on the CUDA platform, contain graphical processing units (GPUs) composed of hundreds of processors that can simultaneously execute thousands of threads and thus constitute a low-cost solution for many high-performance computing applications. In this work, we present a CUDA algorithm that enables the execution, on multiple GPUs, of simulations of large-scale networks composed of biologically realistic Hodgkin-Huxley neurons. The algorithm represents each neuron as a CUDA thread, which solves the set of coupled differential equations that model each neuron. Communication among neurons located in different GPUs is coordinated by the CPU. We obtained speedups of 40 for the simulation of 200k neurons that received random external input and speedups of 9 for a network with 200k neurons and 20M neuronal connections, in a single computer with two graphic boards with two GPUs each, when compared with a modern quad-core CPU. Copyright (C) 2010 John Wiley & Sons, Ltd.