54 resultados para Compute Unified Device Architecture(CUDA)
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
All-optical label swapping (AOLS) forms a key technology towards the implementation of all-optical packet switching nodes (AOPS) for the future optical Internet. The capital expenditures of the deployment of AOLS increases with the size of the label spaces (i.e. the number of used labels), since a special optical device is needed for each recognized label on every node. Label space sizes are affected by the way in which demands are routed. For instance, while shortest-path routing leads to the usage of fewer labels but high link utilization, minimum interference routing leads to the opposite. This paper studies all-optical label stacking (AOLStack), which is an extension of the AOLS architecture. AOLStack aims at reducing label spaces while easing the compromise with link utilization. In this paper, an integer lineal program is proposed with the objective of analyzing the softening of the aforementioned trade-off due to AOLStack. Furthermore, a heuristic aiming at finding good solutions in polynomial-time is proposed as well. Simulation results show that AOLStack either a) reduces the label spaces with a low increase in the link utilization or, similarly, b) uses better the residual bandwidth to decrease the number of labels even more
Resumo:
Análisis de desarrollo paralelo CUDA en lenguajes Java y Python, utilizando JCuda, RootBeer, PyCuda y Anaconda Accelerate.
Resumo:
This paper analyzes the employment relationship on the basis of the notion of access. We argue that the degree of access provided by a job is an incentive to activate the employee’s self-actualization needs. We investigate the effect of access on the workers’ performance through an agency model and provide a number of propositions with practical implications for personnel policies. Our results are consistent with the intuition emerged from the real business practice as well as with many of the arguments on the substitutive role between monetary and non-monetary incentives frequently reported in the literature.
Resumo:
The purpose of this article is to explain what factors determine the use of seniority-based pay for production workers in Spanish manufacturing industry. The data used in order to achieve these objectives was taken from 774 Spanish industrial plants. The estimation of several ordered probit models enabled us to see that in firms where it was more costly for management to engage in opportunistic behaviour, deferred payment shows a negative relation to the use of devices, like monitoring or incentive payment, designed to align workers’ objectives with those of the company.
Resumo:
Report for the scientific sojourn at the Department of Information Technology (INTEC) at the Ghent University, Belgium, from january to june 2007. All-Optical Label Swapping (AOLS) forms a key technology towards the implementation of All-Optical Packet Switching nodes (AOPS) for the future optical Internet. The capital expenditures of the deployment of AOLS increases with the size of the label spaces (i.e. the number of used labels), since a special optical device is needed for each recognized label on every node. Label space sizes are affected by the wayin which demands are routed. For instance, while shortest-path routing leads to the usage of fewer labels but high link utilization, minimum interference routing leads to the opposite. This project studies and proposes All-Optical Label Stacking (AOLStack), which is an extension of the AOLS architecture. AOLStack aims at reducing label spaces while easing the compromise with link utilization. In this project, an Integer Lineal Program is proposed with the objective of analyzing the softening of the aforementioned trade-off due to AOLStack. Furthermore, a heuristic aiming at finding good solutions in polynomial-time is proposed as well. Simulation results show that AOLStack either a) reduces the label spaces with a low increase in the link utilization or, similarly, b) uses better the residual bandwidth to decrease the number of labels even more.
Resumo:
The demand for computational power has been leading the improvement of the High Performance Computing (HPC) area, generally represented by the use of distributed systems like clusters of computers running parallel applications. In this area, fault tolerance plays an important role in order to provide high availability isolating the application from the faults effects. Performance and availability form an undissociable binomial for some kind of applications. Therefore, the fault tolerant solutions must take into consideration these two constraints when it has been designed. In this dissertation, we present a few side-effects that some fault tolerant solutions may presents when recovering a failed process. These effects may causes degradation of the system, affecting mainly the overall performance and availability. We introduce RADIC-II, a fault tolerant architecture for message passing based on RADIC (Redundant Array of Distributed Independent Fault Tolerance Controllers) architecture. RADIC-II keeps as maximum as possible the RADIC features of transparency, decentralization, flexibility and scalability, incorporating a flexible dynamic redundancy feature, allowing to mitigate or to avoid some recovery side-effects.
Resumo:
Architectural design and deployment of Peer-to-Peer Video-on-Demand (P2PVoD) systems which support VCR functionalities is attracting the interest of an increasing number of research groups within the scientific community; especially due to the intrinsic characteristics of such systems and the benefits that peers could provide at reducing the server load. This work focuses on the performance analysis of a P2P-VoD system considering user behaviors obtained from real traces together with other synthetic user patterns. The experiments performed show that it is feasible to achieve a performance close to the best possible. Future work will consider monitoring the physical characteristics of the network in order to improve the design of different aspects of a VoD system.
Resumo:
In this paper, we develop numerical algorithms that use small requirements of storage and operations for the computation of invariant tori in Hamiltonian systems (exact symplectic maps and Hamiltonian vector fields). The algorithms are based on the parameterization method and follow closely the proof of the KAM theorem given in [LGJV05] and [FLS07]. They essentially consist in solving a functional equation satisfied by the invariant tori by using a Newton method. Using some geometric identities, it is possible to perform a Newton step using little storage and few operations. In this paper we focus on the numerical issues of the algorithms (speed, storage and stability) and we refer to the mentioned papers for the rigorous results. We show how to compute efficiently both maximal invariant tori and whiskered tori, together with the associated invariant stable and unstable manifolds of whiskered tori. Moreover, we present fast algorithms for the iteration of the quasi-periodic cocycles and the computation of the invariant bundles, which is a preliminary step for the computation of invariant whiskered tori. Since quasi-periodic cocycles appear in other contexts, this section may be of independent interest. The numerical methods presented here allow to compute in a unified way primary and secondary invariant KAM tori. Secondary tori are invariant tori which can be contracted to a periodic orbit. We present some preliminary results that ensure that the methods are indeed implementable and fast. We postpone to a future paper optimized implementations and results on the breakdown of invariant tori.
Resumo:
Estudi de l'arquitectura i prestacions del microcontrolador LPC2119 tot implementant la proposta d’un cas pràctic. En la besant teòrica, es fa una anàlisi acurada del dispositiu LPC2119, enumerant les principals característiques i exposant les seves parts, aprofundint sobretot en l’arquitectura i core ARM que incorpora. En l'àmbit pràctic, s'introdueix el problema del pèndul invertit com a proposta per a ser integrada sobre un robot que exploti les funcionalitats del dispositiu integrat presentades a l'estudi teòric.
Resumo:
Given the urgence of a new paradigm in wireless digital trasmission which should allow for higher bit rate, lower latency and tigher delay constaints, it has been proposed to investigate the fundamental building blocks that at the circuital/device level, will boost the change towards a more efficient network architecture, with high capacity, higher bandwidth and a more satisfactory end user experience. At the core of each transciever, there are inherently analog devices capable of providing the carrier signal, the oscillators. It is strongly believed that many limitations in today's communication protocols, could be relieved by permitting high carrier frequency radio transmission, and having some degree of reconfigurability. This led us to studying distributed oscillator architectures which work in the microwave range and possess wideband tuning capability. As microvave oscillators are essentially nonlinear devices, a full nonlinear analyis, synthesis, and optimization had to be considered for their implementation. Consequently, all the most used nonlinear numerical techniques in commercial EDA software had been reviewed. An application of all the aforementioned techniques has been shown, considering a systems of three coupled oscillator ("triple push" oscillator) in which the stability of the various oscillating modes has been studied. Provided that a certain phase distribution is maintained among the oscillating elements, this topology permits a rise in the output power of the third harmonic; nevertheless due to circuit simmetry, "unwanted" oscillating modes coexist with the intenteded one. Starting with the necessary background on distributed amplification and distributed oscillator theory, the design of a four stage reverse mode distributed voltage controlled oscillator (DVCO) using lumped elments has been presented. All the design steps have been reported and for the first time a method for an optimized design with reduced variations in the output power has been presented. Ongoing work is devoted to model a wideband DVCO and to implement a frequency divider.
Resumo:
Construcció d'una aplicació web a partir de les especificacions d'un client imaginari. Estudi i utilització del mètode Rational Unified Process, el més habitual actualment en la construcció de software. Disseny d'una base de dades i implementació del model lògic mitjançant un SGBD punter al mercat com Oracle.
Resumo:
Aquest document presenta l'arquitectura de la màquina virtual Java (Java Virtual Machine, o JVM). Java és un llenguatge de programació cada vegada més extès i, per tant, més utilitzat. El seu àmbit d¿execució s¿estèn actualment per qualsevol plataforma, des de un dispositiu mòbil (telèfon, PDA...) com un mainframe (com per exemple, el model S/390 sota z/OS de IBM).
Resumo:
It is well known that image processing requires a huge amount of computation, mainly at low level processing where the algorithms are dealing with a great number of data-pixel. One of the solutions to estimate motions involves detection of the correspondences between two images. For normalised correlation criteria, previous experiments shown that the result is not altered in presence of nonuniform illumination. Usually, hardware for motion estimation has been limited to simple correlation criteria. The main goal of this paper is to propose a VLSI architecture for motion estimation using a matching criteria more complex than Sum of Absolute Differences (SAD) criteria. Today hardware devices provide many facilities for the integration of more and more complex designs as well as the possibility to easily communicate with general purpose processors
Resumo:
This paper proposes a parallel architecture for estimation of the motion of an underwater robot. It is well known that image processing requires a huge amount of computation, mainly at low-level processing where the algorithms are dealing with a great number of data. In a motion estimation algorithm, correspondences between two images have to be solved at the low level. In the underwater imaging, normalised correlation can be a solution in the presence of non-uniform illumination. Due to its regular processing scheme, parallel implementation of the correspondence problem can be an adequate approach to reduce the computation time. Taking into consideration the complexity of the normalised correlation criteria, a new approach using parallel organisation of every processor from the architecture is proposed