892 resultados para Parallel computing, Virtual machine, Composition, Determinism, Abstraction
Resumo:
Artificial neural networks are usually applied to solve complex problems. In problems with more complexity, by increasing the number of layers and neurons, it is possible to achieve greater functional efficiency. Nevertheless, this leads to a greater computational effort. The response time is an important factor in the decision to use neural networks in some systems. Many argue that the computational cost is higher in the training period. However, this phase is held only once. Once the network trained, it is necessary to use the existing computational resources efficiently. In the multicore era, the problem boils down to efficient use of all available processing cores. However, it is necessary to consider the overhead of parallel computing. In this sense, this paper proposes a modular structure that proved to be more suitable for parallel implementations. It is proposed to parallelize the feedforward process of an RNA-type MLP, implemented with OpenMP on a shared memory computer architecture. The research consistes on testing and analizing execution times. Speedup, efficiency and parallel scalability are analyzed. In the proposed approach, by reducing the number of connections between remote neurons, the response time of the network decreases and, consequently, so does the total execution time. The time required for communication and synchronization is directly linked to the number of remote neurons in the network, and so it is necessary to investigate which one is the best distribution of remote connections
Resumo:
The increasing demand for high performance wireless communication systems has shown the inefficiency of the current model of fixed allocation of the radio spectrum. In this context, cognitive radio appears as a more efficient alternative, by providing opportunistic spectrum access, with the maximum bandwidth possible. To ensure these requirements, it is necessary that the transmitter identify opportunities for transmission and the receiver recognizes the parameters defined for the communication signal. The techniques that use cyclostationary analysis can be applied to problems in either spectrum sensing and modulation classification, even in low signal-to-noise ratio (SNR) environments. However, despite the robustness, one of the main disadvantages of cyclostationarity is the high computational cost for calculating its functions. This work proposes efficient architectures for obtaining cyclostationary features to be employed in either spectrum sensing and automatic modulation classification (AMC). In the context of spectrum sensing, a parallelized algorithm for extracting cyclostationary features of communication signals is presented. The performance of this features extractor parallelization is evaluated by speedup and parallel eficiency metrics. The architecture for spectrum sensing is analyzed for several configuration of false alarm probability, SNR levels and observation time for BPSK and QPSK modulations. In the context of AMC, the reduced alpha-profile is proposed as as a cyclostationary signature calculated for a reduced cyclic frequencies set. This signature is validated by a modulation classification architecture based on pattern matching. The architecture for AMC is investigated for correct classification rates of AM, BPSK, QPSK, MSK and FSK modulations, considering several scenarios of observation length and SNR levels. The numerical results of performance obtained in this work show the eficiency of the proposed architectures
Resumo:
With the advance of the Cloud Computing paradigm, a single service offered by a cloud platform may not be enough to meet all the application requirements. To fulfill such requirements, it may be necessary, instead of a single service, a composition of services that aggregates services provided by different cloud platforms. In order to generate aggregated value for the user, this composition of services provided by several Cloud Computing platforms requires a solution in terms of platforms integration, which encompasses the manipulation of a wide number of noninteroperable APIs and protocols from different platform vendors. In this scenario, this work presents Cloud Integrator, a middleware platform for composing services provided by different Cloud Computing platforms. Besides providing an environment that facilitates the development and execution of applications that use such services, Cloud Integrator works as a mediator by providing mechanisms for building applications through composition and selection of semantic Web services that take into account metadata about the services, such as QoS (Quality of Service), prices, etc. Moreover, the proposed middleware platform provides an adaptation mechanism that can be triggered in case of failure or quality degradation of one or more services used by the running application in order to ensure its quality and availability. In this work, through a case study that consists of an application that use services provided by different cloud platforms, Cloud Integrator is evaluated in terms of the efficiency of the performed service composition, selection and adaptation processes, as well as the potential of using this middleware in heterogeneous computational clouds scenarios
Resumo:
This paper presents vectorized methods of construction and descent of quadtrees that can be easily adapted to message passing parallel computing. A time complexity analysis for the present approach is also discussed. The proposed method of tree construction requires a hash table to index nodes of a linear quadtree in the breadth-first order. The hash is performed in two steps: an internal hash to index child nodes and an external hash to index nodes in the same level (depth). The quadtree descent is performed by considering each level as a vector segment of a linear quadtree, so that nodes of the same level can be processed concurrently. © 2012 Springer-Verlag.
Resumo:
Transactional memory (TM) is a new synchronization mechanism devised to simplify parallel programming, thereby helping programmers to unleash the power of current multicore processors. Although software implementations of TM (STM) have been extensively analyzed in terms of runtime performance, little attention has been paid to an equally important constraint faced by nearly all computer systems: energy consumption. In this work we conduct a comprehensive study of energy and runtime tradeoff sin software transactional memory systems. We characterize the behavior of three state-of-the-art lock-based STM algorithms, along with three different conflict resolution schemes. As a result of this characterization, we propose a DVFS-based technique that can be integrated into the resolution policies so as to improve the energy-delay product (EDP). Experimental results show that our DVFS-enhanced policies are indeed beneficial for applications with high contention levels. Improvements of up to 59% in EDP can be observed in this scenario, with an average EDP reduction of 16% across the STAMP workloads. © 2012 IEEE.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Huge image collections are becoming available lately. In this scenario, the use of Content-Based Image Retrieval (CBIR) systems has emerged as a promising approach to support image searches. The objective of CBIR systems is to retrieve the most similar images in a collection, given a query image, by taking into account image visual properties such as texture, color, and shape. In these systems, the effectiveness of the retrieval process depends heavily on the accuracy of ranking approaches. Recently, re-ranking approaches have been proposed to improve the effectiveness of CBIR systems by taking into account the relationships among images. The re-ranking approaches consider the relationships among all images in a given dataset. These approaches typically demands a huge amount of computational power, which hampers its use in practical situations. On the other hand, these methods can be massively parallelized. In this paper, we propose to speedup the computation of the RL-Sim algorithm, a recently proposed image re-ranking approach, by using the computational power of Graphics Processing Units (GPU). GPUs are emerging as relatively inexpensive parallel processors that are becoming available on a wide range of computer systems. We address the image re-ranking performance challenges by proposing a parallel solution designed to fit the computational model of GPUs. We conducted an experimental evaluation considering different implementations and devices. Experimental results demonstrate that significant performance gains can be obtained. Our approach achieves speedups of 7x from serial implementation considering the overall algorithm and up to 36x on its core steps.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Pós-graduação em Biofísica Molecular - IBILCE
Resumo:
Esperimenti sulla virtualizzazione del laboratorio informatico della facoltà, che attraverso migrazioni di virtual machine, consentirebbe il risparmio energetico grazie allo spegnimento di alcune macchine fisiche.
Resumo:
Generic programming is likely to become a new challenge for a critical mass of developers. Therefore, it is crucial to refine the support for generic programming in mainstream Object-Oriented languages — both at the design and at the implementation level — as well as to suggest novel ways to exploit the additional degree of expressiveness made available by genericity. This study is meant to provide a contribution towards bringing Java genericity to a more mature stage with respect to mainstream programming practice, by increasing the effectiveness of its implementation, and by revealing its full expressive power in real world scenario. With respect to the current research setting, the main contribution of the thesis is twofold. First, we propose a revised implementation for Java generics that greatly increases the expressiveness of the Java platform by adding reification support for generic types. Secondly, we show how Java genericity can be leveraged in a real world case-study in the context of the multi-paradigm language integration. Several approaches have been proposed in order to overcome the lack of reification of generic types in the Java programming language. Existing approaches tackle the problem of reification of generic types by defining new translation techniques which would allow for a runtime representation of generics and wildcards. Unfortunately most approaches suffer from several problems: heterogeneous translations are known to be problematic when considering reification of generic methods and wildcards. On the other hand, more sophisticated techniques requiring changes in the Java runtime, supports reified generics through a true language extension (where clauses) so that backward compatibility is compromised. In this thesis we develop a sophisticated type-passing technique for addressing the problem of reification of generic types in the Java programming language; this approach — first pioneered by the so called EGO translator — is here turned into a full-blown solution which reifies generic types inside the Java Virtual Machine (JVM) itself, thus overcoming both performance penalties and compatibility issues of the original EGO translator. Java-Prolog integration Integrating Object-Oriented and declarative programming has been the subject of several researches and corresponding technologies. Such proposals come in two flavours, either attempting at joining the two paradigms, or simply providing an interface library for accessing Prolog declarative features from a mainstream Object-Oriented languages such as Java. Both solutions have however drawbacks: in the case of hybrid languages featuring both Object-Oriented and logic traits, such resulting language is typically too complex, thus making mainstream application development an harder task; in the case of library-based integration approaches there is no true language integration, and some “boilerplate code” has to be implemented to fix the paradigm mismatch. In this thesis we develop a framework called PatJ which promotes seamless exploitation of Prolog programming in Java. A sophisticated usage of generics/wildcards allows to define a precise mapping between Object-Oriented and declarative features. PatJ defines a hierarchy of classes where the bidirectional semantics of Prolog terms is modelled directly at the level of the Java generic type-system.