924 resultados para Algorithmic skeleton frameworks
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the potential of a system comprising GPUs and CPUs, these devices should be presented to the programmer as a single platform. The efficient combination of the power of CPU and GPU devices is highly dependent on each device’s characteristics, resulting in platform specific applications that cannot be ported to different systems. Also, the most efficient work balance among devices is highly dependable on the computations to be performed and respective data sizes. In this work, we propose a solution for heterogeneous environments based on the abstraction level provided by algorithmic skeletons. Our goal is to take full advantage of the power of all CPU and GPU devices present in a system, without the need for different kernel implementations nor explicit work-distribution.To that end, we extended Marrow, an algorithmic skeleton framework for multi-GPUs, to support CPU computations and efficiently balance the work-load between devices. Our approach is based on an offline training execution that identifies the ideal work balance and platform configurations for a given application and input data size. The evaluation of this work shows that the combination of CPU and GPU devices can significantly boost the performance of our benchmarks in the tested environments, when compared to GPU-only executions.
Resumo:
The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architecture. It is a co-processor specially tailored for data-parallel computations, whose basic architectural design is similar to the ones of GPUs (Graphics Processing Units), leveraging the use of many integrated low computational cores to perform parallel computations. The main novelty of the MIC architecture, relatively to GPUs, is its compatibility with the Intel x86 architecture. This enables the use of many of the tools commonly available for the parallel programming of x86-based architectures, which may lead to a smaller learning curve. However, programming the Xeon Phi still entails aspects intrinsic to accelerator-based computing, in general, and to the MIC architecture, in particular. In this thesis we advocate the use of algorithmic skeletons for programming the Xeon Phi. Algorithmic skeletons abstract the complexity inherent to parallel programming, hiding details such as resource management, parallel decomposition, inter-execution flow communication, thus removing these concerns from the programmer’s mind. In this context, the goal of the thesis is to lay the foundations for the development of a simple but powerful and efficient skeleton framework for the programming of the Xeon Phi processor. For this purpose we build upon Marrow, an existing framework for the orchestration of OpenCLTM computations in multi-GPU and CPU environments. We extend Marrow to execute both OpenCL and C++ parallel computations on the Xeon Phi. We evaluate the newly developed framework, several well-known benchmarks, like Saxpy and N-Body, will be used to compare, not only its performance to the existing framework when executing on the co-processor, but also to assess the performance on the Xeon Phi versus a multi-GPU environment.
Resumo:
A Digital Breast Tomosynthesis (DBT) é uma técnica que permite obter imagens mamárias 3D de alta qualidade, que só podem ser obtidas através de métodos de re-construção. Os métodos de reconstrução mais rápidos são os iterativos, sendo no en-tanto computacionalmente exigentes, necessitando de sofrer muitas optimizações. Exis-tem optimizações que usam computação paralela através da implementação em GPUs usando CUDA. Como é sabido, o desenvolvimento de programas eficientes que usam GPUs é ainda uma tarefa demorada, dado que os modelos de programação disponíveis são de baixo nível, e a portabilidade do código para outras arquitecturas não é imedia-ta. É uma mais valia poder criar programas paralelos de forma rápida, com possibili-dade de serem usados em diferentes arquitecturas, sem exigir muitos conhecimentos sobre a arquitectura subjacente e sobre os modelos de programação de baixo nível. Para resolver este problema, propomos a utilização de soluções existentes que reduzam o esforço de paralelização, permitindo a sua portabilidade, garantindo ao mesmo tempo um desempenho aceitável. Para tal, vamos utilizar um framework (FastFlow) com suporte para Algorithmic Skeletons, que tiram partido da programação paralela estruturada, capturando esquemas/padrões recorrentes que são comuns na programação paralela. O trabalho realizado centrou-se na paralelização de uma das fases de reconstru-ção da imagem 3D – geração da matriz de sistema – que é uma das mais demoradas do processo de reconstrução; esse trabalho incluiu um método de ordenação modificado em relação ao existente. Foram realizadas diferentes implementações em CPU e GPU (usando OpenMP, CUDA e FastFlow) o que permitiu comparar estes ambientes de programação em termos de facilidade de desenvolvimento e eficiência da solução. A comparação feita permite concluir que o desempenho das soluções baseadas no FastFlow não é muito diferente das tradicionais o que sugere que ferramentas deste tipo podem simplificar e agilizar a implementação de um algoritmos na área de recons-trução de imagens 3D, mantendo um bom desempenho.
Resumo:
Structured parallel programming, and in particular programming models using the algorithmic skeleton or parallel design pattern concepts, are increasingly considered to be the only viable means of supporting effective development of scalable and efficient parallel programs. Structured parallel programming models have been assessed in a number of works in the context of performance. In this paper we consider how the use of structured parallel programming models allows knowledge of the parallel patterns present to be harnessed to address both performance and energy consumption. We consider different features of structured parallel programming that may be leveraged to impact the performance/energy trade-off and we discuss a preliminary set of experiments validating our claims.
Resumo:
The present work describes a new species of Baurusuchidae from Upper Cretaceous sediments of the Bauru Basin, and provides the first complete postcranial description for the family. Many postcranial features observed in the new species are also present in other notosuchian taxa, and are thus considered plesiomorphic for the genus. These are: long cervical neural spines; robust deltopectoral crest of the humerus; large proximal portion in the radiale that contacts the ulna; ulnare anterior distal projection; supra-acetabular crest well developed laterally; post-acetabular process posterodorsally deflected; presence of an anteromedial crest in the femur; fourth trocanter of femur posteriorly positioned; tibia with a laterally curved shaft; calcaneum tuber posteroventrally oriented; osteoderms ornamented with grooves and imbricated in the tail. On the other hand, we found the following sacral and carpal features to be unique among all mesoeucrocodylians analyzed: transverse processes of sacral vertebrae dorsolaterally deflected; presence of a longitudinal crest in the lateral surface of sacral vertebrae; pisiform carpal with a condyle-like surface. The majority of these cited features corroborates a cursorial locomotion for the new species described in the present study, suggesting that members of the family Baurusuchidae were also cursorial species.
Resumo:
An (n, d)-expander is a graph G = (V, E) such that for every X subset of V with vertical bar X vertical bar <= 2n - 2 we have vertical bar Gamma(G)(X) vertical bar >= (d + 1) vertical bar X vertical bar. A tree T is small if it has at most n vertices and has maximum degree at most d. Friedman and Pippenger (1987) proved that any ( n; d)- expander contains every small tree. However, their elegant proof does not seem to yield an efficient algorithm for obtaining the tree. In this paper, we give an alternative result that does admit a polynomial time algorithm for finding the immersion of any small tree in subgraphs G of (N, D, lambda)-graphs Lambda, as long as G contains a positive fraction of the edges of Lambda and lambda/D is small enough. In several applications of the Friedman-Pippenger theorem, including the ones in the original paper of those authors, the (n, d)-expander G is a subgraph of an (N, D, lambda)-graph as above. Therefore, our result suffices to provide efficient algorithms for such previously non-constructive applications. As an example, we discuss a recent result of Alon, Krivelevich, and Sudakov (2007) concerning embedding nearly spanning bounded degree trees, the proof of which makes use of the Friedman-Pippenger theorem. We shall also show a construction inspired on Wigderson-Zuckerman expander graphs for which any sufficiently dense subgraph contains all trees of sizes and maximum degrees achieving essentially optimal parameters. Our algorithmic approach is based on a reduction of the tree embedding problem to a certain on-line matching problem for bipartite graphs, solved by Aggarwal et al. (1996).
Resumo:
Knowledge is a product of human social systems and, therefore, the foundations of the knowledge-based economy are social and cultural. Communication is central to knowledge creation and diffusion, and Public Policy in Knowledge-Based Economies highlights specific social and cultural conditions that can enhance the communication, use and creation of knowledge in a society.The purpose of this book is to illustrate how these social and cultural conditions are identified and analysed through new conceptual frameworks. Such frameworks are necessary to penetrate the surface features of knowledge-based economies - science and technology - and disclose what drives such economies.This book will provide policymakers, analysts and academics with the fundamental tools needed for the development of policy in this little understood and emerging area.
Resumo:
A lignan with a new skeleton named chimarrhinin (1) was isolated from an extract of the leaves of Chimarrhis turbinata, a Rubiaceae plant species. (13)C NMR spectrometric techniques including 1D and 2D experiments and HRESIMS provided unequivocal structural confirmation of this new C(6).C(3) skeleton type. The relative configuration of 1 was established by 2D (1)H-H analysis and J couplings, while its conformation was evaluated through molecular modeling using the RM1 semiempirical method, with the aid of coupling constants obtained by NMR analysis. The antioxidant activity of the new derivative 1 and two known and previously isolated phenolic derivatives (2 and 3) was investigated. An IC(50) value of 7.50 +/- 0.5 mu mol L(-1) was obtained for the new derivative 1, while 2 and 3 showed IC(50) values of 18.60 +/- 0.4 and 18.50 +/- 0.6 mu mol, respectively.
Resumo:
Objectives. The aim of this study was to evaluate the effect of thermal and mechanical cycling alone or in combination, on the flexural strength of ceramic and metallic frameworks cast in gold alloy or titanium. Methods. Metallic frameworks (25 mm x 3 mm x 0.5 mm) (N = 96) cast in gold alloy or commercial pure titanium (Ti cp) were obtained using acrylic templates. They were airborne particle-abraded with 150 mu m aluminum oxide at the central area of the frameworks (8 mm x 3 mm). Bonding agent and opaque were applied on the particle-abraded surfaces and the corresponding ceramic for each metal was fired onto them. The thickness of the ceramic layer was standardized by positioning the frameworks in a metallic template (height: I mm). The specimens from each ceramic-metal combination (N = 96, n = 12 per group) were randomly assigned into four experimental fatigue conditions, namely water storage at 37 degrees C for 24 h (control group), thermal cycling (3000 cycles, between 4 and 55 degrees C, dwell time: 10 s), mechanical cycling (20,000 cycles under 10 N load, immersion in distilled water at 37 degrees C) and, thermal and mechanical cycling. A flexural strength test was performed in a universal testing machine (crosshead speed: 1.5 mm/min). Data were statistically analyzed using two-way ANOVA and Tukey`s test (alpha = 0.05). Results. The mean flexural strength values for the ceramic-gold alloy combination (55 +/- 7.2MPa) were significantly higher than those of the ceramic-Ti cp combination (32 +/- 6.7 MPa) regardless of the fatigue conditions performed (p < 0.05). Mechanical and thermo-mechanical fatigue decreased the flexural strength results significantly for both ceramic-gold alloy (52 +/- 6.6 and 53 +/- 5.6 MPa, respectively) and ceramic-Ti cp combinations (29 +/- 6.8 and 29 +/- 6.8 MPa, respectively) compared to the control group (58 +/- 7.8 and 39 SA MPa, for gold and Ti cp, respectively) (p < 0.05) (Tukey`s test). While ceramic-Ti cp combinations failed adhesively at the metal-opaque interface, gold alloy frameworks exhibited a residue of ceramic material on the surface in all experimental groups. Significance. Mechanical and thermo-mechanical fatigue conditions decreased the flexural strength values for both ceramic-gold alloy and ceramic-Ti cp combinations with the results being significantly lower for the latter in all experimental conditions. (C) 2007 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.
Resumo:
The basic morphology of the skeleton is determined genetically, but its final mass and architecture are modulated by adaptive mechanisms sensitive to mechanical factors. When subjected to loading, the ability of bones to resist fracture depends on their mass, material properties, geometry and tissue quality. The contribution of altered bone geometry to fracture risk is unappreciated by clinical assessment using absorptiometry because it fails to distinguish geometry and density. For example, for the same bone area and density, small increases in the diaphyseal radius effect a disproportionate influence on torsional strength of bone. Mechanical factors are clinically relevant because of their ability to influence growth, modeling and remodeling activities that can maximize, or maintain, the determinants of fracture resistance. Mechanical loads, greater than those habitually encountered by the skeleton, effect adaptations in cortical and cancellous bone, reduce the rate of bone turnover, and activate new bone formation on cortical and trabecular surfaces. In doing so, they increase bone strength by beneficial adaptations in the geometric dimensions and material properties of the tissue. There is no direct evidence to demonstrate anti-fracture efficacy for mechanical loading, but the geometric alterations engendered undoubtedly increase the structural properties of bone as an organ, increasing the resistance to fracture. Like all interventions, issues of safety also arise. Physical activities involving high strain rates, heavy lifting or impact loading may be detrimental to the joints, leading to osteoarthritis; may stimulate fatigue damage leading with some to stress fractures; or may interact pharmaceutical interventions to increase the rate of microdamage within cortical or trabecular bone.
Resumo:
Several anomalies occur in the developing neural and visceral head skeleton of young specimens of Neoceratodus forsteri that have been reared under laboratory conditions. These include anomalies of the basicranium and its derivatives, aberrations of the anterior mandible and hyoid apparatus, and abnormalities in the articulation of the jaws and the elements that produce them. Apart from the occasional absence of the basihyal, and failure of the quadrate processes to form, the anomalies are not deficiencies. Most involve malformations of parts of the neurocranium and visceral skeleton, inappropriate articulations or fusions between elements, disunity in structures that are normally fused and the appearance of supernumerary elements. The incidence of chondral anomalies, generally higher than aberrations that occur in the dermal skeleton in juvenile lungfish, ranges from 1-10% in laboratory reared individuals that have not been subjected to experimental interference. The anomalies differ from those found in many amphibian populations, in the field and in the laboratory, because they involve the cranium, and not the limbs, and the lungfish have not been exposed to the factors that cause anomalies in the amphibians. It is unlikely that the existence of those anomalies, if it is reflected in the wild population, places a selective pressure on the lungfish, because, in a normal season, less than 1% of the total number of eggs produced survive to be recruited into the adult population.