882 resultados para parallel processing systems
Resumo:
The paper has been presented at the 12th International Conference on Applications of Computer Algebra, Varna, Bulgaria, June, 2006
Resumo:
Bayesian algorithms pose a limit to the performance learning algorithms can achieve. Natural selection should guide the evolution of information processing systems towards those limits. What can we learn from this evolution and what properties do the intermediate stages have? While this question is too general to permit any answer, progress can be made by restricting the class of information processing systems under study. We present analytical and numerical results for the evolution of on-line algorithms for learning from examples for neural network classifiers, which might include or not a hidden layer. The analytical results are obtained by solving a variational problem to determine the learning algorithm that leads to maximum generalization ability. Simulations using evolutionary programming, for programs that implement learning algorithms, confirm and expand the results. The principal result is not just that the evolution is towards a Bayesian limit. Indeed it is essentially reached. In addition we find that evolution is driven by the discovery of useful structures or combinations of variables and operators. In different runs the temporal order of the discovery of such combinations is unique. The main result is that combinations that signal the surprise brought by an example arise always before combinations that serve to gauge the performance of the learning algorithm. This latter structures can be used to implement annealing schedules. The temporal ordering can be understood analytically as well by doing the functional optimization in restricted functional spaces. We also show that there is data suggesting that the appearance of these traits also follows the same temporal ordering in biological systems. © 2006 American Institute of Physics.
Resumo:
Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.
Resumo:
Recently honeycomb meshes have been considered as alternative candidates for interconnection networks in parallel and distributed computer systems. This paper presents a solution to one of the open problems about honeycomb meshes—the so-called three disjoint path problem. The problem requires minimizing the length of the longest of any three disjoint paths between 3-degree nodes. This solution provides information on the re-routing of traffic along the network in the presence of faults.
Resumo:
This study provides evidence for a Stroop-like interference effect in word recognition. Based on phonologic and semantic properties of simple words, participants who performed a same/different word-recognition task exhibited a significant response latency increase when word pairs (e.g., POLL, ROD) featured a comparison word (POLL) that was a homonym of a synonym (pole) of the target word (ROD). These results support a parallel-processing framework of lexical decision making, in which activation of the pathways to word recognition may occur at different levels automatically and in parallel. A subset of simple words that are also brand names was examined and exhibited this same interference. Implications for word recognition theory and practical implications for strategic marketing are discussed.
Resumo:
Hyperspectral instruments have been incorporated in satellite missions, providing data of high spectral resolution of the Earth. This data can be used in remote sensing applications, such as, target detection, hazard prevention, and monitoring oil spills, among others. In most of these applications, one of the requirements of paramount importance is the ability to give real-time or near real-time response. Recently, onboard processing systems have emerged, in order to overcome the huge amount of data to transfer from the satellite to the ground station, and thus, avoiding delays between hyperspectral image acquisition and its interpretation. For this purpose, compact reconfigurable hardware modules, such as field programmable gate arrays (FPGAs) are widely used. This paper proposes a parallel FPGA-based architecture for endmember’s signature extraction. This method based on the Vertex Component Analysis (VCA) has several advantages, namely it is unsupervised, fully automatic, and it works without dimensionality reduction (DR) pre-processing step. The architecture has been designed for a low cost Xilinx Zynq board with a Zynq-7020 SoC FPGA based on the Artix-7 FPGA programmable logic and tested using real hyperspectral data sets collected by the NASA’s Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) over the Cuprite mining district in Nevada. Experimental results indicate that the proposed implementation can achieve real-time processing, while maintaining the methods accuracy, which indicate the potential of the proposed platform to implement high-performance, low cost embedded systems, opening new perspectives for onboard hyperspectral image processing.
Resumo:
A parallel method for dynamic partitioning of unstructured meshes is described. The method employs a new iterative optimisation technique which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more quickly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.
Resumo:
Remotely sensed imagery has been widely used for land use/cover classification thanks to the periodic data acquisition and the widespread use of digital image processing systems offering a wide range of classification algorithms. The aim of this work was to evaluate some of the most commonly used supervised and unsupervised classification algorithms under different landscape patterns found in Rondônia, including (1) areas of mid-size farms, (2) fish-bone settlements and (3) a gradient of forest and Cerrado (Brazilian savannah). Comparison with a reference map based on the kappa statistics resulted in good to superior indicators (best results - K-means: k=0.68; k=0.77; k=0.64 and MaxVer: k=0.71; k=0.89; k=0.70 respectively for three areas mentioned). Results show that choosing a specific algorithm requires to take into account both its capacity to discriminate among various spectral signatures under different landscape patterns as well as a cost/benefit analysis considering the different steps performed by the operator performing a land cover/use map. it is suggested that a more systematic assessment of several options of implementation of a specific project is needed prior to beginning a land use/cover mapping job.
Resumo:
OctVCE is a cartesian cell CFD code produced especially for numerical simulations of shock and blast wave interactions with complex geometries. Virtual Cell Embedding (VCE) was chosen as its cartesian cell kernel as it is simple to code and sufficient for practical engineering design problems. This also makes the code much more ‘user-friendly’ than structured grid approaches as the gridding process is done automatically. The CFD methodology relies on a finite-volume formulation of the unsteady Euler equations and is solved using a standard explicit Godonov (MUSCL) scheme. Both octree-based adaptive mesh refinement and shared-memory parallel processing capability have also been incorporated. For further details on the theory behind the code, see the companion report 2007/12.
Resumo:
A central problem in visual perception concerns how humans perceive stable and uniform object colors despite variable lighting conditions (i.e. color constancy). One solution is to 'discount' variations in lighting across object surfaces by encoding color contrasts, and utilize this information to 'fill in' properties of the entire object surface. Implicit in this solution is the caveat that the color contrasts defining object boundaries must be distinguished from the spurious color fringes that occur naturally along luminance-defined edges in the retinal image (i.e. optical chromatic aberration). In the present paper, we propose that the neural machinery underlying color constancy is complemented by an 'error-correction' procedure which compensates for chromatic aberration, and suggest that error-correction may be linked functionally to the experimentally induced illusory colored aftereffects known as McCollough effects (MEs). To test these proposals, we develop a neural network model which incorporates many of the receptive-field (RF) profiles of neurons in primate color vision. The model is composed of two parallel processing streams which encode complementary sets of stimulus features: one stream encodes color contrasts to facilitate filling-in and color constancy; the other stream selectively encodes (spurious) color fringes at luminance boundaries, and learns to inhibit the filling-in of these colors within the first stream. Computer simulations of the model illustrate how complementary color-spatial interactions between error-correction and filling-in operations (a) facilitate color constancy, (b) reveal functional links between color constancy and the ME, and (c) reconcile previously reported anomalies in the local (edge) and global (spreading) properties of the ME. We discuss the broader implications of these findings by considering the complementary functional roles performed by RFs mediating color-spatial interactions in the primate visual system. (C) 2002 Elsevier Science Ltd. All rights reserved.
Resumo:
COORDINSPECTOR is a Software Tool aiming at extracting the coordination layer of a software system. Such a reverse engineering process provides a clear view of the actually invoked services as well as the logic behind such invocations. The analysis process is based on program slicing techniques and the generation of, System Dependence Graphs and Coordination Dependence Graphs. The tool analyzes Common Intermediate Language (CIL), the native language of the Microsoft .Net Framework, thus making suitable for processing systems developed in any .Net Framework compilable language. COORDINSPECTOR generates graphical representations of the coordination layer together with business process orchestrations specified in WSBPEL 2.0
Resumo:
Consider the problem of scheduling sporadic tasks on a multiprocessor platform under mutual exclusion constraints. We present an approach which appears promising for allowing large amounts of parallel task executions and still ensures low amounts of blocking.
Resumo:
This paper proposes an efficient scalable Residue Number System (RNS) architecture supporting moduli sets with an arbitrary number of channels, allowing to achieve larger dynamic range and a higher level of parallelism. The proposed architecture allows the forward and reverse RNS conversion, by reusing the arithmetic channel units. The arithmetic operations supported at the channel level include addition, subtraction, and multiplication with accumulation capability. For the reverse conversion two algorithms are considered, one based on the Chinese Remainder Theorem and the other one on Mixed-Radix-Conversion, leading to implementations optimized for delay and required circuit area. With the proposed architecture a complete and compact RNS platform is achieved. Experimental results suggest gains of 17 % in the delay in the arithmetic operations, with an area reduction of 23 % regarding the RNS state of the art. When compared with a binary system the proposed architecture allows to perform the same computation 20 times faster alongside with only 10 % of the circuit area resources.
Resumo:
Nos últimos anos começaram a ser vulgares os computadores dotados de multiprocessadores e multi-cores. De modo a aproveitar eficientemente as novas características desse hardware começaram a surgir ferramentas para facilitar o desenvolvimento de software paralelo, através de linguagens e frameworks, adaptadas a diferentes linguagens. Com a grande difusão de redes de alta velocidade, tal como Gigabit Ethernet e a última geração de redes Wi-Fi, abre-se a oportunidade de, além de paralelizar o processamento entre processadores e cores, poder em simultâneo paralelizá-lo entre máquinas diferentes. Ao modelo que permite paralelizar processamento localmente e em simultâneo distribuí-lo para máquinas que também têm capacidade de o paralelizar, chamou-se “modelo paralelo distribuído”. Nesta dissertação foram analisadas técnicas e ferramentas utilizadas para fazer programação paralela e o trabalho que está feito dentro da área de programação paralela e distribuída. Tendo estes dois factores em consideração foi proposta uma framework que tenta aplicar a simplicidade da programação paralela ao conceito paralelo distribuído. A proposta baseia-se na disponibilização de uma framework em Java com uma interface de programação simples, de fácil aprendizagem e legibilidade que, de forma transparente, é capaz de paralelizar e distribuir o processamento. Apesar de simples, existiu um esforço para a tornar configurável de forma a adaptar-se ao máximo de situações possível. Nesta dissertação serão exploradas especialmente as questões relativas à execução e distribuição de trabalho, e a forma como o código é enviado de forma automática pela rede, para outros nós cooperantes, evitando assim a instalação manual das aplicações em todos os nós da rede. Para confirmar a validade deste conceito e das ideias defendidas nesta dissertação foi implementada esta framework à qual se chamou DPF4j (Distributed Parallel Framework for JAVA) e foram feitos testes e retiradas métricas para verificar a existência de ganhos de performance em relação às soluções já existentes.