631 resultados para algorithmic skeletons
Resumo:
Details are presented of the IRIS synthesis system for high-performance digital signal processing. This tool allows non-specialists to automatically derive VLSI circuit architectures from high-level, algorithmic representations, and provides a quick route to silicon implementation. The applicability of the system is demonstrated using the design example of a one-dimensional Discrete Cosine Transform circuit.
Resumo:
The range of potential applications for indoor and campus based personnel localisation has led researchers to create a wide spectrum of different algorithmic approaches and systems. However, the majority of the proposed systems overlook the unique radio environment presented by the human body leading to systematic errors and inaccuracies when deployed in this context. In this paper RSSI-based Monte Carlo Localisation was implemented using commercial 868 MHz off the shelf hardware and empirical data was gathered across a relatively large number of scenarios within a single indoor office environment. This data showed that the body shadowing effect caused by the human body introduced path skew into location estimates. It was also shown that, by using two body-worn nodes in concert, the effect of body shadowing can be mitigated by averaging the estimated position of the two nodes worn on either side of the body. © Springer Science+Business Media, LLC 2012.
Resumo:
Recent trends towards increasingly parallel computers mean that there needs to be a seismic shift in programming practice. The time is rapidly approaching when most programming will be for parallel systems. However, most programming techniques in use today are geared towards sequential, or occasionally small-scale parallel, programming. While refactoring has so far mainly been applied to sequential programs, it is our contention that refactoring can play a key role in significantly improving the programmability of parallel systems, by allowing the programmer to apply a set of well-defined transformations in order to parallelise their programs. In this paper, we describe a new language-independent refactoring approach that helps introduce and tune parallelism through high-level design patterns targeting a set of well-specified parallel skeletons. We believe this new refactoring process is the key to allowing programmers to truly start thinking in parallel. © 2012 ACM.
Resumo:
We describe a lightweight prototype framework (LIBERO) designed for experimentation with behavioural skeletons-components implementing a well-known parallelism exploitation pattern and a rule-based autonomic manager taking care of some non-functional feature related to pattern computation. LIBERO supports multiple autonomic managers within the same behavioural skeleton, each taking care of a different non-functional concern. We introduce LIBERO-built on plain Java and JBoss-and discuss how multiple managers may be coordinated to achieve a common goal using a two-phase coordination protocol developed in earlier work. We present experimental results that demonstrate how the prototype may be used to investigate autonomic management of multiple, independent concerns. © 2011 Springer-Verlag Berlin Heidelberg.
Resumo:
Bulk paleosol samples collected from a Middle to Early Miocene moraine in the New Mountain area of the Dry Valleys, Antarctica, yielded Coleoptera exoskeletons and occasional endoskeletons showing considerable diagenetic effects along with several species of bacteria, all lodged in a dry-frozen but salt-rich horizon at shallow depth to the land surface. The till is at the older end of a chronologic sequence of glacial deposits, thought to have been deposited before the transition from wet-based to cold-based ice (similar to 15 Ma), and hence, entirely weathered in contact with the subaerial atmosphere. It is possible, though not absolutely verifiable, that the skeletons date from this early stage of emplacement having undergone modifications whenever light snowmelt occurred or salt concentrations lowered the freezing temperature to maintain water as liquid. Correlation of the Coleoptera species with cultured bacteria in the sample and the likelihood of co-habitation with Beauveria bassiani found in two adjacent, although younger paleosols, leads to new questions about the antiquity of the Coleoptera and the source of N and glucose from chitinase derived from the insects. The skeletons in the 831 section may date close to the oldest preserved chitin (Oligocene) yet found on Earth. While harsh Martian conditions make it seemingly intolerable for complex, multicellular organisms such as insects to exist in the near-surface and subaerially, life within similar cold, dry paleosol microenvironments (Cryosols) of Antarctica point to life potential for the Red Planet, especially when considering the relatively diverse microbe (bacteria and fungi) population. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
In intelligent video surveillance systems, scalability (of the number of simultaneous video streams) is important. Two key factors which hinder scalability are the time spent in decompressing the input video streams, and the limited computational power of the processor. This paper demonstrates how a combination of algorithmic and hardware techniques can overcome these limitations, and significantly increase the number of simultaneous streams. The techniques used are processing in the compressed domain, and exploitation of the multicore and vector processing capability of modern processors. The paper presents a system which performs background modeling, using a Mixture of Gaussians approach. This is an important first step in the segmentation of moving targets. The paper explores the effects of reducing the number of coefficients in the compressed domain, in terms of throughput speed and quality of the background modeling. The speedups achieved by exploiting compressed domain processing, multicore and vector processing are explored individually. Experiments show that a combination of all these techniques can give a speedup of 170 times on a single CPU compared to a purely serial, spatial domain implementation, with a slight gain in quality.
Resumo:
Sponge classification has long been based mainly on morphocladistic analyses but is now being greatly challenged by more than 12 years of accumulated analyses of molecular data analyses. The current study used phylogenetic hypotheses based on sequence data from 18S rRNA, 28S rRNA, and the CO1 barcoding fragment, combined with morphology to justify the resurrection of the order Axinellida Lévi, 1953. Axinellida occupies a key position in different morphologically derived topologies. The abandonment of Axinellida and the establishment of Halichondrida Vosmaer, 1887 sensu lato to contain Halichondriidae Gray, 1867, Axinellidae Carter, 1875, Bubaridae Topsent, 1894, Heteroxyidae Dendy, 1905, and a new family Dictyonellidae van Soest et al., 1990 was based on the conclusion that an axially condensed skeleton evolved independently in separate lineages in preference to the less parsimonious assumption that asters (star-shaped spicules), acanthostyles (club-shaped spicules with spines), and sigmata (C-shaped spicules) each evolved more than once. Our new molecular trees are congruent and contrast with the earlier, morphologically based, trees. The results show that axially condensed skeletons, asters, acanthostyles, and sigmata are all homoplasious characters. The unrecognized homoplasious nature of these characters explains much of the incongruence between molecular-based and morphology-based phylogenies. We use the molecular trees presented here as a basis for re-interpreting the morphological characters within Heteroscleromorpha. The implications for the classification of Heteroscleromorpha are discussed and a new order Biemnida ord. nov. is erected.
Resumo:
This paper describes the ParaPhrase project, a new 3-year targeted research project funded under EU Framework 7 Objective 3.4 (Computer Systems), starting in October 2011. ParaPhrase aims to follow a new approach to introducing parallelism using advanced refactoring techniques coupled with high-level parallel design patterns. The refactoring approach will use these design patterns to restructure programs defined as networks of software components into other forms that are more suited to parallel execution. The programmer will be aided by high-level cost information that will be integrated into the refactoring tools. The implementation of these patterns will then use a well-understood algorithmic skeleton approach to achieve good parallelism. A key ParaPhrase design goal is that parallel components are intended to match heterogeneous architectures, defined in terms of CPU/GPU combinations, for example. In order to achieve this, the ParaPhrase approach will map components at link time to the available hardware, and will then re-map them during program execution, taking account of multiple applications, changes in hardware resource availability, the desire to reduce communication costs etc. In this way, we aim to develop a new approach to programming that will be able to produce software that can adapt to dynamic changes in the system environment. Moreover, by using a strong component basis for parallelism, we can achieve potentially significant gains in terms of reducing sharing at a high level of abstraction, and so in reducing or even eliminating the costs that are usually associated with cache management, locking, and synchronisation. © 2013 Springer-Verlag Berlin Heidelberg.
Resumo:
Many modern networks are \emph{reconfigurable}, in the sense that the topology of the network can be changed by the nodes in the network. For example, peer-to-peer, wireless and ad-hoc networks are reconfigurable. More generally, many social networks, such as a company's organizational chart; infrastructure networks, such as an airline's transportation network; and biological networks, such as the human brain, are also reconfigurable. Modern reconfigurable networks have a complexity unprecedented in the history of engineering, resembling more a dynamic and evolving living animal rather than a structure of steel designed from a blueprint. Unfortunately, our mathematical and algorithmic tools have not yet developed enough to handle this complexity and fully exploit the flexibility of these networks. We believe that it is no longer possible to build networks that are scalable and never have node failures. Instead, these networks should be able to admit small, and maybe, periodic failures and still recover like skin heals from a cut. This process, where the network can recover itself by maintaining key invariants in response to attack by a powerful adversary is what we call \emph{self-healing}. Here, we present several fast and provably good distributed algorithms for self-healing in reconfigurable dynamic networks. Each of these algorithms have different properties, a different set of gaurantees and limitations. We also discuss future directions and theoretical questions we would like to answer. %in the final dissertation that this document is proposed to lead to.
Resumo:
Dynamic Voltage and Frequency Scaling (DVFS) exhibits fundamental limitations as a method to reduce energy consumption in computing systems. In the HPC domain, where performance is of highest priority and codes are heavily optimized to minimize idle time, DVFS has limited opportunity to achieve substantial energy savings. This paper explores if operating processors Near the transistor Threshold Volt- age (NTV) is a better alternative to DVFS for break- ing the power wall in HPC. NTV presents challenges, since it compromises both performance and reliability to reduce power consumption. We present a first of its kind study of a significance-driven execution paradigm that selectively uses NTV and algorithmic error tolerance to reduce energy consumption in performance- constrained HPC environments. Using an iterative algorithm as a use case, we present an adaptive execution scheme that switches between near-threshold execution on many cores and above-threshold execution on one core, as the computational significance of iterations in the algorithm evolves over time. Using this scheme on state-of-the-art hardware, we demonstrate energy savings ranging between 35% to 67%, while compromising neither correctness nor performance.
Resumo:
A framework supporting fast prototyping as well as tuning of distributed applications is presented. The approach is based on the adoption of a formal model that is used to describe the orchestration of distributed applications. The formal model (Orc by Misra and Cook) can be used to support semi-formal reasoning about the applications at hand. The paper describes how the framework can be used to derive and evaluate alternative orchestrations of a well know parallel/distributed computation pattern; and shows how the same formal model can be used to support generation of prototypes of distributed applications skeletons directly from the application description.
Resumo:
In this paper, we have developed a low-complexity algorithm for epileptic seizure detection with a high degree of accuracy. The algorithm has been designed to be feasibly implementable as battery-powered low-power implantable epileptic seizure detection system or epilepsy prosthesis. This is achieved by utilizing design optimization techniques at different levels of abstraction. Particularly, user-specific critical parameters are identified at the algorithmic level and are explicitly used along with multiplier-less implementations at the architecture level. The system has been tested on neural data obtained from in-vivo animal recordings and has been implemented in 90nm bulk-Si technology. The results show up to 90 % savings in power as compared to prevalent wavelet based seizure detection technique while achieving 97% average detection rate. Copyright 2010 ACM.
Resumo:
In this paper we present a design methodology for algorithm/architecture co-design of a voltage-scalable, process variation aware motion estimator based on significance driven computation. The fundamental premise of our approach lies in the fact that all computations are not equally significant in shaping the output response of video systems. We use a statistical technique to intelligently identify these significant/not-so-significant computations at the algorithmic level and subsequently change the underlying architecture such that the significant computations are computed in an error free manner under voltage over-scaling. Furthermore, our design includes an adaptive quality compensation (AQC) block which "tunes" the algorithm and architecture depending on the magnitude of voltage over-scaling and severity of process variations. Simulation results show average power savings of similar to 33% for the proposed architecture when compared to conventional implementation in the 90 nm CMOS technology. The maximum output quality loss in terms of Peak Signal to Noise Ratio (PSNR) was similar to 1 dB without incurring any throughput penalty.
Resumo:
In this paper, a low complexity system for spectral analysis of heart rate variability (HRV) is presented. The main idea of the proposed approach is the implementation of the Fast-Lomb periodogram that is a ubiquitous tool in spectral analysis, using a wavelet based Fast Fourier transform. Interestingly we show that the proposed approach enables the classification of processed data into more and less significant based on their contribution to output quality. Based on such a classification a percentage of less-significant data is being pruned leading to a significant reduction of algorithmic complexity with minimal quality degradation. Indeed, our results indicate that the proposed system can achieve up-to 45% reduction in number of computations with only 4.9% average error in the output quality compared to a conventional FFT based HRV system.
Resumo:
Structured parallel programming is recognised as a viable and effective means of tackling parallel programming problems. Recently, a set of simple and powerful parallel building blocks RISC pb2l) has been proposed to support modelling and implementation of parallel frameworks. In this work we demonstrate how that same parallel building block set may be used to model both general purpose parallel programming abstractions, not usually listed in classical skeleton sets, and more specialized domain specific parallel patterns. We show how an implementation of RISC pb2 l can be realised via the FastFlow framework and present experimental evidence of the feasibility and efficiency of the approach.