966 resultados para scalable parallel programming
Resumo:
Current research on Internet-based distributed systems emphasizes the scalability of overlay topologies for efficient search and retrieval of data items, as well as routing amongst peers. However, most existing approaches fail to address the transport of data across these logical networks in accordance with quality of service (QoS) constraints. Consequently, this paper investigates the use of scalable overlay topologies for routing real-time media streams between publishers and potentially many thousands of subscribers. Specifically, we analyze the costs of using k-ary n-cubes for QoS-constrained routing. Given a number of nodes in a distributed system, we calculate the optimal k-ary n-cube structure for minimizing the average distance between any pair of nodes. Using this structure, we describe a greedy algorithm that selects paths between nodes in accordance with the real-time delays along physical links. We show this method improves the routing latencies by as much as 67%, compared to approaches that do not consider physical link costs. We are in the process of developing a method for adaptive node placement in the overlay topology, based upon the locations of publishers, subscribers, physical link costs and per-subscriber QoS constraints. One such method for repositioning nodes in logical space is discussed, to improve the likelihood of meeting service requirements on data routed between publishers and subscribers. Future work will evaluate the benefits of such techniques more thoroughly.
Resumo:
Inferring types for polymorphic recursive function definitions (abbreviated to polymorphic recursion) is a recurring topic on the mailing lists of popular typed programming languages. This is despite the fact that type inference for polymorphic recursion using for all-types has been proved undecidable. This report presents several programming examples involving polymorphic recursion and determines their typability under various type systems, including the Hindley-Milner system, an intersection-type system, and extensions of these two. The goal of this report is to show that many of these examples are typable using a system of intersection types as an alternative form of polymorphism. By accomplishing this, we hope to lay the foundation for future research into a decidable intersection-type inference algorithm. We do not provide a comprehensive survey of type systems appropriate for polymorphic recursion, with or without type annotations inserted in the source language. Rather, we focus on examples for which types may be inferred without type annotations.
Resumo:
The Science of Network Service Composition has clearly emerged as one of the grand themes driving many of our research questions in the networking field today [NeXtworking 2003]. This driving force stems from the rise of sophisticated applications and new networking paradigms. By "service composition" we mean that the performance and correctness properties local to the various constituent components of a service can be readily composed into global (end-to-end) properties without re-analyzing any of the constituent components in isolation, or as part of the whole composite service. The set of laws that would govern such composition is what will constitute that new science of composition. The combined heterogeneity and dynamic open nature of network systems makes composition quite challenging, and thus programming network services has been largely inaccessible to the average user. We identify (and outline) a research agenda in which we aim to develop a specification language that is expressive enough to describe different components of a network service, and that will include type hierarchies inspired by type systems in general programming languages that enable the safe composition of software components. We envision this new science of composition to be built upon several theories (e.g., control theory, game theory, network calculus, percolation theory, economics, queuing theory). In essence, different theories may provide different languages by which certain properties of system components can be expressed and composed into larger systems. We then seek to lift these lower-level specifications to a higher level by abstracting away details that are irrelevant for safe composition at the higher level, thus making theories scalable and useful to the average user. In this paper we focus on services built upon an overlay management architecture, and we use control theory and QoS theory as example theories from which we lift up compositional specifications.
Resumo:
We leverage the buffering capabilities of end-systems to achieve scalable, asynchronous delivery of streams in a peer-to-peer environment. Unlike existing cache-and-relay schemes, we propose a distributed prefetching protocol where peers prefetch and store portions of the streaming media ahead of their playout time, thus not only turning themselves to possible sources for other peers but their prefetched data can allow them to overcome the departure of their source-peer. This stands in sharp contrast to existing cache-and-relay schemes where the departure of the source-peer forces its peer children to go the original server, thus disrupting their service and increasing server and network load. Through mathematical analysis and simulations, we show the effectiveness of maintaining such asynchronous multicasts from several source-peers to other children peers, and the efficacy of prefetching in the face of peer departures. We confirm the scalability of our dPAM protocol as it is shown to significantly reduce server load.
Resumo:
Communication and synchronization stand as the dual bottlenecks in the performance of parallel systems, and especially those that attempt to alleviate the programming burden by incurring overhead in these two domains. We formulate the notions of communicable memory and lazy barriers to help achieve efficient communication and synchronization. These concepts are developed in the context of BSPk, a toolkit library for programming networks of workstations|and other distributed memory architectures in general|based on the Bulk Synchronous Parallel (BSP) model. BSPk emphasizes efficiency in communication by minimizing local memory-to-memory copying, and in barrier synchronization by not forcing a process to wait unless it needs remote data. Both the message passing (MP) and distributed shared memory (DSM) programming styles are supported in BSPk. MP helps processes efficiently exchange short-lived unnamed data values, when the identity of either the sender or receiver is known to the other party. By contrast, DSM supports communication between processes that may be mutually anonymous, so long as they can agree on variable names in which to store shared temporary or long-lived data.
Resumo:
We present a distributed indexing scheme for peer to peer networks. Past work on distributed indexing traded off fast search times with non-constant degree topologies or network-unfriendly behavior such as flooding. In contrast, the scheme we present optimizes all three of these performance measures. That is, we provide logarithmic round searches while maintaining connections to a fixed number of peers and avoiding network flooding. In comparison to the well known scheme Chord, we provide competitive constant factors. Finally, we observe that arbitrary linear speedups are possible and discuss both a general brute force approach and specific economical optimizations.
Resumo:
To construct high performance Web servers, system builders are increasingly turning to distributed designs. An important challenge that arises in distributed Web servers is the need to direct incoming connections to individual hosts. Previous methods for connection routing have employed a centralized node which handles all incoming requests. In contrast, we propose a distributed approach, called Distributed Packet Rewriting (DPR), in which all hosts of the distributed system participate in connection routing. We argue that this approach promises better scalability and fault-tolerance than the centralized approach. We describe our implementation of four variants of DPR and compare their performance. We show that DPR provides performance comparable to centralized alternatives, measured in terms of throughput and delay under the SPECweb96 benchmark. Finally, we argue that DPR is particularly attractive both for small scale systems and for systems following the emerging trend toward increasingly intelligent I/O subsystems.
Resumo:
The purpose of this project is the creation of a graphical "programming" interface for a sensor network tasking language called STEP. The graphical interface allows the user to specify a program execution graphically from an extensible pallet of functionalities and save the results as a properly formatted STEP file. Moreover, the software is able to load a file in STEP format and convert it into the corresponding graphical representation. During both phases a type-checker is running on the background to ensure that both the graphical representation and the STEP file are syntactically correct. This project has been motivated by the Sensorium project at Boston University. In this technical report we present the basic features of the software, the process that has been followed during the design and implementation. Finally, we describe the approach used to test and validate our software.
Resumo:
Overlay networks have become popular in recent times for content distribution and end-system multicasting of media streams. In the latter case, the motivation is based on the lack of widespread deployment of IP multicast and the ability to perform end-host processing. However, constructing routes between various end-hosts, so that data can be streamed from content publishers to many thousands of subscribers, each having their own QoS constraints, is still a challenging problem. First, any routes between end-hosts using trees built on top of overlay networks can increase stress on the underlying physical network, due to multiple instances of the same data traversing a given physical link. Second, because overlay routes between end-hosts may traverse physical network links more than once, they increase the end-to-end latency compared to IP-level routing. Third, algorithms for constructing efficient, large-scale trees that reduce link stress and latency are typically more complex. This paper therefore compares various methods to construct multicast trees between end-systems, that vary in terms of implementation costs and their ability to support per-subscriber QoS constraints. We describe several algorithms that make trade-offs between algorithmic complexity, physical link stress and latency. While no algorithm is best in all three cases we show how it is possible to efficiently build trees for several thousand subscribers with latencies within a factor of two of the optimal, and link stresses comparable to, or better than, existing technologies.
Resumo:
Calligraphic writing presents a rich set of challenges to the human movement control system. These challenges include: initial learning, and recall from memory, of prescribed stroke sequences; critical timing of stroke onsets and durations; fine control of grip and contact forces; and letter-form invariance under voluntary size scaling, which entails fine control of stroke direction and amplitude during recruitment and derecruitment of musculoskeletal degrees of freedom. Experimental and computational studies in behavioral neuroscience have made rapid progress toward explaining the learning, planning and contTOl exercised in tasks that share features with calligraphic writing and drawing. This article summarizes computational neuroscience models and related neurobiological data that reveal critical operations spanning from parallel sequence representations to fine force control. Part one addresses stroke sequencing. It treats competitive queuing (CQ) models of sequence representation, performance, learning, and recall. Part two addresses letter size scaling and motor equivalence. It treats cursive handwriting models together with models in which sensory-motor tmnsformations are performed by circuits that learn inverse differential kinematic mappings. Part three addresses fine-grained control of timing and transient forces, by treating circuit models that learn to solve inverse dynamics problems.
Resumo:
A neural model of peripheral auditory processing is described and used to separate features of coarticulated vowels and consonants. After preprocessing of speech via a filterbank, the model splits into two parallel channels, a sustained channel and a transient channel. The sustained channel is sensitive to relatively stable parts of the speech waveform, notably synchronous properties of the vocalic portion of the stimulus it extends the dynamic range of eighth nerve filters using coincidence deteectors that combine operations of raising to a power, rectification, delay, multiplication, time averaging, and preemphasis. The transient channel is sensitive to critical features at the onsets and offsets of speech segments. It is built up from fast excitatory neurons that are modulated by slow inhibitory interneurons. These units are combined over high frequency and low frequency ranges using operations of rectification, normalization, multiplicative gating, and opponent processing. Detectors sensitive to frication and to onset or offset of stop consonants and vowels are described. Model properties are characterized by mathematical analysis and computer simulations. Neural analogs of model cells in the cochlear nucleus and inferior colliculus are noted, as are psychophysical data about perception of CV syllables that may be explained by the sustained transient channel hypothesis. The proposed sustained and transient processing seems to be an auditory analog of the sustained and transient processing that is known to occur in vision.
Resumo:
This paper demonstrates an optimal control solution to change of machine set-up scheduling based on dynamic programming average cost per stage value iteration as set forth by Cararnanis et. al. [2] for the 2D case. The difficulty with the optimal approach lies in the explosive computational growth of the resulting solution. A method of reducing the computational complexity is developed using ideas from biology and neural networks. A real time controller is described that uses a linear-log representation of state space with neural networks employed to fit cost surfaces.
Exploring processes of indeterminate determinism in music composition, programming and improvisation
Resumo:
This portfolio consists of 15 original musical works. Taking the form of electronic and acousmatic music, multimedia, and scores, these chamber works serve as a result of experimentation and improvisation with individually built computer interfaces. The accompanying commentary provides discourse on the conceptual practice of these interfaces becoming a compositional entity that present a multi-interpretative opportunity to explore, engage, and personalise. Following this, the commentary examines the path of creative decisions and musical choices that formed both these interfaces and the resulting musical and visual works. This portfolio is accompanied by interfaces used, transcoded interfacing behavioural information, and documented improvisational findings.
Resumo:
This work considers the static calculation of a program’s average-case time. The number of systems that currently tackle this research problem is quite small due to the difficulties inherent in average-case analysis. While each of these systems make a pertinent contribution, and are individually discussed in this work, only one of them forms the basis of this research. That particular system is known as MOQA. The MOQA system consists of the MOQA language and the MOQA static analysis tool. Its technique for statically determining average-case behaviour centres on maintaining strict control over both the data structure type and the labeling distribution. This research develops and evaluates the MOQA language implementation, and adds to the functions already available in this language. Furthermore, the theory that backs MOQA is generalised and the range of data structures for which the MOQA static analysis tool can determine average-case behaviour is increased. Also, some of the MOQA applications and extensions suggested in other works are logically examined here. For example, the accuracy of classifying the MOQA language as reversible is investigated, along with the feasibility of incorporating duplicate labels into the MOQA theory. Finally, the analyses that take place during the course of this research reveal some of the MOQA strengths and weaknesses. This thesis aims to be pragmatic when evaluating the current MOQA theory, the advancements set forth in the following work and the benefits of MOQA when compared to similar systems. Succinctly, this work’s significant expansion of the MOQA theory is accompanied by a realistic assessment of MOQA’s accomplishments and a serious deliberation of the opportunities available to MOQA in the future.
Resumo:
Recent years have witnessed a rapid growth in the demand for streaming video over the Internet, exposing challenges in coping with heterogeneous device capabilities and varying network throughput. When we couple this rise in streaming with the growing number of portable devices (smart phones, tablets, laptops) we see an ever-increasing demand for high-definition videos online while on the move. Wireless networks are inherently characterised by restricted shared bandwidth and relatively high error loss rates, thus presenting a challenge for the efficient delivery of high quality video. Additionally, mobile devices can support/demand a range of video resolutions and qualities. This demand for mobile streaming highlights the need for adaptive video streaming schemes that can adjust to available bandwidth and heterogeneity, and can provide us with graceful changes in video quality, all while respecting our viewing satisfaction. In this context the use of well-known scalable media streaming techniques, commonly known as scalable coding, is an attractive solution and the focus of this thesis. In this thesis we investigate the transmission of existing scalable video models over a lossy network and determine how the variation in viewable quality is affected by packet loss. This work focuses on leveraging the benefits of scalable media, while reducing the effects of data loss on achievable video quality. The overall approach is focused on the strategic packetisation of the underlying scalable video and how to best utilise error resiliency to maximise viewable quality. In particular, we examine the manner in which scalable video is packetised for transmission over lossy networks and propose new techniques that reduce the impact of packet loss on scalable video by selectively choosing how to packetise the data and which data to transmit. We also exploit redundancy techniques, such as error resiliency, to enhance the stream quality by ensuring a smooth play-out with fewer changes in achievable video quality. The contributions of this thesis are in the creation of new segmentation and encapsulation techniques which increase the viewable quality of existing scalable models by fragmenting and re-allocating the video sub-streams based on user requirements, available bandwidth and variations in loss rates. We offer new packetisation techniques which reduce the effects of packet loss on viewable quality by leveraging the increase in the number of frames per group of pictures (GOP) and by providing equality of data in every packet transmitted per GOP. These provide novel mechanisms for packetizing and error resiliency, as well as providing new applications for existing techniques such as Interleaving and Priority Encoded Transmission. We also introduce three new scalable coding models, which offer a balance between transmission cost and the consistency of viewable quality.