816 resultados para RECONFIGURABLE COMPUTING
Resumo:
Eckerdal, A., McCartney, R., Mostr?m, J. E., Sanders, K., Thomas, L., and Zander, C. 2007. From Limen to Lumen: computing students in liminal spaces. In Proceedings of the Third international Workshop on Computing Education Research (Atlanta, Georgia, USA, September 15 - 16, 2007). ICER '07. ACM, New York, NY, 123-132.
Resumo:
The proliferation of inexpensive workstations and networks has prompted several researchers to use such distributed systems for parallel computing. Attempts have been made to offer a shared-memory programming model on such distributed memory computers. Most systems provide a shared-memory that is coherent in that all processes that use it agree on the order of all memory events. This dissertation explores the possibility of a significant improvement in the performance of some applications when they use non-coherent memory. First, a new formal model to describe existing non-coherent memories is developed. I use this model to prove that certain problems can be solved using asynchronous iterative algorithms on shared-memory in which the coherence constraints are substantially relaxed. In the course of the development of the model I discovered a new type of non-coherent behavior called Local Consistency. Second, a programming model, Mermera, is proposed. It provides programmers with a choice of hierarchically related non-coherent behaviors along with one coherent behavior. Thus, one can trade-off the ease of programming with coherent memory for improved performance with non-coherent memory. As an example, I present a program to solve a linear system of equations using an asynchronous iterative algorithm. This program uses all the behaviors offered by Mermera. Third, I describe the implementation of Mermera on a BBN Butterfly TC2000 and on a network of workstations. The performance of a version of the equation solving program that uses all the behaviors of Mermera is compared with that of a version that uses coherent behavior only. For a system of 1000 equations the former exhibits at least a 5-fold improvement in convergence time over the latter. The version using coherent behavior only does not benefit from employing more than one workstation to solve the problem while the program using non-coherent behavior continues to achieve improved performance as the number of workstations is increased from 1 to 6. This measurement corroborates our belief that non-coherent shared memory can be a performance boon for some applications.
Resumo:
Programmers of parallel processes that communicate through shared globally distributed data structures (DDS) face a difficult choice. Either they must explicitly program DDS management, by partitioning or replicating it over multiple distributed memory modules, or be content with a high latency coherent (sequentially consistent) memory abstraction that hides the DDS' distribution. We present Mermera, a new formalism and system that enable a smooth spectrum of noncoherent shared memory behaviors to coexist between the above two extremes. Our approach allows us to define known noncoherent memories in a new simple way, to identify new memory behaviors, and to characterize generic mixed-behavior computations. The latter are useful for programming using multiple behaviors that complement each others' advantages. On the practical side, we show that the large class of programs that use asynchronous iterative methods (AIM) can run correctly on slow memory, one of the weakest, and hence most efficient and fault-tolerant, noncoherence conditions. An example AIM program to solve linear equations, is developed to illustrate: (1) the need for concurrently mixing memory behaviors, and, (2) the performance gains attainable via noncoherence. Other program classes tolerate weak memory consistency by synchronizing in such a way as to yield executions indistinguishable from coherent ones. AIM computations on noncoherent memory yield noncoherent, yet correct, computations. We report performance data that exemplifies the potential benefits of noncoherence, in terms of raw memory performance, as well as application speed.
Resumo:
The exploding demand for services like the World Wide Web reflects the potential that is presented by globally distributed information systems. The number of WWW servers world-wide has doubled every 3 to 5 months since 1993, outstripping even the growth of the Internet. At each of these self-managed sites, the Common Gateway Interface (CGI) and Hypertext Transfer Protocol (HTTP) already constitute a rudimentary basis for contributing local resources to remote collaborations. However, the Web has serious deficiencies that make it unsuited for use as a true medium for metacomputing --- the process of bringing hardware, software, and expertise from many geographically dispersed sources to bear on large scale problems. These deficiencies are, paradoxically, the direct result of the very simple design principles that enabled its exponential growth. There are many symptoms of the problems exhibited by the Web: disk and network resources are consumed extravagantly; information search and discovery are difficult; protocols are aimed at data movement rather than task migration, and ignore the potential for distributing computation. However, all of these can be seen as aspects of a single problem: as a distributed system for metacomputing, the Web offers unpredictable performance and unreliable results. The goal of our project is to use the Web as a medium (within either the global Internet or an enterprise intranet) for metacomputing in a reliable way with performance guarantees. We attack this problem one four levels: (1) Resource Management Services: Globally distributed computing allows novel approaches to the old problems of performance guarantees and reliability. Our first set of ideas involve setting up a family of real-time resource management models organized by the Web Computing Framework with a standard Resource Management Interface (RMI), a Resource Registry, a Task Registry, and resource management protocols to allow resource needs and availability information be collected and disseminated so that a family of algorithms with varying computational precision and accuracy of representations can be chosen to meet realtime and reliability constraints. (2) Middleware Services: Complementary to techniques for allocating and scheduling available resources to serve application needs under realtime and reliability constraints, the second set of ideas aim at reduce communication latency, traffic congestion, server work load, etc. We develop customizable middleware services to exploit application characteristics in traffic analysis to drive new server/browser design strategies (e.g., exploit self-similarity of Web traffic), derive document access patterns via multiserver cooperation, and use them in speculative prefetching, document caching, and aggressive replication to reduce server load and bandwidth requirements. (3) Communication Infrastructure: Finally, to achieve any guarantee of quality of service or performance, one must get at the network layer that can provide the basic guarantees of bandwidth, latency, and reliability. Therefore, the third area is a set of new techniques in network service and protocol designs. (4) Object-Oriented Web Computing Framework A useful resource management system must deal with job priority, fault-tolerance, quality of service, complex resources such as ATM channels, probabilistic models, etc., and models must be tailored to represent the best tradeoff for a particular setting. This requires a family of models, organized within an object-oriented framework, because no one-size-fits-all approach is appropriate. This presents a software engineering challenge requiring integration of solutions at all levels: algorithms, models, protocols, and profiling and monitoring tools. The framework captures the abstract class interfaces of the collection of cooperating components, but allows the concretization of each component to be driven by the requirements of a specific approach and environment.
Resumo:
We discuss the design principles of TCP within the context of heterogeneous wired/wireless networks and mobile networking. We identify three shortcomings in TCP's behavior: (i) the protocol's error detection mechanism, which does not distinguish different types of errors and thus does not suffice for heterogeneous wired/wireless environments, (ii) the error recovery, which is not responsive to the distinctive characteristics of wireless networks such as transient or burst errors due to handoffs and fading channels, and (iii) the protocol strategy, which does not control the tradeoff between performance measures such as goodput and energy consumption, and often entails a wasteful effort of retransmission and energy expenditure. We discuss a solution-framework based on selected research proposals and the associated evaluation criteria for the suggested modifications. We highlight an important angle that did not attract the required attention so far: the need for new performance metrics, appropriate for evaluating the impact of protocol strategies on battery-powered devices.
Resumo:
A difficulty in lung image registration is accounting for changes in the size of the lungs due to inspiration. We propose two methods for computing a uniform scale parameter for use in lung image registration that account for size change. A scaled rigid-body transformation allows analysis of corresponding lung CT scans taken at different times and can serve as a good low-order transformation to initialize non-rigid registration approaches. Two different features are used to compute the scale parameter. The first method uses lung surfaces. The second uses lung volumes. Both approaches are computationally inexpensive and improve the alignment of lung images over rigid registration. The two methods produce different scale parameters and may highlight different functional information about the lungs.
Resumo:
Many people suffer from conditions that lead to deterioration of motor control and makes access to the computer using traditional input devices difficult. In particular, they may loose control of hand movement to the extent that the standard mouse cannot be used as a pointing device. Most current alternatives use markers or specialized hardware to track and translate a user's movement to pointer movement. These approaches may be perceived as intrusive, for example, wearable devices. Camera-based assistive systems that use visual tracking of features on the user's body often require cumbersome manual adjustment. This paper introduces an enhanced computer vision based strategy where features, for example on a user's face, viewed through an inexpensive USB camera, are tracked and translated to pointer movement. The main contributions of this paper are (1) enhancing a video based interface with a mechanism for mapping feature movement to pointer movement, which allows users to navigate to all areas of the screen even with very limited physical movement, and (2) providing a customizable, hierarchical navigation framework for human computer interaction (HCI). This framework provides effective use of the vision-based interface system for accessing multiple applications in an autonomous setting. Experiments with several users show the effectiveness of the mapping strategy and its usage within the application framework as a practical tool for desktop users with disabilities.
Resumo:
Advanced sensory systems address a number of major obstacles towards the provision for cost effective and proactive rehabilitation. Many of these systems employ technologies such as high-speed video or motion capture to generate quantitative measurements. However these solutions are accompanied by some major limitations including extensive set-up and calibration, restriction to indoor use, high cost and time consuming data analysis. Additionally many do not quantify improvement in a rigorous manner for example gait analysis for 5 minutes as opposed to 24 hour ambulatory monitoring. This work addresses these limitations using low cost, wearable wireless inertial measurement as a mobile and minimal infrastructure alternative. In cooperation with healthcare professionals the goal is to design and implement a reconfigurable and intelligent movement capture system. A key component of this work is an extensive benchmark comparison with the 'gold standard' VICON motion capture system.
Resumo:
Constraint programming has emerged as a successful paradigm for modelling combinatorial problems arising from practical situations. In many of those situations, we are not provided with an immutable set of constraints. Instead, a user will modify his requirements, in an interactive fashion, until he is satisfied with a solution. Examples of such applications include, amongst others, model-based diagnosis, expert systems, product configurators. The system he interacts with must be able to assist him by showing the consequences of his requirements. Explanations are the ideal tool for providing this assistance. However, existing notions of explanations fail to provide sufficient information. We define new forms of explanations that aim to be more informative. Even if explanation generation is a very hard task, in the applications we consider, we must manage to provide a satisfactory level of interactivity and, therefore, we cannot afford long computational times. We introduce the concept of representative sets of relaxations, a compact set of relaxations that shows the user at least one way to satisfy each of his requirements and at least one way to relax them, and present an algorithm that efficiently computes such sets. We introduce the concept of most soluble relaxations, maximising the number of products they allow. We present algorithms to compute such relaxations in times compatible with interactivity, achieving this by indifferently making use of different types of compiled representations. We propose to generalise the concept of prime implicates to constraint problems with the concept of domain consequences, and suggest to generate them as a compilation strategy. This sets a new approach in compilation, and allows to address explanation-related queries in an efficient way. We define ordered automata to compactly represent large sets of domain consequences, in an orthogonal way from existing compilation techniques that represent large sets of solutions.
Resumo:
The authors explore nanoscale sensor processor (nSP) architectures. Their design includes a simple accumulator-based instruction-set architecture, sensors, limited memory, and instruction-fused sensing. Using nSP technology based on optical resonance energy transfer logic helps them decrease the design's size; their smallest design is about the size of the largest-known virus. © 2006 IEEE.
Resumo:
This work was supported by Toyota Motor Engineering and Manufacturing North America and partially supported by the Air Force Office of Scientific Research (Grant No. FA9550-09-1-0562).
Resumo:
Review of: Rosalind W. Picard, Affective Computing
Resumo:
We report on practical experience using the Oxford BSP Library to parallelize a large electromagnetic code, the British Aerospace finite-difference time-domain code EMMA T:FD3D. The Oxford BS Library is one of the first realizations of the Bulk Synchronous Parallel computational model to be targeted at numerically intensive scientific (typically Fortran) computing. The BAe EMMA code is one of the first large-scale applications to be parallelized using this library, and it is an important demonstration of the cost effectiveness of the BSP approach. We illustrate how BSP cost-modelling techniques can be used to predict and optimize performance for single-source programs across different parallel platforms. We provide predicted and observed performance figures for an industrial-strength, single-source parallel code for a variety of real parallel architectures: shared memory multiprocessors, workstation clusters and massively parallel platforms.
Resumo:
Social network analysts have tried to capture the idea of social role explicitly by proposing a framework that precisely gives conditions under which group actors are playing equivalent roles. They term these methods positional analysis techniques. The most general definition is regular equivalence which captures the idea that equivalent actors are related in a similar way to equivalent alters. Regular equivalence gives rise to a whole class of partitions on a network. Given a network we have two different computational problems. The first is how to find a particular regular equivalence. An algorithm exists to find the largest regular partition but there are not efficient algorithms to test whether there is a regular k-partition. That is a partition in k groups that is regular. In addition, when dealing with real data, it is unlikely that any regular partitions exist. To overcome this problem relaxations of regular equivalence have been proposed along with optimisation techniques to find nearly regular partitions. In this paper we review the algorithms that have developed to find particular regular equivalences and look at some of the recent theoretical results which give an insight into the complexity of finding regular partitions.