999 resultados para Architecture, Romanesque
Resumo:
The furious pace of Moore's Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by migrating threads and data, but the overhead of existing implementations has previously made migration an unserviceable solution so far. I present an architecture, implementation, and mechanisms that reduces the overhead of migration to the point where migration is a viable supplement to other latency hiding mechanisms, such as multithreading. The architecture is abstract, and presents programmers with a simple, uniform fine-grained multithreaded parallel programming model with implicit memory management. In other words, the spatial nature and implementation details (such as the number of processors) of a parallel machine are entirely hidden from the programmer. Compiler writers are encouraged to devise programming languages for the machine that guide a programmer to express their ideas in terms of objects, since objects exhibit an inherent physical locality of data and code. The machine implementation can then leverage this locality to automatically distribute data and threads across the physical machine by using a set of high performance migration mechanisms. An implementation of this architecture could migrate a null thread in 66 cycles -- over a factor of 1000 improvement over previous work. Performance also scales well; the time required to move a typical thread is only 4 to 5 times that of a null thread. Data migration performance is similar, and scales linearly with data block size. Since the performance of the migration mechanism is on par with that of an L2 cache, the implementation simulated in my work has no data caches and relies instead on multithreading and the migration mechanism to hide and reduce access latencies.
Resumo:
The Space Systems, Policy and Architecture Research Consortium (SSPARC) was formed to make substantial progress on problems of national importance. The goals of SSPARC were to: • Provide technologies and methods that will allow the creation of flexible, upgradable space systems, • Create a “clean sheet” approach to space systems architecture determination and design, including the incorporation of risk, uncertainty, and flexibility issues, and • Consider the impact of national space policy on the above. This report covers the last two goals, and demonstrates that the effort was largely successful.
Resumo:
We consider the often-studied problem of sorting, for a parallel computer. Given an input array distributed evenly over p processors, the task is to compute the sorted output array, also distributed over the p processors. Many existing algorithms take the approach of approximately load-balancing the output, leaving each processor with Θ(n/p) elements. However, in many cases, approximate load-balancing leads to inefficiencies in both the sorting itself and in further uses of the data after sorting. We provide a deterministic parallel sorting algorithm that uses parallel selection to produce any output distribution exactly, particularly one that is perfectly load-balanced. Furthermore, when using a comparison sort, this algorithm is 1-optimal in both computation and communication. We provide an empirical study that illustrates the efficiency of exact data splitting, and shows an improvement over two sample sort algorithms.
Resumo:
It is well known that image processing requires a huge amount of computation, mainly at low level processing where the algorithms are dealing with a great number of data-pixel. One of the solutions to estimate motions involves detection of the correspondences between two images. For normalised correlation criteria, previous experiments shown that the result is not altered in presence of nonuniform illumination. Usually, hardware for motion estimation has been limited to simple correlation criteria. The main goal of this paper is to propose a VLSI architecture for motion estimation using a matching criteria more complex than Sum of Absolute Differences (SAD) criteria. Today hardware devices provide many facilities for the integration of more and more complex designs as well as the possibility to easily communicate with general purpose processors
Resumo:
This paper proposes a parallel architecture for estimation of the motion of an underwater robot. It is well known that image processing requires a huge amount of computation, mainly at low-level processing where the algorithms are dealing with a great number of data. In a motion estimation algorithm, correspondences between two images have to be solved at the low level. In the underwater imaging, normalised correlation can be a solution in the presence of non-uniform illumination. Due to its regular processing scheme, parallel implementation of the correspondence problem can be an adequate approach to reduce the computation time. Taking into consideration the complexity of the normalised correlation criteria, a new approach using parallel organisation of every processor from the architecture is proposed
Resumo:
This paper surveys control architectures proposed in the literature and describes a control architecture that is being developed for a semi-autonomous underwater vehicle for intervention missions (SAUVIM) at the University of Hawaii. Conceived as hybrid, this architecture has been organized in three layers: planning, control and execution. The mission is planned with a sequence of subgoals. Each subgoal has a related task supervisor responsible for arranging a set of pre-programmed task modules in order to achieve the subgoal. Task modules are the key concept of the architecture. They are the main building blocks and can be dynamically re-arranged by the task supervisor. In our architecture, deliberation takes place at the planning layer while reaction is dealt through the parallel execution of the task modules. Hence, the system presents both a hierarchical and an heterarchical decomposition, being able to show a predictable response while keeping rapid reactivity to the dynamic environment
Resumo:
All-optical label swapping (AOLS) forms a key technology towards the implementation of all-optical packet switching nodes (AOPS) for the future optical Internet. The capital expenditures of the deployment of AOLS increases with the size of the label spaces (i.e. the number of used labels), since a special optical device is needed for each recognized label on every node. Label space sizes are affected by the way in which demands are routed. For instance, while shortest-path routing leads to the usage of fewer labels but high link utilization, minimum interference routing leads to the opposite. This paper studies all-optical label stacking (AOLStack), which is an extension of the AOLS architecture. AOLStack aims at reducing label spaces while easing the compromise with link utilization. In this paper, an integer lineal program is proposed with the objective of analyzing the softening of the aforementioned trade-off due to AOLStack. Furthermore, a heuristic aiming at finding good solutions in polynomial-time is proposed as well. Simulation results show that AOLStack either a) reduces the label spaces with a low increase in the link utilization or, similarly, b) uses better the residual bandwidth to decrease the number of labels even more
Resumo:
The Introductory Lecture is a discussion about "What is the Web". It involves lots of calling out TLAs and writing them on the blackboard, dividing things into servers, clients, protocols, formats, and the punchline is that the one unique and novel thing about the web is the hypertext link. This follows naturally into the Web architecture - the answer to the question "what is the web".
Resumo:
Plain Text - ASCII, Unicode, UTF-8 Content Formats - XML-based formats (RSS, MathML, SVG, Office) + PDF Text based data formats: CSV, JSON
Resumo:
A collection of notes and white papers regarding Enterprise Architecture
Resumo:
Wednesday 23rd April 2014 Speaker(s): Willi Hasselbring Organiser: Leslie Carr Time: 23/04/2014 14:00-15:00 Location: B32/3077 File size: 802Mb Abstract The internal behavior of large-scale software systems cannot be determined on the basis of static (e.g., source code) analysis alone. Kieker provides complementary dynamic analysis capabilities, i.e., monitoring/profiling and analyzing a software system's runtime behavior. Application Performance Monitoring is concerned with continuously observing a software system's performance-specific runtime behavior, including analyses like assessing service level compliance or detecting and diagnosing performance problems. Architecture Discovery is concerned with extracting architectural information from an existing software system, including both structural and behavioral aspects like identifying architectural entities (e.g., components and classes) and their interactions (e.g., local or remote procedure calls). In addition to the Architecture Discovery of Java systems, Kieker supports Architecture Discovery for other platforms, including legacy systems, for instance, inplemented in C#, C++, Visual Basic 6, COBOL or Perl. Thanks to Kieker's extensible architecture it is easy to implement and use custom extensions and plugins. Kieker was designed for continuous monitoring in production systems inducing only a very low overhead, which has been evaluated in extensive benchmark experiments. Please, refer to http://kieker-monitoring.net/ for more information.