952 resultados para Run-Time Code Generation, Programming Languages, Object-Oriented Programming


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The process of developing software that takes advantage of multiple processors is commonly referred to as parallel programming. For various reasons, this process is much harder than the sequential case. For decades, parallel programming has been a problem for a small niche only: engineers working on parallelizing mostly numerical applications in High Performance Computing. This has changed with the advent of multi-core processors in mainstream computer architectures. Parallel programming in our days becomes a problem for a much larger group of developers. The main objective of this thesis was to find ways to make parallel programming easier for them. Different aims were identified in order to reach the objective: research the state of the art of parallel programming today, improve the education of software developers about the topic, and provide programmers with powerful abstractions to make their work easier. To reach these aims, several key steps were taken. To start with, a survey was conducted among parallel programmers to find out about the state of the art. More than 250 people participated, yielding results about the parallel programming systems and languages in use, as well as about common problems with these systems. Furthermore, a study was conducted in university classes on parallel programming. It resulted in a list of frequently made mistakes that were analyzed and used to create a programmers' checklist to avoid them in the future. For programmers' education, an online resource was setup to collect experiences and knowledge in the field of parallel programming - called the Parawiki. Another key step in this direction was the creation of the Thinking Parallel weblog, where more than 50.000 readers to date have read essays on the topic. For the third aim (powerful abstractions), it was decided to concentrate on one parallel programming system: OpenMP. Its ease of use and high level of abstraction were the most important reasons for this decision. Two different research directions were pursued. The first one resulted in a parallel library called AthenaMP. It contains so-called generic components, derived from design patterns for parallel programming. These include functionality to enhance the locks provided by OpenMP, to perform operations on large amounts of data (data-parallel programming), and to enable the implementation of irregular algorithms using task pools. AthenaMP itself serves a triple role: the components are well-documented and can be used directly in programs, it enables developers to study the source code and learn from it, and it is possible for compiler writers to use it as a testing ground for their OpenMP compilers. The second research direction was targeted at changing the OpenMP specification to make the system more powerful. The main contributions here were a proposal to enable thread-cancellation and a proposal to avoid busy waiting. Both were implemented in a research compiler, shown to be useful in example applications, and proposed to the OpenMP Language Committee.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The furious pace of Moore's Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by migrating threads and data, but the overhead of existing implementations has previously made migration an unserviceable solution so far. I present an architecture, implementation, and mechanisms that reduces the overhead of migration to the point where migration is a viable supplement to other latency hiding mechanisms, such as multithreading. The architecture is abstract, and presents programmers with a simple, uniform fine-grained multithreaded parallel programming model with implicit memory management. In other words, the spatial nature and implementation details (such as the number of processors) of a parallel machine are entirely hidden from the programmer. Compiler writers are encouraged to devise programming languages for the machine that guide a programmer to express their ideas in terms of objects, since objects exhibit an inherent physical locality of data and code. The machine implementation can then leverage this locality to automatically distribute data and threads across the physical machine by using a set of high performance migration mechanisms. An implementation of this architecture could migrate a null thread in 66 cycles -- over a factor of 1000 improvement over previous work. Performance also scales well; the time required to move a typical thread is only 4 to 5 times that of a null thread. Data migration performance is similar, and scales linearly with data block size. Since the performance of the migration mechanism is on par with that of an L2 cache, the implementation simulated in my work has no data caches and relies instead on multithreading and the migration mechanism to hide and reduce access latencies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

How we get from transistors through to logic gates to ALUs and memory to the stored program and the fetch execute cycle through to machine code and high level languages. Inspired by Tanenbaum's approach in "Structured Computer Organozation"

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The classical computer vision methods can only weakly emulate some of the multi-level parallelisms in signal processing and information sharing that takes place in different parts of the primates’ visual system thus enabling it to accomplish many diverse functions of visual perception. One of the main functions of the primates’ vision is to detect and recognise objects in natural scenes despite all the linear and non-linear variations of the objects and their environment. The superior performance of the primates’ visual system compared to what machine vision systems have been able to achieve to date, motivates scientists and researchers to further explore this area in pursuit of more efficient vision systems inspired by natural models. In this paper building blocks for a hierarchical efficient object recognition model are proposed. Incorporating the attention-based processing would lead to a system that will process the visual data in a non-linear way focusing only on the regions of interest and hence reducing the time to achieve real-time performance. Further, it is suggested to modify the visual cortex model for recognizing objects by adding non-linearities in the ventral path consistent with earlier discoveries as reported by researchers in the neuro-physiology of vision.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Several non-orthogonal space-time block coding (NO-STBC) schemes have recently been proposed to achieve full rate transmission. Some of these schemes, however, suffer from weak robustness: their channel matrices will become ill conditioned in the case of highly correlated channels (HCC). To address this issue, this paper derives a family of robust NO-STBC schemes for four Tx antennas based on the worst case of HCC. These codes turned out to be a superset of Jafarkhani's quasi-orthogonal STBC codes. A computationally affordable linear decoder is also proposed. Although these codes achieve a similar performance to the non-robust schemes under normal channel conditions, they offer a strong robustness against HCC (although possibly yielding a poorer performance). Finally, computer simulations are presented to verify the algorithm design.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new generation of advanced surveillance systems is being conceived as a collection of multi-sensor components such as video, audio and mobile robots interacting in a cooperating manner to enhance situation awareness capabilities to assist surveillance personnel. The prominent issues that these systems face are: the improvement of existing intelligent video surveillance systems, the inclusion of wireless networks, the use of low power sensors, the design architecture, the communication between different components, the fusion of data emerging from different type of sensors, the location of personnel (providers and consumers) and the scalability of the system. This paper focuses on the aspects pertaining to real-time distributed architecture and scalability. For example, to meet real-time requirements, these systems need to process data streams in concurrent environments, designed by taking into account scheduling and synchronisation. The paper proposes a framework for the design of visual surveillance systems based on components derived from the principles of Real Time Networks/Data Oriented Requirements Implementation Scheme (RTN/DORIS). It also proposes the implementation of these components using the well-known middleware technology Common Object Request Broker Architecture (CORBA). Results using this architecture for video surveillance are presented through an implemented prototype.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The existence of hand-centred visual processing has long been established in the macaque premotor cortex. These hand-centred mechanisms have been thought to play some general role in the sensory guidance of movements towards objects, or, more recently, in the sensory guidance of object avoidance movements. We suggest that these hand-centred mechanisms play a specific and prominent role in the rapid selection and control of manual actions following sudden changes in the properties of the objects relevant for hand-object interactions. We discuss recent anatomical and physiological evidence from human and non-human primates, which indicates the existence of rapid processing of visual information for hand-object interactions. This new evidence demonstrates how several stages of the hierarchical visual processing system may be bypassed, feeding the motor system with hand-related visual inputs within just 70 ms following a sudden event. This time window is early enough, and this processing rapid enough, to allow the generation and control of rapid hand-centred avoidance and acquisitive actions, for aversive and desired objects, respectively