983 resultados para Task-level parallelism
Resumo:
Single core capabilities have reached their maximum clock speed; new multicore architectures provide an alternative way to tackle this issue instead. The design of decoding applications running on top of these multicore platforms and their optimization to exploit all system computational power is crucial to obtain best results. Since the development at the integration level of printed circuit boards are increasingly difficult to optimize due to physical constraints and the inherent increase in power consumption, development of multiprocessor architectures is becoming the new Holy Grail. In this sense, it is crucial to develop applications that can run on the new multi-core architectures and find out distributions to maximize the potential use of the system. Today most of commercial electronic devices, available in the market, are composed of embedded systems. These devices incorporate recently multi-core processors. Task management onto multiple core/processors is not a trivial issue, and a good task/actor scheduling can yield to significant improvements in terms of efficiency gains and also processor power consumption. Scheduling of data flows between the actors that implement the applications aims to harness multi-core architectures to more types of applications, with an explicit expression of parallelism into the application. On the other hand, the recent development of the MPEG Reconfigurable Video Coding (RVC) standard allows the reconfiguration of the video decoders. RVC is a flexible standard compatible with MPEG developed codecs, making it the ideal tool to integrate into the new multimedia terminals to decode video sequences. With the new versions of the Open RVC-CAL Compiler (Orcc), a static mapping of the actors that implement the functionality of the application can be done once the application executable has been generated. This static mapping must be done for each of the different cores available on the working platform. It has been chosen an embedded system with a processor with two ARMv7 cores. This platform allows us to obtain the desired tests, get as much improvement results from the execution on a single core, and contrast both with a PC-based multiprocessor system. Las posibilidades ofrecidas por el aumento de la velocidad de la frecuencia de reloj de sistemas de un solo procesador están siendo agotadas. Las nuevas arquitecturas multiprocesador proporcionan una vía de desarrollo alternativa en este sentido. El diseño y optimización de aplicaciones de descodificación de video que se ejecuten sobre las nuevas arquitecturas permiten un mejor aprovechamiento y favorecen la obtención de mayores rendimientos. Hoy en día muchos de los dispositivos comerciales que se están lanzando al mercado están integrados por sistemas embebidos, que recientemente están basados en arquitecturas multinúcleo. El manejo de las tareas de ejecución sobre este tipo de arquitecturas no es una tarea trivial, y una buena planificación de los actores que implementan las funcionalidades puede proporcionar importantes mejoras en términos de eficiencia en el uso de la capacidad de los procesadores y, por ende, del consumo de energía. Por otro lado, el reciente desarrollo del estándar de Codificación de Video Reconfigurable (RVC), permite la reconfiguración de los descodificadores de video. RVC es un estándar flexible y compatible con anteriores codecs desarrollados por MPEG. Esto hace de RVC el estándar ideal para ser incorporado en los nuevos terminales multimedia que se están comercializando. Con el desarrollo de las nuevas versiones del compilador específico para el desarrollo de lenguaje RVC-CAL (Orcc), en el que se basa MPEG RVC, el mapeo estático, para entornos basados en multiprocesador, de los actores que integran un descodificador es posible. Se ha elegido un sistema embebido con un procesador con dos núcleos ARMv7. Esta plataforma nos permitirá llevar a cabo las pruebas de verificación y contraste de los conceptos estudiados en este trabajo, en el sentido del desarrollo de descodificadores de video basados en MPEG RVC y del estudio de la planificación y mapeo estático de los mismos.
Resumo:
We present two new algorithms which perform automatic parallelization via source-to-source transformations. The objective is to exploit goal-level, unrestricted independent and-parallelism. The proposed algorithms use as targets new parallel execution primitives which are simpler and more flexible than the well-known &/2 parallel operator. This makes it possible to genérate better parallel expressions by exposing more potential parallelism among the literals of a clause than is possible with &/2. The difference between the two algorithms stems from whether the order of the solutions obtained is preserved or not. We also report on a preliminary evaluation of an implementation of our approach. We compare the performance obtained to that of previous annotation algorithms and show that relevant improvements can be obtained.
Resumo:
While logic programming languages offer a great deal of scope for parallelism, there is usually some overhead associated with the execution of goals in parallel because of the work involved in task creation and scheduling. In practice, therefore, the "granularity" of a goal, i.e. an estimate of the work available under it, should be taken into account when deciding whether or not to execute a goal concurrently as a sepárate task. This paper describes a method for estimating the granularity of a goal at compile time. The runtime overhead associated with our approach is usually quite small, and the performance improvements resulting from the incorporation of grainsize control can be quite good. This is shown by means of experimental results.
Resumo:
This paper presents and proves some fundamental results for independent and-parallelism (IAP). First, the paper treats the issues of correctness and efficiency: after defining strict and non-strict goal independence, it is proved that if strictly independent goals are executed in parallel the solutions obtained are the same as those produced by standard sequential execution. It is also shown that, in the absence of failure, the parallel proof procedure doesn't genérate any additional work (with respect to standard SLDresolution) while the actual execution time is reduced. The same results hold even if non-strictly independent goals are executed in parallel, provided a trivial rewriting of such goals is performed. In addition, and most importantly, treats the issue of compile-time generation of IAP by proposing conditions, to be written at compile-time, to efficiently check strict and non-strict goal independence at run-time and proving the sufficiency of such conditions. It is also shown how simpler conditions can be constructed if some information regarding the binding context of the goals to be executed in parallel is available to the compiler trough either local or program-level analysis. These results therefore provide a formal basis for the automatic compile-time generation of IAP. As a corollary of such results, the paper also proves that negative goals are always non-strictly independent, and that goals which share a first occurrence of an existential variable are never independent.
Resumo:
This report addresses speculative parallelism (the assignment of spare processing resources to tasks which are not known to be strictly required for the successful completion of a computation) at the user and application level. At this level, the execution of a program is seen as a (dynamic) tree —a graph, in general. A solution for a problem is a traversal of this graph from the initial state to a node known to be the answer. Speculative parallelism then represents the assignment of resources to múltiple branches of this graph even if they are not positively known to be on the path to a solution. In highly non-deterministic programs the branching factor can be very high and a naive assignment will very soon use up all the resources. This report presents work assignment strategies other than the usual depth-first and breadth-first. Instead, best-first strategies are used. Since their definition is application-dependent, the application language contains primitives that allow the user (or application programmer) to a) indícate when intelligent OR-parallelism should be used; b) provide the functions that define "best," and c) indícate when to use them. An abstract architecture enables those primitives to perform the search in a "speculative" way, using several processors, synchronizing them, killing the siblings of the path leading to the answer, etc. The user is freed from worrying about these interactions. Several search strategies are proposed and their implementation issues are addressed. "Armageddon," a global pruning method, is introduced, together with both a software and a hardware implementation for it. The concepts exposed are applicable to áreas of Artificial Intelligence such as extensive expert systems, planning, game playing, and in general to large search problems. The proposed strategies, although showing promise, have not been evaluated by simulation or experimentation.
Resumo:
A backtracking algorithm for AND-Parallelism and its implementation at the Abstract Machine level are presented: first, a class of AND-Parallelism models based on goal independence is defined, and a generalized version of Restricted AND-Parallelism (RAP) introduced as characteristic of this class. A simple and efficient backtracking algorithm for R A P is then discussed. An implementation scheme is presented for this algorithm which offers minimum overhead, while retaining the performance and storage economy of sequent ial implementations and taking advantage of goal independence to avoid unnecessary backtracking ("restricted intelligent backtracking"). Finally, the implementation of backtracking in sequential and AND-Parallcl systems is explained through a number of examples.
Resumo:
This paper presents some brief considerations on the role of Computational Logic in the construction of Artificial Intelligence systems and in programming in general. It does not address how the many problems in AI can be solved but, rather more modestly, tries to point out some advantages of Computational Logic as a tool for the AI scientist in his quest. It addresses the interaction between declarative and procedural views of programs (deduction and action), the impact of the intrinsic limitations of logic, the relationship with other apparently competing computational paradigms, and finally discusses implementation-related issues, such as the efficiency of current implementations and their capability for efficiently exploiting existing and future sequential and parallel hardware. The purpose of the discussion is in no way to present Computational Logic as the unique overall vehicle for the development of intelligent systems (in the firm belief that such a panacea is yet to be found) but rather to stress its strengths in providing reasonable solutions to several aspects of the task.
Resumo:
Goal-level Independent and-parallelism (IAP) is exploited by scheduling for simultaneous execution two or more goals which will not interfere with each other at run time. This can be done safely even if such goals can produce multiple answers. The most successful IAP implementations to date have used recomputation of answers and sequentially ordered backtracking. While in principle simplifying the implementation, recomputation can be very inefficient if the granularity of the parallel goals is large enough and they produce several answers, while sequentially ordered backtracking limits parallelism. And, despite the expected simplification, the implementation of the classic schemes has proved to involve complex engineering, with the consequent difficulty for system maintenance and expansion, and still frequently run into the well-known trapped goal and garbage slot problems. This work presents ideas about an alternative parallel backtracking model for IAP and a simulation studio. The model features parallel out-of-order backtracking and relies on answer memoization to reuse and combine answers. Whenever a parallel goal backtracks, its siblings also perform backtracking, but after storing the bindings generated by previous answers. The bindings are then reinstalled when combining answers. In order not to unnecessarily penalize forward execution, non-speculative and-parallel goals which have not been executed yet take precedence over sibling goals which could be backtracked over. Using a simulator, we show that this approach can bring significant performance advantages over classical approaches.
Resumo:
We present new algorithms which perform automatic parallelization via source-to-source transformations. The objective is to exploit goal-level, unrestricted independent andparallelism. The proposed algorithms use as targets new parallel execution primitives which are simpler and more flexible than the well-known &/2 parallel operator, which makes it possible to generate better parallel expressions by exposing more potential parallelism among the literals of a clause than is possible with &/2. The main differences between the algorithms stem from whether the order of the solutions obtained is preserved or not, and on the use of determinacy information. We briefly describe the environment where the algorithms have been implemented and the runtime platform in which the parallelized programs are executed. We also report on an evaluation of an implementation of our approach. We compare the performance obtained to that of previous annotation algorithms and show that relevant improvements can be obtained.
Resumo:
The definition of an agent architecture at the knowledge level makes emphasis on the knowledge role played by the data interchanged between the agent components and makes explicit this data interchange this makes easier the reuse of these knowledge structures independently of the implementation This article defines a generic task model of an agent architecture and refines some of these tasks using the interference diagrams. Finally, a operationalisation of this conceptual model using the rule-oriented language Jess is shown. knowledge level,
Resumo:
This thesis presents a task-oriented approach to telemanipulation for maintenance in large scientific facilities, with specific focus on the particle accelerator facilities at European Organization for Nuclear Research (CERN) in Geneva, Switzerland and GSI Helmholtz Centre for Heavy Ion Research (GSI) in Darmstadt, Germany. It examines how telemanipulation can be used in these facilities and reviews how this differs from the representation of telemanipulation tasks within the literature. It provides methods to assess and compare telemanipulation procedures as well a test suite to compare telemanipulators themselves from a dexterity perspective. It presents a formalisation of telemanipulation procedures into a hierarchical model which can be then used as a basis to aid maintenance engineers in assessing tasks for telemanipulation, and as the basis for future research. The model introduces a new concept of Elemental Actions as the building block of telemanipulation movements and incorporates the dependent factors for procedures at a higher level of abstraction. In order to gain insight into realistic tasks performed by telemanipulation systems within both industrial and research environments a survey of teleoperation experts is presented. Analysis of the responses is performed from which it is concluded that there is a need within the robotics community for physical benchmarking tests which are geared towards evaluating the dexterity of telemanipulators for comparison of their dexterous abilities. A three stage test suite is presented which is designed to allow maintenance engineers to assess different telemanipulators for their dexterity. This incorporates general characteristics of the system, a method to compare kinematic reachability of multiple telemanipulators and physical test setups to assess dexterity from a both a qualitative perspective and measurably by using performance metrics. Finally, experimental results are provided for the application of the proposed test suite onto two telemanipulation systems, one from a research setting and the other within CERN. It describes the procedure performed and discusses comparisons between the two systems, as well as providing input from the expert operator of the CERN system.
Resumo:
Adjusting N fertilizer application to crop requirements is a key issue to improve fertilizer efficiency, reducing unnecessary input costs to farmers and N environmental impact. Among the multiple soil and crop tests developed, optical sensors that detect crop N nutritional status may have a large potential to adjust N fertilizer recommendation (Samborski et al. 2009). Optical readings are rapid to take and non-destructive, they can be efficiently processed and combined to obtain indexes or indicators of crop status. However, other physiological stress conditions may interfere with the readings and detection of the best crop nutritional status indicators is not always and easy task. Comparison of different equipments and technologies might help to identify strengths and weakness of the application of optical sensors for N fertilizer recommendation. The aim of this study was to evaluate the potential of various ground-level optical sensors and narrow-band indices obtained from airborne hyperspectral images as tools for maize N fertilizer recommendations. Specific objectives were i) to determine which indices could detect differences in maize plants treated with different N fertilizer rates, and ii) to evaluate its ability to identify N-responsive from non-responsive sites.
Resumo:
The effects of practice on the functional anatomy observed in two different tasks, a verbal and a motor task, are reviewed in this paper. In the first, people practiced a verbal production task, generating an appropriate verb in response to a visually presented noun. Both practiced and unpracticed conditions utilized common regions such as visual and motor cortex. However, there was a set of regions that was affected by practice. Practice produced a shift in activity from left frontal, anterior cingulate, and right cerebellar hemisphere to activity in Sylvian-insular cortex. Similar changes were also observed in the second task, a task in a very different domain, namely the tracing of a maze. Some areas were significantly more activated during initial unskilled performance (right premotor and parietal cortex and left cerebellar hemisphere); a different region (medial frontal cortex, “supplementary motor area”) showed greater activity during skilled performance conditions. Activations were also found in regions that most likely control movement execution irrespective of skill level (e.g., primary motor cortex was related to velocity of movement). One way of interpreting these results is in a “scaffolding-storage” framework. For unskilled, effortful performance, a scaffolding set of regions is used to cope with novel task demands. Following practice, a different set of regions is used, possibly representing storage of particular associations or capabilities that allow for skilled performance. The specific regions used for scaffolding and storage appear to be task dependent.
Resumo:
The specificity of the improvement in perceptual learning is often used to localize the neuronal changes underlying this type of adult plasticity. We investigated a visual texture discrimination task previously reported to be accomplished preattentively and for which learning-related changes were inferred to occur at a very early level of the visual processing stream. The stimulus was a matrix of lines from which a target popped out, due to an orientation difference between the three target lines and the background lines. The task was to report the global orientation of the target and was performed monocularly. The subjects' performance improved dramatically with training over the course of 2-3 weeks, after which we tested the specificity of the improvement for the eye trained. In all subjects tested, there was complete interocular transfer of the learning effect. The neuronal correlate of this learning are therefore most likely localized in a visual area where input from the two eyes has come together.
Resumo:
Three studies investigated the relation between symbolic gestures and words, aiming at discover the neural basis and behavioural features of the lexical semantic processing and integration of the two communicative signals. The first study aimed at determining whether elaboration of communicative signals (symbolic gestures and words) is always accompanied by integration with each other and, if present, this integration can be considered in support of the existence of a same control mechanism. Experiment 1 aimed at determining whether and how gesture is integrated with word. Participants were administered with a semantic priming paradigm with a lexical decision task and pronounced a target word, which was preceded by a meaningful or meaningless prime gesture. When meaningful, the gesture could be either congruent or incongruent with word meaning. Duration of prime presentation (100, 250, 400 ms) randomly varied. Voice spectra, lip kinematics, and time to response were recorded and analyzed. Formant 1 of voice spectra, and mean velocity in lip kinematics increased when the prime was meaningful and congruent with the word, as compared to meaningless gesture. In other words, parameters of voice and movement were magnified by congruence, but this occurred only when prime duration was 250 ms. Time to response to meaningful gesture was shorter in the condition of congruence compared to incongruence. Experiment 2 aimed at determining whether the mechanism of integration of a prime word with a target word is similar to that of a prime gesture with a target word. Formant 1 of the target word increased when word prime was meaningful and congruent, as compared to meaningless congruent prime. Increase was, however, present for whatever prime word duration. In the second study, experiment 3 aimed at determining whether symbolic prime gesture comprehension makes use of motor simulation. Transcranial Magnetic Stimulation was delivered to left primary motor cortex 100, 250, 500 ms after prime gesture presentation. Motor Evoked Potential of First Dorsal Interosseus increased when stimulation occurred 100 ms post-stimulus. Thus, gesture was understood within 100ms and integrated with the target word within 250 ms. Experiment 4 excluded any hand motor simulation in order to comprehend prime word. The effect of the prior presentation of a symbolic gesture on congruent target word processing was investigated in study 3. In experiment 5, symbolic gestures were presented as primes, followed by semantically congruent target word or pseudowords. In this case, lexical-semantic decision was accompanied by a motor simulation at 100ms after the onset of the verbal stimuli. Summing up, the same type of integration with a word was present for both prime gesture and word. It was probably subsequent to understanding of the signal, which used motor simulation for gesture and direct access to semantics for words. However, gesture and words could be understood at the same motor level through simulation if words were preceded by an adequate gestural context. Results are discussed in the prospective of a continuum between transitive actions and emblems, in parallelism with language; the grounded/symbolic content of the different signals evidences relation between sensorimotor and linguistic systems, which could interact at different levels.