Biblioteca Digital

882 resultados para parallel processing systems

An autonomic approach in swarm-array computing

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Can autonomic computing concepts be applied to traditional multi-core systems found in high performance computing environments? In this paper, we propose a novel synergy between parallel computing and swarm robotics to offer a new computing paradigm, `Swarm-Array Computing' that can harness and apply autonomic computing for parallel computing systems. One approach among three proposed approaches in swarm-array computing based on landscapes of intelligent cores, in which the cores of a parallel computing system are abstracted to swarm agents, is investigated. A task gets executed and transferred seamlessly between cores in the proposed approach thereby achieving self-ware properties that characterize autonomic computing. FPGAs are considered as an experimental platform taking into account its application in space robotics. The feasibility of the proposed approach is validated on the SeSAm multi-agent simulator.

Automatic artefact removal from event-related potentials via clustering

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper outlines a method for automatic artefact removal from multichannel recordings of event-related potentials (ERPs). The proposed method is based on, firstly, separation of the ERP recordings into independent components using the method of temporal decorrelation source separation (TDSEP). Secondly, the novel lagged auto-mutual information clustering (LAMIC) algorithm is used to cluster the estimated components, together with ocular reference signals, into clusters corresponding to cerebral and non-cerebral activity. Thirdly, the components in the cluster which contains the ocular reference signals are discarded. The remaining components are then recombined to reconstruct the clean ERPs.

Towards self-ware via swarm-array computing

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The work reported in this paper proposes Swarm-Array computing, a novel technique inspired by swarm robotics, and built on the foundations of autonomic and parallel computing. The approach aims to apply autonomic computing constructs to parallel computing systems and in effect achieve the self-ware objectives that describe self-managing systems. The constitution of swarm-array computing comprising four constituents, namely the computing system, the problem/task, the swarm and the landscape is considered. Approaches that bind these constituents together are proposed. Space applications employing FPGAs are identified as a potential area for applying swarm-array computing for building reliable systems. The feasibility of a proposed approach is validated on the SeSAm multi-agent simulator and landscapes are generated using the MATLAB toolkit.

Achieving intelligent agents and its feasibility in swarm-array computing

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The work reported in this paper proposes ‘Intelligent Agents’, a Swarm-Array computing approach focused to apply autonomic computing concepts to parallel computing systems and build reliable systems for space applications. Swarm-array computing is a robotics a swarm robotics inspired novel computing approach considered as a path to achieve autonomy in parallel computing systems. In the intelligent agent approach, a task to be executed on parallel computing cores is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and can be seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-* objectives of autonomic computing. The approach is validated on a multi-agent simulator.

Multicore-enabling the MPJ express messaging library

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the transition to multicore processors almost complete, the parallel processing community is seeking efficient ways to port legacy message passing applications on shared memory and multicore processors. MPJ Express is our reference implementation of Message Passing Interface (MPI)-like bindings for the Java language. Starting with the current release, the MPJ Express software can be configured in two modes: the multicore and the cluster mode. In the multicore mode, parallel Java applications execute on shared memory or multicore processors. In the cluster mode, Java applications parallelized using MPJ Express can be executed on distributed memory platforms like compute clusters and clouds. The multicore device has been implemented using Java threads in order to satisfy two main design goals of portability and performance. We also discuss the challenges of integrating the multicore device in the MPJ Express software. This turned out to be a challenging task because the parallel application executes in a single JVM in the multicore mode. On the contrary in the cluster mode, the parallel user application executes in multiple JVMs. Due to these inherent architectural differences between the two modes, the MPJ Express runtime is modified to ensure correct semantics of the parallel program. Towards the end, we compare performance of MPJ Express (multicore mode) with other C and Java message passing libraries---including mpiJava, MPJ/Ibis, MPICH2, MPJ Express (cluster mode)---on shared memory and multicore processors. We found out that MPJ Express performs signicantly better in the multicore mode than in the cluster mode. Not only this but the MPJ Express software also performs better in comparison to other Java messaging libraries including mpiJava and MPJ/Ibis when used in the multicore mode on shared memory or multicore processors. We also demonstrate effectiveness of the MPJ Express multicore device in Gadget-2, which is a massively parallel astrophysics N-body siimulation code.

An MPI-based implementation of intelligent agents on clusters

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recent research in multi-agent systems incorporate fault tolerance concepts, but does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely 'Intelligent Agents'. A task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The feasibility of the approach is validated by implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.

Intelligent agents for fault tolerance: from multi-agent simulation to cluster-based implementation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recent research in multi-agent systems incorporate fault tolerance concepts, but does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely 'Intelligent Agents'. A task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The feasibility of the approach is validated by simulations on an FPGA using a multi-agent simulator, and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.

System identification using partitioned least squares

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A novel partitioned least squares (PLS) algorithm is presented, in which estimates from several simple system models are combined by means of a Bayesian methodology of pooling partial knowledge. The method has the added advantage that, when the simple models are of a similar structure, it lends itself directly to parallel processing procedures, thereby speeding up the entire parameter estimation process by several factors.

Dealing with complexity: an overview of the neural network approach

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The problem of complexity is particularly relevant to the field of control engineering, since many engineering problems are inherently complex. The inherent complexity is such that straightforward computational problem solutions often produce very poor results. Although parallel processing can alleviate the problem to some extent, it is artificial neural networks (in various forms) which have recently proved particularly effective, even in dealing with the causes of the problem itself. This paper presents an overview of the current neural network research being undertaken. Such research aims to solve the complex problems found in many areas of science and engineering today.

Hyperresolution global land surface modeling: meeting a grand challenge for monitoring Earth's terrestrial water

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Monitoring Earth's terrestrial water conditions is critically important to many hydrological applications such as global food production; assessing water resources sustainability; and flood, drought, and climate change prediction. These needs have motivated the development of pilot monitoring and prediction systems for terrestrial hydrologic and vegetative states, but to date only at the rather coarse spatial resolutions (∼10–100 km) over continental to global domains. Adequately addressing critical water cycle science questions and applications requires systems that are implemented globally at much higher resolutions, on the order of 1 km, resolutions referred to as hyperresolution in the context of global land surface models. This opinion paper sets forth the needs and benefits for a system that would monitor and predict the Earth's terrestrial water, energy, and biogeochemical cycles. We discuss six major challenges in developing a system: improved representation of surface‐subsurface interactions due to fine‐scale topography and vegetation; improved representation of land‐atmospheric interactions and resulting spatial information on soil moisture and evapotranspiration; inclusion of water quality as part of the biogeochemical cycle; representation of human impacts from water management; utilizing massively parallel computer systems and recent computational advances in solving hyperresolution models that will have up to 109 unknowns; and developing the required in situ and remote sensing global data sets. We deem the development of a global hyperresolution model for monitoring the terrestrial water, energy, and biogeochemical cycles a “grand challenge” to the community, and we call upon the international hydrologic community and the hydrological science support infrastructure to endorse the effort.

Bollène-2002 experiment: radar quantitative precipitation estimation in the Cévennes–Vivarais region, France

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Bollène-2002 Experiment was aimed at developing the use of a radar volume-scanning strategy for conducting radar rainfall estimations in the mountainous regions of France. A developmental radar processing system, called Traitements Régionalisés et Adaptatifs de Données Radar pour l’Hydrologie (Regionalized and Adaptive Radar Data Processing for Hydrological Applications), has been built and several algorithms were specifically produced as part of this project. These algorithms include 1) a clutter identification technique based on the pulse-to-pulse variability of reflectivity Z for noncoherent radar, 2) a coupled procedure for determining a rain partition between convective and widespread rainfall R and the associated normalized vertical profiles of reflectivity, and 3) a method for calculating reflectivity at ground level from reflectivities measured aloft. Several radar processing strategies, including nonadaptive, time-adaptive, and space–time-adaptive variants, have been implemented to assess the performance of these new algorithms. Reference rainfall data were derived from a careful analysis of rain gauge datasets furnished by the Cévennes–Vivarais Mediterranean Hydrometeorological Observatory. The assessment criteria for five intense and long-lasting Mediterranean rain events have proven that good quantitative precipitation estimates can be obtained from radar data alone within 100-km range by using well-sited, well-maintained radar systems and sophisticated, physically based data-processing systems. The basic requirements entail performing accurate electronic calibration and stability verification, determining the radar detection domain, achieving efficient clutter elimination, and capturing the vertical structure(s) of reflectivity for the target event. Radar performance was shown to depend on type of rainfall, with better results obtained with deep convective rain systems (Nash coefficients of roughly 0.90 for point radar–rain gauge comparisons at the event time step), as opposed to shallow convective and frontal rain systems (Nash coefficients in the 0.6–0.8 range). In comparison with time-adaptive strategies, the space–time-adaptive strategy yields a very significant reduction in the radar–rain gauge bias while the level of scatter remains basically unchanged. Because the Z–R relationships have not been optimized in this study, results are attributed to an improved processing of spatial variations in the vertical profile of reflectivity. The two main recommendations for future work consist of adapting the rain separation method for radar network operations and documenting Z–R relationships conditional on rainfall type.

EMD performance comparison: single vs double floating points

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Empirical mode decomposition (EMD) is a data-driven method used to decompose data into oscillatory components. This paper examines to what extent the defined algorithm for EMD might be susceptible to data format. Two key issues with EMD are its stability and computational speed. This paper shows that for a given signal there is no significant difference between results obtained with single (binary32) and double (binary64) floating points precision. This implies that there is no benefit in increasing floating point precision when performing EMD on devices optimised for single floating point format, such as graphical processing units (GPUs).

A Floating-point Extended Kalman Filter Implementation for Autonomous Mobile Robots

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Localization and Mapping are two of the most important capabilities for autonomous mobile robots and have been receiving considerable attention from the scientific computing community over the last 10 years. One of the most efficient methods to address these problems is based on the use of the Extended Kalman Filter (EKF). The EKF simultaneously estimates a model of the environment (map) and the position of the robot based on odometric and exteroceptive sensor information. As this algorithm demands a considerable amount of computation, it is usually executed on high end PCs coupled to the robot. In this work we present an FPGA-based architecture for the EKF algorithm that is capable of processing two-dimensional maps containing up to 1.8 k features at real time (14 Hz), a three-fold improvement over a Pentium M 1.6 GHz, and a 13-fold improvement over an ARM920T 200 MHz. The proposed architecture also consumes only 1.3% of the Pentium and 12.3% of the ARM energy per feature.

Automated Negotiation Between Publishers And Consumers Of Grid Notifications

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Notification Services mediate between information publishers and consumers that wish to subscribe to periodic updates. In many cases, however, there is a mismatch between the dissemination of these updates and the delivery preferences of the consumer, often in terms of frequency of delivery, quality, etc. In this paper, we present an automated negotiation engine that identifies mutually acceptable terms; we study its performance, and discuss its application to a Grid Notification Service. We also demonstrate how the negotiation engine enables users to control the Quality of Service levels they require.

RST: Reuse through Speculation on Traces

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this thesis, we present a novel approach to combine both reuse and prediction of dynamic sequences of instructions called Reuse through Speculation on Traces (RST). Our technique allows the dynamic identification of instruction traces that are redundant or predictable, and the reuse (speculative or not) of these traces. RST addresses the issue, present on Dynamic Trace Memoization (DTM), of traces not being reused because some of their inputs are not ready for the reuse test. These traces were measured to be 69% of all reusable traces in previous studies. One of the main advantages of RST over just combining a value prediction technique with an unrelated reuse technique is that RST does not require extra tables to store the values to be predicted. Applying reuse and value prediction in unrelated mechanisms but at the same time may require a prohibitive amount of storage in tables. In RST, the values are already stored in the Trace Memoization Table, and there is no extra cost in reading them if compared with a non-speculative trace reuse technique. . The input context of each trace (the input values of all instructions in the trace) already stores the values for the reuse test, which may also be used for prediction. Our main contributions include: (i) a speculative trace reuse framework that can be adapted to different processor architectures; (ii) specification of the modifications in a superscalar, superpipelined processor in order to implement our mechanism; (iii) study of implementation issues related to this architecture; (iv) study of the performance limits of our technique; (v) a performance study of a realistic, constrained implementation of RST; and (vi) simulation tools that can be used in other studies which represent a superscalar, superpipelined processor in detail. In a constrained architecture with realistic confidence, our RST technique is able to achieve average speedups (harmonic means) of 1.29 over the baseline architecture without reuse and 1.09 over a non-speculative trace reuse technique (DTM).

«
1
2
...
12
13
14
15
16
17
18
...
58
59
»