Biblioteca Digital

38 resultados para Process Mining, Distributed Computing, Grid Computing, Process Discovery, Conformance Checking, Business Process Management

Real-Time Operations: Techno-Management Aspects

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Compiler-assisted energy optimization for clustered VLIW processors

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving the clock speed, reducing the energy consumption of the logic, and making the design simpler, it introduces extra overheads by way of inter-cluster communication. This communication happens over long global wires having high load capacitance which leads to delay in execution and significantly high energy consumption. Inter-cluster communication also introduces many short idle cycles, thereby significantly increasing the overall leakage energy consumption in the functional units. The trend towards miniaturization of devices (and associated reduction in threshold voltage) makes energy consumption in interconnects and functional units even worse, and limits the usability of clustered architectures in smaller technologies. However, technological advancements now permit the design of interconnects and functional units with varying performance and power modes. In this paper, we propose scheduling algorithms that aggregate the scheduling slack of instructions and communication slack of data values to exploit the low-power modes of functional units and interconnects. Finally, we present a synergistic combination of these algorithms that simultaneously saves energy in functional units and interconnects to improves the usability of clustered architectures by achieving better overall energy-performance trade-offs. Even with conservative estimates of the contribution of the functional units and interconnects to the overall processor energy consumption, the proposed combined scheme obtains on average 8% and 10% improvement in overall energy-delay product with 3.5% and 2% performance degradation for a 2-clustered and a 4-clustered machine, respectively. We present a detailed experimental evaluation of the proposed schemes. Our test bed uses the Trimaran compiler infrastructure. (C) 2012 Elsevier Inc. All rights reserved.

Veja mais

Smoothed functional and Quasi-Newton algorithms for routing in multi-stage queueing network with constraints

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of optimal routing in a multi-stage network of queues with constraints on queue lengths. We develop three algorithms for probabilistic routing for this problem using only the total end-to-end delays. These algorithms use the smoothed functional (SF) approach to optimize the routing probabilities. In our model all the queues are assumed to have constraints on the average queue length. We also propose a novel quasi-Newton based SF algorithm. Policies like Join Shortest Queue or Least Work Left work only for unconstrained routing. Besides assuming knowledge of the queue length at all the queues. If the only information available is the expected end-to-end delay as with our case such policies cannot be used. We also give simulation results showing the performance of the SF algorithms for this problem.

Veja mais

A weighted average based external clock synchronization protocol for wireless sensor networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clock synchronization is an extremely important requirement of wireless sensor networks(WSNs). There are many application scenarios such as weather monitoring and forecasting etc. where external clock synchronization may be required because WSN itself may consists of components which are not connected to each other. A usual approach for external clock synchronization in WSNs is to synchronize the clock of a reference node with an external source such as UTC, and the remaining nodes synchronize with the reference node using an internal clock synchronization protocol. In order to provide highly accurate time, both the offset and the drift rate of each clock with respect to reference node are estimated from time to time, and these are used for getting correct time from local clock reading. A problem with this approach is that it is difficult to estimate the offset of a clock with respect to the reference node when drift rate of clocks varies over a period of time. In this paper, we first propose a novel internal clock synchronization protocol based on weighted averaging technique, which synchronizes all the clocks of a WSN to a reference node periodically. We call this protocol weighted average based internal clock synchronization(WICS) protocol. Based on this protocol, we then propose our weighted average based external clock synchronization(WECS) protocol. We have analyzed the proposed protocols for maximum synchronization error and shown that it is always upper bounded. Extensive simulation studies of the proposed protocols have been carried out using Castalia simulator. Simulation results validate our theoretical claim that the maximum synchronization error is always upper bounded and also show that the proposed protocols perform better in comparison to other protocols in terms of synchronization accuracy. A prototype implementation of the proposed internal clock synchronization protocol using a few TelosB motes also validates our claim.

Veja mais

Efficient asynchronous executions of AMR computations and visualization on a GPU system

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adaptive Mesh Refinement is a method which dynamically varies the spatio-temporal resolution of localized mesh regions in numerical simulations, based on the strength of the solution features. In-situ visualization plays an important role for analyzing the time evolving characteristics of the domain structures. Continuous visualization of the output data for various timesteps results in a better study of the underlying domain and the model used for simulating the domain. In this paper, we develop strategies for continuous online visualization of time evolving data for AMR applications executed on GPUs. We reorder the meshes for computations on the GPU based on the users input related to the subdomain that he wants to visualize. This makes the data available for visualization at a faster rate. We then perform asynchronous executions of the visualization steps and fix-up operations on the CPUs while the GPU advances the solution. By performing experiments on Tesla S1070 and Fermi C2070 clusters, we found that our strategies result in 60% improvement in response time and 16% improvement in the rate of visualization of frames over the existing strategy of performing fix-ups and visualization at the end of the timesteps.

Veja mais

Performance metrics in a hybrid MPI-OpenMP based molecular dynamics simulation with short-range interactions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We discuss the computational bottlenecks in molecular dynamics (MD) and describe the challenges in parallelizing the computation-intensive tasks. We present a hybrid algorithm using MPI (Message Passing Interface) with OpenMP threads for parallelizing a generalized MD computation scheme for systems with short range interatomic interactions. The algorithm is discussed in the context of nano-indentation of Chromium films with carbon indenters using the Embedded Atom Method potential for Cr-Cr interaction and the Morse potential for Cr-C interactions. We study the performance of our algorithm for a range of MPI-thread combinations and find the performance to depend strongly on the computational task and load sharing in the multi-core processor. The algorithm scaled poorly with MPI and our hybrid schemes were observed to outperform the pure message passing scheme, despite utilizing the same number of processors or cores in the cluster. Speed-up achieved by our algorithm compared favorably with that achieved by standard MD packages. (C) 2013 Elsevier Inc. All rights reserved.

Veja mais

Editorial: Scalable Systems for Big Data Management and Analytics

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Physically Based, Hydrologic Model Results Based on Three Precipitation Products

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of the study is to examine the accuracy of and differences among simulated streamflows driven by rainfall estimates from a network of 22 rain gauges spread over a 2,170 km2 watershed, NEXRAD Stage III radar data, and Tropical Rainfall Measuring Mission (TRMM) 3B42 satellite data. The Gridded Surface Subsurface Hydrologic Analysis (GSSHA), a physically based, distributed parameter, grid-structured, hydrologic model, was used to simulate the June-2002 flooding event in the Upper Guadalupe River watershed in south central Texas. There were significant differences between the rainfall fields estimated by the three types of measurement technologies. These differences resulted in even larger differences in the simulated hydrologic response of the watershed. In general, simulations driven by radar rainfall yielded better results than those driven by satellite or rain-gauge estimates. This study also presents an overview of effects of land cover changes on runoff and stream discharge. The results demonstrate that, for major rainfall events similar to the 2002 event, the effect of urbanization on the watershed in the past two decades would not have made any significant effect on the hydrologic response. The effect of urbanization on the hydrologic response increases as the size of the rainfall event decreases.

Veja mais

38 resultados para Process Mining, Distributed Computing, Grid Computing, Process Discovery, Conformance Checking, Business Process Management

Filtro por publicador