893 resultados para forced execution of obligations
Resumo:
Parallel execution of computational mechanics codes requires efficient mesh-partitioning techniques. These mesh-partitioning techniques divide the mesh into specified number of submeshes of approximately the same size and at the same time, minimise the interface nodes of the submeshes. This paper describes a new mesh partitioning technique, employing Genetic Algorithms. The proposed algorithm operates on the deduced graph (dual or nodal graph) of the given finite element mesh rather than directly on the mesh itself. The algorithm works by first constructing a coarse graph approximation using an automatic graph coarsening method. The coarse graph is partitioned and the results are interpolated onto the original graph to initialise an optimisation of the graph partition problem. In practice, hierarchy of (usually more than two) graphs are used to obtain the final graph partition. The proposed partitioning algorithm is applied to graphs derived from unstructured finite element meshes describing practical engineering problems and also several example graphs related to finite element meshes given in the literature. The test results indicate that the proposed GA based graph partitioning algorithm generates high quality partitions and are superior to spectral and multilevel graph partitioning algorithms.
Resumo:
Long running multi-physics coupled parallel applications have gained prominence in recent years. The high computational requirements and long durations of simulations of these applications necessitate the use of multiple systems of a Grid for execution. In this paper, we have built an adaptive middleware framework for execution of long running multi-physics coupled applications across multiple batch systems of a Grid. Our framework, apart from coordinating the executions of the component jobs of an application on different batch systems, also automatically resubmits the jobs multiple times to the batch queues to continue and sustain long running executions. As the set of active batch systems available for execution changes, our framework performs migration and rescheduling of components using a robust rescheduling decision algorithm. We have used our framework for improving the application throughput of a foremost long running multi-component application for climate modeling, the Community Climate System Model (CCSM). Our real multi-site experiments with CCSM indicate that Grid executions can lead to improved application throughput for climate models.
Resumo:
Many common activities, like reading, scanning scenes, or searching for an inconspicuous item in a cluttered environment, entail serial movements of the eyes that shift the gaze from one object to another. Previous studies have shown that the primate brain is capable of programming sequential saccadic eye movements in parallel. Given that the onset of saccades directed to a target are unpredictable in individual trials, what prevents a saccade during parallel programming from being executed in the direction of the second target before execution of another saccade in the direction of the first target remains unclear. Using a computational model, here we demonstrate that sequential saccades inhibit each other and share the brain's limited processing resources (capacity) so that the planning of a saccade in the direction of the first target always finishes first. In this framework, the latency of a saccade increases linearly with the fraction of capacity allocated to the other saccade in the sequence, and exponentially with the duration of capacity sharing. Our study establishes a link between the dual-task paradigm and the ramp-to-threshold model of response time to identify a physiologically viable mechanism that preserves the serial order of saccades without compromising the speed of performance.
Resumo:
Accurate and timely prediction of weather phenomena, such as hurricanes and flash floods, require high-fidelity compute intensive simulations of multiple finer regions of interest within a coarse simulation domain. Current weather applications execute these nested simulations sequentially using all the available processors, which is sub-optimal due to their sub-linear scalability. In this work, we present a strategy for parallel execution of multiple nested domain simulations based on partitioning the 2-D processor grid into disjoint rectangular regions associated with each domain. We propose a novel combination of performance prediction, processor allocation methods and topology-aware mapping of the regions on torus interconnects. Experiments on IBM Blue Gene systems using WRF show that the proposed strategies result in performance improvement of up to 33% with topology-oblivious mapping and up to additional 7% with topology-aware mapping over the default sequential strategy.
Resumo:
In this paper we present a framework for realizing arbitrary instruction set extensions (IE) that are identified post-silicon. The proposed framework has two components viz., an IE synthesis methodology and the architecture of a reconfigurable data-path for realization of the such IEs. The IE synthesis methodology ensures maximal utilization of resources on the reconfigurable data-path. In this context we present the techniques used to realize IEs for applications that demand high throughput or those that must process data streams. The reconfigurable hardware called HyperCell comprises a reconfigurable execution fabric. The fabric is a collection of interconnected compute units. A typical use case of HyperCell is where it acts as a co-processor with a host and accelerates execution of IEs that are defined post-silicon. We demonstrate the effectiveness of our approach by evaluating the performance of some well-known integer kernels that are realized as IEs on HyperCell. Our methodology for realizing IEs through HyperCells permits overlapping of potentially all memory transactions with computations. We show significant improvement in performance for streaming applications over general purpose processor based solutions, by fully pipelining the data-path. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
The correctness of a hard real-time system depends its ability to meet all its deadlines. Existing real-time systems use either a pure real-time scheduler or a real-time scheduler embedded as a real-time scheduling class in the scheduler of an operating system (OS). Existing implementations of schedulers in multicore systems that support real-time and non-real-time tasks, permit the execution of non-real-time tasks in all the cores with priorities lower than those of real-time tasks, but interrupts and softirqs associated with these non-real-time tasks can execute in any core with priorities higher than those of real-time tasks. As a result, the execution overhead of real-time tasks is quite large in these systems, which, in turn, affects their runtime. In order that the hard real-time tasks can be executed in such systems with minimal interference from other Linux tasks, we propose, in this paper, an integrated scheduler architecture, called SchedISA, which aims to considerably reduce the execution overhead of real-time tasks in these systems. In order to test the efficacy of the proposed scheduler, we implemented partitioned earliest deadline first (P-EDF) scheduling algorithm in SchedISA on Linux kernel, version 3.8, and conducted experiments on Intel core i7 processor with eight logical cores. We compared the execution overhead of real-time tasks in the above implementation of SchedISA with that in SCHED_DEADLINE's P-EDF implementation, which concurrently executes real-time and non-real-time tasks in Linux OS in all the cores. The experimental results show that the execution overhead of real-time tasks in the above implementation of SchedISA is considerably less than that in SCHED_DEADLINE. We believe that, with further refinement of SchedISA, the execution overhead of real-time tasks in SchedISA can be reduced to a predictable maximum, making it suitable for scheduling hard real-time tasks without affecting the CPU share of Linux tasks.
Resumo:
A block-structured adaptive mesh refinement (AMR) technique has been used to obtain numerical solutions for many scientific applications. Some block-structured AMR approaches have focused on forming patches of non-uniform sizes where the size of a patch can be tuned to the geometry of a region of interest. In this paper, we develop strategies for adaptive execution of block-structured AMR applications on GPUs, for hyperbolic directionally split solvers. While effective hybrid execution strategies exist for applications with uniform patches, our work considers efficient execution of non-uniform patches with different workloads. Our techniques include bin-packing work units to load balance GPU computations, adaptive asynchronism between CPU and GPU executions using a knapsack formulation, and scheduling communications for multi-GPU executions. Our experiments with synthetic and real data, for single-GPU and multi-GPU executions, on Tesla S1070 and Fermi C2070 clusters, show that our strategies result in up to a 3.23 speedup in performance over existing strategies.
Resumo:
In this paper we present HyperCell as a reconfigurable datapath for Instruction Extensions (IEs). HyperCell comprises an array of compute units laid over a switch network. We present an IE synthesis methodology that enables post-silicon realization of IE datapaths on HyperCell. The synthesis methodology optimally exploits hardware resources in HyperCell to enable software pipelined execution of IEs. Exploitation of temporal reuse of data in HyperCell results in significant reduction of input/output bandwidth requirements of HyperCell.
Resumo:
Identification of dominant modes is an important step in studying linearly vibrating systems, including flow-induced vibrations. In the presence of uncertainty, when some of the system parameters and the external excitation are modeled as random quantities, this step becomes more difficult. This work is aimed at giving a systematic treatment to this end. The ability to capture the time averaged kinetic energy is chosen as the primary criterion for selection of modes. Accordingly, a methodology is proposed based on the overlap of probability density functions (pdf) of the natural and excitation frequencies, proximity of the natural frequencies of the mean or baseline system, modal participation factor, and stochastic variation of mode shapes in terms of the modes of the baseline system - termed here as statistical modal overlapping. The probabilistic descriptors of the natural frequencies and mode shapes are found by solving a random eigenvalue problem. Three distinct vibration scenarios are considered: (i) undamped arid damped free vibrations of a bladed disk assembly, (ii) forced vibration of a building, and (iii) flutter of a bridge model. Through numerical studies, it is observed that the proposed methodology gives an accurate selection of modes. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
Clock synchronization in a wireless sensor network (WSN) is quite essential as it provides a consistent and a coherent time frame for all the nodes across the network. Typically, clock synchronization is achieved by message passing using a contention-based scheme for media access, like carrier sense multiple access (CSMA). The nodes try to synchronize with each other, by sending synchronization request messages. If many nodes try to send messages simultaneously, contention-based schemes cannot efficiently avoid collisions. In such a situation, there are chances of collisions, and hence, message losses, which, in turn, affects the convergence of the synchronization algorithms. However, the number of collisions can be reduced with a frame based approach like time division multiple access (TDMA) for message passing. In this paper, we propose a design to utilize TDMA-based media access and control (MAC) protocol for the performance improvement of clock synchronization protocols. The basic idea is to use TDMA-based transmissions when the degree of synchronization improves among the sensor nodes during the execution of the clock synchronization algorithm. The design significantly reduces the collisions among the synchronization protocol messages. We have simulated the proposed protocol in Castalia network simulator. The simulation results show that the proposed protocol significantly reduces the time required for synchronization and also improves the accuracy of the synchronization algorithm.
Resumo:
The constitutive relations and kinematic assumptions on the composite beam with shape memory alloy (SMA) arbitrarily embedded are discussed and the results related to the different kinematic assumptions are compared. As the approach of mechanics of materials is to study the composite beam with the SMA layer embedded, the kinematic assumption is vital. In this paper, we systematically study the kinematic assumptions influence on the composite beam deflection and vibration characteristics. Based on the different kinematic assumptions, the equations of equilibrium/motion are different. Here three widely used kinematic assumptions are presented and the equations of equilibrium/motion are derived accordingly. As the three kinematic assumptions change from the simple to the complex one, the governing equations evolve from the linear to the nonlinear ones. For the nonlinear equations of equilibrium, the numerical solution is obtained by using Galerkin discretization method and Newton-Rhapson iteration method. The analysis on the numerical difficulty of using Galerkin method on the post-buckling analysis is presented. For the post-buckling analysis, finite element method is applied to avoid the difficulty due to the singularity occurred in Galerkin method. The natural frequencies of the composite beam with the nonlinear governing equation, which are obtained by directly linearizing the equations and locally linearizing the equations around each equilibrium, are compared. The influences of the SMA layer thickness and the shift from neutral axis on the deflection, buckling and post-buckling are also investigated. This paper presents a very general way to treat thermo-mechanical properties of the composite beam with SMA arbitrarily embedded. The governing equations for each kinematic assumption consist of a third order and a fourth order differential equation with a total of seven boundary conditions. Some previous studies on the SMA layer either ignore the thermal constraint effect or implicitly assume that the SMA is symmetrically embedded. The composite beam with the SMA layer asymmetrically embedded is studied here, in which symmetric embedding is a special case. Based on the different kinematic assumptions, the results are different depending on the deflection magnitude because of the nonlinear hardening effect due to the (large) deflection. And this difference is systematically compared for both the deflection and the natural frequencies. For simple kinematic assumption, the governing equations are linear and analytical solution is available. But as the deflection increases to the large magnitude, the simple kinematic assumption does not really reflect the structural deflection and the complex one must be used. During the systematic comparison of computational results due to the different kinematic assumptions, the application range of the simple kinematic assumption is also evaluated. Besides the equilibrium study of the composite laminate with SMA embedded, the buckling, post-buckling, free and forced vibrations of the composite beam with the different configurations are also studied and compared.
Resumo:
Smoldering constitutes a significant fire risk both in normal gravity and in microgravity. This space experiment has been conducted aboard the China Recoverable Satellite SJ-8 to investigate smoldering characteristics of flexible polyurethane foam with central ignition in a forced flow of oxidizer. This configuration resulted in a combination of opposed and forward flow smolder. The microgravity experiment is rather unique in that it was performed at constant pressure, and with a relatively high ambient oxygen concentration (35% by volume). The smoldering characteristics are inferred from measurements of temperature histories at several locations along the foam sample. Particularly important is the discovery that there is a transition from smoldering to flaming near the sample end in the opposed smoldering. This transition seems to be caused by strong acceleration of the smoldering reaction. The observed transition serves to initiate a vigorous forward-propagating oxidation reaction in the char left behind by the smoldering reaction. The secondary char oxidation reaction propagates through the sample and consumes most of the remaining char. In forward flow smoldering, the oxidizer depletion by the upstream opposed smolder prevents an exothermic oxidation reaction from being established in the foam until this preceding reaction is completed. Once fresh oxidizer flows in the sample, the existing conditions are sufficient for a self-sustained forward smoldering reaction to take place.
Resumo:
A half floating zone is fixed on a vibrational deck, which supports a periodical applied acceleration to simulate the effect of g-jitter. This paper deals with the effects of g-jitter on the fluid fields and the critical Marangoni number, which describes the transition from a forced oscillation of thermocapillary convection into an instability oscillatory convection in a liquid bridge of half floating zone with top rod heated. The responses of g-jitter field on the temperature profiles and flow pattern in the liquid bridge were obtained experimentally. The results indicated that the critical Marangoni number decreases with the increasing of g-jitter effect and is slightly smaller for higher frequency of g-jitter with fixed strength of applied gravity.
Resumo:
The utilization of waste waters in aquaculture were briefly reviewed. At the National Institute for Freshwater Fisheries Research (NIFFR), stocking density (20 to 160 fish/m super(3)) experiments using Sarotherodon galilaeus (without supplementary feeding) in floating cages were carried out in a sewage pond (0.4ha surface area). Cage culture of S. galilaeus was observed to have potentials in waste waters aquaculture. Recommendations were made on the execution of an intergrated waste water management and utilization.
Resumo:
The importance of fish meal production as a means of reducing fish waste currently being experienced in the fisheries subsector is discussed. Cost estimate for Nigeria establishing a fish meal manufacturing plant and suggestions on rational execution of the project are presented. If properly located and well managed, the project will serve to convert fish waste to cash in the industrial fishery