Biblioteca Digital

936 resultados para Thread safe parallel run-time

Global-EDF scheduling of multimode real- time systems considering mode independent tasks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Embedded real-time systems often have to support the embedding system in very different and changing application scenarios. An aircraft taxiing, taking off and in cruise flight is one example. The different application scenarios are reflected in the software structure with a changing task set and thus different operational modes. At the same time there is a strong push for integrating previously isolated functionalities in single-chip multicore processors. On such multicores the behavior of the system during a mode change, when the systems transitions from one mode to another, is complex but crucial to get right. In the past we have investigated mode change in multiprocessor systems where a mode change requires a complete change of task set. Now, we present the first analysis which considers mode changes in multicore systems, which use global EDF to schedule a set of mode independent (MI) and mode specific (MS) tasks. In such systems, only the set of MS tasks has to be replaced during mode changes, without jeopardizing the schedulability of the MI tasks. Of prime concern is that the mode change is safe and efficient: i.e. the mode change needs to be performed in a predefined time window and no deadlines may be missed as a function of the mode change.

Calculating an upper bound on the finishing time of a group of threads executing on a GPU: a preliminary case study

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Graphics processor units (GPUs) today can be used for computations that go beyond graphics and such use can attain a performance that is orders of magnitude greater than a normal processor. The software executing on a graphics processor is composed of a set of (often thousands of) threads which operate on different parts of the data and thereby jointly compute a result which is delivered to another thread executing on the main processor. Hence the response time of a thread executing on the main processor is dependent on the finishing time of the execution of threads executing on the GPU. Therefore, we present a simple method for calculating an upper bound on the finishing time of threads executing on a GPU, in particular NVIDIA Fermi. Developing such a method is nontrivial because threads executing on a GPU share hardware resources at very fine granularity.

Capacity sharing and stealing in dynamic server-based real-time systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a dynamic scheduler that supports the coexistence of guaranteed and non-guaranteed bandwidth servers to efficiently handle soft-tasks’ overloads by making additional capacity available from two sources: (i) residual capacity allocated but unused when jobs complete in less than their budgeted execution time; (ii) stealing capacity from inactive non-isolated servers used to schedule best-effort jobs. The effectiveness of the proposed approach in reducing the mean tardiness of periodic jobs is demonstrated through extensive simulations. The achieved results become even more significant when tasks’ computation times have a large variance.

Handling QoS dependencies in distributed cooperative real-time systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to the growing complexity and adaptability requirements of real-time embedded systems, which often exhibit unrestricted inter-dependencies among supported services and user-imposed quality constraints, it is increasingly difficult to optimise the level of service of a dynamic task set within an useful and bounded time. This is even more difficult when intending to benefit from the full potential of an open distributed cooperating environment, where service characteristics are not known beforehand. This paper proposes an iterative refinement approach for a service’s QoS configuration taking into account services’ inter-dependencies and quality constraints, and trading off the achieved solution’s quality for the cost of computation. Extensive simulations demonstrate that the proposed anytime algorithm is able to quickly find a good initial solution and effectively optimises the rate at which the quality of the current solution improves as the algorithm is given more time to run. The added benefits of the proposed approach clearly surpass its reducedoverhead.

Shared resources and precedence constraints with capacity sharing and stealing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a new strategy to integrate shared resources and precedence constraints among real-time tasks, assuming no precise information on critical sections and computation times is available. The concept of bandwidth inheritance is combined with a greedy capacity sharing and stealing policy to efficiently exchange bandwidth among tasks, minimising the degree of deviation from the ideal system's behaviour caused by inter-application blocking. The proposed capacity exchange protocol (CXP) focus on exchanging extra capacities as early, and not necessarily as fairly, as possible. This loss of optimality is worth the reduced complexity as the protocol's behaviour nevertheless tends to be fair in the long run and outperforms other solutions in highly dynamic scenarios, as demonstrated by extensive simulations.

Autonomic workflow activities: the award framework

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. This has motivated many initiatives that have been developing scientific workflow tools. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from workflow tasks specification, decentralizing the control of workflow activities, and allowing their tasks to run autonomous in distributed infrastructures, for instance on Clouds. Furthermore many workflow tools only support the execution of Direct Acyclic Graphs (DAG) without the concept of iterations, where activities are executed millions of iterations during long periods of time and supporting dynamic workflow reconfigurations after certain iteration. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on the Process Networks model, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures, e. g. on Clouds. Each AWA executes a Task developed as a Java class that implements a generic interface allowing end-users to code their applications without concerns for low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables support to dynamic workflow reconfiguration and monitoring of the execution of workflows. We describe how AWARD supports dynamic reconfiguration and discuss typical workflow reconfiguration scenarios. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to a small dedicated cluster and the Amazon (Elastic Computing EC2) Cloud.

A framework for the response time analysis of fixed-priority tasks with stochastic inter-arrival times

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Real-time scheduling usually considers worst-case values for the parameters of task (or message stream) sets, in order to provide safe schedulability tests for hard real-time systems. However, worst-case conditions introduce a level of pessimism that is often inadequate for a certain class of (soft) real-time systems. In this paper we provide an approach for computing the stochastic response time of tasks where tasks have inter-arrival times described by discrete probabilistic distribution functions, instead of minimum inter-arrival (MIT) values.

GTS allocation analysis in IEEE 802.15.4 for real-time wireless sensor networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The IEEE 802.15.4 protocol proposes a flexible communication solution for Low-Rate Wireless Personal Area Networks including sensor networks. It presents the advantage to fit different requirements of potential applications by adequately setting its parameters. When enabling its beacon mode, the protocol makes possible real-time guarantees by using its Guaranteed Time Slot (GTS) mechanism. This paper analyzes the performance of the GTS allocation mechanism in IEEE 802.15.4. The analysis gives a full understanding of the behavior of the GTS mechanism with regards to delay and throughput metrics. First, we propose two accurate models of service curves for a GTS allocation as a function of the IEEE 802.15.4 parameters. We then evaluate the delay bounds guaranteed by an allocation of a GTS using Network Calculus formalism. Finally, based on the analytic results, we analyze the impact of the IEEE 802.15.4 parameters on the throughput and delay bound guaranteed by a GTS allocation. The results of this work pave the way for an efficient dimensioning of an IEEE 802.15.4 cluster.

Time-bounded distributed QoS-aware service configuration in heterogeneous cooperative environments

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The scarcity and diversity of resources among the devices of heterogeneous computing environments may affect their ability to perform services with specific Quality of Service constraints, particularly in dynamic distributed environments where the characteristics of the computational load cannot always be predicted in advance. Our work addresses this problem by allowing resource constrained devices to cooperate with more powerful neighbour nodes, opportunistically taking advantage of global distributed resources and processing power. Rather than assuming that the dynamic configuration of this cooperative service executes until it computes its optimal output, the paper proposes an anytime approach that has the ability to tradeoff deliberation time for the quality of the solution. Extensive simulations demonstrate that the proposed anytime algorithms are able to quickly find a good initial solution and effectively optimise the rate at which the quality of the current solution improves at each iteration, with an overhead that can be considered negligible.

Parallel hyperspectral unmixing on GPUs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This letter presents a new parallel method for hyperspectral unmixing composed by the efficient combination of two popular methods: vertex component analysis (VCA) and sparse unmixing by variable splitting and augmented Lagrangian (SUNSAL). First, VCA extracts the endmember signatures, and then, SUNSAL is used to estimate the abundance fractions. Both techniques are highly parallelizable, which significantly reduces the computing time. A design for the commodity graphics processing units of the two methods is presented and evaluated. Experimental results obtained for simulated and real hyperspectral data sets reveal speedups up to 100 times, which grants real-time response required by many remotely sensed hyperspectral applications.

A shared-disk parallel cluster file system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação apresentada para obtenção do Grau de Doutor em Informática Pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia

Optimization of multi-stage amplifiers in deep-submicron CMOS using a distributed/parallel genetic algorithm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

IEEE International Symposium on Circuits and Systems, pp. 724 – 727, Seattle, EUA

Long-run consumption risk with durable goods : UK evidence for equity and bond markets

Relevância:

30.00% 30.00%

Publicador:

Resumo:

"It is a widely accepted fact that the consumption-based capital asset pricing model (CCAPM) fails to provide a good explanation of many important features of the behaviour of financial market returns in a large range of countries over a long period of time. However, within a representative consumer/investor model, it is hard to see how the basic structure of the consumption based model can be safely abandoned." [introdução]

Parallel hyperspectral compressive sensing method on GPU

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.

Parallel hyperspectral coded aperture for compressive sensing on GPUs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The application of compressive sensing (CS) to hyperspectral images is an active area of research over the past few years, both in terms of the hardware and the signal processing algorithms. However, CS algorithms can be computationally very expensive due to the extremely large volumes of data collected by imaging spectrometers, a fact that compromises their use in applications under real-time constraints. This paper proposes four efficient implementations of hyperspectral coded aperture (HYCA) for CS, two of them termed P-HYCA and P-HYCA-FAST and two additional implementations for its constrained version (CHYCA), termed P-CHYCA and P-CHYCA-FAST on commodity graphics processing units (GPUs). HYCA algorithm exploits the high correlation existing among the spectral bands of the hyperspectral data sets and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. The proposed P-HYCA and P-CHYCA implementations have been developed using the compute unified device architecture (CUDA) and the cuFFT library. Moreover, this library has been replaced by a fast iterative method in the P-HYCA-FAST and P-CHYCA-FAST implementations that leads to very significant speedup factors in order to achieve real-time requirements. The proposed algorithms are evaluated not only in terms of reconstruction error for different compressions ratios but also in terms of computational performance using two different GPU architectures by NVIDIA: 1) GeForce GTX 590; and 2) GeForce GTX TITAN. Experiments are conducted using both simulated and real data revealing considerable acceleration factors and obtaining good results in the task of compressing remotely sensed hyperspectral data sets.

«
1
2
...
23
24
25
26
27
28
29
...
62
63
»