13 resultados para Slot-based task-splitting algorithms
em Greenwich Academic Literature Archive - UK
Resumo:
In this paper, we shall critically examine a special class of graph matching algorithms that follow the approach of node-similarity measurement. A high-level algorithm framework, namely node-similarity graph matching framework (NSGM framework), is proposed, from which, many existing graph matching algorithms can be subsumed, including the eigen-decomposition method of Umeyama, the polynomial-transformation method of Almohamad, the hubs and authorities method of Kleinberg, and the kronecker product successive projection methods of Wyk, etc. In addition, improved algorithms can be developed from the NSGM framework with respects to the corresponding results in graph theory. As the observation, it is pointed out that, in general, any algorithm which can be subsumed from NSGM framework fails to work well for graphs with non-trivial auto-isomorphism structure.
Resumo:
This paper discusses preconditioned Krylov subspace methods for solving large scale linear systems that originate from oil reservoir numerical simulations. Two types of preconditioners, one being based on an incomplete LU decomposition and the other being based on iterative algorithms, are used together in a combination strategy in order to achieve an adaptive and efficient preconditioner. Numerical tests show that different Krylov subspace methods combining with appropriate preconditioners are able to achieve optimal performance.
Resumo:
A zone based systems design framework is described and utilised in the implementation of a message authentication code (MAC) algorithm based on symmetric key block ciphers. The resulting block cipher based MAC algorithm may be used to provide assurance of the authenticity and, hence, the integrity of binary data. Using software simulation to benchmark against the de facto cipher block chaining MAC (CBC-MAC) variant used in the TinySec security protocol for wireless sensor networks and the NIST cipher block chaining MAC standard, CMAC; we show that our zone based systems design framework can lead to block cipher based MAC constructs that point to improvements in message processing efficiency, processing throughput and processing latency.
Resumo:
Orthogonal frequency division multiplexing(OFDM) is becoming a fundamental technology in future generation wireless communications. Call admission control is an effective mechanism to guarantee resilient, efficient, and quality-of-service (QoS) services in wireless mobile networks. In this paper, we present several call admission control algorithms for OFDM-based wireless multiservice networks. Call connection requests are differentiated into narrow-band calls and wide-band calls. For either class of calls, the traffic process is characterized as batch arrival since each call may request multiple subcarriers to satisfy its QoS requirement. The batch size is a random variable following a probability mass function (PMF) with realistically maximum value. In addition, the service times for wide-band and narrow-band calls are different. Following this, we perform a tele-traffic queueing analysis for OFDM-based wireless multiservice networks. The formulae for the significant performance metrics call blocking probability and bandwidth utilization are developed. Numerical investigations are presented to demonstrate the interaction between key parameters and performance metrics. The performance tradeoff among different call admission control algorithms is discussed. Moreover, the analytical model has been validated by simulation. The methodology as well as the result provides an efficient tool for planning next-generation OFDM-based broadband wireless access systems.
Resumo:
The availability of a very accurate dependence graph for a scalar code is the basis for the automatic generation of an efficient parallel implementation. The strategy for this task which is encapsulated in a comprehensive data partitioning code generation algorithm is described. This algorithm involves the data partition, calculation of assignment ranges for partitioned arrays, addition of a comprehensive set of execution control masks, altering loop limits, addition and optimisation of communications for all data. In this context, the development and implementation of strategies to merge communications wherever possible has proved an important feature in producing efficient parallel implementations for numerical mesh based codes. The code generation strategies described here are embedded within the Computer Aided Parallelisation tools (CAPTools) software as a key part of a toolkit for automating as much as possible of the parallelisation process for mesh based numerical codes. The algorithms used enables parallelisation of real computational mechanics codes with only minor user interaction and without any prior manual customisation of the serial code to suit the parallelisation tool.
Resumo:
The performance of loadsharing algorithms for heterogeneous distributed systems is investigated by simulation. The systems considered are networks of workstations (nodes) which differ in processing power. Two parameters are proposed for characterising system heterogeneity, namely the variance and skew of the distribution of processing power among the network nodes. A variety of networks are investigated, with the same number of nodes and total processing power, but with the processing power distributed differently among the nodes. Two loadsharing algorithms are evaluated, at overall system loadings of 50% and 90%, using job response time as the performance metric. Comparison is made with the ideal situation of ‘perfect sharing’, where it is assumed that the communication delays are zero and that complete knowledge is available about job lengths and the loading at the different nodes, so that an arriving job can be sent to the node where it will be completed in the shortest time. The algorithms studied are based on those already in use for homogeneous networks, but were adapted to take account of system heterogeneity. Both algorithms take into account the differences in the processing powers of the nodes in their location policies, but differ in the extent to which they ‘discriminate’ against the slower nodes. It is seen that the relative performance of the two is strongly influenced by the system utilisation and the distribution of processing power among the nodes.
Resumo:
This paper presents work on document retrieval based on first time participation in the CLEF 2001 monolingual retrieval task using French. The experiment findings indicated that Okapi, the text retrieval system in use, can successfully be used for non-English text retrieval. A lot of internal pre-processing is required in the basic search system for conversion into Okapi access formats. Various shell scripts were written to achieve the conversion in a UNIX environment, failure of which would significantly have impeded the overall performance. Based on the experiment findings using Okapi - originally designed for English - it was clear that, although most European languages share conventional word boundaries and variant word morphemes formed by the additon of suffixes, there is significant difference between French and English retrieval depending on the adaptation of indexing and search strategies in use. No sophisticated method for higher recall and precision such as stemming techniques, phrase translation or de-compounding was employed for the experiment and our results were suggestively poor. Future participation would include more refined query translation tools.
Resumo:
Parallel processing techniques have been used in the past to provide high performance computing resources for activities such as fire-field modelling. This has traditionally been achieved using specialized hardware and software, the expense of which would be difficult to justify for many fire engineering practices. In this article we demonstrate how typical office-based PCs attached to a Local Area Network has the potential to offer the benefits of parallel processing with minimal costs associated with the purchase of additional hardware or software. It was found that good speedups could be achieved on homogeneous networks of PCs, for example a problem composed of ~100,000 cells would run 9.3 times faster on a network of 12 800MHz PCs than on a single 800MHz PC. It was also found that a network of eight 3.2GHz Pentium 4 PCs would run 7.04 times faster than a single 3.2GHz Pentium computer. A dynamic load balancing scheme was also devised to allow the effective use of the software on heterogeneous PC networks. This scheme also ensured that the impact between the parallel processing task and other computer users on the network was minimized.
Resumo:
This paper presents an investigation into dynamic self-adjustment of task deployment and other aspects of self-management, through the embedding of multiple policies. Non-dedicated loosely-coupled computing environments, such as clusters and grids are increasingly popular platforms for parallel processing. These abundant systems are highly dynamic environments in which many sources of variability affect the run-time efficiency of tasks. The dynamism is exacerbated by the incorporation of mobile devices and wireless communication. This paper proposes an adaptive strategy for the flexible run-time deployment of tasks; to continuously maintain efficiency despite the environmental variability. The strategy centres on policy-based scheduling which is informed by contextual and environmental inputs such as variance in the round-trip communication time between a client and its workers and the effective processing performance of each worker. A self-management framework has been implemented for evaluation purposes. The framework integrates several policy-controlled, adaptive services with the application code, enabling the run-time behaviour to be adapted to contextual and environmental conditions. Using this framework, an exemplar self-managing parallel application is implemented and used to investigate the extent of the benefits of the strategy
Resumo:
This paper discusses a reliability based optimisation modelling approach demonstrated for the design of a SiP structure integrated by stacking dies one upon the other. In this investigation the focus is on the strategy for handling the uncertainties in the package design inputs and their implementation into the design optimisation modelling framework. The analysis of fhermo-mechanical behaviour of the package is utilised to predict the fatigue life-time of the lead-free board level solder interconnects and warpage of the package under thermal cycling. The SiP characterisation is obtained through the exploitation of Reduced Order Models (ROM) constructed using high fidelity analysis and Design of Experiments (DoE) methods. The design task is to identify the optimal SiP design specification by varying several package input parameters so that a specified target reliability of the solder joints is achieved and in the same time design requirements and package performance criteria are met
Resumo:
We consider a variety of preemptive scheduling problems with controllable processing times on a single machine and on identical/uniform parallel machines, where the objective is to minimize the total compression cost. In this paper, we propose fast divide-and-conquer algorithms for these scheduling problems. Our approach is based on the observation that each scheduling problem we discuss can be formulated as a polymatroid optimization problem. We develop a novel divide-and-conquer technique for the polymatroid optimization problem and then apply it to each scheduling problem. We show that each scheduling problem can be solved in $ \O({\rm T}_{\rm feas}(n) \times\log n)$ time by using our divide-and-conquer technique, where n is the number of jobs and Tfeas(n) denotes the time complexity of the corresponding feasible scheduling problem with n jobs. This approach yields faster algorithms for most of the scheduling problems discussed in this paper.
Resumo:
Fourth-order partial differential equation (PDE) proposed by You and Kaveh (You-Kaveh fourth-order PDE), which replaces the gradient operator in classical second-order nonlinear diffusion methods with a Laplacian operator, is able to avoid blocky effects often caused by second-order nonlinear PDEs. However, the equation brought forward by You and Kaveh tends to leave the processed images with isolated black and white speckles. Although You and Kaveh use median filters to filter these speckles, median filters can blur the processed images to some extent, which weakens the result of You-Kaveh fourth-order PDE. In this paper, the reason why You-Kaveh fourth-order PDE can leave the processed images with isolated black and white speckles is analyzed, and a new fourth-order PDE based on the changes of Laplacian (LC fourth-order PDE) is proposed and tested. The new fourth-order PDE preserves the advantage of You-Kaveh fourth-order PDE and avoids leaving isolated black and white speckles. Moreover, the new fourth-order PDE keeps the boundary from being blurred and preserves the nuance in the processed images, so, the processed images look very natural.
Resumo:
Image inpainting refers to restoring a damaged image with missing information. The total variation (TV) inpainting model is one such method that simultaneously fills in the regions with available information from their surroundings and eliminates noises. The method works well with small narrow inpainting domains. However there remains an urgent need to develop fast iterative solvers, as the underlying problem sizes are large. In addition one needs to tackle the imbalance of results between inpainting and denoising. When the inpainting regions are thick and large, the procedure of inpainting works quite slowly and usually requires a significant number of iterations and leads inevitably to oversmoothing in the outside of the inpainting domain. To overcome these difficulties, we propose a solution for TV inpainting method based on the nonlinear multi-grid algorithm.