293 resultados para Scramjet Applications


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, we evaluate performance of a real-world image processing application that uses a cross-correlation algorithm to compare a given image with a reference one. The algorithm processes individual images represented as 2-dimensional matrices of single-precision floating-point values using O(n4) operations involving dot-products and additions. We implement this algorithm on a nVidia GTX 285 GPU using CUDA, and also parallelize it for the Intel Xeon (Nehalem) and IBM Power7 processors, using both manual and automatic techniques. Pthreads and OpenMP with SSE and VSX vector intrinsics are used for the manually parallelized version, while a state-of-the-art optimization framework based on the polyhedral model is used for automatic compiler parallelization and optimization. The performance of this algorithm on the nVidia GPU suffers from: (1) a smaller shared memory, (2) unaligned device memory access patterns, (3) expensive atomic operations, and (4) weaker single-thread performance. On commodity multi-core processors, the application dataset is small enough to fit in caches, and when parallelized using a combination of task and short-vector data parallelism (via SSE/VSX) or through fully automatic optimization from the compiler, the application matches or beats the performance of the GPU version. The primary reasons for better multi-core performance include larger and faster caches, higher clock frequency, higher on-chip memory bandwidth, and better compiler optimization and support for parallelization. The best performing versions on the Power7, Nehalem, and GTX 285 run in 1.02s, 1.82s, and 1.75s, respectively. These results conclusively demonstrate that, under certain conditions, it is possible for a FLOP-intensive structured application running on a multi-core processor to match or even beat the performance of an equivalent GPU version.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Palladium and platinum dichloride complexes of a series of symmetrically and unsymmetrically substituted 25,26;27,28-dibridged p-tert-butyl-calix[4]arene bisphosphites in which two proximal phenolic oxygen atoms of p-tert-butyl-or p-H-calix[4]arene are connected to a P(OR) ( R = substituted phenyl) moiety have been synthesized. The palladium dichloride complexes of calix[4]arene bisphosphites bearing sterically bulky aryl substituents undergo cyclometalation by C-C or C-H bond scission. An example of cycloplatinated complex is also reported. The complexes have been characterized by NMR spectroscopic and single crystal X-ray diffraction studies. During crystallization of the palladium dichloride complex of a symmetrically substituted calix[4]arene bisphosphite in dichloromethane, insertion of oxygen occurs into the Pd-P bond to give a P,O-coordinated palladium dichloride complex. The calix[4]arene framework in these bisphosphites and their metal complexes adopt distorted cone conformation; the cone conformation is more flattened in the metal complexes than in the free calix[4]arene bisphosphites. Some of these cyclometalated complexes proved to be active catalysts for Heck and Suzuki C-C cross-coupling reactions but, on an average, the yields are only modest. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As computational Grids are increasingly used for executing long running multi-phase parallel applications, it is important to develop efficient rescheduling frameworks that adapt application execution in response to resource and application dynamics. In this paper, three strategies or algorithms have been developed for deciding when and where to reschedule parallel applications that execute on multi-cluster Grids. The algorithms derive rescheduling plans that consist of potential points in application execution for rescheduling and schedules of resources for application execution between two consecutive rescheduling points. Using large number of simulations, it is shown that the rescheduling plans developed by the algorithms can lead to large decrease in application execution times when compared to executions without rescheduling on dynamic Grid resources. The rescheduling plans generated by the algorithms are also shown to be competitive when compared to the near-optimal plans generated by brute-force methods. Of the algorithms, genetic algorithm yielded the most efficient rescheduling plans with 9-12% smaller average execution times than the other algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The idea of ubiquity and seamless connectivity in networks is gaining more importance in recent times because of the emergence of mobile devices with added capabilities like multiple interfaces and more processing abilities. The success of ubiquitous applications depends on how effectively the user is provided with seamless connectivity. In a ubiquitous application, seamless connectivity encompasses the smooth migration of a user between networks and providing him/her with context based information automatically at all times. In this work, we propose a seamless connectivity scheme in the true sense of ubiquitous networks by providing smooth migration to a user along with providing information based on his/her contexts automatically without re-registration with the foreign network. The scheme uses Ubi-SubSystems(USS) and Soft-Switches(SS) for maintaining the ubiquitous application resources and the users. The scheme has been tested by considering the ubiquitous touring system with several sets of tourist spots and users.

Relevância:

20.00% 20.00%

Publicador: