998 resultados para Parallel projection


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A parallel formulation of an algorithm for the histogram computation of n data items using an on-the-fly data decomposition and a novel quantum-like representation (QR) is developed. The QR transformation separates multiple data read operations from multiple bin update operations thereby making it easier to bind data items into their corresponding histogram bins. Under this model the steps required to compute the histogram is n/s + t steps, where s is a speedup factor and t is associated with pipeline latency. Here, we show that an overall speedup factor, s, is available for up to an eightfold acceleration. Our evaluation also shows that each one of these cells requires less area/time complexity compared to similar proposals found in the literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper considers second kind equations of the form (abbreviated x=y + K2x) in which and the factor z is bounded but otherwise arbitrary so that equations of Wiener-Hopf type are included as a special case. Conditions on a set are obtained such that a generalized Fredholm alternative is valid: if W satisfies these conditions and I − Kz, is injective for each z ε W then I − Kz is invertible for each z ε W and the operators (I − Kz)−1 are uniformly bounded. As a special case some classical results relating to Wiener-Hopf operators are reproduced. A finite section version of the above equation (with the range of integration reduced to [−a, a]) is considered, as are projection and iterated projection methods for its solution. The operators (where denotes the finite section version of Kz) are shown uniformly bounded (in z and a) for all a sufficiently large. Uniform stability and convergence results, for the projection and iterated projection methods, are obtained. The argument generalizes an idea in collectively compact operator theory. Some new results in this theory are obtained and applied to the analysis of projection methods for the above equation when z is compactly supported and k(s − t) replaced by the general kernel k(s,t). A boundary integral equation of the above type, which models outdoor sound propagation over inhomogeneous level terrain, illustrates the application of the theoretical results developed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel method is presented for obtaining rigorous upper bounds on the finite-amplitude growth of instabilities to parallel shear flows on the beta-plane. The method relies on the existence of finite-amplitude Liapunov (normed) stability theorems, due to Arnol'd, which are nonlinear generalizations of the classical stability theorems of Rayleigh and Fjørtoft. Briefly, the idea is to use the finite-amplitude stability theorems to constrain the evolution of unstable flows in terms of their proximity to a stable flow. Two classes of general bounds are derived, and various examples are considered. It is also shown that, for a certain kind of forced-dissipative problem with dissipation proportional to vorticity, the finite-amplitude stability theorems (which were originally derived for inviscid, unforced flow) remain valid (though they are no longer strictly Liapunov); the saturation bounds therefore continue to hold under these conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Disturbances of arbitrary amplitude are superposed on a basic flow which is assumed to be steady and either (a) two-dimensional, homogeneous, and incompressible (rotating or non-rotating) or (b) stably stratified and quasi-geostrophic. Flow over shallow topography is allowed in either case. The basic flow, as well as the disturbance, is assumed to be subject neither to external forcing nor to dissipative processes like viscosity. An exact, local ‘wave-activity conservation theorem’ is derived in which the density A and flux F are second-order ‘wave properties’ or ‘disturbance properties’, meaning that they are O(a2) in magnitude as disturbance amplitude a [rightward arrow] 0, and that they are evaluable correct to O(a2) from linear theory, to O(a3) from second-order theory, and so on to higher orders in a. For a disturbance in the form of a single, slowly varying, non-stationary Rossby wavetrain, $\overline{F}/\overline{A}$ reduces approximately to the Rossby-wave group velocity, where (${}^{-}$) is an appropriate averaging operator. F and A have the formal appearance of Eulerian quantities, but generally involve a multivalued function the correct branch of which requires a certain amount of Lagrangian information for its determination. It is shown that, in a certain sense, the construction of conservable, quasi-Eulerian wave properties like A is unique and that the multivaluedness is inescapable in general. The connection with the concepts of pseudoenergy (quasi-energy), pseudomomentum (quasi-momentum), and ‘Eliassen-Palm wave activity’ is noted. The relationship of this and similar conservation theorems to dynamical fundamentals and to Arnol'd's nonlinear stability theorems is discussed in the light of recent advances in Hamiltonian dynamics. These show where such conservation theorems come from and how to construct them in other cases. An elementary proof of the Hamiltonian structure of two-dimensional Eulerian vortex dynamics is put on record, with explicit attention to the boundary conditions. The connection between Arnol'd's second stability theorem and the suppression of shear and self-tuning resonant instabilities by boundary constraints is discussed, and a finite-amplitude counterpart to Rayleigh's inflection-point theorem noted

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wind generated waves at the sea surface are of outstanding importance for both their practical relevance in many aspects, such as coastal erosion, protection, or safety of navigation, and for their scientific relevance in modifying fluxes at the air-sea interface. So far long-term changes in ocean wave climate have been studied mostly from a regional perspective with global dynamical studies emerging only recently. Here a global wave climate study is presented, in which a global wave model (WAM) is driven by atmospheric forcing from a global climate model (ECHAM5) for present day and potential future climate conditions represented by the IPCC (Intergovernmental Panel for Climate Change) A1B emission scenario. It is found that changes in mean and extreme wave climate towards the end of the twenty-first century are small to moderate, with the largest signals being a poleward shift in the annual mean and extreme significant wave heights in the mid-latitudes of both hemispheres, more pronounced in the Southern Hemisphere, and most likely associated with a corresponding shift in mid-latitude storm tracks. These changes are broadly consistent with results from the few studies available so far. The projected changes in the mean wave periods, associated with the changes in the wave climate in the mid to high latitudes, are also shown, revealing a moderate increase in the equatorial eastern side of the ocean basins. This study presents a step forward towards a larger ensemble of global wave climate projections required to better assess robustness and uncertainty of potential future wave climate change.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we study convergence of the L2-projection onto the space of polynomials up to degree p on a simplex in Rd, d >= 2. Optimal error estimates are established in the case of Sobolev regularity and illustrated on several numerical examples. The proof is based on the collapsed coordinate transform and the expansion into various polynomial bases involving Jacobi polynomials and their antiderivatives. The results of the present paper generalize corresponding estimates for cubes in Rd from [P. Houston, C. Schwab, E. Süli, Discontinuous hp-finite element methods for advection-diffusion-reaction problems. SIAM J. Numer. Anal. 39 (2002), no. 6, 2133-2163].

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A parallel pipelined array of cells suitable for real-time computation of histograms is proposed. The cell architecture builds on previous work obtained via C-slow retiming techniques and can be clocked at 65 percent faster frequency than previous arrays. The new arrays can be exploited for higher throughput particularly when dual data rate sampling techniques are used to operate on single streams of data from image sensors. In this way, the new cell operates on a p-bit data bus which is more convenient for interfacing to camera sensors or to microprocessors in consumer digital cameras.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. The modified algorithm runs more than 50 times faster on the CELL’s Synergistic Processing Elements than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60% of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Garfield produces a critique of neo-minimalist art practice by demonstrating how the artist Melanie Jackson’s Some things you are not allowed to send around the world (2003 and 2006) and the experimental film-maker Vivienne Dick’s Liberty’s booty (1980) – neither of which can be said to be about feeling ‘at home’ in the world, be it as a resident or as a nomad – examine global humanity through multi-positionality, excess and contingency, and thereby begin to articulate a new cosmopolitan relationship with the local – or, rather, with many different localities – in one and the same maximalist sweep of the work. ‘Maximalism’ in Garfield’s coinage signifies an excessive overloading (through editing, collage, and the sheer density of the range of the material) that enables the viewer to insert themselves into the narrative of the work. In the art of both Jackson and Dick Garfield detects a refusal to know or to judge the world; instead, there is an attempt to incorporate the complexities of its full range into the singular vision of the work, challenging the viewer to identify what is at stake.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate electron acceleration due to shear Alfven waves in a collissionless plasma for plasma parameters typical of 4–5RE radial distance from the Earth along auroral field lines. Recent observational work has motivated this study, which explores the plasma regime where the thermal velocity of the electrons is similar to the Alfven speed of the plasma, encouraging Landau resonance for electrons in the wave fields. We use a self-consistent kinetic simulation model to follow the evolution of the electrons as they interact with a short-duration wave pulse, which allows us to determine the parallel electric field of the shear Alfven wave due to both electron inertia and electron pressure effects. The simulation demonstrates that electrons can be accelerated to keV energies in a modest amplitude sub-second period wave. We compare the parallel electric field obtained from the simulation with those provided by fluid approximations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The time to process each of W/B processing blocks of a median calculation method on a set of N W-bit integers is improved here by a factor of three compared to the literature. Parallelism uncovered in blocks containing B-bit slices are exploited by independent accumulative parallel counters so that the median is calculated faster than any known previous method for any N, W values. The improvements to the method are discussed in the context of calculating the median for a moving set of N integers for which a pipelined architecture is developed. An extra benefit of smaller area for the architecture is also reported.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A great number of studies on wind conditions in passages between slab-type buildings have been conducted in the past. However, wind conditions under different structure and configuration of buildings is still unclear and studies existed still can’t provide guidance on urban planning and design, due to the complexity of buildings and aerodynamics. The aim of this paper is to provide more insight in the mechanism of wind conditions in passages. In this paper, a simplified passage model with non-parallel buildings is developed on the basis of the wind tunnel experiments conducted by Blocken et al. (2008). Numerical simulation based on CFD is employed for a detailed investigation of the wind environment in passages between two long narrow buildings with different directions and model validation is performed by comparing numerical results with corresponding wind tunnel measurements.