12 resultados para automatic translation
em Greenwich Academic Literature Archive - UK
Resumo:
The availability of a very accurate dependence graph for a scalar code is the basis for the automatic generation of an efficient parallel implementation. The strategy for this task which is encapsulated in a comprehensive data partitioning code generation algorithm is described. This algorithm involves the data partition, calculation of assignment ranges for partitioned arrays, addition of a comprehensive set of execution control masks, altering loop limits, addition and optimisation of communications for all data. In this context, the development and implementation of strategies to merge communications wherever possible has proved an important feature in producing efficient parallel implementations for numerical mesh based codes. The code generation strategies described here are embedded within the Computer Aided Parallelisation tools (CAPTools) software as a key part of a toolkit for automating as much as possible of the parallelisation process for mesh based numerical codes. The algorithms used enables parallelisation of real computational mechanics codes with only minor user interaction and without any prior manual customisation of the serial code to suit the parallelisation tool.
Resumo:
User supplied knowledge and interaction is a vital component of a toolkit for producing high quality parallel implementations of scalar FORTRAN numerical code. In this paper we consider the necessary components that such a parallelisation toolkit should possess to provide an effective environment to identify, extract and embed user relevant user knowledge. We also examine to what extent these facilities are available in leading parallelisation tools; in particular we discuss how these issues have been addressed in the development of the user interface of the Computer Aided Parallelisation Tools (CAPTools). The CAPTools environment has been designed to enable user exploration, interaction and insertion of user knowledge to facilitate the automatic generation of very efficient parallel code. A key issue in the user's interaction is control of the volume of information so that the user is focused on only that which is needed. User control over the level and extent of information revealed at any phase is supplied using a wide variety of filters. Another issue is the way in which information is communicated. Dependence analysis and its resulting graphs involve a lot of sophisticated rather abstract concepts unlikely to be familiar to most users of parallelising tools. As such, considerable effort has been made to communicate with the user in terms that they will understand. These features, amongst others, and their use in the parallelisation process are described and their effectiveness discussed.
Resumo:
This paper addresses the exploitation of overlapping communication with calculation within parallel FORTRAN 77 codes for computational fluid dynamics (CFD) and computational structured dynamics (CSD). The obvious objective is to overlap interprocessor communication with calculation on each processor in a distributed memory parallel system and so improve the efficiency of the parallel implementation. A general strategy for converting synchronous to overlapped communication is presented together with tools to enable its automatic implementation in FORTRAN 77 codes. This strategy is then implemented within the parallelisation toolkit, CAPTools, to facilitate the automatic generation of parallel code with overlapped communications. The success of these tools are demonstrated on two codes from the NAS-PAR and PERFECT benchmark suites. In each case, the tools produce parallel code with overlapped communications which is as good as that which could be generated manually. The parallel performance of the codes also improve in line with expectation.
Resumo:
The most common parallelisation strategy for many Computational Mechanics (CM) (typified by Computational Fluid Dynamics (CFD) applications) which use structured meshes, involves a 1D partition based upon slabs of cells. However, many CFD codes employ pipeline operations in their solution procedure. For parallelised versions of such codes to scale well they must employ two (or more) dimensional partitions. This paper describes an algorithmic approach to the multi-dimensional mesh partitioning in code parallelisation, its implementation in a toolkit for almost automatically transforming scalar codes to parallel form, and its testing on a range of ‘real-world’ FORTRAN codes. The concept of multi-dimensional partitioning is straightforward, but non-trivial to represent as a sufficiently generic algorithm so that it can be embedded in a code transformation tool. The results of the tests on fine real-world codes demonstrate clear improvements in parallel performance and scalability (over a 1D partition). This is matched by a huge reduction in the time required to develop the parallel versions when hand coded – from weeks/months down to hours/days.
Resumo:
The shared-memory programming model can be an effective way to achieve parallelism on shared memory parallel computers. Historically however, the lack of a programming standard using directives and the limited scalability have affected its take-up. Recent advances in hardware and software technologies have resulted in improvements to both the performance of parallel programs with compiler directives and the issue of portability with the introduction of OpenMP. In this study, the Computer Aided Parallelisation Toolkit has been extended to automatically generate OpenMP-based parallel programs with nominal user assistance. We categorize the different loop types and show how efficient directives can be placed using the toolkit's in-depth interprocedural analysis. Examples are taken from the NAS parallel benchmarks and a number of real-world application codes. This demonstrates the great potential of using the toolkit to quickly parallelise serial programs as well as the good performance achievable on up to 300 processors for hybrid message passing-directive parallelisations.
Resumo:
In this work we show how automatic relative debugging can be used to find differences in computation between a correct serial program and an OpenMP parallel version of that program that does not yield correct results. Backtracking and re-execution are used to determine the first OpenMP parallel region that produces a difference in computation that may lead to an incorrect value the user has indicated. Our approach also lends itself to finding differences between parallel computations, where executing with M threads produces expected results but an N thread execution does not (M, N > 1, M ≠ N). OpenMP programs created using a parallelization tool are addressed by utilizing static analysis and directive information from the tool. Hand-parallelized programs, where OpenMP directives are inserted by the user, are addressed by performing data dependence and directive analysis.
Resumo:
In Sofia Coppola's 2003 film Lost in Translation, Bill Murray and Scarlett Johansson's characters find themselves culturally stranded and oddly mismatched as an improvised tourist couple in contemporary Tokyo. This is an urban landscape that they cannot comprehend but only temporarily experience, in a fragmented and surreptitious way that allows no possible understanding and categorizations, but offers physical inclusion, emotional participation and momentary embeddedness.
Resumo:
The effectiveness of corporate governance mechanisms has been a subject of academic research for many decades. Although the large majority of corporate governance studies prior to mid 1990s were based on data from developed market economies such as the U.S., U.K. and Japan, in recent years researchers have begun examining corporate governance in transition economies. A comparison of China and India offers a unique environment for analyzing the effectiveness of corporate governance. First, both countries state-owned enterprise (SOE) reform strategies hinges on the Modern Enterprise System characterized by the separation of ownership and control. Ownership of an SOE’s assets is distributed among the government, institutional investors, managers, employees, and private investors. Effective control rights are assigned to management, which generally has a very small, or even nonexistent ownership stake. This distinctive shareholding structure creates conflict of interest not only between management (insiders) and outside investors but also between large shareholders and minority investors. Moreover, because both governments desire to retain some control—in part through partial retained ownership of commercialized SOEs, further conflicts arise between politicians and firms. Second, directors in publicly listed firms in both countries are predominantly drawn from institutions with significant non-market objectives: the government and other state enterprises, particularly in China, and extended families, particularly in India. As a result, the effectiveness of internal governance mechanisms, such as the number of independent directors on the board and the number of independent supervisors on the supervisory committee, are likely to be quiet limited, although this has yet to be fully evaluated. Third, because of the political nature of the privatization process itself, typical external governance mechanisms, such as debt (in conjunction with appropriate bankruptcy procedures), takeover threats, legal protection of investors, product market competition, etc., have not been effective. Bank loans have traditionally been viewed as grants from the state designed to bail out failing firms. State-owned banks retain monopoly or quasi-monopoly positions in the banking sector and profit is not their overriding objective. If political favor is deemed appropriate, subsidized loans, rescheduling of overdue debt or even outright transfer of funds can be arranged with SOEs (soft budget constraints). In addition, a market for private, non-bank debt is limited in India and has yet to be established China. There is no active merger or takeover activity in Chinese stock markets to discipline management. Information available in the capital markets is insufficient to keep at arm’s length of the corporate decisions. In light of the above peculiarities, China and India share many of the typical institutional characteristics as a transition economy, including poor legal protection of creditors and investors, the absence of an effective takeover market, an underdeveloped capital market, a relative inefficient banking system and significant interference of politicians in firm management. Su (2005) finds that the extent of political interference, managerial entrenchment and institutional control can help explain corporate dividend policies and post-IPO financing choices in this situation. Allen et al. (2005) demonstrate that standard corporate governance mechanisms are weak and ineffective for publicly listed firms while alternative governance mechanisms based on reputation and relationship have been remarkably effective in the private sector. Because the peculiarities are significant in this context, the differences in the political-economies of the two countries are likely to be evident in such relational terms. In this paper we explore the peculiarities of corporate governance in this transitional environment through a systematic examination of certain aspects of these reputational and relationship dimensions. Utilising the methods of social network analysis we identify the inter-organisational relationships at board level formed by equity holdings and by shared directors. Using data drawn from the Orbis database we map these relations among the 3700 largest firms in India and China respectively and identify the roles played in these relational networks by the particularly characteristic institutions in each case. We find greatly different social network structures in each case with some support in these relational dimensions for their distinctive features of governance. Further, the social network metrics allow us to considerably refine proxies for political interference, managerial entrenchment and institutional control used in earlier econometric analysis.
Resumo:
The rapid prototyping (RP) process is being used widely with great potential for rapid manufacturing of functional parts. The RP process involves translation of the CAD file to STL format followed by slicing of the model into multiple horizontal layers, each of which is reproduced physically in making the prototype. The thickness of the resulting slices has a profound effect on the surface finish and build time of the prototype. The purpose of this paper is to show the effects of slice thickness on the surface finish, layering error, and build time of a prototype, as well as to show how an efficient STL file can be developed. Three objects were modeled and STL files were generated. One STL file for each object was sliced using different slice thicknesses, and the build times were obtained. Screenshots were used to show the slicing effect on layering error and surface finish and to demonstrate the means to a more efficient STL file. From the results, it is clear that the surface finish and build time are important factors that are affected by slice thickness