169 resultados para Parallelizing Compilers


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents improved unification algorithms, an implementation, and an analysis of the effectiveness of an abstract interpreter based on the sharing + freeness domain presented in a previous paper, which was designed to accurately and concisely represent combined freeness and sharing information for program variables. We first briefly review this domain and the unification algorithms previously proposed. We then improve these algorithms and correct them to deal with some cases which were not well analyzed previously, illustrating the improvement with an example. We then present the implementation of the improved algorithm and evaluate its performance by comparing the effectiveness of the information inferred to that of other interpreters available to us for an application (program parallelization) that is common to all these interpreters. All these systems have been embedded in a real parallelizing compiler. Effectiveness of the analysis is measured in terms of actual final performance of the system: i.e. in terms of the actual speedups obtained. The results show good performance for the combined domain in that it improves the accuracy of both types of information and also in that the analyzer using the combined domain is more effective in the application than any of the other analyzers it is compared to.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The characteristics of CC and CLP systems are in principle very dierent However a recent trend towards convergence in the implementation techniques for these systems can be observed While CLP and Prolog systems have been incorporating capabilities to deal with userdened suspension and coroutining CC compilers have been trying to coalesce negrained tasks into coarsergrained sequential threads This convergence of techniques opens up the possibility of having a general purpose kernel language and abstract machine to serve as a compilation target for a variety of userlevel languages We propose a transformation technique directed towards such an objective In particular we report on techniques to support the Andorra computational model essentially emulating the AndorraI system via program transformation into a sequential language with delay primitives The system is automatic comprising an optional program analyzer and a basic transformer to the kernel language It turns out that a simple parallel CLP or Prolog system with dynamic scheduling is sucient as a kernel language for this purpose The preliminary results are quite encouraging performance of the resulting system is comparable to the current AndorraI implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este proyecto fin de grado presenta dos herramientas, Papify y Papify-Viewer, para medir y visualizar, respectivamente, las prestaciones a bajo nivel de especificaciones RVC-CAL basándose en eventos hardware. RVC-CAL es un lenguaje de flujo de datos estandarizado por MPEG y utilizado para definir herramientas relacionadas con la codificación de vídeo. La estructura de los programas descritos en RVC-CAL se basa en unidades funcionales llamadas actores, que a su vez se subdividen en funciones o procedimientos llamados acciones. ORCC (Open RVC-CAL Compiler) es un compilador de código abierto que utiliza como entrada descripciones RVC-CAL y genera a partir de ellas código fuente en un lenguaje dado, como por ejemplo C. Internamente, el compilador ORCC se divide en tres etapas distinguibles: front-end, middle-end y back-end. La implementación de Papify consiste en modificar la etapa del back-end del compilador, encargada de la generación de código, de modo tal que los actores, al ser traducidos a lenguaje C, queden instrumentados con PAPI (Performance Application Programing Interface), una herramienta utilizada como interfaz a los registros contadores de rendimiento (PMC) de los procesadores. Además, también se modifica el front-end para permitir identificar cierto tipo de anotaciones en las descripciones RVC-CAL, utilizadas para que el diseñador pueda indicar qué actores o acciones en particular se desean analizar. Los actores instrumentados, además de conservar su funcionalidad original, generan una serie de ficheros que contienen datos sobre los distintos eventos hardware que suceden a lo largo de su ejecución. Los eventos incluidos en estos ficheros son configurables dentro de las anotaciones previamente mencionadas. La segunda herramienta, Papify-Viewer, utiliza los datos generados por Papify y los procesa, obteniendo una representación visual de la información a dos niveles: por un lado, representa cronológicamente la ejecución de la aplicación, distinguiendo cada uno de los actores a lo largo de la misma. Por otro lado, genera estadísticas sobre la cantidad de eventos disparados por acción, actor o núcleo de ejecución y las representa mediante gráficos de barra. Ambas herramientas pueden ser utilizadas en conjunto para verificar el funcionamiento del programa, balancear la carga de los actores o la distribución por núcleos de los mismos, mejorar el rendimiento y diagnosticar problemas. ABSTRACT. This diploma project presents two tools, Papify and Papify-Viewer, used to measure and visualize the low level performance of RVC-CAL specifications based on hardware events. RVC-CAL is a dataflow language standardized by MPEG which is used to define video codec tools. The structure of the applications described in RVC-CAL is based on functional units called actors, which are in turn divided into smaller procedures called actions. ORCC (Open RVC-CAL Compiler) is an open-source compiler capable of transforming RVC-CAL descriptions into source code in a given language, such as C. Internally, the compiler is divided into three distinguishable stages: front-end, middle-end and back-end. Papify’s implementation consists of modifying the compiler’s back-end stage, which is responsible for generating the final source code, so that translated actors in C code are now instrumented with PAPI (Performance Application Programming Interface), a tool that provides an interface to the microprocessor’s performance monitoring counters (PMC). In addition, the front-end is also modified in such a way that allows identification of a certain type of annotations in the RVC-CAL descriptions, allowing the designer to set the actors or actions to be included in the measurement. Besides preserving their initial behavior, the instrumented actors will also generate a set of files containing data about the different events triggered throughout the program’s execution. The events included in these files can be configured inside the previously mentioned annotations. The second tool, Papify-Viewer, makes use of the files generated by Papify to process them and provide a visual representation of the information in two different ways: on one hand, a chronological representation of the application’s execution where each actor has its own timeline. On the other hand, statistical information is generated about the amount of triggered events per action, actor or core. Both tools can be used together to assert the normal functioning of the program, balance the load between actors or cores, improve performance and identify problems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research aims to show the main points of convergence between hegemonic schools of economic and sociological theory from the Scottish Enlightenment until today. To this end, on the one hand, we set three basic families of economic thought (the mainstream, the Austrian school and Marxism); and, on the other, we divide the history of sociology in five major generations (pioneers, founders, institutionalizers, compilers and constructivist). Subsequently, we set five historical periods as reference to our respective chapters and compare, within each of them, the theoretical contributions from these two areas. Thus, in the first chapter, called "the liberal parenthesis", we consider the relationship between the classical school of economics and the pioneers and founders of sociology. In the second, entitled "the social question" we analyze, on the one hand, the theoretical consistency of both the neoclassical school, as Austrian, with the principles defended by the institutionalizers of sociology; and, on the other, the influence of Karl Marx, as founder of sociology and classical economist, in the work of Soviet revolutionary theorists. In chapter three, called the "new industrial state", we demonstrate the theoretical proximity between both Keynesianism and the Austrian school of economics, with the doctrine defended by the generation of compilers in sociology. The fourth chapter, entitled "second industrial divide", refers to the similarities between the theoretical contributions of the monetarist Chicago school and the Austrian school with sociological constructivism. Finally, chapter five, the "global market", shows that the two hegemonic schools in economics, "integrated model", and sociology, "analytical sociology", are composed of the same three schools of thought: the rational choice, the neo-institutionalism and network approach. Thus, we can conclude that, if we look at their respective areas of influence, during this historical period occurs an manifest agreement between the theoretical contributions from the economic and sociological fields.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A rápida evolução do hardware demanda uma evolução contínua dos compiladores. Um processo de ajuste deve ser realizado pelos projetistas de compiladores para garantir que o código gerado pelo compilador mantenha uma determinada qualidade, seja em termos de tempo de processamento ou outra característica pré-definida. Este trabalho visou automatizar o processo de ajuste de compiladores por meio de técnicas de aprendizado de máquina. Como resultado os planos de compilação obtidos usando aprendizado de máquina com as características propostas produziram código para programas cujos valores para os tempos de execução se aproximaram daqueles seguindo o plano padrão utilizado pela LLVM.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current model-driven Web Engineering approaches (such as OO-H, UWE or WebML) provide a set of methods and supporting tools for a systematic design and development of Web applications. Each method addresses different concerns using separate models (content, navigation, presentation, business logic, etc.), and provide model compilers that produce most of the logic and Web pages of the application from these models. However, these proposals also have some limitations, especially for exchanging models or representing further modeling concerns, such as architectural styles, technology independence, or distribution. A possible solution to these issues is provided by making model-driven Web Engineering proposals interoperate, being able to complement each other, and to exchange models between the different tools. MDWEnet is a recent initiative started by a small group of researchers working on model-driven Web Engineering (MDWE). Its goal is to improve current practices and tools for the model-driven development of Web applications for better interoperability. The proposal is based on the strengths of current model-driven Web Engineering methods, and the existing experience and knowledge in the field. This paper presents the background, motivation, scope, and objectives of MDWEnet. Furthermore, it reports on the MDWEnet results and achievements so far, and its future plan of actions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modern compilers present a great and ever increasing number of options which can modify the features and behavior of a compiled program. Many of these options are often wasted due to the required comprehensive knowledge about both the underlying architecture and the internal processes of the compiler. In this context, it is usual, not having a single design goal but a more complex set of objectives. In addition, the dependencies between different goals are difficult to be a priori inferred. This paper proposes a strategy for tuning the compilation of any given application. This is accomplished by using an automatic variation of the compilation options by means of multi-objective optimization and evolutionary computation commanded by the NSGA-II algorithm. This allows finding compilation options that simultaneously optimize different objectives. The advantages of our proposal are illustrated by means of a case study based on the well-known Apache web server. Our strategy has demonstrated an ability to find improvements up to 7.5% and up to 27% in context switches and L2 cache misses, respectively, and also discovers the most important bottlenecks involved in the application performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

"The original design and preliminary coding of the PLONE compiler was done in a seminar course on compilers led by Professor Robert S. Northcote at the University of Illinois during the fall semester of 1969"--P. iii.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Compilers: John K. Chandler, John C. Redman, Virginia H. Wood, Caroline Larner.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In Sämtliche Werke der Kirchenväter, v.7, pp.333-386; 8-12; 13, pp.1-222.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Compilers: 1841/42, S.J. Willis.-1842/43-1866, D.T. Valentine.- 1868-69, J. Shannon.- 1970, J. Hardy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

1904 edition incudes Hawaii; 19 -14 include Canada, Hawaii and Cuba; 1915- include Alaska and Hawaii.