80 resultados para run-time profiling
Resumo:
CIAO is an advanced programming environment supporting Logic and Constraint programming. It offers a simple concurrent kernel on top of which declarative and non-declarative extensions are added via librarles. Librarles are available for supporting the ISOProlog standard, several constraint domains, functional and higher order programming, concurrent and distributed programming, internet programming, and others. The source language allows declaring properties of predicates via assertions, including types and modes. Such properties are checked at compile-time or at run-time. The compiler and system architecture are designed to natively support modular global analysis, with the two objectives of proving properties in assertions and performing program optimizations, including transparently exploiting parallelism in programs. The purpose of this paper is to report on recent progress made in the context of the CIAO system, with special emphasis on the capabilities of the compiler, the techniques used for supporting such capabilities, and the results in the áreas of program analysis and transformation already obtained with the system.
Resumo:
In an increasing number of applications (e.g., in embedded, real-time, or mobile systems) it is important or even essential to ensure conformance with respect to a specification expressing resource usages, such as execution time, memory, energy, or user-defined resources. In previous work we have presented a novel framework for data size-aware, static resource usage verification. Specifications can include both lower and upper bound resource usage functions. In order to statically check such specifications, both upper- and lower-bound resource usage functions (on input data sizes) approximating the actual resource usage of the program which are automatically inferred and compared against the specification. The outcome of the static checking of assertions can express intervals for the input data sizes such that a given specification can be proved for some intervals but disproved for others. After an overview of the approach in this paper we provide a number of novel contributions: we present a full formalization, and we report on and provide results from an implementation within the Ciao/CiaoPP framework (which provides a general, unified platform for static and run-time verification, as well as unit testing). We also generalize the checking of assertions to allow preconditions expressing intervals within which the input data size of a program is supposed to lie (i.e., intervals for which each assertion is applicable), and we extend the class of resource usage functions that can be checked.
Resumo:
Ciao Prolog incorporates a module system which allows sepárate compilation and sensible creation of standalone executables. We describe some of the main aspects of the Ciao modular compiler, ciaoc, which takes advantage of the characteristics of the Ciao Prolog module system to automatically perform sepárate and incremental compilation and efficiently build small, standalone executables with competitive run-time performance, ciaoc can also detect statically a larger number of programming errors. We also present a generic code processing library for handling modular programs, which provides an important part of the functionality of ciaoc. This library allows the development of program analysis and transformation tools in a way that is to some extent orthogonal to the details of module system design, and has been used in the implementation of ciaoc and other Ciao system tools. We also describe the different types of executables which can be generated by the Ciao compiler, which offer different tradeoffs between executable size, startup time, and portability, depending, among other factors, on the linking regime used (static, dynamic, lazy, etc.). Finally, we provide experimental data which illustrate these tradeoffs.
Resumo:
Ciao Prolog incorporates a module system which allows sepárate compilation and sensible creation of standalone executables. We describe some of the main aspects of the Ciao modular compiler, ciaoc, which takes advantage of the characteristics of the Ciao Prolog module system to automatically perform sepárate and incremental compilation and efficiently build small, standalone executables with competitive run-time performance, ciaoc can also detect statically a larger number of programming errors. We also present a generic code processing library for handling modular programs, which provides an important part of the functionality of ciaoc. This library allows the development of program analysis and transformation tools in a way that is to some extent orthogonal to the details of module system design, and has been used in the implementation of ciaoc and other Ciao system tools. We also describe the different types of executables which can be generated by the Ciao compiler, which offer different tradeoffs between executable size, startup time, and portability, depending, among other factors, on the linking regime used (static, dynamic, lazy, etc.). Finally, we provide experimental data which illustrate these tradeoffs.
Resumo:
CIAO is an advanced programming environment supporting Logic and Constraint programming. It offers a simple concurrent kernel on top of which declarative and non-declarative extensions are added via librarles. Librarles are available for supporting the ISOProlog standard, several constraint domains, functional and higher order programming, concurrent and distributed programming, internet programming, and others. The source language allows declaring properties of predicates via assertions, including types and modes. Such properties are checked at compile-time or at run-time. The compiler and system architecture are designed to natively support modular global analysis, with the two objectives of proving properties in assertions and performing program optimizations, including transparently exploiting parallelism in programs. The purpose of this paper is to report on recent progress made in the context of the CIAO system, with special emphasis on the capabilities of the compiler, the techniques used for supporting such capabilities, and the results in the áreas of program analysis and transformation already obtained with the system.
Resumo:
Knowing the size of the terms to which program variables are bound at run-time in logic programs is required in a class of applications related to program optimization such as, for example, recursion elimination and granularity analysis. Such size is difficult to even approximate at compile time and is thus generally computed at run-time by using (possibly predefined) predicates which traverse the terms involved. We propose a technique based on program transformation which has the potential of performing this computation much more efficiently. The technique is based on finding program procedures which are called before those in which knowledge regarding term sizes is needed and which traverse the terms whose size is to be determined, and transforming such procedures so that they compute term sizes "on the fly". We present a systematic way of determining whether a given program can be transformed in order to compute a given term size at a given program point without additional term traversal. Also, if several such transformations are possible our approach allows finding minimal transformations under certain criteria. We also discuss the advantages and present some applications of our technique.
Resumo:
Esta tesis doctoral se enmarca dentro de la computación con membranas. Se trata de un tipo de computación bio-inspirado, concretamente basado en las células de los organismos vivos, en las que se producen múltiples reacciones de forma simultánea. A partir de la estructura y funcionamiento de las células se han definido diferentes modelos formales, denominados P sistemas. Estos modelos no tratan de modelar el comportamiento biológico de una célula, sino que abstraen sus principios básicos con objeto de encontrar nuevos paradigmas computacionales. Los P sistemas son modelos de computación no deterministas y masivamente paralelos. De ahí el interés que en los últimos años estos modelos han suscitado para la resolución de problemas complejos. En muchos casos, consiguen resolver de forma teórica problemas NP-completos en tiempo polinómico o lineal. Por otra parte, cabe destacar también la aplicación que la computación con membranas ha tenido en la investigación de otros muchos campos, sobre todo relacionados con la biología. Actualmente, una gran cantidad de estos modelos de computación han sido estudiados desde el punto de vista teórico. Sin embargo, el modo en que pueden ser implementados es un reto de investigación todavía abierto. Existen varias líneas en este sentido, basadas en arquitecturas distribuidas o en hardware dedicado, que pretenden acercarse en lo posible a su carácter no determinista y masivamente paralelo, dentro de un contexto de viabilidad y eficiencia. En esta tesis doctoral se propone la realización de un análisis estático del P sistema, como vía para optimizar la ejecución del mismo en estas plataformas. Se pretende que la información recogida en tiempo de análisis sirva para configurar adecuadamente la plataforma donde se vaya a ejecutar posteriormente el P sistema, obteniendo como consecuencia una mejora en el rendimiento. Concretamente, en esta tesis se han tomado como referencia los P sistemas de transiciones para llevar a cabo el estudio de dicho análisis estático. De manera un poco más específica, el análisis estático propuesto en esta tesis persigue que cada membrana sea capaz de determinar sus reglas activas de forma eficiente en cada paso de evolución, es decir, aquellas reglas que reúnen las condiciones adecuadas para poder ser aplicadas. En esta línea, se afronta el problema de los estados de utilidad de una membrana dada, que en tiempo de ejecución permitirán a la misma conocer en todo momento las membranas con las que puede comunicarse, cuestión que determina las reglas que pueden aplicarse en cada momento. Además, el análisis estático propuesto en esta tesis se basa en otra serie de características del P sistema como la estructura de membranas, antecedentes de las reglas, consecuentes de las reglas o prioridades. Una vez obtenida toda esta información en tiempo de análisis, se estructura en forma de árbol de decisión, con objeto de que en tiempo de ejecución la membrana obtenga las reglas activas de la forma más eficiente posible. Por otra parte, en esta tesis se lleva a cabo un recorrido por un número importante de arquitecturas hardware y software que diferentes autores han propuesto para implementar P sistemas. Fundamentalmente, arquitecturas distribuidas, hardware dedicado basado en tarjetas FPGA y plataformas basadas en microcontroladores PIC. El objetivo es proponer soluciones que permitan implantar en dichas arquitecturas los resultados obtenidos del análisis estático (estados de utilidad y árboles de decisión para reglas activas). En líneas generales, se obtienen conclusiones positivas, en el sentido de que dichas optimizaciones se integran adecuadamente en las arquitecturas sin penalizaciones significativas. Summary Membrane computing is the focus of this doctoral thesis. It can be considered a bio-inspired computing type. Specifically, it is based on living cells, in which many reactions take place simultaneously. From cell structure and operation, many different formal models have been defined, named P systems. These models do not try to model the biological behavior of the cell, but they abstract the basic principles of the cell in order to find out new computational paradigms. P systems are non-deterministic and massively parallel computational models. This is why, they have aroused interest when dealing with complex problems nowadays. In many cases, they manage to solve in theory NP problems in polynomial or lineal time. On the other hand, it is important to note that membrane computing has been successfully applied in many researching areas, specially related to biology. Nowadays, lots of these computing models have been sufficiently characterized from a theoretical point of view. However, the way in which they can be implemented is a research challenge, that it is still open nowadays. There are some lines in this way, based on distributed architectures or dedicated hardware. All of them are trying to approach to its non-deterministic and parallel character as much as possible, taking into account viability and efficiency. In this doctoral thesis it is proposed carrying out a static analysis of the P system in order to optimize its performance in a computing platform. The general idea is that after data are collected in analysis time, they are used for getting a suitable configuration of the computing platform in which P system is going to be performed. As a consequence, the system throughput will improve. Specifically, this thesis has made use of Transition P systems for carrying out the study in static analysis. In particular, the static analysis proposed in this doctoral thesis tries to achieve that every membrane can efficiently determine its active rules in every evolution step. These rules are the ones that can be applied depending on the system configuration at each computational step. In this line, we are going to tackle the problem of the usefulness states for a membrane. This state will allow this membrane to know the set of membranes with which communication is possible at any time. This is a very important issue in determining the set of rules that can be applied. Moreover, static analysis in this thesis is carried out taking into account other properties such as membrane structure, rule antecedents, rule consequents and priorities among rules. After collecting all data in analysis time, they are arranged in a decision tree structure, enabling membranes to obtain the set of active rules as efficiently as possible in run-time system. On the other hand, in this doctoral thesis is going to carry out an overview of hardware and software architectures, proposed by different authors in order to implement P systems, such as distributed architectures, dedicated hardware based on PFGA, and computing platforms based on PIC microcontrollers. The aim of this overview is to propose solutions for implementing the results of the static analysis, that is, usefulness states and decision trees for active rules. In general, conclusions are satisfactory, because these optimizations can be properly integrated in most of the architectures without significant penalties.
Resumo:
Goal-level Independent and-parallelism (IAP) is exploited by scheduling for simultaneous execution two or more goals which will not interfere with each other at run time. This can be done safely even if such goals can produce multiple answers. The most successful IAP implementations to date have used recomputation of answers and sequentially ordered backtracking. While in principle simplifying the implementation, recomputation can be very inefficient if the granularity of the parallel goals is large enough and they produce several answers, while sequentially ordered backtracking limits parallelism. And, despite the expected simplification, the implementation of the classic schemes has proved to involve complex engineering, with the consequent difficulty for system maintenance and expansion, and still frequently run into the well-known trapped goal and garbage slot problems. This work presents ideas about an alternative parallel backtracking model for IAP and a simulation studio. The model features parallel out-of-order backtracking and relies on answer memoization to reuse and combine answers. Whenever a parallel goal backtracks, its siblings also perform backtracking, but after storing the bindings generated by previous answers. The bindings are then reinstalled when combining answers. In order not to unnecessarily penalize forward execution, non-speculative and-parallel goals which have not been executed yet take precedence over sibling goals which could be backtracked over. Using a simulator, we show that this approach can bring significant performance advantages over classical approaches.
Resumo:
La temperatura es una preocupación que juega un papel protagonista en el diseño de circuitos integrados modernos. El importante aumento de las densidades de potencia que conllevan las últimas generaciones tecnológicas ha producido la aparición de gradientes térmicos y puntos calientes durante el funcionamiento normal de los chips. La temperatura tiene un impacto negativo en varios parámetros del circuito integrado como el retardo de las puertas, los gastos de disipación de calor, la fiabilidad, el consumo de energía, etc. Con el fin de luchar contra estos efectos nocivos, la técnicas de gestión dinámica de la temperatura (DTM) adaptan el comportamiento del chip en función en la información que proporciona un sistema de monitorización que mide en tiempo de ejecución la información térmica de la superficie del dado. El campo de la monitorización de la temperatura en el chip ha llamado la atención de la comunidad científica en los últimos años y es el objeto de estudio de esta tesis. Esta tesis aborda la temática de control de la temperatura en el chip desde diferentes perspectivas y niveles, ofreciendo soluciones a algunos de los temas más importantes. Los niveles físico y circuital se cubren con el diseño y la caracterización de dos nuevos sensores de temperatura especialmente diseñados para los propósitos de las técnicas DTM. El primer sensor está basado en un mecanismo que obtiene un pulso de anchura variable dependiente de la relación de las corrientes de fuga con la temperatura. De manera resumida, se carga un nodo del circuito y posteriormente se deja flotando de tal manera que se descarga a través de las corrientes de fugas de un transistor; el tiempo de descarga del nodo es la anchura del pulso. Dado que la anchura del pulso muestra una dependencia exponencial con la temperatura, la conversión a una palabra digital se realiza por medio de un contador logarítmico que realiza tanto la conversión tiempo a digital como la linealización de la salida. La estructura resultante de esta combinación de elementos se implementa en una tecnología de 0,35 _m. El sensor ocupa un área muy reducida, 10.250 nm2, y consume muy poca energía, 1.05-65.5nW a 5 muestras/s, estas cifras superaron todos los trabajos previos en el momento en que se publicó por primera vez y en el momento de la publicación de esta tesis, superan a todas las implementaciones anteriores fabricadas en el mismo nodo tecnológico. En cuanto a la precisión, el sensor ofrece una buena linealidad, incluso sin calibrar; se obtiene un error 3_ de 1,97oC, adecuado para tratar con las aplicaciones de DTM. Como se ha explicado, el sensor es completamente compatible con los procesos de fabricación CMOS, este hecho, junto con sus valores reducidos de área y consumo, lo hacen especialmente adecuado para la integración en un sistema de monitorización de DTM con un conjunto de monitores empotrados distribuidos a través del chip. Las crecientes incertidumbres de proceso asociadas a los últimos nodos tecnológicos comprometen las características de linealidad de nuestra primera propuesta de sensor. Con el objetivo de superar estos problemas, proponemos una nueva técnica para obtener la temperatura. La nueva técnica también está basada en las dependencias térmicas de las corrientes de fuga que se utilizan para descargar un nodo flotante. La novedad es que ahora la medida viene dada por el cociente de dos medidas diferentes, en una de las cuales se altera una característica del transistor de descarga |la tensión de puerta. Este cociente resulta ser muy robusto frente a variaciones de proceso y, además, la linealidad obtenida cumple ampliamente los requisitos impuestos por las políticas DTM |error 3_ de 1,17oC considerando variaciones del proceso y calibrando en dos puntos. La implementación de la parte sensora de esta nueva técnica implica varias consideraciones de diseño, tales como la generación de una referencia de tensión independiente de variaciones de proceso, que se analizan en profundidad en la tesis. Para la conversión tiempo-a-digital, se emplea la misma estructura de digitalización que en el primer sensor. Para la implementación física de la parte de digitalización, se ha construido una biblioteca de células estándar completamente nueva orientada a la reducción de área y consumo. El sensor resultante de la unión de todos los bloques se caracteriza por una energía por muestra ultra baja (48-640 pJ) y un área diminuta de 0,0016 mm2, esta cifra mejora todos los trabajos previos. Para probar esta afirmación, se realiza una comparación exhaustiva con más de 40 propuestas de sensores en la literatura científica. Subiendo el nivel de abstracción al sistema, la tercera contribución se centra en el modelado de un sistema de monitorización que consiste de un conjunto de sensores distribuidos por la superficie del chip. Todos los trabajos anteriores de la literatura tienen como objetivo maximizar la precisión del sistema con el mínimo número de monitores. Como novedad, en nuestra propuesta se introducen nuevos parámetros de calidad aparte del número de sensores, también se considera el consumo de energía, la frecuencia de muestreo, los costes de interconexión y la posibilidad de elegir diferentes tipos de monitores. El modelo se introduce en un algoritmo de recocido simulado que recibe la información térmica de un sistema, sus propiedades físicas, limitaciones de área, potencia e interconexión y una colección de tipos de monitor; el algoritmo proporciona el tipo seleccionado de monitor, el número de monitores, su posición y la velocidad de muestreo _optima. Para probar la validez del algoritmo, se presentan varios casos de estudio para el procesador Alpha 21364 considerando distintas restricciones. En comparación con otros trabajos previos en la literatura, el modelo que aquí se presenta es el más completo. Finalmente, la última contribución se dirige al nivel de red, partiendo de un conjunto de monitores de temperatura de posiciones conocidas, nos concentramos en resolver el problema de la conexión de los sensores de una forma eficiente en área y consumo. Nuestra primera propuesta en este campo es la introducción de un nuevo nivel en la jerarquía de interconexión, el nivel de trillado (o threshing en inglés), entre los monitores y los buses tradicionales de periféricos. En este nuevo nivel se aplica selectividad de datos para reducir la cantidad de información que se envía al controlador central. La idea detrás de este nuevo nivel es que en este tipo de redes la mayoría de los datos es inútil, porque desde el punto de vista del controlador sólo una pequeña cantidad de datos |normalmente sólo los valores extremos| es de interés. Para cubrir el nuevo nivel, proponemos una red de monitorización mono-conexión que se basa en un esquema de señalización en el dominio de tiempo. Este esquema reduce significativamente tanto la actividad de conmutación sobre la conexión como el consumo de energía de la red. Otra ventaja de este esquema es que los datos de los monitores llegan directamente ordenados al controlador. Si este tipo de señalización se aplica a sensores que realizan conversión tiempo-a-digital, se puede obtener compartición de recursos de digitalización tanto en tiempo como en espacio, lo que supone un importante ahorro de área y consumo. Finalmente, se presentan dos prototipos de sistemas de monitorización completos que de manera significativa superan la características de trabajos anteriores en términos de área y, especialmente, consumo de energía. Abstract Temperature is a first class design concern in modern integrated circuits. The important increase in power densities associated to recent technology evolutions has lead to the apparition of thermal gradients and hot spots during run time operation. Temperature impacts several circuit parameters such as speed, cooling budgets, reliability, power consumption, etc. In order to fight against these negative effects, dynamic thermal management (DTM) techniques adapt the behavior of the chip relying on the information of a monitoring system that provides run-time thermal information of the die surface. The field of on-chip temperature monitoring has drawn the attention of the scientific community in the recent years and is the object of study of this thesis. This thesis approaches the matter of on-chip temperature monitoring from different perspectives and levels, providing solutions to some of the most important issues. The physical and circuital levels are covered with the design and characterization of two novel temperature sensors specially tailored for DTM purposes. The first sensor is based upon a mechanism that obtains a pulse with a varying width based on the variations of the leakage currents on the temperature. In a nutshell, a circuit node is charged and subsequently left floating so that it discharges away through the subthreshold currents of a transistor; the time the node takes to discharge is the width of the pulse. Since the width of the pulse displays an exponential dependence on the temperature, the conversion into a digital word is realized by means of a logarithmic counter that performs both the timeto- digital conversion and the linearization of the output. The structure resulting from this combination of elements is implemented in a 0.35_m technology and is characterized by very reduced area, 10250 nm2, and power consumption, 1.05-65.5 nW at 5 samples/s, these figures outperformed all previous works by the time it was first published and still, by the time of the publication of this thesis, they outnumber all previous implementations in the same technology node. Concerning the accuracy, the sensor exhibits good linearity, even without calibration it displays a 3_ error of 1.97oC, appropriate to deal with DTM applications. As explained, the sensor is completely compatible with standard CMOS processes, this fact, along with its tiny area and power overhead, makes it specially suitable for the integration in a DTM monitoring system with a collection of on-chip monitors distributed across the chip. The exacerbated process fluctuations carried along with recent technology nodes jeop-ardize the linearity characteristics of the first sensor. In order to overcome these problems, a new temperature inferring technique is proposed. In this case, we also rely on the thermal dependencies of leakage currents that are used to discharge a floating node, but now, the result comes from the ratio of two different measures, in one of which we alter a characteristic of the discharging transistor |the gate voltage. This ratio proves to be very robust against process variations and displays a more than suficient linearity on the temperature |1.17oC 3_ error considering process variations and performing two-point calibration. The implementation of the sensing part based on this new technique implies several issues, such as the generation of process variations independent voltage reference, that are analyzed in depth in the thesis. In order to perform the time-to-digital conversion, we employ the same digitization structure the former sensor used. A completely new standard cell library targeting low area and power overhead is built from scratch to implement the digitization part. Putting all the pieces together, we achieve a complete sensor system that is characterized by ultra low energy per conversion of 48-640pJ and area of 0.0016mm2, this figure outperforms all previous works. To prove this statement, we perform a thorough comparison with over 40 works from the scientific literature. Moving up to the system level, the third contribution is centered on the modeling of a monitoring system consisting of set of thermal sensors distributed across the chip. All previous works from the literature target maximizing the accuracy of the system with the minimum number of monitors. In contrast, we introduce new metrics of quality apart form just the number of sensors; we consider the power consumption, the sampling frequency, the possibility to consider different types of monitors and the interconnection costs. The model is introduced in a simulated annealing algorithm that receives the thermal information of a system, its physical properties, area, power and interconnection constraints and a collection of monitor types; the algorithm yields the selected type of monitor, the number of monitors, their position and the optimum sampling rate. We test the algorithm with the Alpha 21364 processor under several constraint configurations to prove its validity. When compared to other previous works in the literature, the modeling presented here is the most complete. Finally, the last contribution targets the networking level, given an allocated set of temperature monitors, we focused on solving the problem of connecting them in an efficient way from the area and power perspectives. Our first proposal in this area is the introduction of a new interconnection hierarchy level, the threshing level, in between the monitors and the traditional peripheral buses that applies data selectivity to reduce the amount of information that is sent to the central controller. The idea behind this new level is that in this kind of networks most data are useless because from the controller viewpoint just a small amount of data |normally extreme values| is of interest. To cover the new interconnection level, we propose a single-wire monitoring network based on a time-domain signaling scheme that significantly reduces both the switching activity over the wire and the power consumption of the network. This scheme codes the information in the time domain and allows a straightforward obtention of an ordered list of values from the maximum to the minimum. If the scheme is applied to monitors that employ TDC, digitization resource sharing is achieved, producing an important saving in area and power consumption. Two prototypes of complete monitoring systems are presented, they significantly overcome previous works in terms of area and, specially, power consumption.
Resumo:
We present in a tutorial fashion CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system libraries), to generate and simplify run-time tests, and to perform high-level program transformations such as multiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, non-failure, and bounds on resource consumption (time or space cost).
Resumo:
We present a framework for the application of abstract interpretation as an aid during program development, rather than in the more traditional application of program optimization. Program validation and detection of errors is first performed statically by comparing (partial) specifications written in terms of assertions against information obtained from static analysis of the program. The results of this process are expressed in the user assertion language. Assertions (or parts of assertions) which cannot be verified statically are translated into run-time tests. The framework allows the use of assertions to be optional. It also allows using very general properties in assertions, beyond the predefined set understandable by the static analyzer and including properties defined by means of user programs. We also report briefly on an implementation of the framework. The resulting tool generates and checks assertions for Prolog, CLP(R), and CHIP/CLP(fd) programs, and integrates compile-time and run-time checking in a uniform way. The tool allows using properties such as types, modes, non-failure, determinacy, and computational cost, and can treat modules separately, performing incremental analysis. In practice, this modularity allows detecting statically bugs in user programs even if they do not contain any assertions.
Resumo:
We present and evaluate a compiler from Prolog (and extensions) to JavaScript which makes it possible to use (constraint) logic programming to develop the client side of web applications while being compliant with current industry standards. Targeting JavaScript makes (C)LP programs executable in virtually every modern computing device with no additional software requirements from the point of view of the user. In turn, the use of a very high-level language facilitates the development of high-quality, complex software. The compiler is a back end of the Ciao system and supports most of its features, including its module system and its rich language extension mechanism based on packages. We present an overview of the compilation process and a detailed description of the run-time system, including the support for modular compilation into separate JavaScript code. We demonstrate the maturity of the compiler by testing it with complex code such as a CLP(FD) library written in Prolog with attributed variables. Finally, we validate our proposal by measuring the performance of some LP and CLP(FD) benchmarks running on top of major JavaScript engines.
Resumo:
We consider the problem of supporting goal-level, independent andparallelism (IAP) in the presence of non-determinism. IAP is exploited when two or more goals which will not interfere at run time are scheduled for simultaneous execution. Backtracking over non-deterministic parallel goals runs into the wellknown trapped goal and garbage slot problems. The proposed solutions for these problems generally require complex low-level machinery which makes systems difficult to maintain and extend, and in some cases can even affect sequential execution performance. In this paper we propose a novel solution to the problem of trapped nondeterministic goals and garbage slots which is based on a single stack reordering operation and offers several advantages over previous proposals. While the implementation of this operation itself is not simple, in return it does not impose constraints on the scheduler. As a result, the scheduler and the rest of the run-time machinery can safely ignore the trapped goal and garbage slot problems and their implementation is greatly simplified. Also, standard sequential execution remains unaffected. In addition to describing the solution we report on an implementation and provide performance results. We also suggest other possible applications of the proposed approach beyond parallel execution.
Resumo:
Visualisation of program executions has been used in applications which include education and debugging. However, traditional visualisation techniques often fall short of expectations or are altogether inadequate for new programming paradigms, such as Constraint Logic Programming (CLP), whose declarative and operational semantics differ in some crucial ways from those of other paradigms. In particular, traditional ideas regarding the behaviour of data often cannot be lifted in a straightforward way to (C)LP from other families of programming languages. In this chapter we discuss techniques for visualising data evolution in CLP. We briefly review some previously proposed visualisation paradigms, and also propose a number of (to our knowledge) novel ones. The graphical representations have been chosen based on the perceived needs of a programmer trying to analyse the behaviour and characteristics of an execution. In particular, we concentrate on the representation of the run-time values of the variables, and the constraints among them. Given our interest in visualising large executions, we also pay attention to abstraction techniques, i.e., techniques which are intended to help in reducing the complexity of the visual information.
Resumo:
In this report we discuss some of the issues involved in the specialization and optimization of constraint logic programs with dynamic scheduling. Dynamic scheduling, as any other form of concurrency, increases the expressive power of constraint logic programs, but also introduces run-time overhead. The objective of the specialization and optimization is to reduce as much as possible such overhead automatically, while preserving the semantics of the original programs. This is done by program transformation based on global analysis. We present implementation techniques for this purpose and report on experimental results obtained from an implementation of the techniques in the context of the CIAO compiler.