80 resultados para run-time profiling
Resumo:
We discuss experiences gained by porting a Software Validation Facility (SVF) and a satellite Central Software (CSW) to a platform with support for Time and Space Partitioning (TSP). The SVF and CSW are part of the EagleEye Reference mission of the European Space Agency (ESA). As a reference mission, EagleEye is a perfect candidate to evaluate practical aspects of developing satellite CSW for and on TSP platforms. The specific TSP platform we used consists of a simulate D LEON3 CPU controlled by the XtratuM separation micro-kernel. On top of this, we run five separate partitions. Each partition ru n s its own real-time operating system or Ada run-time kernel, which in turn are running the application software of the CSW. We describe issues related to partitioning; inter-partition communication; scheduling; I/O; and fault-detection, isolation, and recovery (FDIR)
Resumo:
This thesis contributes to the analysis and design of printed reflectarray antennas. The main part of the work is focused on the analysis of dual offset antennas comprising two reflectarray surfaces, one of them acts as sub-reflector and the second one acts as mainreflector. These configurations introduce additional complexity in several aspects respect to conventional dual offset reflectors, however they present a lot of degrees of freedom that can be used to improve the electrical performance of the antenna. The thesis is organized in four parts: the development of an analysis technique for dualreflectarray antennas, a preliminary validation of such methodology using equivalent reflector systems as reference antennas, a more rigorous validation of the software tool by manufacturing and testing a dual-reflectarray antenna demonstrator and the practical design of dual-reflectarray systems for some applications that show the potential of these kind of configurations to scan the beam and to generate contoured beams. In the first part, a general tool has been implemented to analyze high gain antennas which are constructed of two flat reflectarray structures. The classic reflectarray analysis based on MoM under local periodicity assumption is used for both sub and main reflectarrays, taking into account the incident angle on each reflectarray element. The incident field on the main reflectarray is computed taking into account the field radiated by all the elements on the sub-reflectarray.. Two approaches have been developed, one which employs a simple approximation to reduce the computer run time, and the other which does not, but offers in many cases, improved accuracy. The approximation is based on computing the reflected field on each element on the main reflectarray only once for all the fields radiated by the sub-reflectarray elements, assuming that the response will be the same because the only difference is a small variation on the angle of incidence. This approximation is very accurate when the reflectarray elements on the main reflectarray show a relatively small sensitivity to the angle of incidence. An extension of the analysis technique has been implemented to study dual-reflectarray antennas comprising a main reflectarray printed on a parabolic surface, or in general in a curved surface. In many applications of dual-reflectarray configurations, the reflectarray elements are in the near field of the feed-horn. To consider the near field radiated by the horn, the incident field on each reflectarray element is computed using a spherical mode expansion. In this region, the angles of incidence are moderately wide, and they are considered in the analysis of the reflectarray to better calculate the actual incident field on the sub-reflectarray elements. This technique increases the accuracy for the prediction of co- and cross-polar patterns and antenna gain respect to the case of using ideal feed models. In the second part, as a preliminary validation, the proposed analysis method has been used to design a dual-reflectarray antenna that emulates previous dual-reflector antennas in Ku and W-bands including a reflectarray as subreflector. The results for the dualreflectarray antenna compare very well with those of the parabolic reflector and reflectarray subreflector; radiation patterns, antenna gain and efficiency are practically the same when the main parabolic reflector is substituted by a flat reflectarray. The results show that the gain is only reduced by a few tenths of a dB as a result of the ohmic losses in the reflectarray. The phase adjustment on two surfaces provided by the dual-reflectarray configuration can be used to improve the antenna performance in some applications requiring multiple beams, beam scanning or shaped beams. Third, a very challenging dual-reflectarray antenna demonstrator has been designed, manufactured and tested for a more rigorous validation of the analysis technique presented. The proposed antenna configuration has the feed, the sub-reflectarray and the main-reflectarray in the near field one to each other, so that the conventional far field approximations are not suitable for the analysis of such antenna. This geometry is used as benchmarking for the proposed analysis tool in very stringent conditions. Some aspects of the proposed analysis technique that allow improving the accuracy of the analysis are also discussed. These improvements include a novel method to reduce the inherent cross polarization which is introduced mainly from grounded patch arrays. It has been checked that cross polarization in offset reflectarrays can be significantly reduced by properly adjusting the patch dimensions in the reflectarray in order to produce an overall cancellation of the cross-polarization. The dimensions of the patches are adjusted in order not only to provide the required phase-distribution to shape the beam, but also to exploit the crosses by zero of the cross-polarization components. The last part of the thesis deals with direct applications of the technique described. The technique presented is directly applicable to the design of contoured beam antennas for DBS applications, where the requirements of cross-polarisation are very stringent. The beam shaping is achieved by synthesithing the phase distribution on the main reflectarray while the sub-reflectarray emulates an equivalent hyperbolic subreflector. Dual-reflectarray antennas present also the ability to scan the beam over small angles about boresight. Two possible architectures for a Ku-band antenna are also described based on a dual planar reflectarray configuration that provides electronic beam scanning in a limited angular range. In the first architecture, the beam scanning is achieved by introducing a phase-control in the elements of the sub-reflectarray and the mainreflectarray is passive. A second alternative is also studied, in which the beam scanning is produced using 1-bit control on the main reflectarray, while a passive subreflectarray is designed to provide a large focal distance within a compact configuration. The system aims to develop a solution for bi-directional satellite links for emergency communications. In both proposed architectures, the objective is to provide a compact optics and simplicity to be folded and deployed.
Resumo:
The technique of Abstract Interpretation has allowed the development of very sophisticated global program analyses which are at the same time provably correct and practical. We present in a tutorial fashion a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system libraries), to generate and simplify run-time tests, and to perform high-level program transformations such as multiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, nonfailure, and bounds on resource consumption (time or space cost). CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements the described functionality, will be used to illustrate the fundamental ideas.
Resumo:
Since the early days of logic programming, researchers in the field realized the potential for exploitation of parallelism present in the execution of logic programs. Their high-level nature, the presence of nondeterminism, and their referential transparency, among other characteristics, make logic programs interesting candidates for obtaining speedups through parallel execution. At the same time, the fact that the typical applications of logic programming frequently involve irregular computations, make heavy use of dynamic data structures with logical variables, and involve search and speculation, makes the techniques used in the corresponding parallelizing compilers and run-time systems potentially interesting even outside the field. The objective of this article is to provide a comprehensive survey of the issues arising in parallel execution of logic programming languages along with the most relevant approaches explored to date in the field. Focus is mostly given to the challenges emerging from the parallel execution of Prolog programs. The article describes the major techniques used for shared memory implementation of Or-parallelism, And-parallelism, and combinations of the two. We also explore some related issues, such as memory management, compile-time analysis, and execution visualization.
Resumo:
Modern FPGAs with run-time reconfiguration allow the implementation of complex systems offering both the flexibility of software-based solutions combined with the performance of hardware. This combination of characteristics, together with the development of new specific methodologies, make feasible to reach new points of the system design space, and make embedded systems built on these platforms acquire more and more importance. However, the practical exploitation of this technique in fields that traditionally have relied on resource restricted embedded systems, is mainly limited by strict power consumption requirements, the cost and the high dependence of DPR techniques with the specific features of the device technology underneath. In this work, we tackle the previously reported problems, designing a reconfigurable platform based on the low-cost and low-power consuming Spartan-6 FPGA family. The full process to develop the platform will be detailed in the paper from scratch. In addition, the implementation of the reconfiguration mechanism, including two profiles, is reported. The first profile is a low-area and low-speed reconfiguration engine based mainly on software functions running on the embedded processor, while the other one is a hardware version of the same engine, implemented in the FPGA logic. This reconfiguration hardware block has been originally designed to the Virtex-5 family, and its porting process will be also described in this work, facing the interoperability problem among different families.
Resumo:
Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also,a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with medium to large parallel execution overheads.
Resumo:
Studying independence of goals has proven very useful in the context of logic programming. In particular, it has provided a formal basis for powerful automatic parallelization tools, since independence ensures that two goals may be evaluated in parallel while preserving correctness and eciency. We extend the concept of independence to constraint logic programs (CLP) and prove that it also ensures the correctness and eciency of the parallel evaluation of independent goals. Independence for CLP languages is more complex than for logic programming as search space preservation is necessary but no longer sucient for ensuring correctness and eciency. Two additional issues arise. The rst is that the cost of constraint solving may depend upon the order constraints are encountered. The second is the need to handle dynamic scheduling. We clarify these issues by proposing various types of search independence and constraint solver independence, and show how they can be combined to allow dierent optimizations, from parallelism to intelligent backtracking. Sucient conditions for independence which can be evaluated \a priori" at run-time are also proposed. Our study also yields new insights into independence in logic programming languages. In particular, we show that search space preservation is not only a sucient but also a necessary condition for ensuring correctness and eciency of parallel execution.
Resumo:
The properties of data and activities in business processes can be used to greatly facilítate several relevant tasks performed at design- and run-time, such as fragmentation, compliance checking, or top-down design. Business processes are often described using workflows. We present an approach for mechanically inferring business domain-specific attributes of workflow components (including data Ítems, activities, and elements of sub-workflows), taking as starting point known attributes of workflow inputs and the structure of the workflow. We achieve this by modeling these components as concepts and applying sharing analysis to a Horn clause-based representation of the workflow. The analysis is applicable to workflows featuring complex control and data dependencies, embedded control constructs, such as loops and branches, and embedded component services.
Resumo:
Program specialization optimizes programs for known valúes of the input. It is often the case that the set of possible input valúes is unknown, or this set is infinite. However, a form of specialization can still be performed in such cases by means of abstract interpretation, specialization then being with respect to abstract valúes (substitutions), rather than concrete ones. We study the múltiple specialization of logic programs based on abstract interpretation. This involves in principie, and based on information from global analysis, generating several versions of a program predicate for different uses of such predicate, optimizing these versions, and, finally, producing a new, "multiply specialized" program. While múltiple specialization has received theoretical attention, little previous evidence exists on its practicality. In this paper we report on the incorporation of múltiple specialization in a parallelizing compiler and quantify its effects. A novel approach to the design and implementation of the specialization system is proposed. The resulting implementation techniques result in identical specializations to those of the best previously proposed techniques but require little or no modification of some existing abstract interpreters. Our results show that, using the proposed techniques, the resulting "abstract múltiple specialization" is indeed a relevant technique in practice. In particular, in the parallelizing compiler application, a good number of run-time tests are eliminated and invariants extracted automatically from loops, resulting generally in lower overheads and in several cases in increased speedups.
Resumo:
This paper addresses the issue of the practicality of global flow analysis in logic program compilation, in terms of speed of the analysis, precisión, and usefulness of the information obtained. To this end, design and implementation aspects are discussed for two practical abstract interpretation-based flow analysis systems: MA , the MCC And-parallel Analyzer and Annotator; and Ms, an experimental mode inference system developed for SB-Prolog. The paper also provides performance data obtained (rom these implementations and, as an example of an application, a study of the usefulness of the mode information obtained in reducing run-time checks in independent and-parallelism.Based on the results obtained, it is concluded that the overhead of global flow analysis is not prohibitive, while the results of analysis can be quite precise and useful.
Resumo:
We propose a general framework for assertion-based debugging of constraint logic programs. Assertions are linguistic constructions for expressing properties of programs. We define several assertion schemas for writing (partial) specifications for constraint logic programs using quite general properties, including user-defined programs. The framework is aimed at detecting deviations of the program behavior (symptoms) with respect to the given assertions, either at compile-time (i.e., statically) or run-time (i.e., dynamically). We provide techniques for using information from global analysis both to detect at compile-time assertions which do not hold in at least one of the possible executions (i.e., static symptoms) and assertions which hold for all possible executions (i.e., statically proved assertions). We also provide program transformations which introduce tests in the program for checking at run-time those assertions whose status cannot be determined at compile-time. Both the static and the dynamic checking are provably safe in the sense that all errors flagged are definite violations of the pecifications. Finally, we report briefly on the currently implemented instances of the generic framework.
Resumo:
We propose a general framework for assertion-based debugging of constraint logic programs. Assertions are linguistic constructions which allow expressing properties of programs. We define assertion schemas which allow writing (partial) specifications for constraint logic programs using quite general properties, including user-defined programs. The framework is aimed at detecting deviations of the program behavior (symptoms) with respect to the given assertions, either at compile-time or run-time. We provide techniques for using information from global analysis both to detect at compile-time assertions which do not hold in at least one of the possible executions (i.e., static symptoms) and assertions which hold for all possible executions (i.e., statically proved assertions). We also provide program transformations which introduce tests in the program for checking at run-time those assertions whose status cannot be determined at compile-time. Both the static and the dynamic checking are provably safe in the sense that all errors flagged are definite violations of the specifications. Finally, we report on an implemented instance of the assertion language and framework.
Resumo:
We present a technique to estimate accurate speedups for parallel logic programs with relative independence from characteristics of a given implementation or underlying parallel hardware. The proposed technique is based on gathering accurate data describing one execution at run-time, which is fed to a simulator. Alternative schedulings are then simulated and estimates computed for the corresponding speedups. A tool implementing the aforementioned techniques is presented, and its predictions are compared to the performance of real systems, showing good correlation.
Resumo:
Knowing the size of the terms to which program variables are bound at run-time in logic programs is required in a class of optimizations which includes granularity control and recursion elimination. Such size is difficult to even approximate at compile time and is thus generally computed at run-time by using (possibly predeñned) predicates which traverse the terms involved. We propose a technique which has the potential of performing this computation much more efficiently. The technique is based on ñnding program procedures which are called before those in which knowledge regarding term sizes is needed and which traverse the terms whose size is to be determined, and transforming such procedures so that they compute term sizes "on the fly". We present a systematic way of determining whether a given program can be transformed in order to compute a given term size at a given program point without additional term traversal. Also, if several such transformations are possible our approach allows ñnding minimal transformations under certain criteria. We also discuss the advantages and applications of our technique (specifically in the task of granularity control) and present some performance results.
Resumo:
Several types of parallelism can be exploited in logic programs while preserving correctness and efficiency, i.e. ensuring that the parallel execution obtains the same results as the sequential one and the amount of work performed is not greater. However, such results do not take into account a number of overheads which appear in practice, such as process creation and scheduling, which can induce a slow-down, or, at least, limit speedup, if they are not controlled in some way. This paper describes a methodology whereby the granularity of parallel tasks, i.e. the work available under them, is efficiently estimated and used to limit parallelism so that the effect of such overheads is controlled. The run-time overhead associated with the approach is usually quite small, since as much work is done at compile time as possible. Also, a number of run-time optimizations are proposed. Moreover, a static analysis of the overhead associated with the granularity control process is performed in order to decide its convenience. The performance improvements resulting from the incorporation of grain size control are shown to be quite good, specially for systems with médium to large parallel execution overheads.