5 resultados para runtime assertions
em Massachusetts Institute of Technology
Resumo:
Scheduling tasks to efficiently use the available processor resources is crucial to minimizing the runtime of applications on shared-memory parallel processors. One factor that contributes to poor processor utilization is the idle time caused by long latency operations, such as remote memory references or processor synchronization operations. One way of tolerating this latency is to use a processor with multiple hardware contexts that can rapidly switch to executing another thread of computation whenever a long latency operation occurs, thus increasing processor utilization by overlapping computation with communication. Although multiple contexts are effective for tolerating latency, this effectiveness can be limited by memory and network bandwidth, by cache interference effects among the multiple contexts, and by critical tasks sharing processor resources with less critical tasks. This thesis presents techniques that increase the effectiveness of multiple contexts by intelligently scheduling threads to make more efficient use of processor pipeline, bandwidth, and cache resources. This thesis proposes thread prioritization as a fundamental mechanism for directing the thread schedule on a multiple-context processor. A priority is assigned to each thread either statically or dynamically and is used by the thread scheduler to decide which threads to load in the contexts, and to decide which context to switch to on a context switch. We develop a multiple-context model that integrates both cache and network effects, and shows how thread prioritization can both maintain high processor utilization, and limit increases in critical path runtime caused by multithreading. The model also shows that in order to be effective in bandwidth limited applications, thread prioritization must be extended to prioritize memory requests. We show how simple hardware can prioritize the running of threads in the multiple contexts, and the issuing of requests to both the local memory and the network. Simulation experiments show how thread prioritization is used in a variety of applications. Thread prioritization can improve the performance of synchronization primitives by minimizing the number of processor cycles wasted in spinning and devoting more cycles to critical threads. Thread prioritization can be used in combination with other techniques to improve cache performance and minimize cache interference between different working sets in the cache. For applications that are critical path limited, thread prioritization can improve performance by allowing processor resources to be devoted preferentially to critical threads. These experimental results show that thread prioritization is a mechanism that can be used to implement a wide range of scheduling policies.
Resumo:
The goal of the work reported here is to capture the commonsense knowledge of non-expert human contributors. Achieving this goal will enable more intelligent human-computer interfaces and pave the way for computers to reason about our world. In the domain of natural language processing, it will provide the world knowledge much needed for semantic processing of natural language. To acquire knowledge from contributors not trained in knowledge engineering, I take the following four steps: (i) develop a knowledge representation (KR) model for simple assertions in natural language, (ii) introduce cumulative analogy, a class of nearest-neighbor based analogical reasoning algorithms over this representation, (iii) argue that cumulative analogy is well suited for knowledge acquisition (KA) based on a theoretical analysis of effectiveness of KA with this approach, and (iv) test the KR model and the effectiveness of the cumulative analogy algorithms empirically. To investigate effectiveness of cumulative analogy for KA empirically, Learner, an open source system for KA by cumulative analogy has been implemented, deployed, and evaluated. (The site "1001 Questions," is available at http://teach-computers.org/learner.html). Learner acquires assertion-level knowledge by constructing shallow semantic analogies between a KA topic and its nearest neighbors and posing these analogies as natural language questions to human contributors. Suppose, for example, that based on the knowledge about "newspapers" already present in the knowledge base, Learner judges "newspaper" to be similar to "book" and "magazine." Further suppose that assertions "books contain information" and "magazines contain information" are also already in the knowledge base. Then Learner will use cumulative analogy from the similar topics to ask humans whether "newspapers contain information." Because similarity between topics is computed based on what is already known about them, Learner exhibits bootstrapping behavior --- the quality of its questions improves as it gathers more knowledge. By summing evidence for and against posing any given question, Learner also exhibits noise tolerance, limiting the effect of incorrect similarities. The KA power of shallow semantic analogy from nearest neighbors is one of the main findings of this thesis. I perform an analysis of commonsense knowledge collected by another research effort that did not rely on analogical reasoning and demonstrate that indeed there is sufficient amount of correlation in the knowledge base to motivate using cumulative analogy from nearest neighbors as a KA method. Empirically, evaluating the percentages of questions answered affirmatively, negatively and judged to be nonsensical in the cumulative analogy case compares favorably with the baseline, no-similarity case that relies on random objects rather than nearest neighbors. Of the questions generated by cumulative analogy, contributors answered 45% affirmatively, 28% negatively and marked 13% as nonsensical; in the control, no-similarity case 8% of questions were answered affirmatively, 60% negatively and 26% were marked as nonsensical.
Resumo:
Dynamic optimization has several key advantages. This includes the ability to work on binary code in the absence of sources and to perform optimization across module boundaries. However, it has a significant disadvantage viz-a-viz traditional static optimization: it has a significant runtime overhead. There can be performance gain only if the overhead can be amortized. In this paper, we will quantitatively analyze the runtime overhead introduced by a dynamic optimizer, DynamoRIO. We found that the major overhead does not come from the optimizer's operation. Instead, it comes from the extra code in the code cache added by DynamoRIO. After a detailed analysis, we will propose a method of trace construction that ameliorate the overhead introduced by the dynamic optimizer, thereby reducing the runtime overhead of DynamoRIO. We believe that the result of the study as well as the proposed solution is applicable to other scenarios such as dynamic code translation and managed execution that utilizes a framework similar to that of dynamic optimization.
Resumo:
It has been widely known that a significant part of the bits are useless or even unused during the program execution. Bit-width analysis targets at finding the minimum bits needed for each variable in the program, which ensures the execution correctness and resources saving. In this paper, we proposed a static analysis method for bit-widths in general applications, which approximates conservatively at compile time and is independent of runtime conditions. While most related work focus on integer applications, our method is also tailored and applicable to floating point variables, which could be extended to transform floating point number into fixed point numbers together with precision analysis. We used more precise representations for data value ranges of both scalar and array variables. Element level analysis is carried out for arrays. We also suggested an alternative for the standard fixed-point iterations in bi-directional range analysis. These techniques are implemented on the Trimaran compiler structure and tested on a set of benchmarks to show the results.
Resumo:
Testing constraints for real-time systems are usually verified through the satisfiability of propositional formulae. In this paper, we propose an alternative where the verification of timing constraints can be done by counting the number of truth assignments instead of boolean satisfiability. This number can also tell us how “far away” is a given specification from satisfying its safety assertion. Furthermore, specifications and safety assertions are often modified in an incremental fashion, where problematic bugs are fixed one at a time. To support this development, we propose an incremental algorithm for counting satisfiability. Our proposed incremental algorithm is optimal as no unnecessary nodes are created during each counting. This works for the class of path RTL. To illustrate this application, we show how incremental satisfiability counting can be applied to a well-known rail-road crossing example, particularly when its specification is still being refined.