995 resultados para Runtime performance
Resumo:
Transactional memory (TM) is a new synchronization mechanism devised to simplify parallel programming, thereby helping programmers to unleash the power of current multicore processors. Although software implementations of TM (STM) have been extensively analyzed in terms of runtime performance, little attention has been paid to an equally important constraint faced by nearly all computer systems: energy consumption. In this work we conduct a comprehensive study of energy and runtime tradeoff sin software transactional memory systems. We characterize the behavior of three state-of-the-art lock-based STM algorithms, along with three different conflict resolution schemes. As a result of this characterization, we propose a DVFS-based technique that can be integrated into the resolution policies so as to improve the energy-delay product (EDP). Experimental results show that our DVFS-enhanced policies are indeed beneficial for applications with high contention levels. Improvements of up to 59% in EDP can be observed in this scenario, with an average EDP reduction of 16% across the STAMP workloads. © 2012 IEEE.
Resumo:
In recent years, progress in the area of mobile telecommunications has changed our way of life, in the private as well as the business domain. Mobile and wireless networks have ever increasing bit rates, mobile network operators provide more and more services, and at the same time costs for the usage of mobile services and bit rates are decreasing. However, mobile services today still lack functions that seamlessly integrate into users’ everyday life. That is, service attributes such as context-awareness and personalisation are often either proprietary, limited or not available at all. In order to overcome this deficiency, telecommunications companies are heavily engaged in the research and development of service platforms for networks beyond 3G for the provisioning of innovative mobile services. These service platforms are to support such service attributes. Service platforms are to provide basic service-independent functions such as billing, identity management, context management, user profile management, etc. Instead of developing own solutions, developers of end-user services such as innovative messaging services or location-based services can utilise the platform-side functions for their own purposes. In doing so, the platform-side support for such functions takes away complexity, development time and development costs from service developers. Context-awareness and personalisation are two of the most important aspects of service platforms in telecommunications environments. The combination of context-awareness and personalisation features can also be described as situation-dependent personalisation of services. The support for this feature requires several processing steps. The focus of this doctoral thesis is on the processing step, in which the user’s current context is matched against situation-dependent user preferences to find the matching user preferences for the current user’s situation. However, to achieve this, a user profile management system and corresponding functionality is required. These parts are also covered by this thesis. Altogether, this thesis provides the following contributions: The first part of the contribution is mainly architecture-oriented. First and foremost, we provide a user profile management system that addresses the specific requirements of service platforms in telecommunications environments. In particular, the user profile management system has to deal with situation-specific user preferences and with user information for various services. In order to structure the user information, we also propose a user profile structure and the corresponding user profile ontology as part of an ontology infrastructure in a service platform. The second part of the contribution is the selection mechanism for finding matching situation-dependent user preferences for the personalisation of services. This functionality is provided as a sub-module of the user profile management system. Contrary to existing solutions, our selection mechanism is based on ontology reasoning. This mechanism is evaluated in terms of runtime performance and in terms of supported functionality compared to other approaches. The results of the evaluation show the benefits and the drawbacks of ontology modelling and ontology reasoning in practical applications.
Resumo:
In order to facilitate the development of agent-based software, several agent programming languages and architectures, have been created. Plans in these architectures are often self-contained procedures with an associated triggering event and a context condition, while any further information about the consequences of executing a plan is absent. However, agents designed using such an approach have limited flexibility at runtime, and rely on the designer’s ability to foresee all relevant situations an agent might have to handle. In order to overcome this limitation, we have created AgentSpeak(PL), an interpreter capable of performing state-space planning to generate new high-level plans. As the planning module creates new plans, the plan library is expanded, improving performance over time. However, for new plans to be useful in the long run, it is critical that the context condition associated with new plans is carefully generated. In this paper we describe a plan reuse technique aimed at improving an agent’s runtime performance by deriving optimal context conditions for new plans, allowing an agent to reuse generated plans as much as possible.
Resumo:
Observability measures the support of computer systems to accurately capture, analyze, and present (collectively observe) the internal information about the systems. Observability frameworks play important roles for program understanding, troubleshooting, performance diagnosis, and optimizations. However, traditional solutions are either expensive or coarse-grained, consequently compromising their utility in accommodating today’s increasingly complex software systems. New solutions are emerging for VM-based languages due to the full control language VMs have over program executions. Existing such solutions, nonetheless, still lack flexibility, have high overhead, or provide limited context information for developing powerful dynamic analyses. In this thesis, we present a VM-based infrastructure, called marker tracing framework (MTF), to address the deficiencies in the existing solutions for providing better observability for VM-based languages. MTF serves as a solid foundation for implementing fine-grained low-overhead program instrumentation. Specifically, MTF allows analysis clients to: 1) define custom events with rich semantics ; 2) specify precisely the program locations where the events should trigger; and 3) adaptively enable/disable the instrumentation at runtime. In addition, MTF-based analysis clients are more powerful by having access to all information available to the VM. To demonstrate the utility and effectiveness of MTF, we present two analysis clients: 1) dynamic typestate analysis with adaptive online program analysis (AOPA); and 2) selective probabilistic calling context analysis (SPCC). In addition, we evaluate the runtime performance of MTF and the typestate client with the DaCapo benchmarks. The results show that: 1) MTF has acceptable runtime overhead when tracing moderate numbers of marker events; and 2) AOPA is highly effective in reducing the event frequency for the dynamic typestate analysis; and 3) language VMs can be exploited to offer greater observability.
Resumo:
In this paper, we propose a new technique that can identify transaction-local memory (i.e. captured memory), in managed environments, while having a low runtime overhead. We implemented our proposal in a well known STM framework (Deuce) and we tested it in STMBench7 with two different STMs: TL2 and LSA. In both STMs the performance improved significantly (4 times and 2.6 times, respectively). Moreover, running the STAMP benchmarks with our approach shows improvements of 7 times in the best case for the Vacation application.
Resumo:
An adaptive antenna array combines the signal of each element, using some constraints to produce the radiation pattern of the antenna, while maximizing the performance of the system. Direction of arrival (DOA) algorithms are applied to determine the directions of impinging signals, whereas beamforming techniques are employed to determine the appropriate weights for the array elements, to create the desired pattern. In this paper, a detailed analysis of both categories of algorithms is made, when a planar antenna array is used. Several simulation results show that it is possible to point an antenna array in a desired direction based on the DOA estimation and on the beamforming algorithms. A comparison of the performance in terms of runtime and accuracy of the used algorithms is made. These characteristics are dependent on the SNR of the incoming signal.
Resumo:
The goal of this work is to try to create a statistical model, based only on easily computable parameters from the CSP problem to predict runtime behaviour of the solving algorithms, and let us choose the best algorithm to solve the problem. Although it seems that the obvious choice should be MAC, experimental results obtained so far show, that with big numbers of variables, other algorithms perfom much better, specially for hard problems in the transition phase.
Resumo:
Die Bedeutung des Dienstgüte-Managements (SLM) im Bereich von Unternehmensanwendungen steigt mit der zunehmenden Kritikalität von IT-gestützten Prozessen für den Erfolg einzelner Unternehmen. Traditionell werden zur Implementierung eines wirksamen SLMs Monitoringprozesse in hierarchischen Managementumgebungen etabliert, die einen Administrator bei der notwendigen Rekonfiguration von Systemen unterstützen. Auf aktuelle, hochdynamische Softwarearchitekturen sind diese hierarchischen Ansätze jedoch nur sehr eingeschränkt anwendbar. Ein Beispiel dafür sind dienstorientierte Architekturen (SOA), bei denen die Geschäftsfunktionalität durch das Zusammenspiel einzelner, voneinander unabhängiger Dienste auf Basis deskriptiver Workflow-Beschreibungen modelliert wird. Dadurch ergibt sich eine hohe Laufzeitdynamik der gesamten Architektur. Für das SLM ist insbesondere die dezentrale Struktur einer SOA mit unterschiedlichen administrativen Zuständigkeiten für einzelne Teilsysteme problematisch, da regelnde Eingriffe zum einen durch die Kapselung der Implementierung einzelner Dienste und zum anderen durch das Fehlen einer zentralen Kontrollinstanz nur sehr eingeschränkt möglich sind. Die vorliegende Arbeit definiert die Architektur eines SLM-Systems für SOA-Umgebungen, in dem autonome Management-Komponenten kooperieren, um übergeordnete Dienstgüteziele zu erfüllen: Mithilfe von Selbst-Management-Technologien wird zunächst eine Automatisierung des Dienstgüte-Managements auf Ebene einzelner Dienste erreicht. Die autonomen Management-Komponenten dieser Dienste können dann mithilfe von Selbstorganisationsmechanismen übergreifende Ziele zur Optimierung von Dienstgüteverhalten und Ressourcennutzung verfolgen. Für das SLM auf Ebene von SOA Workflows müssen temporär dienstübergreifende Kooperationen zur Erfüllung von Dienstgüteanforderungen etabliert werden, die sich damit auch über mehrere administrative Domänen erstrecken können. Eine solche zeitlich begrenzte Kooperation autonomer Teilsysteme kann sinnvoll nur dezentral erfolgen, da die jeweiligen Kooperationspartner im Vorfeld nicht bekannt sind und – je nach Lebensdauer einzelner Workflows – zur Laufzeit beteiligte Komponenten ausgetauscht werden können. In der Arbeit wird ein Verfahren zur Koordination autonomer Management-Komponenten mit dem Ziel der Optimierung von Antwortzeiten auf Workflow-Ebene entwickelt: Management-Komponenten können durch Übertragung von Antwortzeitanteilen untereinander ihre individuellen Ziele straffen oder lockern, ohne dass das Gesamtantwortzeitziel dadurch verändert wird. Die Übertragung von Antwortzeitanteilen wird mithilfe eines Auktionsverfahrens realisiert. Technische Grundlage der Kooperation bildet ein Gruppenkommunikationsmechanismus. Weiterhin werden in Bezug auf die Nutzung geteilter, virtualisierter Ressourcen konkurrierende Dienste entsprechend geschäftlicher Ziele priorisiert. Im Rahmen der praktischen Umsetzung wird die Realisierung zentraler Architekturelemente und der entwickelten Verfahren zur Selbstorganisation beispielhaft für das SLM konkreter Komponenten vorgestellt. Zur Untersuchung der Management-Kooperation in größeren Szenarien wird ein hybrider Simulationsansatz verwendet. Im Rahmen der Evaluation werden Untersuchungen zur Skalierbarkeit des Ansatzes durchgeführt. Schwerpunkt ist hierbei die Betrachtung eines Systems aus kooperierenden Management-Komponenten, insbesondere im Hinblick auf den Kommunikationsaufwand. Die Evaluation zeigt, dass ein dienstübergreifendes, autonomes Performance-Management in SOA-Umgebungen möglich ist. Die Ergebnisse legen nahe, dass der entwickelte Ansatz auch in großen Umgebungen erfolgreich angewendet werden kann.
Resumo:
Wednesday 23rd April 2014 Speaker(s): Willi Hasselbring Organiser: Leslie Carr Time: 23/04/2014 14:00-15:00 Location: B32/3077 File size: 802Mb Abstract The internal behavior of large-scale software systems cannot be determined on the basis of static (e.g., source code) analysis alone. Kieker provides complementary dynamic analysis capabilities, i.e., monitoring/profiling and analyzing a software system's runtime behavior. Application Performance Monitoring is concerned with continuously observing a software system's performance-specific runtime behavior, including analyses like assessing service level compliance or detecting and diagnosing performance problems. Architecture Discovery is concerned with extracting architectural information from an existing software system, including both structural and behavioral aspects like identifying architectural entities (e.g., components and classes) and their interactions (e.g., local or remote procedure calls). In addition to the Architecture Discovery of Java systems, Kieker supports Architecture Discovery for other platforms, including legacy systems, for instance, inplemented in C#, C++, Visual Basic 6, COBOL or Perl. Thanks to Kieker's extensible architecture it is easy to implement and use custom extensions and plugins. Kieker was designed for continuous monitoring in production systems inducing only a very low overhead, which has been evaluated in extensive benchmark experiments. Please, refer to http://kieker-monitoring.net/ for more information.
Resumo:
In the 1990s the Message Passing Interface Forum defined MPI bindings for Fortran, C, and C++. With the success of MPI these relatively conservative languages have continued to dominate in the parallel computing community. There are compelling arguments in favour of more modern languages like Java. These include portability, better runtime error checking, modularity, and multi-threading. But these arguments have not converted many HPC programmers, perhaps due to the scarcity of full-scale scientific Java codes, and the lack of evidence for performance competitive with C or Fortran. This paper tries to redress this situation by porting two scientific applications to Java. Both of these applications are parallelized using our thread-safe Java messaging system—MPJ Express. The first application is the Gadget-2 code, which is a massively parallel structure formation code for cosmological simulations. The second application uses the finite-domain time-difference method for simulations in the area of computational electromagnetics. We evaluate and compare the performance of the Java and C versions of these two scientific applications, and demonstrate that the Java codes can achieve performance comparable with legacy applications written in conventional HPC languages. Copyright © 2009 John Wiley & Sons, Ltd.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
An important problem in computational biology is finding the longest common subsequence (LCS) of two nucleotide sequences. This paper examines the correctness and performance of a recently proposed parallel LCS algorithm that uses successor tables and pruning rules to construct a list of sets from which an LCS can be easily reconstructed. Counterexamples are given for two pruning rules that were given with the original algorithm. Because of these errors, performance measurements originally reported cannot be validated. The work presented here shows that speedup can be reliably achieved by an implementation in Unified Parallel C that runs on an Infiniband cluster. This performance is partly facilitated by exploiting the software cache of the MuPC runtime system. In addition, this implementation achieved speedup without bulk memory copy operations and the associated programming complexity of message passing.
Resumo:
This paper describes a study and analysis of surface normal-base descriptors for 3D object recognition. Specifically, we evaluate the behaviour of descriptors in the recognition process using virtual models of objects created from CAD software. Later, we test them in real scenes using synthetic objects created with a 3D printer from the virtual models. In both cases, the same virtual models are used on the matching process to find similarity. The difference between both experiments is in the type of views used in the tests. Our analysis evaluates three subjects: the effectiveness of 3D descriptors depending on the viewpoint of camera, the geometry complexity of the model and the runtime used to do the recognition process and the success rate to recognize a view of object among the models saved in the database.
Resumo:
Modern software applications are becoming more dependent on database management systems (DBMSs). DBMSs are usually used as black boxes by software developers. For example, Object-Relational Mapping (ORM) is one of the most popular database abstraction approaches that developers use nowadays. Using ORM, objects in Object-Oriented languages are mapped to records in the database, and object manipulations are automatically translated to SQL queries. As a result of such conceptual abstraction, developers do not need deep knowledge of databases; however, all too often this abstraction leads to inefficient and incorrect database access code. Thus, this thesis proposes a series of approaches to improve the performance of database-centric software applications that are implemented using ORM. Our approaches focus on troubleshooting and detecting inefficient (i.e., performance problems) database accesses in the source code, and we rank the detected problems based on their severity. We first conduct an empirical study on the maintenance of ORM code in both open source and industrial applications. We find that ORM performance-related configurations are rarely tuned in practice, and there is a need for tools that can help improve/tune the performance of ORM-based applications. Thus, we propose approaches along two dimensions to help developers improve the performance of ORM-based applications: 1) helping developers write more performant ORM code; and 2) helping developers configure ORM configurations. To provide tooling support to developers, we first propose static analysis approaches to detect performance anti-patterns in the source code. We automatically rank the detected anti-pattern instances according to their performance impacts. Our study finds that by resolving the detected anti-patterns, the application performance can be improved by 34% on average. We then discuss our experience and lessons learned when integrating our anti-pattern detection tool into industrial practice. We hope our experience can help improve the industrial adoption of future research tools. However, as static analysis approaches are prone to false positives and lack runtime information, we also propose dynamic analysis approaches to further help developers improve the performance of their database access code. We propose automated approaches to detect redundant data access anti-patterns in the database access code, and our study finds that resolving such redundant data access anti-patterns can improve application performance by an average of 17%. Finally, we propose an automated approach to tune performance-related ORM configurations using both static and dynamic analysis. Our study shows that our approach can help improve application throughput by 27--138%. Through our case studies on real-world applications, we show that all of our proposed approaches can provide valuable support to developers and help improve application performance significantly.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08