Biblioteca Digital

78 resultados para Source code (Computer science)

em BORIS: Bern Open Repository and Information System - Berna - Suiça

Automatic Labeling of Software Components and their Evolution using Log-Likelihood Ratio of Word Frequencies in Source Code

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As more and more open-source software components become available on the internet we need automatic ways to label and compare them. For example, a developer who searches for reusable software must be able to quickly gain an understanding of retrieved components. This understanding cannot be gained at the level of source code due to the semantic gap between source code and the domain model. In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components. We present a prototype implementation of our labeling/comparison algorithm and provide examples of its application. In particular, we apply the approach to detect trends in the evolution of a software system.

Semantic Clustering: Identifying Topics in Source Code

Relevância:

100.00% 100.00%

Publicador:

Mining frequent bug-fix code changes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detecting bugs as early as possible plays an important role in ensuring software quality before shipping. We argue that mining previous bug fixes can produce good knowledge about why bugs happen and how they are fixed. In this paper, we mine the change history of 717 open source projects to extract bug-fix patterns. We also manually inspect many of the bugs we found to get insights into the contexts and reasons behind those bugs. For instance, we found out that missing null checks and missing initializations are very recurrent and we believe that they can be automatically detected and fixed.

Scaleable Code Clone Detection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Code clone detection helps connect developers across projects, if we do it on a large scale. The cornerstones that allow clone detection to work on a large scale are: (1) bad hashing (2) lightweight parsing using regular expressions and (3) MapReduce pipelines. Bad hashing means to determine whether or not two artifacts are similar by checking whether their hashes are identical. We show a bad hashing scheme that works well on source code. Lightweight parsing using regular expressions is our technique of obtaining entire parse trees from regular expressions, robustly and efficiently. We detail the algorithm and implementation of one such regular expression engine. MapReduce pipelines are a way of expressing a computation such that it can automatically and simply be parallelized. We detail the design and implementation of one such MapReduce pipeline that is efficient and debuggable. We show a clone detector that combines these cornerstones to detect code clones across all projects, across all versions of each project.

Modeling Features at Runtime

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A feature represents a functional requirement fulfilled by a system. Since many maintenance tasks are expressed in terms of features, it is important to establish the correspondence between a feature and its implementation in source code. Traditional approaches to establish this correspondence exercise features to generate a trace of runtime events, which is then processed by post-mortem analysis. These approaches typically generate large amounts of data to analyze. Due to their static nature, these approaches do not support incremental and interactive analysis of features. We propose a radically different approach called live feature analysis, which provides a model at runtime of features. Our approach analyzes features on a running system and also makes it possible to grow feature representations by exercising different scenarios of the same feature, and identifies execution elements even to the sub-method level. We describe how live feature analysis is implemented effectively by annotating structural representations of code based on abstract syntax trees. We illustrate our live analysis with a case study where we achieve a more complete feature representation by exercising and merging variants of feature behavior and demonstrate the efficiency or our technique with benchmarks.

Enriching Reverse Engineering with Annotations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Much of the knowledge about software systems is implicit, and therefore difficult to recover by purely automated techniques. Architectural layers and the externally visible features of software systems are two examples of information that can be difficult to detect from source code alone, and that would benefit from additional human knowledge. Typical approaches to reasoning about data involve encoding an explicit meta-model and expressing analyses at that level. Due to its informal nature, however, human knowledge can be difficult to characterize up-front and integrate into such a meta-model. We propose a generic, annotation-based approach to capture such knowledge during the reverse engineering process. Annotation types can be iteratively defined, refined and transformed, without requiring a fixed meta-model to be defined in advance. We show how our approach supports reverse engineering by implementing it in a tool called Metanool and by applying it to (i) analyzing architectural layering, (ii) tracking reengineering tasks, (iii) detecting design flaws, and (iv) analyzing features.

Augmenting Static Source Views in IDEs with Dynamic Metrics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mainstream IDEs such as Eclipse support developers in managing software projects mainly by offering static views of the source code. Such a static perspective neglects any information about runtime behavior. However, object-oriented programs heavily rely on polymorphism and late-binding, which makes them difficult to understand just based on their static structure. Developers thus resort to debuggers or profilers to study the system's dynamics. However, the information provided by these tools is volatile and hence cannot be exploited to ease the navigation of the source space. In this paper we present an approach to augment the static source perspective with dynamic metrics such as precise runtime type information, or memory and object allocation statistics. Dynamic metrics can leverage the understanding for the behavior and structure of a system. We rely on dynamic data gathering based on aspects to analyze running Java systems. By solving concrete use cases we illustrate how dynamic metrics directly available in the IDE are useful. We also comprehensively report on the efficiency of our approach to gather dynamic metrics.

Senseo: Enriching Eclipse's Static Source Views with Dynamic Metrics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maintaining object-oriented systems that use inheritance and polymorphism is difficult, since runtime information, such as which methods are actually invoked at a call site, is not visible in the static source code. We have implemented Senseo, an Eclipse plugin enhancing Eclipse's static source views with various dynamic metrics, such as runtime types, the number of objects created, or the amount of memory allocated in particular methods.

Versioning Your Code with Monticello

Relevância:

100.00% 100.00%

Publicador:

Parsing for agile modeling

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to analyze software systems, it is necessary to model them. Static software models are commonly imported by parsing source code and related data. Unfortunately, building custom parsers for most programming languages is a non-trivial endeavour. This poses a major bottleneck for analyzing software systems programmed in languages for which importers do not already exist. Luckily, initial software models do not require detailed parsers, so it is possible to start analysis with a coarse-grained importer, which is then gradually refined. In this paper we propose an approach to "agile modeling" that exploits island grammars to extract initial coarse-grained models, parser combinators to enable gradual refinement of model importers, and various heuristics to recognize language structure, keywords and other language artifacts.

A Time-based Passive Source Localization System for Narrow-band Signal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Time-based indoor localization has been investigated for several years but the accuracy of existing solutions is limited by several factors, e.g., imperfect synchronization, signal bandwidth and indoor environment. In this paper, we compare two time-based localization algorithms for narrow-band signals, i.e., multilateration and fingerprinting. First, we develop a new Linear Least Square (LLS) algorithm for Differential Time Difference Of Arrival (DTDOA). Second, fingerprinting is among the most successful approaches used for indoor localization and typically relies on the collection of measurements on signal strength over the area of interest. We propose an alternative by constructing fingerprints of fine-grained time information of the radio signal. We offer comprehensive analytical discussions on the feasibility of the approaches, which are backed up by evaluations in a software defined radio based IEEE 802.15.4 testbed. Our work contributes to research on localization with narrow-band signals. The results show that our proposed DTDOA-based LLS algorithm obviously improves the localization accuracy compared to traditional TDOA-based LLS algorithm but the accuracy is still limited because of the complex indoor environment. Furthermore, we show that time-based fingerprinting is a promising alternative to power-based fingerprinting.

A Passive Source Localization System for IEEE 802.15.4 Signal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, we provide a passive location monitoring system for IEEE 802.15.4 signal emitters. The system adopts software defined radio techniques to passively overhear IEEE 802.15.4 packets and to extract power information from baseband signals. In our system, we provide a new model based on the nonlinear regression for ranging. After obtaining distance information, a Weighted Centroid (WC) algorithm is adopted to locate users. In WC, each weight is inversely proportional to the nth power of propagation distance, and the degree n is obtained from some initial measurements. We evaluate our system in a 16m-18m area with complex indoor propagation conditions. We are able to achieve a median error of 2:1m with only 4 anchor nodes.

Predicting dependencies using domain-based coupling

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software dependencies play a vital role in programme comprehension, change impact analysis and other software maintenance activities. Traditionally, these activities are supported by source code analysis; however, the source code is sometimes inaccessible or difficult to analyse, as in hybrid systems composed of source code in multiple languages using various paradigms (e.g. object-oriented programming and relational databases). Moreover, not all stakeholders have adequate knowledge to perform such analyses. For example, non-technical domain experts and consultants raise most maintenance requests; however, they cannot predict the cost and impact of the requested changes without the support of the developers. We propose a novel approach to predicting software dependencies by exploiting the coupling present in domain-level information. Our approach is independent of the software implementation; hence, it can be used to approximate architectural dependencies without access to the source code or the database. As such, it can be applied to hybrid systems with heterogeneous source code or legacy systems with missing source code. In addition, this approach is based solely on information visible and understandable to domain users; therefore, it can be efficiently used by domain experts without the support of software developers. We evaluate our approach with a case study on a large-scale enterprise system, in which we demonstrate how up to 65 of the source code dependencies and 77% of the database dependencies are predicted solely based on domain information.

Pangea: A Workbench for Statically Analyzing Multi-Language Software Corpora

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software corpora facilitate reproducibility of analyses, however, static analysis for an entire corpus still requires considerable effort, often duplicated unnecessarily by multiple users. Moreover, most corpora are designed for single languages increasing the effort for cross-language analysis. To address these aspects we propose Pangea, an infrastructure allowing fast development of static analyses on multi-language corpora. Pangea uses language-independent meta-models stored as object model snapshots that can be directly loaded into memory and queried without any parsing overhead. To reduce the effort of performing static analyses, Pangea provides out-of-the box support for: creating and refining analyses in a dedicated environment, deploying an analysis on an entire corpus, using a runner that supports parallel execution, and exporting results in various formats. In this tool demonstration we introduce Pangea and provide several usage scenarios that illustrate how it reduces the cost of analysis.

Bounded Seas: Island Parsing Without Shipwrecks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing - bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.

«
1
2
3
4
5
6
»