986 resultados para source code parsing
Resumo:
Nous proposons une approche basée sur la formulation interactive des requêtes. Notre approche sert à faciliter des tâches d’analyse et de compréhension du code source. Dans cette approche, l’analyste utilise un ensemble de filtres de base (linguistique, structurel, quantitatif, et filtre d’interactivité) pour définir des requêtes complexes. Ces requêtes sont construites à l’aide d’un processus interactif et itératif, où des filtres de base sont choisis et exécutés, et leurs résultats sont visualisés, changés et combinés en utilisant des opérateurs prédéfinis. Nous avons évalués notre approche par l’implantation des récentes contributions en détection de défauts de conception ainsi que la localisation de fonctionnalités dans le code. Nos résultats montrent que, en plus d’être générique, notre approche aide à la mise en œuvre des solutions existantes implémentées par des outils automatiques.
Resumo:
In order to analyze software systems, it is necessary to model them. Static software models are commonly imported by parsing source code and related data. Unfortunately, building custom parsers for most programming languages is a non-trivial endeavour. This poses a major bottleneck for analyzing software systems programmed in languages for which importers do not already exist. Luckily, initial software models do not require detailed parsers, so it is possible to start analysis with a coarse-grained importer, which is then gradually refined. In this paper we propose an approach to "agile modeling" that exploits island grammars to extract initial coarse-grained models, parser combinators to enable gradual refinement of model importers, and various heuristics to recognize language structure, keywords and other language artifacts.
Resumo:
Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing - bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.
Resumo:
Code clone detection helps connect developers across projects, if we do it on a large scale. The cornerstones that allow clone detection to work on a large scale are: (1) bad hashing (2) lightweight parsing using regular expressions and (3) MapReduce pipelines. Bad hashing means to determine whether or not two artifacts are similar by checking whether their hashes are identical. We show a bad hashing scheme that works well on source code. Lightweight parsing using regular expressions is our technique of obtaining entire parse trees from regular expressions, robustly and efficiently. We detail the algorithm and implementation of one such regular expression engine. MapReduce pipelines are a way of expressing a computation such that it can automatically and simply be parallelized. We detail the design and implementation of one such MapReduce pipeline that is efficient and debuggable. We show a clone detector that combines these cornerstones to detect code clones across all projects, across all versions of each project.
Resumo:
The Lattes platform is the major scientific information system maintained by the National Council for Scientific and Technological Development (CNPq). This platform allows to manage the curricular information of researchers and institutions working in Brazil based on the so called Lattes Curriculum. However, the public information is individually available for each researcher, not providing the automatic creation of reports of several scientific productions for research groups. It is thus difficult to extract and to summarize useful knowledge for medium to large size groups of researchers. This paper describes the design, implementation and experiences with scriptLattes: an open-source system to create academic reports of groups based on curricula of the Lattes Database. The scriptLattes system is composed by the following modules: (a) data selection, (b) data preprocessing, (c) redundancy treatment, (d) collaboration graph generation among group members, (e) research map generation based on geographical information, and (f) automatic report creation of bibliographical, technical and artistic production, and academic supervisions. The system has been extensively tested for a large variety of research groups of Brazilian institutions, and the generated reports have shown an alternative to easily extract knowledge from data in the context of Lattes platform. The source code, usage instructions and examples are available at http://scriptlattes.sourceforge.net/.
Resumo:
Graphical user interfaces (GUIs) are critical components of today's software. Developers are dedicating a larger portion of code to implementing them. Given their increased importance, correctness of GUIs code is becoming essential. This paper describes the latest results in the development of GUISurfer, a tool to reverse engineer the GUI layer of interactive computing systems. The ultimate goal of the tool is to enable analysis of interactive system from source code.
Resumo:
More and more current software systems rely on non trivial coordination logic for combining autonomous services typically running on different platforms and often owned by different organizations. Often, however, coordination data is deeply entangled in the code and, therefore, difficult to isolate and analyse separately. COORDINSPECTOR is a software tool which combines slicing and program analysis techniques to isolate all coordination elements from the source code of an existing application. Such a reverse engineering process provides a clear view of the actually invoked services as well as of the orchestration patterns which bind them together. The tool analyses Common Intermediate Language (CIL) code, the native language of Microsoft .Net Framework. Therefore, the scope of application of COORDINSPECTOR is quite large: potentially any piece of code developed in any of the programming languages which compiles to the .Net Framework. The tool generates graphical representations of the coordination layer together and identifies the underlying business process orchestrations, rendering them as Orc specifications
Resumo:
Graphical user interfaces (GUIs) are critical components of today's open source software. Given their increased relevance, the correctness and usability of GUIs are becoming essential. This paper describes the latest results in the development of our tool to reverse engineer the GUI layer of interactive computing open source systems. We use static analysis techniques to generate models of the user interface behavior from source code. Models help in graphical user interface inspection by allowing designers to concentrate on its more important aspects. One particular type of model that the tool is able to generate is state machines. The paper shows how graph theory can be useful when applied to these models. A number of metrics and algorithms are used in the analysis of aspects of the user interface's quality. The ultimate goal of the tool is to enable analysis of interactive system through GUIs source code inspection.
Resumo:
Eradication of code smells is often pointed out as a way to improve readability, extensibility and design in existing software. However, code smell detection remains time consuming and error-prone, partly due to the inherent subjectivity of the detection processes presently available. In view of mitigating the subjectivity problem, this dissertation presents a tool that automates a technique for the detection and assessment of code smells in Java source code, developed as an Eclipse plugin. The technique is based upon a Binary Logistic Regression model that uses complexity metrics as independent variables and is calibrated by expert‟s knowledge. An overview of the technique is provided, the tool is described and validated by an example case study.
Resumo:
The use of open source software continues to grow on a daily basis. Today, enterprise applications contain 40% to 70% open source code and this fact has legal, development, IT security, risk management and compliance organizations focusing their attention on its use, as never before. They increasingly understand that the open source content within an application must be detected. Once uncovered, decisions regarding compliance with intellectual property licensing obligations must be made and known security vulnerabilities must be remediated. It is no longer sufficient from a risk perspective to not address both open source issues.
Resumo:
The Free Open Source Software (FOSS) seem far from the military field but in some cases, some technologies normally used for civilian purposes may have military applications. These products and technologies are called dual-use. Can we manage to combine FOSS and dual-use products? On one hand, we have to admit that this kind of association exists - dual-use software can be FOSS and many examples demonstrate this duality - but on the other hand, dual-use software available under free licenses lead us to ask many questions. For example, the dual-use export control laws aimed at stemming the proliferation of weapons of mass destruction. Dual-use export in United States (ITAR) and Europe (regulation 428/2009) implies as a consequence the prohibition or regulation of software exportation, involving the closing of source code. Therefore, the issues of exported softwares released under free licenses arises. If software are dual-use goods and serve for military purposes, they may represent a danger. By the rights granted to licenses to run, study, redistribute and distribute modified versions of the software, anyone can access the free dual-use software. So, the licenses themselves are not at the origin of the risk, it is actually linked to the facilitated access to source codes. Seen from this point of view, it goes against the dual-use regulation which allows states to control these technologies exportation. For this analysis, we will discuss about various legal questions and draft answers from either licenses or public policies in this respect.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Code clones are portions of source code which are similar to the original program code. The presence of code clones is considered as a bad feature of software as the maintenance of software becomes difficult due to the presence of code clones. Methods for code clone detection have gained immense significance in the last few years as they play a significant role in engineering applications such as analysis of program code, program understanding, plagiarism detection, error detection, code compaction and many more similar tasks. Despite of all these facts, several features of code clones if properly utilized can make software development process easier. In this work, we have pointed out such a feature of code clones which highlight the relevance of code clones in test sequence identification. Here program slicing is used in code clone detection. In addition, a classification of code clones is presented and the benefit of using program slicing in code clone detection is also mentioned in this work.
Clean Code vs Dirty Code : Ett fältexperiment för att förklara hur Clean Code påverkar kodförståelse
Resumo:
Stora och komplexa kodbaser med bristfällig kodförståelse är ett problem som blir allt vanligare bland företag idag. Bristfällig kodförståelse resulterar i längre tidsåtgång vid underhåll och modifiering av koden, vilket för ett företag leder till ökade kostnader. Clean Code anses enligt somliga vara lösningen på detta problem. Clean Code är en samling riktlinjer och principer för hur man skriver kod som är enkel att förstå och underhålla. Ett kunskapsglapp identifierades vad gäller empirisk data som undersöker Clean Codes påverkan på kodförståelse. Studiens frågeställning var: Hur påverkas förståelsen vid modifiering av kod som är refaktoriserad enligt Clean Code principerna för namngivning och att skriva funktioner? För att undersöka hur Clean Code påverkar kodförståelsen utfördes ett fältexperiment tillsammans med företaget CGM Lab Scandinavia i Borlänge, där data om tidsåtgång och upplevd förståelse hos testdeltagare samlades in och analyserades. Studiens resultat visar ingen tydlig förbättring eller försämring av kodförståelsen då endast den upplevda kodförståelsen verkar påverkas. Alla testdeltagare föredrar Clean Code framför Dirty Code även om tidsåtgången inte påverkas. Detta leder fram till slutsatsen att Clean Codes effekter kanske inte är omedelbara då utvecklare inte hunnit anpassa sig till Clean Code, och därför inte kan utnyttja det till fullo. Studien ger en fingervisning om Clean Codes potential att förbättra kodförståelsen.
Resumo:
The applications of the Finite Element Method (FEM) for three-dimensional domains are already well documented in the framework of Computational Electromagnetics. However, despite the power and reliability of this technique for solving partial differential equations, there are only a few examples of open source codes available and dedicated to the solid modeling and automatic constrained tetrahedralization, which are the most time consuming steps in a typical three-dimensional FEM simulation. Besides, these open source codes are usually developed separately by distinct software teams, and even under conflicting specifications. In this paper, we describe an experiment of open source code integration for solid modeling and automatic mesh generation. The integration strategy and techniques are discussed, and examples and performance results are given, specially for complicated and irregular volumes which are not simply connected. © 2011 IEEE.