950 resultados para Software repository mining. Process mining. Software developer contribution
Resumo:
This article describes some approaches to problem of testing and documenting automation in information systems with graphical user interface. Combination of data mining methods and theory of finite state machines is used for testing automation. Automated creation of software documentation is based on using metadata in documented system. Metadata is built on graph model. Described approaches improve performance and quality of testing and documenting processes.
Resumo:
Dimensionality reduction is a very important step in the data mining process. In this paper, we consider feature extraction for classification tasks as a technique to overcome problems occurring because of “the curse of dimensionality”. Three different eigenvector-based feature extraction approaches are discussed and three different kinds of applications with respect to classification tasks are considered. The summary of obtained results concerning the accuracy of classification schemes is presented with the conclusion about the search for the most appropriate feature extraction method. The problem how to discover knowledge needed to integrate the feature extraction and classification processes is stated. A decision support system to aid in the integration of the feature extraction and classification processes is proposed. The goals and requirements set for the decision support system and its basic structure are defined. The means of knowledge acquisition needed to build up the proposed system are considered.
Resumo:
This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.
Resumo:
Software product line engineering promotes large software reuse by developing a system family that shares a set of developed core features, and enables the selection and customization of a set of variabilities that distinguish each software product family from the others. In order to address the time-to-market, the software industry has been using the clone-and-own technique to create and manage new software products or product lines. Despite its advantages, the clone-and-own approach brings several difficulties for the evolution and reconciliation of the software product lines, especially because of the code conflicts generated by the simultaneous evolution of the original software product line, called Source, and its cloned products, called Target. This thesis proposes an approach to evolve and reconcile cloned products based on mining software repositories and code conflict analysis techniques. The approach provides support to the identification of different kinds of code conflicts – lexical, structural and semantics – that can occur during development task integration – bug correction, enhancements and new use cases – from the original evolved software product line to the cloned product line. We have also conducted an empirical study of characterization of the code conflicts produced during the evolution and merging of two large-scale web information system product lines. The results of our study demonstrate the approach potential to automatically or semi-automatically solve several existing code conflicts thus contributing to reduce the complexity and costs of the reconciliation of cloned software product lines.
Resumo:
Soft skills and teamwork practices were identi ed as the main de ciencies of recent graduates in computer courses. This issue led to a realization of a qualitative research aimed at investigating the challenges faced by professors of those courses in conducting, monitoring and assessing collaborative software development projects. Di erent challenges were reported by teachers, including di culties in the assessment of students both in the collective and individual levels. In this context, a quantitative research was conducted with the aim to map soft skill of students to a set of indicators that can be extracted from software repositories using data mining techniques. These indicators are aimed at measuring soft skills, such as teamwork, leadership, problem solving and the pace of communication. Then, a peer assessment approach was applied in a collaborative software development course of the software engineering major at the Federal University of Rio Grande do Norte (UFRN). This research presents a correlation study between the students' soft skills scores and indicators based on mining software repositories. This study contributes: (i) in the presentation of professors' perception of the di culties and opportunities for improving management and monitoring practices in collaborative software development projects; (ii) in investigating relationships between soft skills and activities performed by students using software repositories; (iii) in encouraging the development of soft skills and the use of software repositories among software engineering students; (iv) in contributing to the state of the art of three important areas of software engineering, namely software engineering education, educational data mining and human aspects of software engineering.
Resumo:
Soft skills and teamwork practices were identi ed as the main de ciencies of recent graduates in computer courses. This issue led to a realization of a qualitative research aimed at investigating the challenges faced by professors of those courses in conducting, monitoring and assessing collaborative software development projects. Di erent challenges were reported by teachers, including di culties in the assessment of students both in the collective and individual levels. In this context, a quantitative research was conducted with the aim to map soft skill of students to a set of indicators that can be extracted from software repositories using data mining techniques. These indicators are aimed at measuring soft skills, such as teamwork, leadership, problem solving and the pace of communication. Then, a peer assessment approach was applied in a collaborative software development course of the software engineering major at the Federal University of Rio Grande do Norte (UFRN). This research presents a correlation study between the students' soft skills scores and indicators based on mining software repositories. This study contributes: (i) in the presentation of professors' perception of the di culties and opportunities for improving management and monitoring practices in collaborative software development projects; (ii) in investigating relationships between soft skills and activities performed by students using software repositories; (iii) in encouraging the development of soft skills and the use of software repositories among software engineering students; (iv) in contributing to the state of the art of three important areas of software engineering, namely software engineering education, educational data mining and human aspects of software engineering.
Resumo:
The objective of D6.1 is to make the Ecosystem software platform with underlying Software Repository, Digital Library and Media Archive available to the degree, that the RAGE project can start collecting content in the form of software assets, and documents of various media types. This paper describes the current state of the Ecosystem as of month 12 of the project, and documents the structure of the Ecosystem, individual components, integration strategies, and overall approach. The deliverable itself is the deployment of the described components, which is now available to collect and curate content. Whilst this version is not yet feature complete, full realization is expected within the next few months. Following this development, WP6 will continue to add features driven by the business models to be defined by WP7 later on in the project.
Resumo:
With the increasing complexity of today's software, the software development process is becoming highly time and resource consuming. The increasing number of software configurations, input parameters, usage scenarios, supporting platforms, external dependencies, and versions plays an important role in expanding the costs of maintaining and repairing unforeseeable software faults. To repair software faults, developers spend considerable time in identifying the scenarios leading to those faults and root-causing the problems. While software debugging remains largely manual, it is not the case with software testing and verification. The goal of this research is to improve the software development process in general, and software debugging process in particular, by devising techniques and methods for automated software debugging, which leverage the advances in automatic test case generation and replay. In this research, novel algorithms are devised to discover faulty execution paths in programs by utilizing already existing software test cases, which can be either automatically or manually generated. The execution traces, or alternatively, the sequence covers of the failing test cases are extracted. Afterwards, commonalities between these test case sequence covers are extracted, processed, analyzed, and then presented to the developers in the form of subsequences that may be causing the fault. The hypothesis is that code sequences that are shared between a number of faulty test cases for the same reason resemble the faulty execution path, and hence, the search space for the faulty execution path can be narrowed down by using a large number of test cases. To achieve this goal, an efficient algorithm is implemented for finding common subsequences among a set of code sequence covers. Optimization techniques are devised to generate shorter and more logical sequence covers, and to select subsequences with high likelihood of containing the root cause among the set of all possible common subsequences. A hybrid static/dynamic analysis approach is designed to trace back the common subsequences from the end to the root cause. A debugging tool is created to enable developers to use the approach, and integrate it with an existing Integrated Development Environment. The tool is also integrated with the environment's program editors so that developers can benefit from both the tool suggestions, and their source code counterparts. Finally, a comparison between the developed approach and the state-of-the-art techniques shows that developers need only to inspect a small number of lines in order to find the root cause of the fault. Furthermore, experimental evaluation shows that the algorithm optimizations lead to better results in terms of both the algorithm running time and the output subsequence length.
Resumo:
Hydrographic and geosciences surveys, using acoustic devices, need to use accurate water sound velocity profiles. Because the acoustic path depends on the sound velocity profile (SVP), the use of the most accurate SVP is one of the keys to conducting effective surveys (with multibeams, for instance). To date, the existing software available does not answer to both the needs of efficiency and simplicity (sometimes not so easy to operate, sometimes not so accurate). DORIS provides a handy freeware to post-process SVP for the hydrographic communities.