950 resultados para Software repository mining. Process mining. Software developer contribution
Resumo:
We describe the use of log file analysis to investigate whether the use of CSCL applications corresponds to its didactical purposes. Exemplarily we examine the use of the web-based system CommSy as software support for project-oriented university courses. We present two findings: (1) We suggest measures to shape the context of CSCL applications and support their initial and continuous use. (2) We show how log files can be used to analyze how, when and by whom a CSCL system is used and thus help to validate further empirical findings. However, log file analyses can only be interpreted reasonably when additional data concerning the context of use is available.
Resumo:
We present the results of an investigation into the nature of the information needs of software developers who work in projects that are part of larger ecosystems. In an open- question survey we asked framework and library developers about their information needs with respect to both their upstream and downstream projects. We investigated what kind of information is required, why is it necessary, and how the developers obtain this information. The results show that the downstream needs are grouped into three categories roughly corresponding to the different stages in their relation with an upstream: selection, adop- tion, and co-evolution. The less numerous upstream needs are grouped into two categories: project statistics and code usage. The current practices part of the study shows that to sat- isfy many of these needs developers use non-specific tools and ad hoc methods. We believe that this is a largely unexplored area of research.
Resumo:
IT has turned out to be a key factor for the purposes of gaining maturity in Business Process Management (BPM). This book presents a worldwide investigation that was conducted among companies from the ‘Forbes Global 2000’ list to explore the current usage of software throughout the BPM life cycle and to identify the companies’ requirements concerning process modelling. The responses from 130 companies indicate that, at the present time, it is mainly software for process description and analysis that is required, while process execution is supported by general software such as databases, ERP systems and office tools. The resulting complex system landscapes give rise to distinct requirements for BPM software, while the process modelling requirements can be equally satisfied by the most common languages (BPMN, UML, EPC).
Resumo:
Large amounts of animal health care data are present in veterinary electronic medical records (EMR) and they present an opportunity for companion animal disease surveillance. Veterinary patient records are largely in free-text without clinical coding or fixed vocabulary. Text-mining, a computer and information technology application, is needed to identify cases of interest and to add structure to the otherwise unstructured data. In this study EMR's were extracted from veterinary management programs of 12 participating veterinary practices and stored in a data warehouse. Using commercially available text-mining software (WordStat™), we developed a categorization dictionary that could be used to automatically classify and extract enteric syndrome cases from the warehoused electronic medical records. The diagnostic accuracy of the text-miner for retrieving cases of enteric syndrome was measured against human reviewers who independently categorized a random sample of 2500 cases as enteric syndrome positive or negative. Compared to the reviewers, the text-miner retrieved cases with enteric signs with a sensitivity of 87.6% (95%CI, 80.4-92.9%) and a specificity of 99.3% (95%CI, 98.9-99.6%). Automatic and accurate detection of enteric syndrome cases provides an opportunity for community surveillance of enteric pathogens in companion animals.
Resumo:
We present the results of an investigation into the nature of information needs of software developers who work in projects that are part of larger ecosystems. This work is based on a quantitative survey of 75 professional software developers. We corroborate the results identified in the sur- vey with needs and motivations proposed in a previous sur- vey and discover that tool support for developers working in an ecosystem context is even more meager than we thought: mailing lists and internet search are the most popular tools developers use to satisfy their ecosystem-related information needs.
Resumo:
Detecting bugs as early as possible plays an important role in ensuring software quality before shipping. We argue that mining previous bug fixes can produce good knowledge about why bugs happen and how they are fixed. In this paper, we mine the change history of 717 open source projects to extract bug-fix patterns. We also manually inspect many of the bugs we found to get insights into the contexts and reasons behind those bugs. For instance, we found out that missing null checks and missing initializations are very recurrent and we believe that they can be automatically detected and fixed.
Resumo:
Dynamically typed languages lack information about the types of variables in the source code. Developers care about this information as it supports program comprehension. Ba- sic type inference techniques are helpful, but may yield many false positives or negatives. We propose to mine information from the software ecosys- tem on how frequently given types are inferred unambigu- ously to improve the quality of type inference for a single system. This paper presents an approach to augment existing type inference techniques by supplementing the informa- tion available in the source code of a project with data from other projects written in the same language. For all available projects, we track how often messages are sent to instance variables throughout the source code. Predictions for the type of a variable are made based on the messages sent to it. The evaluation of a proof-of-concept prototype shows that this approach works well for types that are sufficiently popular, like those from the standard librarie, and tends to create false positives for unpopular or domain specific types. The false positives are, in most cases, fairly easily identifiable. Also, the evaluation data shows a substantial increase in the number of correctly inferred types when compared to the non-augmented type inference.
Resumo:
Usability is the capability of the software product to be understood, learned, used and attractive to the user, when used under specified conditions. Many studies demonstrate the benefits of usability, yet to this day software products continue to exhibit consistently low levels of this quality attribute. Furthermore, poor usability in software systems contributes largely to software failing in actual use. One of the main disciplines involved in usability is that of Human-Computer Interaction (HCI). Over the past two decades the HCI community has proposed specific features that should be present in applications to improve their usability, yet incorporating them into software continues to be far from trivial for software developers. These difficulties are due to multiple factors, including the high level of abstraction at which these HCI recommendations are made and how far removed they are from actual software implementation. In order to bridge this gap, the Software Engineering community has long proposed software design solutions to help developers include usability features into software, however, the problem remains an open research question. This doctoral thesis addresses the problem of helping software developers include specific usability features into their applications by providing them with a structured and tangible guidance in the form of a process, which we have termed the Usability-Oriented Software Development Process. This process is supported by a set of Software Usability Guidelines that help developers to incorporate a set of eleven usability features with high impact on software design. After developing the Usability-oriented Software Development Process and the Software Usability Guidelines, they have been validated across multiple academic projects and proven to help software developers to include such usability features into their software applications. In doing so, their use significantly reduced development time and improved the quality of the resulting designs of these projects. Furthermore, in this work we propose a software tool to automate the application of the proposed process. In sum, this work contributes to the integration of the Software Engineering and HCI disciplines providing a framework that helps software developers to create usable applications in an efficient way.
Resumo:
Software evolution, and particularly its growth, has been mainly studied at the file (also sometimes referred as module) level. In this paper we propose to move from the physical towards a level that includes semantic information by using functions or methods for measuring the evolution of a software system. We point out that use of functions-based metrics has many advantages over the use of files or lines of code. We demonstrate our approach with an empirical study of two Free/Open Source projects: a community-driven project, Apache, and a company-led project, Novell Evolution. We discovered that most functions never change; when they do their number of modifications is correlated with their size, and that very few authors who modify each; finally we show that the departure of a developer from a software project slows the evolution of the functions that she authored.
A repository for integration of software artifacts with dependency resolution and federation support
Resumo:
While developing new IT products, reusability of existing components is a key aspect that can considerably improve the success rate. This fact has become even more important with the rise of the open source paradigm. However, integrating different products and technologies is not always an easy task. Different communities employ different standards and tools, and most times is not clear which dependencies a particular piece of software has. This is exacerbated by the transitive nature of these dependencies, making component integration a complicated affair. To help reducing this complexity we propose a model-based repository, capable of automatically resolve the required dependencies. This repository needs to be expandable, so new constraints can be analyzed, and also have federation support, for the integration with other sources of artifacts. The solution we propose achieves these working with OSGi components and using OSGi itself.
Resumo:
Ubiquitous computing software needs to be autonomous so that essential decisions such as how to configure its particular execution are self-determined. Moreover, data mining serves an important role for ubiquitous computing by providing intelligence to several types of ubiquitous computing applications. Thus, automating ubiquitous data mining is also crucial. We focus on the problem of automatically configuring the execution of a ubiquitous data mining algorithm. In our solution, we generate configuration decisions in a resource aware and context aware manner since the algorithm executes in an environment in which the context often changes and computing resources are often severely limited. We propose to analyze the execution behavior of the data mining algorithm by mining its past executions. By doing so, we discover the effects of resource and context states as well as parameter settings on the data mining quality. We argue that a classification model is appropriate for predicting the behavior of an algorithm?s execution and we concentrate on decision tree classifier. We also define taxonomy on data mining quality so that tradeoff between prediction accuracy and classification specificity of each behavior model that classifies by a different abstraction of quality, is scored for model selection. Behavior model constituents and class label transformations are formally defined and experimental validation of the proposed approach is also performed.
Resumo:
The focus of this paper is to outline the main structure of an alternative solution to implement a Software Process Improvement program in Small-Settings using the outsourcing infrastructure. This solution takes the advantages of the traditional outsourcing models and applies its structure to propose an alternative solution to make available a Software Process Improvement program for Small-Settings. With this outsourcing solution it is possible share the resources between several Small-Settings.
Resumo:
The focus of this paper is to outline the main structure of an alternative software process improvement method for small- and medium-size enterprises. This method is based on the action package concept, which helps to institutionalize the effective practices with affordable implementation costs. This paper also presents the results and lessons learned when this method was applied to three enterprises in the requirements engineering domain.
Resumo:
This research is concerned with the experimental software engineering area, specifically experiment replication. Replication has traditionally been viewed as a complex task in software engineering. This is possibly due to the present immaturity of the experimental paradigm applied to software development. Researchers usually use replication packages to replicate an experiment. However, replication packages are not the solution to all the information management problems that crop up when successive replications of an experiment accumulate. This research borrows ideas from the software configuration management and software product line paradigms to support the replication process. We believe that configuration management can help to manage and administer information from one replication to another: hypotheses, designs, data analysis, etc. The software product line paradigm can help to organize and manage any changes introduced into the experiment by each replication. We expect the union of the two paradigms in replication to improve the planning, design and execution of further replications and their alignment with existing replications. Additionally, this research work will contribute a web support environment for archiving information related to different experiment replications. Additionally, it will provide flexible enough information management support for running replications with different numbers and types of changes. Finally, it will afford massive storage of data from different replications. Experimenters working collaboratively on the same experiment must all have access to the different experiments.