991 resultados para Software defect prediction


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significants concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more? bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As users continually request additional functionality, software systems will continue to grow in their complexity, as well as in their susceptibility to failures. Particularly for sensitive systems requiring higher levels of reliability, faulty system modules may increase development and maintenance cost. Hence, identifying them early would support the development of reliable systems through improved scheduling and quality control. Research effort to predict software modules likely to contain faults, as a consequence, has been substantial. Although a wide range of fault prediction models have been proposed, we remain far from having reliable tools that can be widely applied to real industrial systems. For projects with known fault histories, numerous research studies show that statistical models can provide reasonable estimates at predicting faulty modules using software metrics. However, as context-specific metrics differ from project to project, the task of predicting across projects is difficult to achieve. Prediction models obtained from one project experience are ineffective in their ability to predict fault-prone modules when applied to other projects. Hence, taking full benefit of the existing work in software development community has been substantially limited. As a step towards solving this problem, in this dissertation we propose a fault prediction approach that exploits existing prediction models, adapting them to improve their ability to predict faulty system modules across different software projects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As users continually request additional functionality, software systems will continue to grow in their complexity, as well as in their susceptibility to failures. Particularly for sensitive systems requiring higher levels of reliability, faulty system modules may increase development and maintenance cost. Hence, identifying them early would support the development of reliable systems through improved scheduling and quality control. Research effort to predict software modules likely to contain faults, as a consequence, has been substantial. Although a wide range of fault prediction models have been proposed, we remain far from having reliable tools that can be widely applied to real industrial systems. For projects with known fault histories, numerous research studies show that statistical models can provide reasonable estimates at predicting faulty modules using software metrics. However, as context-specific metrics differ from project to project, the task of predicting across projects is difficult to achieve. Prediction models obtained from one project experience are ineffective in their ability to predict fault-prone modules when applied to other projects. Hence, taking full benefit of the existing work in software development community has been substantially limited. As a step towards solving this problem, in this dissertation we propose a fault prediction approach that exploits existing prediction models, adapting them to improve their ability to predict faulty system modules across different software projects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software faults are expensive and cause serious damage, particularly if discovered late or not at all. Some software faults tend to be hidden. One goal of the thesis is to figure out the status quo in the field of software fault elimination since there are no recent surveys of the whole area. Basis for a structural framework is proposed for this unstructured field, paying attention to compatibility and how to find studies. Bug elimination means are surveyed, including bug knowhow, defect prevention and prediction, analysis, testing, and fault tolerance. The most common research issues for each area are identified and discussed, along with issues that do not get enough attention. Recommendations are presented for software developers, researchers, and teachers. Only the main lines of research are figured out. The main emphasis is on technical aspects. The survey was done by performing searches in IEEE, ACM, Elsevier, and Inspect databases. In addition, a systematic search was done for a few well-known related journals from recent time intervals. Some other journals, some conference proceedings and a few books, reports, and Internet articles have been investigated, too. The following problems were found and solutions for them discussed. Quality assurance is testing only is a common misunderstanding, and many checks are done and some methods applied only in the late testing phase. Many types of static review are almost forgotten even though they reveal faults that are hard to be detected by other means. Other forgotten areas are knowledge of bugs, knowing continuously repeated bugs, and lightweight means to increase reliability. Compatibility between studies is not always good, which also makes documents harder to understand. Some means, methods, and problems are considered method- or domain-specific when they are not. The field lacks cross-field research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software Repository Mining (MSR) is a research area that analyses software repositories in order to derive relevant information for the research and practice of software engineering. The main goal of repository mining is to extract static information from repositories (e.g. code repository or change requisition system) into valuable information providing a way to support the decision making of software projects. On the other hand, another research area called Process Mining (PM) aims to find the characteristics of the underlying process of business organizations, supporting the process improvement and documentation. Recent works have been doing several analyses through MSR and PM techniques: (i) to investigate the evolution of software projects; (ii) to understand the real underlying process of a project; and (iii) create defect prediction models. However, few research works have been focusing on analyzing the contributions of software developers by means of MSR and PM techniques. In this context, this dissertation proposes the development of two empirical studies of assessment of the contribution of software developers to an open-source and a commercial project using those techniques. The contributions of developers are assessed through three different perspectives: (i) buggy commits; (ii) the size of commits; and (iii) the most important bugs. For the opensource project 12.827 commits and 8.410 bugs have been analyzed while 4.663 commits and 1.898 bugs have been analyzed for the commercial project. Our results indicate that, for the open source project, the developers classified as core developers have contributed with more buggy commits (although they have contributed with the majority of commits), more code to the project (commit size) and more important bugs solved while the results could not indicate differences with statistical significance between developer groups for the commercial project

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El presente proyecto desarrolla un estudio acústico del recinto: Auditorio Rafael Frühbeck de Burgos, cumpliendo con las indicaciones exigidas por la norma UNE-EN ISO 3382-1: 2010, “Medición de parámetros acústicos en recintos, Parte 1: Salas de Espectáculos”. Se desarrollan dos estudios acústicos sobre el mismo recinto. En el primero de ellos, el recinto está configurado para la realización de eventos tales como conferencias o congresos, donde la inteligibilidad de la palabra es un factor determinante. En el segundo estudio, el recinto se configura para espectáculos musicales como conciertos de orquesta sinfónica o música de cámara. En esta configuración, la palabra ya no es tan determinante como la correcta interpretación y disfrute de la música por parte de la audiencia. Para ambas configuraciones del recinto se ha realizado un procesado estadístico de los datos con el fin de obtener un valor único de cada parámetro acústico estudiado. De esta forma, se comparan los resultados para ambas configuraciones, y se evalúan los valores obtenidos de cada uno de los parámetros acústicos con el fin de conocer si se adecuan a las necesidades acústicas exigidas por el tipo de evento desarrollado. Además, se ha construido un modelo geométrico del recinto por ordenador, para ambas configuraciones acústicas, haciendo uso del software profesional de predicción y simulación acústica EASE. Se realiza un estudio acústico sobre el modelo geométrico mediante simulación, siguiendo las pautas llevadas a cabo durante la medición “in situ”. Los resultados obtenidos por simulación se comparan con los obtenidos de las mediciones “in situ”, para estudiar la validación del modelo geométrico. El parámetro acústico elegido para validar el modelo, en un primer momento, será el tiempo de reverberación. Si se consigue una buena validación del modelo geométrico, este puede ser utilizado para realizar predicciones acústicas mediante simulación, cuando un sistema de refuerzo sonoro sea utilizado dentro del recinto. El sistema de refuerzo sonoro ubicado en el recinto sometido a estudio, no ha sido utilizado en el presente proyecto. ABSTRACT. The present projects carry out an acoustic study of enclosure: Rafael Frühbeck Concert Hall, in Burgos, fulfilling the indications demanded by the standard UNE-EN ISO 3382-1:2010 “Measurement of room acoustic parameters – Part 1: Performance spaces. Two acoustics studies are developed on the same enclosure. In first of them, the enclosure is formed for the accomplishment of events such as conferences or congresses, where speech intelligibility is a determining factor. In the second study, the enclosure forms for musical performances like concerts of symphony orchestra or chamber music. In this acoustic configuration, speech intelligibility is not as determining as the correct interpretation and enjoyment of music in audience areas. For both configurations of the enclosure, a statistical processing of the data has been realised with the purpose of obtaining a unique value of each studied acoustic parameter. In this way, the results for both configurations are compared, and the obtained values of each one of the acoustic parameters are evaluated with the purpose of knowing if they are adapted to the acoustic needs demanded by the type of developed event. In addition, a geometric model of the enclosure has been constructed by computer, for both acoustic configurations; making use of the professional software of prediction and acoustical simulation, EASE. An acoustic study is developed on the geometric model by means of simulation, following the rules carried out during the measurement “in situ”. The results obtained by simulation are compared with the obtained ones from the measurement “in situ”, to study the validation of the geometric model. Initially the acoustic parameter chosen to validate the model is Reverberation Time. If a good validation of the geometric model is reached, it can be used to realize acoustic predictions by simulation, when a sound reinforcement system is used within the enclosure. The sound reinforcement system located in the enclosure under study has not been used in the present project.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Consider the statement "this project should cost X and has risk of Y". Such statements are used daily in industry as the basis for making decisions. The work reported here is part of a study aimed at providing a rational and pragmatic basis for such statements. Of particular interest are predictions made in the requirements and early phases of projects. A preliminary model has been constructed using Bayesian Belief Networks and in support of this, a programme to collect and study data during the execution of various software development projects commenced in May 2002. The data collection programme is undertaken under the constraints of a commercial industrial regime of multiple concurrent small to medium scale software development projects. Guided by pragmatism, the work is predicated on the use of data that can be collected readily by project managers; including expert judgements, effort, elapsed times and metrics collected within each project.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Managing software maintenance is rarely a precise task due to uncertainties concerned with resources and services descriptions. Even when a well-established maintenance process is followed, the risk of delaying tasks remains if the new services are not precisely described or when resources change during process execution. Also, the delay of a task at an early process stage may represent a different delay at the end of the process, depending on complexity or services reliability requirements. This paper presents a knowledge-based representation (Bayesian Networks) for maintenance project delays based on specialists experience and a corresponding tool to help in managing software maintenance projects. (c) 2006 Elsevier Ltd. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Objectives: To compare simulated periodontal bone defect depth measured in digital radiographs with dedicated and non-dedicated software systems and to compare the depth measurements from each program with the measurements in dry mandibles.Methods: Forty periodontal bone defects were created at the proximal area of the first premolar in dry pig mandibles. Measurements of the defects were performed with a periodontal probe in the dry mandible. Periapical digital radiographs of the defects were recorded using the Schick sensor in a standardized exposure setting. All images were read using a Schick dedicated software system (CDR DICOM for Windows v.3.5), and three commonly available non-dedicated software systems (Vix Win 2000 v.1.2; Adobe Photoshop 7.0 and Image Tool 3.0). The defects were measured three times in each image and a consensus was reached among three examiners using the four software systems. The difference between the radiographic measurements was analysed using analysis of variance (ANOVA) and by comparing the measurements from each software system with the dry mandibles measurements using Student's t-test.Results: the mean values of the bone defects measured in the radiographs were 5.07 rum, 5.06 rum, 5.01 mm and 5.11 mm for CDR Digital Image and Communication in Medicine (DICOM) for Windows, Vix Win, Adobe Photoshop, and Image Tool, respectively, and 6.67 mm for the dry mandible. The means of the measurements performed in the four software systems were not significantly different, ANOVA (P = 0.958). A significant underestimation of defect depth was obtained when we compared the mean depths from each software system with the dry mandible measurements (t-test; P congruent to 0.000).Conclusions: the periodontal bone defect measurements in dedicated and in three non-dedicated software systems were not significantly different, but they all underestimated the measurements when compared with the measurements obtained in the dry mandibles.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Traction prediction modelling, a key factor in farm tractor design, has been driven by the need to find the answer to this question without having to build physical prototypes. A wide range of theories and their respective algorithms can be used in such predictions. The “Tractors and Tillage” research team at the Polytechnic University of Madrid, which engages, among others, in traction prediction for farm tractors, has developed a series of programs based on the cone index as the parameter representative of the terrain. With the software introduced in the present paper, written in Visual Basic, slip can be predicted in two- and four-wheel drive tractors using any one of four models. It includes databases for tractors, front tyres, rear tyres and working conditions (soil cone index and drawbar pull exerted). The results can be exported in spreadsheet format.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Transportation Department, Office of University Research, Washington, D.C.