926 resultados para software bug
Resumo:
Software faults are expensive and cause serious damage, particularly if discovered late or not at all. Some software faults tend to be hidden. One goal of the thesis is to figure out the status quo in the field of software fault elimination since there are no recent surveys of the whole area. Basis for a structural framework is proposed for this unstructured field, paying attention to compatibility and how to find studies. Bug elimination means are surveyed, including bug knowhow, defect prevention and prediction, analysis, testing, and fault tolerance. The most common research issues for each area are identified and discussed, along with issues that do not get enough attention. Recommendations are presented for software developers, researchers, and teachers. Only the main lines of research are figured out. The main emphasis is on technical aspects. The survey was done by performing searches in IEEE, ACM, Elsevier, and Inspect databases. In addition, a systematic search was done for a few well-known related journals from recent time intervals. Some other journals, some conference proceedings and a few books, reports, and Internet articles have been investigated, too. The following problems were found and solutions for them discussed. Quality assurance is testing only is a common misunderstanding, and many checks are done and some methods applied only in the late testing phase. Many types of static review are almost forgotten even though they reveal faults that are hard to be detected by other means. Other forgotten areas are knowledge of bugs, knowing continuously repeated bugs, and lightweight means to increase reliability. Compatibility between studies is not always good, which also makes documents harder to understand. Some means, methods, and problems are considered method- or domain-specific when they are not. The field lacks cross-field research.
Resumo:
Security defects are common in large software systems because of their size and complexity. Although efficient development processes, testing, and maintenance policies are applied to software systems, there are still a large number of vulnerabilities that can remain, despite these measures. Some vulnerabilities stay in a system from one release to the next one because they cannot be easily reproduced through testing. These vulnerabilities endanger the security of the systems. We propose vulnerability classification and prediction frameworks based on vulnerability reproducibility. The frameworks are effective to identify the types and locations of vulnerabilities in the earlier stage, and improve the security of software in the next versions (referred to as releases). We expand an existing concept of software bug classification to vulnerability classification (easily reproducible and hard to reproduce) to develop a classification framework for differentiating between these vulnerabilities based on code fixes and textual reports. We then investigate the potential correlations between the vulnerability categories and the classical software metrics and some other runtime environmental factors of reproducibility to develop a vulnerability prediction framework. The classification and prediction frameworks help developers adopt corresponding mitigation or elimination actions and develop appropriate test cases. Also, the vulnerability prediction framework is of great help for security experts focus their effort on the top-ranked vulnerability-prone files. As a result, the frameworks decrease the number of attacks that exploit security vulnerabilities in the next versions of the software. To build the classification and prediction frameworks, different machine learning techniques (C4.5 Decision Tree, Random Forest, Logistic Regression, and Naive Bayes) are employed. The effectiveness of the proposed frameworks is assessed based on collected software security defects of Mozilla Firefox.
Resumo:
Software bug analysis is one of the most important activities in Software Quality. The rapid and correct implementation of the necessary repair influence both developers, who must leave the fully functioning software, and users, who need to perform their daily tasks. In this context, if there is an incorrect classification of bugs, there may be unwanted situations. One of the main factors to be assigned bugs in the act of its initial report is severity, which lives up to the urgency of correcting that problem. In this scenario, we identified in datasets with data extracted from five open source systems (Apache, Eclipse, Kernel, Mozilla and Open Office), that there is an irregular distribution of bugs with respect to existing severities, which is an early sign of misclassification. In the dataset analyzed, exists a rate of about 85% bugs being ranked with normal severity. Therefore, this classification rate can have a negative influence on software development context, where the misclassified bug can be allocated to a developer with little experience to solve it and thus the correction of the same may take longer, or even generate a incorrect implementation. Several studies in the literature have disregarded the normal bugs, working only with the portion of bugs considered severe or not severe initially. This work aimed to investigate this portion of the data, with the purpose of identifying whether the normal severity reflects the real impact and urgency, to investigate if there are bugs (initially classified as normal) that could be classified with other severity, and to assess if there are impacts for developers in this context. For this, an automatic classifier was developed, which was based on three algorithms (Näive Bayes, Max Ent and Winnow) to assess if normal severity is correct for the bugs categorized initially with this severity. The algorithms presented accuracy of about 80%, and showed that between 21% and 36% of the bugs should have been classified differently (depending on the algorithm), which represents somewhere between 70,000 and 130,000 bugs of the dataset.
Resumo:
We propose a new approach and related indicators for globally distributed software support and development based on a 3-year process improvement project in a globally distributed engineering company. The company develops, delivers and supports a complex software system with tailored hardware components and unique end-customer installations. By applying the domain knowledge from operations management on lead time reduction and its multiple benefits to process performance, the workflows of globally distributed software development and multitier support processes were measured and monitored throughout the company. The results show that the global end-to-end process visibility and centrally managed reporting at all levels of the organization catalyzed a change process toward significantly better performance. Due to the new performance indicators based on lead times and their variation with fixed control procedures, the case company was able to report faster bug-fixing cycle times, improved response times and generally better customer satisfaction in its global operations. In all, lead times to implement new features and to respond to customer issues and requests were reduced by 50%.
Resumo:
This thesis studies evaluation of software development practices through an error analysis. The work presents software development process, software testing, software errors, error classification and software process improvement methods. The practical part of the work presents results from the error analysis of one software process. It also gives improvement ideas for the project. It was noticed that the classification of the error data was inadequate in the project. Because of this it was impossible to use the error data effectively. With the error analysis we were able to show that there were deficiencies in design and analyzing phases, implementation phase and in testing phase. The work gives ideas for improving error classification and for software development practices.
Resumo:
Les logiciels sont de plus en plus complexes et leur développement est souvent fait par des équipes dispersées et changeantes. Par ailleurs, de nos jours, la majorité des logiciels sont recyclés au lieu d’être développés à partir de zéro. La tâche de compréhension, inhérente aux tâches de maintenance, consiste à analyser plusieurs dimensions du logiciel en parallèle. La dimension temps intervient à deux niveaux dans le logiciel : il change durant son évolution et durant son exécution. Ces changements prennent un sens particulier quand ils sont analysés avec d’autres dimensions du logiciel. L’analyse de données multidimensionnelles est un problème difficile à résoudre. Cependant, certaines méthodes permettent de contourner cette difficulté. Ainsi, les approches semi-automatiques, comme la visualisation du logiciel, permettent à l’usager d’intervenir durant l’analyse pour explorer et guider la recherche d’informations. Dans une première étape de la thèse, nous appliquons des techniques de visualisation pour mieux comprendre la dynamique des logiciels pendant l’évolution et l’exécution. Les changements dans le temps sont représentés par des heat maps. Ainsi, nous utilisons la même représentation graphique pour visualiser les changements pendant l’évolution et ceux pendant l’exécution. Une autre catégorie d’approches, qui permettent de comprendre certains aspects dynamiques du logiciel, concerne l’utilisation d’heuristiques. Dans une seconde étape de la thèse, nous nous intéressons à l’identification des phases pendant l’évolution ou pendant l’exécution en utilisant la même approche. Dans ce contexte, la prémisse est qu’il existe une cohérence inhérente dans les évènements, qui permet d’isoler des sous-ensembles comme des phases. Cette hypothèse de cohérence est ensuite définie spécifiquement pour les évènements de changements de code (évolution) ou de changements d’état (exécution). L’objectif de la thèse est d’étudier l’unification de ces deux dimensions du temps que sont l’évolution et l’exécution. Ceci s’inscrit dans notre volonté de rapprocher les deux domaines de recherche qui s’intéressent à une même catégorie de problèmes, mais selon deux perspectives différentes.
Resumo:
High throughput sequencing (HTS) provides new research opportunities for work on non-model organisms, such as differential expression studies between populations exposed to different environmental conditions. However, such transcriptomic studies first require the production of a reference assembly. The choice of sampling procedure, sequencing strategy and assembly workflow is crucial. To develop a reliable reference transcriptome for Triatoma brasiliensis, the major Chagas disease vector in Northeastern Brazil, different de novo assembly protocols were generated using various datasets and software. Both 454 and Illumina sequencing technologies were applied on RNA extracted from antennae and mouthparts from single or pooled individuals. The 454 library yielded 278 Mb. Fifteen Illumina libraries were constructed and yielded nearly 360 million RNA-seq single reads and 46 million RNA-seq paired-end reads for nearly 45 Gb. For the 454 reads, we used three assemblers, Newbler, CAP3 and/or MIRA and for the Illumina reads, the Trinity assembler. Ten assembly workflows were compared using these programs separately or in combination. To compare the assemblies obtained, quantitative and qualitative criteria were used, including contig length, N50, contig number and the percentage of chimeric contigs. Completeness of the assemblies was estimated using the CEGMA pipeline. The best assembly (57,657 contigs, completeness of 80 %, < 1 % chimeric contigs) was a hybrid assembly leading to recommend the use of (1) a single individual with large representation of biological tissues, (2) merging both long reads and short paired-end Illumina reads, (3) several assemblers in order to combine the specific advantages of each.
Resumo:
High Throughput Sequencing capabilities have made the process of assembling a transcriptome easier, whether or not there is a reference genome. But the quality of a transcriptome assembly must be good enough to capture the most comprehensive catalog of transcripts and their variations, and to carry out further experiments on transcriptomics. There is currently no consensus on which of the many sequencing technologies and assembly tools are the most effective. Many non-model organisms lack a reference genome to guide the transcriptome assembly. One question, therefore, is whether or not a reference-based genome assembly gives better results than de novo assembly. The blood-sucking insect Rhodnius prolixus-a vector for Chagas disease-has a reference genome. It is therefore a good model on which to compare reference-based and de novo transcriptome assemblies. In this study, we compared de novo and reference-based genome assembly strategies using three datasets (454, Illumina, 454 combined with Illumina) and various assembly software. We developed criteria to compare the resulting assemblies: the size distribution and number of transcripts, the proportion of potentially chimeric transcripts, how complete the assembly was (completeness evaluated both through CEGMA software and R. prolixus proteome fraction retrieved). Moreover, we looked for the presence of two chemosensory gene families (Odorant-Binding Proteins and Chemosensory Proteins) to validate the assembly quality. The reference-based assemblies after genome annotation were clearly better than those generated using de novo strategies alone. Reference-based strategies revealed new transcripts, including new isoforms unpredicted by automatic genome annotation. However, a combination of both de novo and reference-based strategies gave the best result, and allowed us to assemble fragmented transcripts.
Resumo:
Studio degli strumenti Open Source usati per lo sviluppo cooperativo del software, delle loro possibili interazioni e di come esse facilitino lo sviluppo cooperativo.
Resumo:
La presente tesi esamina la progettazione e lo sviluppo di un sistema software la cui funzionalità principale è quella di gestire segnalazioni in merito a problematiche riscontrate dall'utente che possono riguardare diversi ambiti come segnalazioni post-vendita in merito a progetti informatici (bug, mancanza di funzionalità, errori di funzionalità...) o segnalazioni in merito a disguidi su ordini (ordine errato, ordine non ricevuto...). Tali problematiche vengono identificate nel ticket. Una volta aperto, dopo un'analisi del problema, il ticket viene assegnato ad un operatore che si occuperà di risolverlo. In questa fase operatore ed utente possono scambiarsi informazioni aggiuntive tramite un thread di conversazione associato al ticket. Il sistema è volto ad uniformare il canale di comunicazione tra azienda e cliente e a fornire all'azienda che ne ha fatto richiesta un sistema efficiente per la gestione di queste segnalazioni, portando dei benefici ad entrambe le parti, impiegati e clienti, che possono fornire un feedback in merito al servizio ricevuto. Il sistema è stato sviluppato per dispositivi Android. L'architettura utilizzata per sviluppare l'applicazione è di tipo client-server. I dati necessari al funzionamento dell'applicazione sono conservati in un database online.
Resumo:
Code duplication is common in current programming-practice: programmers search for snippets of code, incorporate them into their projects and then modify them to their needs. In today's practice, no automated scheme is in place to inform both parties of any distant changes of the code. As code snippets continues to evolve both on the side of the user and on the side of the author, both may wish to benefit from remote bug fixes or refinements --- authors may be interested in the actual usage of their code snippets, and researchers could gather information on clone usage. We propose maintaining a link between software clones across repositories and outline how the links can be created and maintained.
Resumo:
For popular software systems, the number of daily submitted bug reports is high. Triaging these incoming reports is a time consuming task. Part of the bug triage is the assignment of a report to a developer with the appropriate expertise. In this paper, we present an approach to automatically suggest developers who have the appropriate expertise for handling a bug report. We model developer expertise using the vocabulary found in their source code contributions and compare this vocabulary to the vocabulary of bug reports. We evaluate our approach by comparing the suggested experts to the persons who eventually worked on the bug. Using eight years of Eclipse development as a case study, we achieve 33.6\% top-1 precision and 71.0\% top-10 recall.
Resumo:
Detecting bugs as early as possible plays an important role in ensuring software quality before shipping. We argue that mining previous bug fixes can produce good knowledge about why bugs happen and how they are fixed. In this paper, we mine the change history of 717 open source projects to extract bug-fix patterns. We also manually inspect many of the bugs we found to get insights into the contexts and reasons behind those bugs. For instance, we found out that missing null checks and missing initializations are very recurrent and we believe that they can be automatically detected and fixed.
Resumo:
Computer software plays an important role in business, government, society and sciences. To solve real-world problems, it is very important to measure the quality and reliability in the software development life cycle (SDLC). Software Engineering (SE) is the computing field concerned with designing, developing, implementing, maintaining and modifying software. The present paper gives an overview of the Data Mining (DM) techniques that can be applied to various types of SE data in order to solve the challenges posed by SE tasks such as programming, bug detection, debugging and maintenance. A specific DM software is discussed, namely one of the analytical tools for analyzing data and summarizing the relationships that have been identified. The paper concludes that the proposed techniques of DM within the domain of SE could be well applied in fields such as Customer Relationship Management (CRM), eCommerce and eGovernment. ACM Computing Classification System (1998): H.2.8.
Resumo:
Software product line engineering promotes large software reuse by developing a system family that shares a set of developed core features, and enables the selection and customization of a set of variabilities that distinguish each software product family from the others. In order to address the time-to-market, the software industry has been using the clone-and-own technique to create and manage new software products or product lines. Despite its advantages, the clone-and-own approach brings several difficulties for the evolution and reconciliation of the software product lines, especially because of the code conflicts generated by the simultaneous evolution of the original software product line, called Source, and its cloned products, called Target. This thesis proposes an approach to evolve and reconcile cloned products based on mining software repositories and code conflict analysis techniques. The approach provides support to the identification of different kinds of code conflicts – lexical, structural and semantics – that can occur during development task integration – bug correction, enhancements and new use cases – from the original evolved software product line to the cloned product line. We have also conducted an empirical study of characterization of the code conflicts produced during the evolution and merging of two large-scale web information system product lines. The results of our study demonstrate the approach potential to automatically or semi-automatically solve several existing code conflicts thus contributing to reduce the complexity and costs of the reconciliation of cloned software product lines.