Automated Debugging in Use
Automated Debugging in Use
Disciplines
Computer Sciences (100%)
Keywords
-
Program Slicing,
Software Fault Localization,
Spectrum-Based Fault Localization,
Model-Based Software Debugging,
Software Debugging
The process of correcting faults in software is called debugging. Debugging is a time-intensive process because even small programs can consist of several thousand lines of code which have to be manually investigated in order to find those lines which are responsible for an observed error (e.g. wrong computations or program crashes). It is estimated that software developers spend 30 % to 90 % of their time on debugging. Thus the improvement of the debugging process could save an enormous amount of money and time. Many researchers have developed approaches which support software developers when debugging. Unfortunately, these approaches are rarely used in practice. This project therefore aims to close the gap between academic research and debugging in practice and consists of three phases: First, we will examine the reasons why existing academic approaches are rarely used in practice. We will observe how software developers are debugging programs in order to assess the status quo of debugging in practice. Such observational sessions are very time consuming; therefore, they can only be done for a small group of study participants. In order to ensure that our findings are generally valid, we will additionally perform a large-scale online survey. Second, we will use the insights gained from the observational sessions and the online survey to improve existing debugging approaches. Thereby, we will particularly focus on the scalability, the accuracy, and the practicability of the approaches: Scalability: The debugging approaches should be able to handle large programs with more than one million lines of code. Accuracy: The debugging approaches should draw the software developers attention to those lines of code that are responsible for an observed error, but not to other irrelevant lines. Practicability: The debugging approaches should be easy to use. Software developers might struggle to use academic debugging approaches because they do not believe that such an approach can help them. In addition, they do not know which approach is best suited for their debugging problem. Therefore, we are going to answer two particularly interesting research questions in this project phase: (1) Can the combination of debugging approaches help to improve the overall debugging experience? (2) Is it possible to automatically select the best suited debugging method for a given program? Third, we will integrate the debugging approaches into the development environments and development processes. A debugging approach that is integrated into the software engineers development environment and process is more likely to be used than a stand-alone approach. To evaluate the usefulness of our developments, we will conduct extensive experiments.
Correcting errors in software is called debugging. Debugging is a time-consuming process because even small programs can consist of several thousand lines of code that must be examined manually to find the lines of code that are responsible for an observed error (e.g., incorrect calculations or program crashes). It is estimated that software developers spend 30% to 90% of their time debugging. Improvements to the debugging process therefore have the potential to save money and time. As a first step, we reached out to software developers with an online survey to find out what the biggest problems in the debugging process are. This survey showed that most errors are semantic errors that users notice, for example, due to incorrect operation of the program. The debugging process often follows the same pattern: recreating the error (executing a sequence of steps that leads to the error), making observations (What values do certain variables have? What changes to the sequence of steps cause the error to no longer occur? ...) and draw conclusions (e.g., focusing on specific lines of code). While the majority of programmers reported that it is easy to reproduce errors, many find identifying the location of the faulty code difficult. Furthermore, the study shows that errors are often complex and it is not enough to change individual lines of code. It follows that researchers should develop debugging tools that are capable of identifying bugs consisting of multiple lines of code. In a second step, we looked at how we can support software developers with the most difficult part of the debugging process, i.e., fault localization. Here, we focused on improving two existing approaches: The first approach (Information Retrieval Fault Localization) works with textual descriptions of errors, so-called bug reports, and uses artificial intelligence to find code that matches the error description. One of our extensions additionally calculates the error type based on probabilities. This approach is suitable for programs of any size (keyword: scalability). The second approach (Slicing) targets smaller programs: software developers reproduce errors and the developed slicing tool marks all lines of code that have an impact on the calculated error. All data sets, all evaluations carried out and all code written in the project are publicly available (see https://amadeus.ist.tugraz.at/).
- Technische Universität Graz - 100%
- Rui Abreu, University of Lisbon - Portugal
Research Output
- 41 Citations
- 11 Publications
- 8 Datasets & models
-
2024
Title Reducing the Length of Dynamic and Relevant Slices by Pruning Boolean Expressions DOI 10.3390/electronics13061146 Type Journal Article Author Hirsch T Journal Electronics Pages 1146 Link Publication -
2022
Title Pruning Boolean Expressions to Shorten Dynamic Slices DOI 10.1109/scam55253.2022.00006 Type Conference Proceeding Abstract Author Hirsch T Pages 1-11 -
2022
Title Detecting non-natural language artifacts for de-noising bug reports DOI 10.1007/s10515-022-00350-0 Type Journal Article Author Hirsch T Journal Automated Software Engineering Pages 52 Link Publication -
2020
Title Root cause prediction based on bug reports DOI 10.1109/issrew51248.2020.00067 Type Conference Proceeding Abstract Author Hirsch T Pages 171-176 Link Publication -
2020
Title A Fault Localization and Debugging Support Framework driven by Bug Tracking Data DOI 10.1109/issrew51248.2020.00053 Type Conference Proceeding Abstract Author Hirsch T Pages 139-142 Link Publication -
2021
Title Identifying non-natural language artifacts in bug reports DOI 10.1109/asew52652.2021.00046 Type Conference Proceeding Abstract Author Hirsch T Pages 191-197 Link Publication -
2024
Title Predictive Reranking using Code Smells for Information Retrieval Fault Localization DOI 10.1109/sami60510.2024.10432857 Type Conference Proceeding Abstract Author Hirsch T Pages 000277-000282 -
2023
Title The MAP Metric in Information Retrieval Fault Localization DOI 10.1109/ase56229.2023.00041 Type Conference Proceeding Abstract Author Hirsch T Pages 1480-1491 Link Publication -
2022
Title Using textual bug reports to predict the fault category of software bugs DOI 10.1016/j.array.2022.100189 Type Journal Article Author Hirsch T Journal Array Pages 100189 Link Publication -
2022
Title A systematic literature review on benchmarks for evaluating debugging approaches DOI 10.1016/j.jss.2022.111423 Type Journal Article Author Hirsch T Journal Journal of Systems and Software Pages 111423 Link Publication -
2021
Title What we can learn from how programmers debug their code DOI 10.1109/ser-ip52554.2021.00014 Type Conference Proceeding Abstract Author Hirsch T Pages 37-40 Link Publication
-
2024
Link
Title prunedSlicing: Reducing the Length of Dynamic and Relevant Slices by Pruning Boolean Expressions DOI 10.5281/zenodo.6908074 Type Database/Collection of data Public Access Link Link -
2024
Link
Title Best practices for evaluating IRFL approaches - Supplemental material DOI 10.5281/zenodo.11509228 Type Database/Collection of data Public Access Link Link -
2023
Link
Title Supplemental Material for Predictive Reranking using Code Smells for Information Retrieval Fault Localization DOI 10.5281/zenodo.8186774 Type Database/Collection of data Public Access Link Link -
2023
Link
Title Supplementary material for 'The MAP metric in Information Retrieval Fault Localization' DOI 10.5281/zenodo.7817015 Type Database/Collection of data Public Access Link Link -
2022
Link
Title artifact_detection - A tool for NLP tasks on textual bug reports. DOI 10.5281/zenodo.5519502 Type Database/Collection of data Public Access Link Link -
2022
Link
Title Supplemental Material for a Systematic Literature Review on Benchmarks for Evaluating Debugging Approaches DOI 10.5281/zenodo.6670198 Type Database/Collection of data Public Access Link Link -
2021
Link
Title Debugging Questionnaire Dataset DOI 10.5281/zenodo.4449044 Type Database/Collection of data Public Access Link Link -
2020
Link
Title AmadeusGitHubBugDataset DOI 10.5281/zenodo.3973048 Type Database/Collection of data Public Access Link Link