Projectdetail

Grant DOI 10.55776/P32653
Funding program Principal Investigator Projects
Status ended
Start January 1, 2020
End April 30, 2024
Funding amount € 404,495
E-mail

Disciplines

Computer Sciences (100%)

Keywords

Program Slicing, Software Fault Localization, Spectrum-Based Fault Localization, Model-Based Software Debugging, Software Debugging

Abstract

Final report

The process of correcting faults in software is called debugging. Debugging is a time-intensive process because even small programs can consist of several thousand lines of code which have to be manually investigated in order to find those lines which are responsible for an observed error (e.g. wrong computations or program crashes). It is estimated that software developers spend 30 % to 90 % of their time on debugging. Thus the improvement of the debugging process could save an enormous amount of money and time. Many researchers have developed approaches which support software developers when debugging. Unfortunately, these approaches are rarely used in practice. This project therefore aims to close the gap between academic research and debugging in practice and consists of three phases: First, we will examine the reasons why existing academic approaches are rarely used in practice. We will observe how software developers are debugging programs in order to assess the status quo of debugging in practice. Such observational sessions are very time consuming; therefore, they can only be done for a small group of study participants. In order to ensure that our findings are generally valid, we will additionally perform a large-scale online survey. Second, we will use the insights gained from the observational sessions and the online survey to improve existing debugging approaches. Thereby, we will particularly focus on the scalability, the accuracy, and the practicability of the approaches: Scalability: The debugging approaches should be able to handle large programs with more than one million lines of code. Accuracy: The debugging approaches should draw the software developers attention to those lines of code that are responsible for an observed error, but not to other irrelevant lines. Practicability: The debugging approaches should be easy to use. Software developers might struggle to use academic debugging approaches because they do not believe that such an approach can help them. In addition, they do not know which approach is best suited for their debugging problem. Therefore, we are going to answer two particularly interesting research questions in this project phase: (1) Can the combination of debugging approaches help to improve the overall debugging experience? (2) Is it possible to automatically select the best suited debugging method for a given program? Third, we will integrate the debugging approaches into the development environments and development processes. A debugging approach that is integrated into the software engineers development environment and process is more likely to be used than a stand-alone approach. To evaluate the usefulness of our developments, we will conduct extensive experiments.

Correcting errors in software is called debugging. Debugging is a time-consuming process because even small programs can consist of several thousand lines of code that must be examined manually to find the lines of code that are responsible for an observed error (e.g., incorrect calculations or program crashes). It is estimated that software developers spend 30% to 90% of their time debugging. Improvements to the debugging process therefore have the potential to save money and time. As a first step, we reached out to software developers with an online survey to find out what the biggest problems in the debugging process are. This survey showed that most errors are semantic errors that users notice, for example, due to incorrect operation of the program. The debugging process often follows the same pattern: recreating the error (executing a sequence of steps that leads to the error), making observations (What values do certain variables have? What changes to the sequence of steps cause the error to no longer occur? ...) and draw conclusions (e.g., focusing on specific lines of code). While the majority of programmers reported that it is easy to reproduce errors, many find identifying the location of the faulty code difficult. Furthermore, the study shows that errors are often complex and it is not enough to change individual lines of code. It follows that researchers should develop debugging tools that are capable of identifying bugs consisting of multiple lines of code. In a second step, we looked at how we can support software developers with the most difficult part of the debugging process, i.e., fault localization. Here, we focused on improving two existing approaches: The first approach (Information Retrieval Fault Localization) works with textual descriptions of errors, so-called bug reports, and uses artificial intelligence to find code that matches the error description. One of our extensions additionally calculates the error type based on probabilities. This approach is suitable for programs of any size (keyword: scalability). The second approach (Slicing) targets smaller programs: software developers reproduce errors and the developed slicing tool marks all lines of code that have an impact on the calculated error. All data sets, all evaluations carried out and all code written in the project are publicly available (see https://amadeus.ist.tugraz.at/).

Research institution(s)

Technische Universität Graz - 100%

International project participants

Rui Abreu, University of Lisbon - Portugal

Research Output

61 Citations
17 Publications
8 Datasets & models

Publications

Title	Reducing the Length of Dynamic and Relevant Slices by Pruning Boolean Expressions
DOI	10.3390/electronics13061146
Type	Journal Article
Author	Hirsch T
Journal	Electronics
Pages	1146
Link	Publication

Title	Automated Fault Classification and Fault Localization based on Textual Bug Reports
Type	PhD Thesis
Author	Thomas Hirsch
Link	Publication

Title	Best practices for evaluating IRFL approaches
DOI	10.1016/j.jss.2025.112342
Type	Journal Article
Author	Hirsch T
Journal	Journal of Systems and Software
Pages	112342
Link	Publication

Title	Predictive Reranking using Code Smells for Information Retrieval Fault Localization
DOI	10.1109/sami60510.2024.10432857
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	000277-000282

Title	The MAP Metric in Information Retrieval Fault Localization
DOI	10.1109/ase56229.2023.00041
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	1480-1491
Link	Publication

Title	What we can learn from how programmers debug their code
DOI	10.1109/ser-ip52554.2021.00014
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	37-40
Link	Publication

Title	Using textual bug reports to predict the fault category of software bugs
DOI	10.1016/j.array.2022.100189
Type	Journal Article
Author	Hirsch T
Journal	Array
Pages	100189
Link	Publication

Title	Identifying non-natural language artifacts in bug reports
DOI	10.48550/arxiv.2110.01336
Type	Preprint
Author	Hirsch T

Title	A Fault Localization and Debugging Support Framework driven by Bug Tracking Data
DOI	10.48550/arxiv.2103.02386
Type	Preprint
Author	Hirsch T

Title	Root cause prediction based on bug reports
DOI	10.48550/arxiv.2103.02372
Type	Preprint
Author	Hirsch T

Title	A systematic literature review on benchmarks for evaluating debugging approaches
DOI	10.1016/j.jss.2022.111423
Type	Journal Article
Author	Hirsch T
Journal	Journal of Systems and Software
Pages	111423
Link	Publication

Title	Detecting non-natural language artifacts for de-noising bug reports
DOI	10.1007/s10515-022-00350-0
Type	Journal Article
Author	Hirsch T
Journal	Automated Software Engineering
Pages	52
Link	Publication

Title	Pruning Boolean Expressions to Shorten Dynamic Slices
DOI	10.1109/scam55253.2022.00006
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	1-11

Title	What we can learn from how programmers debug their code
DOI	10.48550/arxiv.2103.12447
Type	Preprint
Author	Hirsch T

Title	Identifying non-natural language artifacts in bug reports
DOI	10.1109/asew52652.2021.00046
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	191-197
Link	Publication

Title	Root cause prediction based on bug reports
DOI	10.1109/issrew51248.2020.00067
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	171-176
Link	Publication

Title	A Fault Localization and Debugging Support Framework driven by Bug Tracking Data
DOI	10.1109/issrew51248.2020.00053
Type	Conference Proceeding Abstract
Author	Hirsch T
Pages	139-142
Link	Publication

Datasets & models

Public Access
Title	prunedSlicing: Reducing the Length of Dynamic and Relevant Slices by Pruning Boolean Expressions
DOI	10.5281/zenodo.6908074
Type	Database/Collection of data
Link	Link

Public Access
Title	Best practices for evaluating IRFL approaches - Supplemental material
DOI	10.5281/zenodo.11509228
Type	Database/Collection of data
Link	Link

Public Access
Title	Supplementary material for 'The MAP metric in Information Retrieval Fault Localization'
DOI	10.5281/zenodo.7817015
Type	Database/Collection of data
Link	Link

Public Access
Title	Supplemental Material for Predictive Reranking using Code Smells for Information Retrieval Fault Localization
DOI	10.5281/zenodo.8186774
Type	Database/Collection of data
Link	Link

Public Access
Title	artifact_detection - A tool for NLP tasks on textual bug reports.
DOI	10.5281/zenodo.5519502
Type	Database/Collection of data
Link	Link

Public Access
Title	Supplemental Material for a Systematic Literature Review on Benchmarks for Evaluating Debugging Approaches
DOI	10.5281/zenodo.6670198
Type	Database/Collection of data
Link	Link

Public Access
Title	Debugging Questionnaire Dataset
DOI	10.5281/zenodo.4449044
Type	Database/Collection of data
Link	Link

Public Access
Title	AmadeusGitHubBugDataset
DOI	10.5281/zenodo.3973048
Type	Database/Collection of data
Link	Link

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

Automated Debugging in Use

Automated Debugging in Use

Disciplines

Keywords

Research Output

Contact

General information

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

SOCIAL MEDIA

SCILOG

Automated Debugging in Use

Automated Debugging in Use

Disciplines

Keywords

Research Output