AutoSD, a project led by Dr. Sungmin Kang, has been accepted into the Journal of Empeirical Software Engineering. The preprint is available from here.

Scientific Debugging is a guideline for systematic debugging originally proposed by Andreas Zeller initially for human developers. It suggests that developers shoud follow the process of scientific discovery:

  1. Hypothesize why the failure occurs
  2. Predict the program behaviour based on the hypothesis
  3. Experiment to confirm the prediction
  4. Observe the result
  5. Conclude what the root cause is if the observation confirms the hypothesis; otherwise go back and make another hypothesis

AutoSD is an autonomous LLM agent that performs scientific debugging, with zero-shot prompt that explains the process of scientific debugging as well as how to use debuggers to run experiments and make observations. We believe this work is significant in two aspects:

  • Here, an LLM agent is excuting a guideline that is initially written for humans. In turn, we expect that the outcome (i.e., the generated patch as well as the debugging process itself) is inherently explainable and better aligned with human understanding, especially compared to existing techniques that directly produces patches.
  • AutoSD shows how the autocompletion driven by LLMs can be hybridised with more symbolic analysis (here, dynamic analysis using the debugger). We believe that such executability will play a major role in improving the robustness of LLM-generated solutions.

This work was born out of Dr. Kang’s internship at Microsoft Research Asia. Congratulations, everyone!