Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities

White, Martin; Tufano, Michele; Martinez, Matias; Monperrus, Martin; Poshyvanyk, Denys

doi:10.1109/SANER.2019.8668043

Computer Science > Software Engineering

arXiv:1707.04742 (cs)

[Submitted on 15 Jul 2017 (v1), last revised 31 Dec 2018 (this version, v2)]

Title:Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities

Authors:Martin White, Michele Tufano, Matias Martinez, Martin Monperrus, Denys Poshyvanyk

View PDF

Abstract:In the field of automated program repair, the redundancy assumption claims large programs contain the seeds of their own repair. However, most redundancy-based program repair techniques do not reason about the repair ingredients---the code that is reused to craft a patch. We aim to reason about the repair ingredients by using code similarities to prioritize and transform statements in a codebase for patch generation. Our approach, DeepRepair, relies on deep learning to reason about code similarities. Code fragments at well-defined levels of granularity in a codebase can be sorted according to their similarity to suspicious elements (i.e., code elements that contain suspicious statements) and statements can be transformed by mapping out-of-scope identifiers to similar identifiers in scope. We examined these new search strategies for patch generation with respect to effectiveness from the viewpoint of a software maintainer. Our comparative experiments were executed on six open-source Java projects including 374 buggy program revisions and consisted of 19,949 trials spanning 2,616 days of computation time. DeepRepair's search strategy using code similarities generally found compilable ingredients faster than the baseline, jGenProg, but this improvement neither yielded test-adequate patches in fewer attempts (on average) nor found significantly more patches than the baseline. Although the patch counts were not statistically different, there were notable differences between the nature of DeepRepair patches and baseline patches. The results demonstrate that our learning-based approach finds patches that cannot be found by existing redundancy-based repair techniques.

Comments:	camera-ready paper for SANER 2019
Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:1707.04742 [cs.SE]
	(or arXiv:1707.04742v2 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.1707.04742
Journal reference:	Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering, 2019
Related DOI:	https://doi.org/10.1109/SANER.2019.8668043

Submission history

From: Martin White [view email]
[v1] Sat, 15 Jul 2017 14:41:40 UTC (239 KB)
[v2] Mon, 31 Dec 2018 03:01:40 UTC (268 KB)

Computer Science > Software Engineering

Title:Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators