A Large-Scale Empirical Study of Commit Message Generation: Models, Datasets and Evaluation
Published:
Note
This paper is an extension of our conference paper published in ICSME 2021. The new contents are as follows:
- We conduct a literature review on commit message generation models.
- More recent work, especially CoRec, is discussed and experimentally evaluated.
- We add experiments to evaluate the effectiveness of CodeBERT, a recent pre-trained source code model, in commit message generation with multiple programming languages.
- We explore the research questions in more depth and provide more analysis on the experimental results.
Abstract
Commit messages are natural language descriptions of code changes, which are important for program understanding and maintenance. However, writing commit messages manually is time-consuming and laborious, especially when the code is updated frequently. Various approaches utilizing generation or retrieval techniques have been proposed to automatically generate commit messages.
To achieve a better understanding of how the existing approaches perform in solving this problem, this paper conducts a systematic and in-depth analysis of the state-of-the-art models and datasets. We find that: (1) Different variants of the BLEU metric used in previous works affect the evaluation. (2) Most datasets are crawled only from Java repositories while repositories in other programming languages are not sufficiently explored. (3) Dataset splitting strategies can influence the performance of existing models by a large margin. (4) For pre-trained models, fune-tuning with different multi-programming-language combinations can influence their performance.
Based on these findings, we collect a large-scale, information-rich, Multi-language Commit Message Dataset MCMD. Using MCMD, we conduct extensive experiments under different experiment settings including splitting strategies and multi-programming-language combinations. Furthermore, we provide suggestions for comprehensively evaluating commit message generation models and discuss possible future research directions. We believe our work can help practitioners and researchers better evaluate and select models for automatic commit message generation.
Our source code and data are available at https://anonymous.4open.science/r/CommitMessageEmpirical
Links
Related Works
On the Evaluation of Commit Message Generation Models: An Experimental Study
Recommended citation
Plain Text
Tao, W., Wang, Y., Shi, E. et al. A large-scale empirical study of commit message generation: models, datasets and evaluation. Empir Software Eng 27, 198 (2022). https://doi.org/10.1007/s10664-022-10219-1
BibTex
@inproceedings{journals/emse/TaoWSDH0ZZ22,
author = {Wei Tao and
Yanlin Wang and
Ensheng Shi and
Lun Du and
Shi Han and
Hongyu Zhang and
Dongmei Zhang and
Wenqiang Zhang},
title = {A Large-Scale Empirical Study of Commit Message Generation: Models,
Datasets and Evaluation},
journal = {Empir. Softw. Eng.},
year = {2022},
doi = {10.1007/s10664-022-10219-1}
}