When interacting with version control systems, developers often commit unrelated or loosely related code changes in a single transaction. When analyzing the version history, such tangled changes will make all changes to all modules appear related, possibly compromising the resulting analyses through noise and bias. In an investigation of five open-source JAVA projects, we found up to 15% of all bug fixes to consist of multiple tangled changes. Using a multi-predictor approach to untangle changes, we show that on average at least 16.6% of all source files are incorrectly associated with bug reports. We recommend better change organization to limit the impact of tangled changes.

Further Details

For more information please visit the website at http://softevo.org/bugclassify/. Please also note that the public available data sets moved to a new location.

K. Herzig and A. Zeller, “The Impact of Tangled Code Changes,” in Proceedings of the 10th working conference on mining software repositories, Piscataway, NJ, USA, 2013, pp. 121-130.
[Bibtex]

@inproceedings{herzig-msr-2013,
title = {{The Impact of Tangled Code Changes}},
author = {Kim Herzig and Andreas Zeller},
booktitle = {Proceedings of the 10th Working Conference on Mining Software Repositories},
series = {MSR '13},
year = {2013},
isbn = {978-1-4673-2936-1},
location = {San Francisco, CA, USA},
pages = {121--130},
numpages = {10},
url = {http://dl.acm.org/citation.cfm?id=2487085.2487113},
acmid = {2487113},
publisher = {IEEE Press},
address = {Piscataway, NJ, USA},
link = {http://www.kim-herzig.de/2013/03/22/untangling_changes/},
pdf = {http://www.kim-herzig.de/wp-content/uploads/2013/03/msr2013-untangling.pdf}
}

Download author PDF.
The author PDF is posted here by permission of IEEE Press for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 10th working conference on mining software repositories .
K. Herzig, S. Just, and A. Zeller, “The impact of tangled code changes on defect prediction models,” Empirical software engineering, pp. 1-34, 2015.
[Bibtex]

@article{herzig-emse-2015,
year={2015},
issn={1382-3256},
journal={Empirical Software Engineering},
doi={10.1007/s10664-015-9376-6},
title={The impact of tangled code changes on defect prediction models},
pdf={http://dx.doi.org/10.1007/s10664-015-9376-6},
publisher={Springer US},
keywords={Defect prediction; Untangling; Data noise},
author={Herzig, Kim and Just, Sascha and Zeller, Andreas},
pages={1-34},
language={English}
}

Download author PDF.
The author PDF is posted here by permission of Springer US for your personal use. Not for redistribution. The definitive version was published in and can be downloaded using the publisher site .

Public data sets

Please check our github repository for the public data sets. Below you can find the repository snapshots we used for the experiments.

No Comment

Comments are closed.