Hadley Wickham’s ggplot2 R package is a very powerful and excellent package. And although the documentation of the package is excellent, things can get complicated.

Background

In my PhD thesis, I compare various software metrics and the performance of defect prediction models based on these metrics. For this reason, I trained multiple (let’s say two) series of prediction models, each series using a different set of metric values as feature vectors. One series of prediction models is spread across different projects and for each project, I repeatedly sample the project specific data set 100 times to create 100 independent training and testing sets. That means, 200 independent prediction models (2 metric sets x 100 samplings) for each project. Additionally, I wanted to use multiple classification models to see which machine learning algorithm performs best and if the metrics are stable across different machine learners. Rahul Premraj and I used a similar plot to present result in one of our papers. R. Premraj and K. Herzig, “Network versus code metrics to predict defects: a replication study,” in Proceedings of the 2011 international symposium on empirical software engineering and measurement, Washington, DC, USA, 2011, pp. 215-224.
[Bibtex]

@inproceedings{premraj-esem-2011,
author = {Premraj, Rahul and Herzig, Kim},
title = {Network Versus Code Metrics to Predict Defects: A Replication Study},
booktitle = {Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement},
series = {ESEM '11},
year = {2011},
isbn = {978-0-7695-4604-9},
pages = {215--224},
numpages = {10},
doi = {10.1109/ESEM.2011.30},
acmid = {2083428},
publisher = {IEEE Computer Society},
address = {Washington, DC, USA},
link={http://www.kim-herzig.de/2011/05/21/network-versus-code-metrics-to-predict-defects-a-replication-study-esem-2011/},
pdf={http://www.kim-herzig.de/wp-content/uploads/2011/09/premraj_esem_2011.pdf}
}

Download author PDF.
The author PDF is posted here by permission of IEEE Computer Society for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2011 international symposium on empirical software engineering and measurement and can be downloaded using the publisher site .

The actual task

So far so easy. But how to visualize the results? What we need is a four dimensional plot that:

  1. Displays the prediction accuracy measures (e.g. precision, recall, accuracy) per project (two dimensions),
  2. Allows a visual comparison of the prediction accuracies based on different metric sets (two + one = three dimensions),
  3. And preferably also shows the prediction accuracy variance within each model series (two + one + one = four dimensions).
  4. Last but not least we want to show the name of the best machine learner.

The result looks like this (klick to enlarge):

How do we have to read the plot?

Panels on the x-axis represent the subject projects. The prediction accuracy measures are distributed over the y-axis. Each model ran on 100 stratified random samples on the two metric sets. The variance of accuracy is plotted as boxplot. The black line in the middle of each boxplot indicates the median value of the distribution. The red colored horizontal lines do not have any statistical meaning—they have been added to ease visual comparison. The best performing machine learner is stated under each corresponding boxplot.

How to produce this plot?

1
data <- read.csv("http://www.kim-herzig.de/wp-content/uploads/2012/11/comparing_boxplot_grid.csv")

The colnames of the data frame data are

1
2
colnames(data)
[1] "X"         "project"   "model"     "measure"   "value"     "metrics"   "bestmodel"
  • The column “X” can be ignored
  • column “project” contains the name of the subject project we tested our series of prediction models on;
  • column “model” contains the name of the machine learning algorithm used to build the prediction model;
  • column “value” contains the actual prediction accuracy measure (column “measure”) value;
  • column “metrics” contains the name of the metric the corresponding prediction models is based on;
  • and column “bestmodel” specified which machine learning algorithm performed best.

We first care about separating the results of prediction models based on different metric values

1
p <- ggplot(data, aes(factor(metrics), value))

and print use boxplots to show the variance within the results.

1
p <- p + geom_boxplot()

The we add the two dimensional grid to display different projects on the x-axis and the different accuracy measures (precision, recall, and f-measure)

1
p <- p + facet_grid(measure ~ project)

That looks good already. Now, lets add the name of the best performing model under each boxplot.

1
p <- p + geom_text(data=data.a, aes(factor(metrics), label=bestmodel, y=0.01, size=2))

The rest of the script makes the plot look nice. We adjust colors and fonts so that the plot can be used in papers printed in black and white but remain readable.

1
p + stat_summary(fun.y=median, colour="red", geom="line", aes(group = 1)) + theme_bw() + opts(axis.text.x = theme_text(size=14), axis.text.y = theme_text(size=14), strip.background = theme_rect(fill = "black"), strip.text.x = theme_text(colour = "white", size = 16), strip.text.y = theme_text(colour = "white", angle = 270, size = 16), axis.title.x = theme_blank(),axis.title.y = theme_blank(), legend.position = "none")

Done. The final scrip looks like this:

1
2
3
library(gplot2)
data <- read.csv("http://www.kim-herzig.de/wp-content/uploads/2012/11/comparing_boxplot_grid.csv")
p + geom_boxplot() + geom_text(data=data.a, aes(factor(metrics), label=bestmodel, y=0.01, size=2))  + facet_grid(measure ~ project) + stat_summary(fun.y=median, colour="red", geom="line", aes(group = 1)) + theme_bw() + opts(axis.text.x = theme_text(size=14), axis.text.y = theme_text(size=14), strip.background = theme_rect(fill = "black"), strip.text.x = theme_text(colour = "white", size = 16), strip.text.y = theme_text(colour = "white", angle = 270, size = 16), axis.title.x = theme_blank(),axis.title.y = theme_blank(), legend.position = "none")

No Comment

Comments are closed.