Joachim Breitner's Homepage
Faster Winter: Statistics (the making-of)
(This is an appendix to the “faster winter” series, please see that post for background information.)
Did you like the graph and the stats that I produced? Just for completeness, I am including the various scripts I used. Nothing super exciting to see here, but maybe someone finds this useful.
This little shell one-liner collects the run-time statistics for each commit in the interesting range (line-wrapped for your convenience):
for h in $(git log 1cea7652f48fad348af914cb6a56b39f8dd99c6a^..5406efd9e057aebdcf94d14b1bc6b5469454faf3 --format=%H)
do
echo -n "$h"
git checkout -q "$h"
cabal new-build -v0
echo -n ":"
rm -f stats/$h.txt
for i in $(seq 1 5)
do
cabal -v0 new-run exe:wasm-invoke -- -w loop.wasm -f canister_init +RTS -t >/dev/null 2>> stats/$h.txt
echo -n .
done
echo
done
A small Perl script takes the minimum for each measurement across the five runs, and produces a CSV file:
#!/usr/bin/perl
use List::Util qw(min);
my @alloc;
my @in_use;
my @time;
while (<>) {
m!<<ghc: (\d+) bytes, \d+ GCs, \d+/\d+ avg/max bytes residency \(\d+ samples\), (\d+)M in use, [\d.]+ INIT \(([\d.]+) elapsed\), [\d.]+ MUT \(([\d.]+) elapsed\), [\d.]+ GC \(([\d.]+) elapsed\) :ghc>>! or die $!;
push @alloc, 0+$1;
push @in_use, $2;
push @time, $3+$4+$5;
}
printf "%d;%d;%f\n", min(@alloc), min(@in_use), min(@time);
To create a full file for all the commits in the range that have files, I used this bash one-liner (again line-wrapped for your convenience):
echo 'commit;allocations;memory;time' > stats.csv
for h in $(git log 1cea7652f48fad348af914cb6a56b39f8dd99c6a^..5406efd9e057aebdcf94d14b1bc6b5469454faf3 --format=%H|tac)
do
git log -n 1 --oneline $h
test -f stats/$h.txt && echo "$(echo $h|cut -c-7);$(./tally.pl < stats/$h.txt)" | tee -a stats.csv
done
The stats can be turned into the graphc using pgfplots
by compiling this LaTeX file:
\documentclass[class=minimal]{standalone}
\usepackage{mathpazo}
\usepackage{pgfplots}
\definecolor{skyblue1}{rgb}{0.447,0.624,0.812}
\definecolor{scarletred1}{rgb}{0.937,0.161,0.161}
\pgfplotsset{width=12cm,compat=newest}
% From https://tex.stackexchange.com/a/63340/15107
\makeatletter
\pgfplotsset{
%
/pgfplots/flexible xticklabels from table/.code n args={3}{\pgfplotstableread[#3]{#1}\coordinate@table
\pgfplotstablegetcolumn{#2}\of{\coordinate@table}\to\pgfplots@xticklabels
\let\pgfplots@xticklabel=\pgfplots@user@ticklabel@list@x
}
}\makeatother
\begin{document}
\begin{tikzpicture}
\pgfplotsset{every axis/.style={ymin=0}}
\begin{semilogyaxis}[
skyblue1,
scale only axis,
axis y line*=left,
ylabel=Allocation (bytes),
flexible xticklabels from table={stats.csv}{[index]0}{col sep=semicolon},\ttfamily, color=black},
xticklabel style={rotate=90, anchor=east, text height=1.5ex, font=
xtick=data,
]\addplot[const plot mark mid, color=skyblue1]
\coordindex+1, y index=1, col sep=semicolon] {stats.csv};
table [x expr=\end{semilogyaxis}
\begin{semilogyaxis}[
green,
scale only axis,
axis y line*=right,
ylabel=Memory (MB),
x tick style={draw=none},\empty,
xtick=
]\addplot[const plot mark mid, color=green]
\coordindex+1, y index=2, col sep=semicolon] {stats.csv};
table [x expr=\end{semilogyaxis}
\begin{semilogyaxis}[
red,
scale only axis,
axis y line*=right,
ylabel=Time (seconds),
x tick style={draw=none},\empty,
xtick=
]\pgfplotsset{every outer y axis line/.style={xshift=2cm}, every tick/.style={xshift=2cm}, every y tick label/.style={xshift=2cm} }
\addplot[const plot mark mid, color=red]
\coordindex+1, y index=3, col sep=semicolon] {stats.csv};
table [x expr=\end{semilogyaxis}
\end{tikzpicture}
\end{document}
And finally this Perl script allows me to paste any two lines from the CSV file and produces appropriate Markdown for the “improvement” lines in my posts:
#!/usr/bin/perl
my $first = 1;
my $commit;
my $alloc;
my $in_use;
my $time;
while (<>) {
/(.*);(.*);(.*);(.*)/ or die;
unless ($first) {
printf "**Improvement**: Allocations: %+.2f%% Memory: %+.2f%% Time: %+.2f%% (Commit [%s...%s](http://github.com/dfinity/winter/compare/%s...%s))\n",
100 * ($2/$alloc - 1)),
(100 * ($3/$in_use - 1)),
(100 * ($4/$time - 1)),
($commit,
$1,
$commit,
$1;
}$first = 0;
$commit = $1;
$alloc = $2;
$in_use = $3;
$time = $4;
}
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.