Joachim Breitner's Homepage
GHC 7.4.1 speeds up arbtt by a factor of 22
More than two years ago I wrote arbtt, a tool that silently records what programs you are using and allows you to do statistics on that data later, based on rules that you define afterwards, hence the name automatic rule based time tracker. I wasn’t doing much with it recently (the last release has been half a year ago), but it nevertheless was running on my machine and by now has tracked a total time span of 248 days in 350000 records.
Yesterday, I had a use for it again: measuring the time spent creating a certain document with LaTeX. So I added a rule to my categorize.cfg and ran arbtt-stats. I knew that it was not very fast, and that my data set has grown considerably since I last used it. And indeed, it took more than 6 minutes to process the data and spit out the result.
Since I’m currently working on the GHC 7.4.1 transition in Debian anyways, I decided to check what happens if I compile the code with that version of the Haskell compiler, instead of the previous version 7.0.4. And behold: The whole process took merely 17.3 seconds to complete! At first I did not believe it, but the result was identical, both binaries were built with the same option, i.e. no profiling enabled or anything like that. Wouldn’t you also like to have such speed ups for free, just by waiting for someone else to improve their work?
Comments
First I have to compile ghc6 6.8 with ghc6 6.6 (started that about two days ago), then ghc6 6.12 with that, then I hopefully can use that to build ghc 7.4… and hscolour, which has an indirect B-D on itself (luckily, the version in m68k is barely the minimum needed to satisfy it).
hugs98 built quicker, but doesn’t seem to be able to be used for building ghc…
(Not that I even speak Haskell, but it features prominently in Debian recently, so I figured I better try to have it keep up.)
Also when bootstrapping happy and alex (I think), you’ll find that the upstream tarball contains the generated files required to bootstrap them, but be careful: debian/rules clean removes them. Send d-haskell a mail if you need help.
Simon Marlow looked into the issue and found that 99.5% of the time was spent in the garbage collector, which he subsequently changed. It had a tremendous speedup (~35), and i would not be surprised when you are profiting from the same change.
See:
http://hackage.haskell.org/trac/ghc/ticket/5505
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.