Joachim Breitner's Homepage
A Solution to the Configuration Problem in Haskell
On the drive back home from BelHac I thought about the configuration problem in Haskell: The issue is finding a convenient way to work with values that are initialized once and used in many places all over the code.
Assume you have a large module of pure code that, using many custom functions and combinators, parses some data structure. Later you noticed that somewhere far down in the parser, you need to react differently depending on some user preferences – say, his preferred language. The usual solution is to add a new parameter to that function and, in consequence, to each and every function that calls or might call directly or indirectly this function. This is often very inconvenient.
Other solutions include:
- Using mutable references and some hacking with unsafePerformIO, which always gives the programmer a bad conscience.
- Using a Reader monad, requiring a rewrite of the whole program in monadic style.
- Using implicit parameters which is ok if you did not write type signatures, but if you did, you still have to modify them a lot.
- Some advanced type hackery.
The solution I thought of and implemented uses Template Haskell, the Haskell library to modify code at compile time, to turn the style you prefer to write in (pure code that uses configuration values as if they were global constants) into the style that is semantically correct (pure code with configuration values as an additional parameter). I uploaded the resulting code as seal-module to hackage and added plenty of comments and examples to the SealModule module (⅔ are comments according to ohcount). I refrain from copying that into this blog post, so if you are curious, please continue reading there.
Comments
Though currently the params can only be values, right? sometimes one would like to abstract over types too.
And i think we should be able to use type families declarations to specify which types we want to abstract over.
e.g.
sealModule [d|
type family IMap :: * -> *
lookup :: Int -> IMap a -> Maybe a
insert :: Int -> a -> IMap a -> IMap a
lookup = sealParam
insert = sealParam
newtype MyMonad a = MM (State (IMap Foo) a)
foo :: IMap Foo -> MyMonad Bar
foo = ...
|]
the above would produce code like:
newtype MyMonad imap a = MM (State (imap Foo) a)
foo :: (Int -> imap a -> Maybe a) -> (Int -> a -> imap a -> imap a) -> imap Foo -> MyMonad imap Bar
foo = ...
Used like this might appear to overlap a bit in scope with typeclasses, but i still think there are many case where this style would be nicer.
{-# LANGUAGE TemplateHaskell, RecordWildCards #-}
module Test where
import Language.Haskell.SealModule
sealModule [d|
lookup :: Int -> imap a -> Maybe a
lookup = sealedParam
insert :: Int -> a -> imap a -> imap a
insert = sealedParam
-- foo :: imap a -> imap a
foo map = case lookup (1::Int) map of
Just a -> insert (2::Int) a map
Nothing -> map
|]
and the resulting function foo has a type of "foo :: (Int -> t -> Maybe t1) -> (Int -> t1 -> t -> t) -> t -> t".
But of course this is not an ideal solution.
For example, one idea I have been struggling with recently is to use an I18n framework in a haskell web application, say with a framework like Snap.
Ideally, you would want to load the localization maps from gettext/yaml/json files into memory at the very start and keep them there throughout the run. As it stands, this doesn't seem to be possible in Snap as the Snap monad only has the Request, Response and Logger available during the processing cycle.
An obvious alternative would be to initialize the web server built on a reader monad that accepts an arbitrary GADT, which can be used to embed the I18n maps. But I am not sure if that is the performant thing to do...
Best,
OA
So if you make sure any expensive calculations about the parameter are shared among the calls _into_ the sealed Module (which is the case if you bind it once in main), you should be good.
Nevertheless, it's what I've been doing in Java for years. I decided that it was better to be explicit about what's required by a function and that I would not use global variables, shared state nor anything else not allowed in plain classical Haskell. I have seed too many horrible solution for configuration: singletons, dynamic variables (fluid-let, thread locals, dynamic parameters). They eventually make the code impossible to unit test or to understand. So I took the explicit road even if it's not mainstream. I don't regret it. The code has less dependencies, is more clear and more robust.
Essentially, a configuration is big read-only structure having its elements passed to different components of the system. I fail to see why it's not possible to accomplish the same thing in Haskell using just regular function parameters. If the problem is that too many modifications must be done on the function signatures in order to introduce a new configuration element, well I'd say what we really need here is a refactoring tool. I've been using Eclipse for my Java development and it makes this kind of task very easy to perform.
A more difficult problem is when your need to refresh some configuration parameters during program execution. In order to support this, you have to convert everything in the call chain to monadic code (IO or Reader, I guess). This is much more dramatic than simply adding some parameters to existing functions.
One last thing concerning configuration. Sadly, it is not possible (or even conceivable, as far as I understand the Haskell type system) to perform reflection in Haskell. This would greatly ease the configuration process as we could automatically create complex dependent data structures out of external declarations. The Java example of that is the SpringFramework bean factory which allow the creation of a graph of Java objects that are used in the application. So, the Java developer never really needs to do anything to allow the system to be configurable. Correct me if I'm wrong but in Haskell, you'll always need to somehow read your configuration file and create your data by hand, the though part being managing dependencies between those data.
Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.