Continuous Builds in Haskell, Part 2
In Part 1, we laid the groundwork for a simple directory file watcher that we could extend to construct our own continuous build system using Haskell. In this post, we’re going to add the next layer by integrating with our cabal project we want monitored.
This post applies to Linux users only, due to build dependencies. Hackage does not yet have a corresponding package for Windows.
This post also assumes a basic but passable familiarity with developing in Haskell. If you don’t know how to use cabal yet, this is probably a bit advanced for you.
What’s the Game Plan?
I’m going to walk you through a few basic steps. For the purposes of this exercise, we’re going to assume you are starting from the code developed in Part 1:
- We’re going to define a data type to hold our configuration details.
- When our event handler notifies us of a source file change, we’re going to act on that configuration.
- Lastly, we’ll need a synchronization object. Because our work may be long-running and we may get multiple file changed events while we’re still running, we don’t want to cause any conflicts. For our purposes, we’ll just ignore additional file change events while we’re currently working on a previous change.
Required Packages
If you don’t have it already, you’ll want to grab the stm package:
cabal update && cabal install stm
Humbly described as a “modular composable concurrency abstraction”, Software Transactional Memory is just about the greatest thing since Chuck Norris invented sliced bread. The stm package provides the backbone of Haskell’s version of this, with lock-free, high performance concurrency.
We’re barely going to scratch the surface of its power, though. We’ll just use it to track whether our continuous build system is currently doing any work.
Setting up our Imports
import Control.Concurrent.STM import Control.Concurrent.STM.TVar
Defining our Configuration
Diving into the real code, the first thing we’ll want is a way to store our configuration details. We’ll keep this pretty simple: Just a path to our project .cabal file and a flag to indicate that our build system is working (or not).
You probably already know all about various data types. Here’s our definition:
data Config = Config { confCabalFile :: FilePath , confWorking :: Bool } deriving (Show,Read)
We won’t use the automatically derived Show and Read types here, but I find it useful to include them in most of my data types.
Filtering Extensions
I’m going to jump a bit ahead to our filter function for file extensions. Because we’ll now be looking for both .hs and .cabal files, let’s perform a bit of refactoring on our original filterHS method. Let’s pull out the bit about scraping a file extension so that we can reuse it:
fileExt = reverse . takeWhile (/= '.') . reverse
Nothing special here, just a bit of currying. Then, modifying our original filterHS function, we’ll use this to find either .hs or .cabal files:
filterHS fp = fileExt fp == "hs" filterCabal fp = fileExt fp == "cabal"
… with a little sanity check modification to our main function to double-check that the .cabal file we’ll use actually exists before we start depending on it:
main = do args <- getArgs let dir = head args putStrLn $ "Watching directory: " ++ dir -- modifications start here contents <- getDirectoryContents dir let contents' = filter filterCabal contents case contents' of (x:_) -> runThread x dir [] -> do putStrLn "No cabal file found!" putStrLn "Exiting" runThread cabal dir = do -- /modifications end here n <- initINotify putStrLn "Press <Enter> to exit" print n wd <- addWatch n [ Modify, CloseWrite, Create, Delete, MoveIn, MoveOut ] dir eventHandler print wd getLine removeWatch wd killINotify n
As you can see, we’ve now split our main function to create a new runThread function to handle the work after sanity checking our input.
Introducing Concurrency
Now, let’s slip in a bit of concurrency. To see where we’re heading with this, I’ll show you the updated function definitions for our existing eventHandler and doWork functions, which we’re about to edit:
eventHandler :: TVar Config -> Event -> IO () doWork :: TVar Config -> FilePath -> IO ()
As you can see, we’re going to use our Config data type and wrap it inside a TVar. A TVar is an stm concurrency mechanism that allows for multiple threads to access the same data. The Hackage docs describe this well: “Shared memory locations that support atomic memory transactions.” I don’t know about you, but atomic transactions always make me feel warm and fuzzy inside. In memory, for more speedz!
To make use of our more speedz device, in our runThread method, let’s create a TVar from our Config data type and pass this little guy down to our eventHandler:
runThread cabal dir = do config <- newTVarIO $ Config cabal False -- create our TVar n <- initINotify putStrLn "Press <Enter> to exit" print n wd <- addWatch n [ Modify, CloseWrite, Create, Delete, MoveIn, MoveOut ] dir (eventHandler config) -- and pass it to the eventHandler -- ...
This simply creates a Config variable and uses newTVarIO to initialize a TVar object. Because this lives in the STM monad, we’ll need to capture the result using the <- binding operator. We use newTVarIO, rather than the more general newTVar, because we want to live in the IO monad without doing additional boilerplate lifting.
If you don’t know all about monads yet, feel free to glaze over that explanation and just remember: You need to use <- to access TVars.
Our eventHandler doesn’t do anything special, it just passes the TVar on through:
eventHandler :: TVar Config -> Event -> IO () eventHandler conf x@(Modified _ (Just fp)) = handleFilteredFile conf x fp eventHandler conf x@(MovedIn _ fp _) = handleFilteredFile conf x fp eventHandler conf x@(MovedOut _ fp _) = handleFilteredFile conf x fp eventHandler conf x@(Created _ fp) = handleFilteredFile conf x fp eventHandler conf x@(Deleted _ fp) = handleFilteredFile conf x fp eventHandler _ _ = return ()
Same thing for our handleFilteredFile function:
handleFilteredFile conf evt fp = do if filterHS fp then print evt >> doWork conf fp else return ()
On to the Good Stuff!
So, that was just some boilerplate to get things hooked up for consumption. It wasn’t as busy as it looked, because we were mostly just tweaking stuff we wrote in Part 1. Now, let’s make our doWork function do something useful for once! (Put that mai tai down!)
First, we’ll lift the configuration out of our TVar. If we’re already working, we’ll just inform the user and exit. If we’re not working, we’ll inform the user that new work is beginning, set the working flag, and do our work.
This is what our new doWork implementation looks like:
doWork :: TVar Config -> FilePath -> IO () doWork conf fp = do config <- readTVarIO conf if confWorking config then do print "Already working!" return () else do print "New work available!" atomically $ writeTVar conf (config { confWorking = True }) return ()
We use readTVarIO to grab the synchronized configuration. If new work can be run, we update the working flag by constructing a new Config, initialized from our old definition. We use writeTVar to synchronize and store this back into the TVar. We use atomically because writeTVar lives in the STM monad, whereas doWork is acting in the IO monad (as shown in our function type definition).
Oh, and we’re still not actually doing any real work, so you can pick up that mai tai again. We’ll save that bit for the next post in the series.
(Technically, the only part of this operation that we’re making transactional is the writing of the confWorking flag. The actual work we’re doing – or not doing – is not part of the transaction. But for the purposes of this example, it’s Good Enough.)
Wrapping up
In the next part, we’ll actually put our cabal reference to work.
As always, the full cabal project is available on https://github.com/stormont/continuous-hs. To get the version used in this post, use the tag part-2.
For a better side-by-side comparison of what we’ve changed, check out this diff.
Here’s the full code we’ve developed:
import Control.Concurrent.STM import Control.Concurrent.STM.TVar import System.Directory import System.Environment (getArgs) import System.INotify import System.IO data Config = Config { confCabalFile :: FilePath , confWorking :: Bool } deriving (Show,Read) main = do args <- getArgs let dir = head args putStrLn $ "Watching directory: " ++ dir contents <- getDirectoryContents dir let contents' = filter filterCabal contents case contents' of (x:_) -> runThread x dir [] -> do putStrLn "No cabal file found!" putStrLn "Exiting" runThread cabal dir = do config <- newTVarIO $ Config cabal False n <- initINotify putStrLn "Press <Enter> to exit" print n wd <- addWatch n [ Modify, CloseWrite, Create, Delete, MoveIn, MoveOut ] dir (eventHandler config) print wd getLine removeWatch wd killINotify n eventHandler :: TVar Config -> Event -> IO () eventHandler conf x@(Modified _ (Just fp)) = handleFilteredFile conf x fp eventHandler conf x@(MovedIn _ fp _) = handleFilteredFile conf x fp eventHandler conf x@(MovedOut _ fp _) = handleFilteredFile conf x fp eventHandler conf x@(Created _ fp) = handleFilteredFile conf x fp eventHandler conf x@(Deleted _ fp) = handleFilteredFile conf x fp eventHandler _ _ = return () handleFilteredFile conf evt fp = do if filterHS fp then print evt >> doWork conf fp else return () filterHS fp = fileExt fp == "hs" filterCabal fp = fileExt fp == "cabal" fileExt = reverse . takeWhile (/= '.') . reverse doWork :: TVar Config -> FilePath -> IO () doWork conf fp = do config <- readTVarIO conf if confWorking config then do print "Already working!" return () else do print "New work available!" atomically $ writeTVar conf (config { confWorking = True }) return ()
2 comments