no time

Connected Components, Concurrently

Andrey — Thu, 26 Sep 2019 14:59:10 +0000

Computing connected components in an undirected graph is one of the most basic graph problems. Given a graph with n vertices and m edges, you can find its components in linear time O(n + m) using depth-first or breadth-first search. But what if you need to go faster? In this blog post, I will describe a cool new concurrent algorithm for this problem, which I learned this week at the Heidelberg Laureate Forum from Robert Tarjan himself. The algorithm distributes the work among n + m tiny processors that work concurrently most of the time and requires O(log n) global synchronisation rounds. The algorithm is remarkably simple but it’s far from obvious that it works correctly and efficiently. Happily, Tarjan and his co-author S. Cliff Liu have done all the hard proofs in their recent paper, so we can simply take the algorithm and use it.

First, let’s recap the classic solution based on the depth-first search. I’ll use my favourite graph library Alga, so my examples will be in Haskell. Below I create an example undirected graph and compute the number of connected components by counting trees in the depth-first search forest. The graph and the forest are shown in the figure; the edges that belong to the forest are directed to illustrate the order of graph traversal.

λ> import Algebra.Graph.Undirected
λ> example = edges [ (1,6), (2,6), (3,7), ... ]
λ> length $ dfsForest $ fromUndirected example
3

This approach is very simple and you should definitely use it, provided that the linear complexity O(n + m) is fast enough for your application. But what if you need to go faster? Some time ago my colleagues and I wrote a paper where we showed how to take advantage of concurrency by implementing the breadth-first search on an FPGA, resulting in better time complexity O(d) where d is the diameter of the graph. This was an excellent result for our application where graphs had a small diameter, but in general, the diameter can be as large as n. So can we do better?

Yes! Tarjan’s paper presents several faster algorithms. It also describes the algorithms in a very nice compositional manner:

To play with and understand these algorithms, let’s translate the above definitions to Haskell, trying to preserve their clarity and conciseness while also being explicit about details. Jumping ahead a little, here is how Algorithm P will look like in the end:

algorithmP = repeat (parentConnect >>> update >>> shortcut)

Pretty close! Now, let’s define the primitives parentConnect etc. in terms of the underlying computational model where a lot of tiny processing threads communicate via short messages, in rounds. In a round, threads concurrently receive some messages, then update their local states, and possibly send out new messages for the next round. We can capture the local view of a computation round by the following type:

type Local s t i o = s -> t -> [i] -> (s, [(t, o)])

Here s is a type of states, t is a type of threads, and i and o are types of incoming and outgoing messages in a round of computation. In words, given the current state of a thread and a list of incoming messages, a Local function returns an updated state and a list of outgoing messages, each tagged with a target thread identifier — these messages will be delivered in the next computation round.

We can upload Local functions to our tiny processors, say on an FPGA, inject input messages into the communication network, and extract the outputs after a computation round, i.e. when all threads complete the computation. This requires specialised hardware and non-trivial setup, so let’s find a way to simulate such computations on a big sequential machine that has enough memory to have the global view of a round:

type Global s t i o = (Map t s, [(t, i)]) -> (Map t s, [(t, o)])

The Global function takes a map from threads to their current states and a list of all input messages in the network and returns the resulting global state: a map of new thread states and a list of newly generated output messages. Note that such global computations can be composed using function composition and form a category; the operator >>> in the above code snippet for Algorithm P is just the left-to-right composition defined in the standard module Control.Category.

Converting from a Local to a Global view is relatively straightforward:

round :: Ord t => Local s t i o -> Global s t i o
round local (states, messages) = collect (Map.mapWithKey update states)
  where
    deliveries = Map.fromAscList (groupSort messages)
    update t s = local s t (Map.findWithDefault [] t deliveries)
    collect    = runWriter . traverse writer

We first find all the deliveries, i.e. lists of incoming messages that should be delivered to each thread, then execute the local update functions of each thread, sequentially (note that this is OK since threads interact only between rounds), and finally, collect outgoing messages of all threads.

To express Tarjan’s algorithms, we will need the generic repeat function that executes a given Global computation repeatedly until thread states stop changing.

repeat :: (Eq s, Eq t) => Global s t Void Void -> Global s t Void Void
repeat g (states, messages)
    | states == newStates = (states, messages)
    | otherwise           = repeat g (newStates, newMessages)
  where
    (newStates, newMessages) = g (states, messages)

Note that we require that the network is quiescent before and after a given computation, as indicated by the type of incoming and outgoing messages i = o = Void. Of course, the computation may be comprised of multiple rounds, which can exchange messages between each other!

Now let’s use the above definitions to express Algorithms P and R from Tarjan’s paper. We’ll have n + m threads corresponding to vertices and edges, whose states will be very minimalistic: edges will have no state at all, and every vertex will store its current parent i.e. the minimum vertex of the connected component to which the vertex is currently assigned.

type Vertex = Int

data Thread = VertexThread Vertex | EdgeThread Vertex Vertex
data State  = VertexState  Vertex | EdgeState

A vertex that is its own parent is called root; all vertices are initially roots. If this reminds you of the disjoint-set data structure you are on the right track! All algorithms follow the same general idea: we start by assigning each vertex to a separate component and then use edges to inform their neighbouring vertices about other reachable vertices in the component, maintaining the invariant that the root of a component is its smallest vertex. As rounds progress, we grow a parent forest, not dissimilar to the forest produced by the depth-first search for our earlier example graph, but now taking advantage of concurrency.

Let’s also create a convenient type synonym for denoting global views of computation rounds involving s = State and t = Thread:

type (~>) i o = Global State Thread i o

We can now express computation primitives, such as connect, where edges inform their neighbours about each other:

connect :: Void ~> Vertex
connect = round $ \s t _ -> case t of
    VertexThread _ -> (s, [])
    EdgeThread x y -> (s, [(VertexThread (max x y), min x y)])

The type Void ~> Vertex says that the round starts with no messages in the network and ends with messages carrying vertices. Vertex threads are dormant in the round, whereas edge threads generate one message each, sending a smaller vertex to the thread corresponding to the larger vertex so that the latter could update its parent. Note that we express the behaviour locally, and then use the function round defined above to obtain its Global semantics.

This connect round can be followed by the update round, where vertex threads process incoming messages, updating their parents accordingly:

update :: Vertex ~> Void
update = round $ \s _ i -> case s of
    VertexState p -> (VertexState $ minimum (p : i), [])
    EdgeState     -> (s, [])

The primitives connect and update are not sufficient on their own; we need another crucial ingredient — the shortcut primitive that halves the depths of trees in the parent forest, similarly to the path compression technique used in the disjoint-set data structure.

shortcut :: Void ~> Void
shortcut = request >>> respondParent >>> update
  where
    request :: Void ~> Thread
    request = round $ \s t _ -> case s of
        VertexState p -> (s, [(VertexThread p, t)])
        EdgeState     -> (s, [])

respondParent :: Thread ~> Vertex
respondParent = round $ \s _ i -> case s of
    VertexState p -> (s, map (,p) i)
    EdgeState     -> (s, [])

In shortcut, every vertex requests the parent of its parent sending itself as a “respond-to” address. In the subsequent round, each vertex thread responds by sending its parent. The process completes by running the update primitive defined above. Note how we can compose simple primitives together in a type-safe way, obtaining a non-trivial shortcut.

The last primitive that I’ll cover is parentConnect, a variation of connect that informs the parents of two edge vertices about each other:

parentConnect :: Void ~> Vertex
parentConnect = request >>> respondParent >>> receive
  where
    request :: Void ~> Thread
    request = round $ \s t _ -> case t of
        VertexThread _ -> (s, [])
        EdgeThread x y -> (s, [(VertexThread x, t), (VertexThread y, t)])

    receive :: Vertex ~> Vertex
    receive = round $ \s _ i -> case s of
        VertexState _ -> (s, [])
        EdgeState     -> case i of
            [x, y] -> (s, [(VertexThread (max x y), min x y)])
            _      -> error "Unexpected number of responses"

We start by requesting parents of edge vertices, continue by sending the responses using the respondParent primitive defined above, and finally receive and handle the responses in a way similar to connect.

All the ingredients required for expressing Algorithm P are now in place:

algorithmP :: Void ~> Void
algorithmP = repeat (parentConnect >>> update >>> shortcut)

Great! My favourite algorithm from the paper is Algorithm R, which is a slight variation of Algorithm P:

algorithmR :: Void ~> Void
algorithmR = repeat (parentConnect >>> rootUpdate >>> shortcut)

Here we use rootUpdate instead of update, whose only difference is that updates to non-root vertices are ignored. This makes the growth of the parent forest monotonic: we never exchange subtrees between trees and only graft whole trees to other trees. This greatly simplifies the analysis of the performance of Algorithm R compared to Algorithm P. In fact, according to Tarjan, analysis of Algorithm P is still an open problem! I encourage you to read the paper, which presents five (!) algorithms for computing connected components concurrently, and also to implement the two remaining primitives, extendedConnect and alter, required for expressing Algorithms E, A and RA using our little modelling framework.

Let’s check if our implementation of Algorithm P works correctly on the example graph:

initialise :: Graph Int -> Map Thread State
initialise g = Map.fromList (vs ++ es)
  where
    vs = [ (VertexThread x, VertexState x) | x      <- vertexList g ]
    es = [ (EdgeThread x y, EdgeState    ) | (x, y) <- edgeList   g ]

run :: Global s t Void Void -> Map t s -> Map t s
run g m = fst $ g (m, [])

components :: Map Thread State -> [(Int, [Int])]
components m = groupSort
    [ (p, x) | (VertexThread x, VertexState p) <- Map.toList m ]

λ> mapM_ print $ components $ run algorithmP $ initialise example
(1,[1,2,3,4,5,6,7,8,9,10,11,13,15,18,22,23,24,25,27,29,31,32,35,38])
(12,[12,14,16,17])
(19,[19,20,21,26,28,30,33,34,36,37])

As expected, there are three components with roots 1, 12 and 19, which matches the figure at the top of the blog post.

And here is an animation of how Algorithm R builds up the parent forest in a monotonic manner. Note that the edges in the parent forest are now shown as undirected (unlike in the earlier depth-first search figure) since they may change their logical direction during the execution of the algorithm (e.g. see the edge 4-9).

That’s all for now! I hope to find time soon to try these algorithms on Tinsel, a multi-FPGA hardware platform with thousands of processors connected by a fast network. Tinsel is being developed at the University of Cambridge as part of the POETS research project, which I worked on while at Newcastle University. Tinsel is very cool! Have a look at it if you are into FPGAs and unconventional computing architectures.

P.S.: If you haven’t heard about the Heidelberg Laureate Forum before, you can read about it here. If you are a young researcher in computer science or mathematics, you should consider applying — it’s an amazing event where you get a chance to talk to the leaders of the two fields, as well as to young researchers from all over the world. This year I’m here as part of the HLF blogging team, writing about people I met and things I learned here, and it’s been a lot of fun too — if you’d like to help cover and promote this event, get in touch with the HLF media team.

P.P.S.: And here are a couple of links: (i) a video recording of Tarjan’s HLF lecture and (ii) the complete source code from this blog post.

Moving to Jane Street

Andrey — Sun, 01 Sep 2019 22:00:08 +0000

Exciting times! I’m moving from Newcastle University to Jane Street London to work on the OCaml build system Dune. I’ve received a lot of questions about this move already, so I decided to write down some answers in a blog post.

I’ve been studying and then working at Newcastle University for 14 years and enjoyed this time tremendously. People there, both in the streets and in the university, are warm and friendly so I felt welcome from Day 1. Academic life isn’t perfect for everyone but it was perfect for me. (By the way, I’m happy to recommend Newcastle University to anyone who is thinking of an academic career: it’s a great place and I’m happy to introduce you to my former colleagues.)

What I like about academia is the freedom of exploring new topics without asking anyone’s permission. In particular, in 2014 I unexpectedly started working on build systems, first practically by developing a new build system for the Glasgow Haskell Compiler and then theoretically by exploring the space of existing build systems in the paper Build Systems à la Carte. This led me to the build system Dune, which didn’t quite fit the model described in the paper, so I started collaborating with Jane Street to understand Dune better and we wrote another theoretical paper on “selective functors” together.

I find build systems fascinating and would like to dedicate some time to work on them. When I heard that the Dune team was looking for new developers I thought that this would be a great opportunity to both deepen my understanding of build systems, and also do something I wanted to do for a while — get some experience of working in industry and see if it would be a good environment for me too.

Like other members of the Tools and Compilers team at Jane Street, I plan to continue being involved in programming languages research. Fortunately, Dune and a few other Jane Street projects are open source, which makes it possible to openly collaborate and publish. I’ll also stay in touch with my former academic colleagues; Newcastle University has kindly granted me the Visiting Fellow status, so if you’ve been in touch with me via my university email, don’t worry, it will still work. Looks like this blog is still up too, although I’d like to set up a standalone blog at some point.

To complete the picture of my new place of work I’d like to highlight a few Jane Street tech talks: algebraic effects, incremental computation, compiling OCaml to FPGAs, as well as a more general talk about the use of OCaml at Jane Street. Not sure I’ll have time to contribute to any of these projects as I’m going to focus on Dune first, but it’s exciting to work in such an environment. My new Day 1 is tomorrow!

P.S.: If you are in London on 1-2 October, come along to the Build Meetup organised by Cloudflare, Bloomberg and Google where I’m giving a talk.

Stroll: an experimental build system

Andrey — Fri, 26 Jul 2019 06:23:36 +0000

I’d like to share an experiment in developing a language-agnostic build system that does not require the user to specify dependencies between individual build tasks. By “language-agnostic” I mean the user does not need to learn a new language or a special file format for describing build tasks — existing scripts or executable files can be used as build tasks directly without modification.

I call this build system Stroll because its build algorithm reminds me of strolling through a park where you’ve never been before, trying to figure out an optimal path to your target destination, and likely going in circles occasionally until you’ve built up a complete mental map of the park.

Main idea

Most build systems require the user to specify both build tasks as well as dependencies between the tasks: knowing the dependencies allows the build system to determine which tasks need to be executed when you modify some of the source files. From personal experience, describing dependencies accurately is difficult and is often a source of frustration.

Build tasks and their dependencies are typically described using domain-specific task description languages: Make uses makefiles, Bazel uses a Python-inspired Starlark, Shake uses Haskell, etc. While learning a new task description language is not a big deal, translating an existing build system to a new language may take years (or maybe I’m just slow).

Stroll is different. You use your favourite language(s) to describe build tasks, put them in a directory, and ask Stroll to execute them. Stroll does not look inside the tasks; it treats the tasks as black boxes and finds the dependencies between them by tracking their file accesses. This process is not optimal in the sense that a task may fail because its dependency is not ready and, therefore, will need to be built again later, but in the end, Stroll will learn the complete and accurate dependency graph and will store it to speed up future builds.

There is a build system called Fabricate that also tracks file accesses to automatically compute accurate task dependencies, but it requires the user to describe build tasks in a lightweight Python DSL and preschedule them by calling the tasks in the right order, essentially asking the user to do the strolling part themselves. Fabricate is cool and inspired Stroll but I’d like to explore this corner of the build systems space a bit further.

Demo

Stroll is just an experiment and I’m not sure if the main idea behind it is feasible, but it can already be used to automate some simple collections of build tasks. Let me give you a demo. I’m using Windows below but the demo works on Linux too.

Consider a simple “Hello, World!” program stored in the file src/main.c:

#include 

int main()
{
    printf("Hello, World!\n");
    return 0;
}

We can build an executable bin/main.exe from it by using the following simple build script build/main.bat:

mkdir bin
gcc src/main.c -o bin/main.exe

Let’s ask Stroll to build our little project:

$ stroll build
Executing build/main.bat...
Done

$ bin/main.exe
Hello, World!

When executing build tasks, Stroll tracks their file accesses and stores the discovered information next to the build scripts, by appending the extension .stroll to their names:

$ cat build/main.bat.stroll
exit-code: ExitSuccess
operations:
  bin/main.exe:
    write: efc851e573be26cf8fe726caf70facf924ccdbae5c4fce241fdbe728b3abde76
  src/main.c:
    read: bc31bb10c238be7ee34fd86edec0dc99d145f56067b13b82c351608efd38763c
  build/main.bat:
    read: da9a4390693741b8d52388f18b1c5ccc418531bc3b0a64900890c381a31e7839

As you can see, Stroll recorded the exit code of the script, two file reads and one file write, along with the corresponding hashes of file contents.

We can ask Stroll to visualise the discovered dependency graph:

$ stroll -g build | dot -Tpng -Gdpi=600 -o graph.png

The flag -g tells Stroll to print out the discovered dependency graph in the DOT format that we subsequently convert to the following PNG file:

The green box indicates that the only build task main is up-to-date. This means Stroll will not execute it in the next run unless any of its inputs or outputs change. If we modify src/main.c and regenerate the dependency graph, we can see that the task main is now out-of-date:

The specific dependency that is out-of-date is shown by a dashed edge. Note that Stroll is a self-tracking build system, i.e. it tracks changes not only in sources and build artefacts but in build tasks too.

To make the example a bit more interesting, let’s add a library providing a greeting function greet in files src/lib/lib.h and src/lib/lib.c:

$ cat src/lib/lib.h
void greet(char *name);

$ cat src/lib/lib.c
#include 
#include 

void greet(char *name)
{
    printf("Hello, %s!\n", name);
}

To compile the library we will add a new build task build/lib.bat:

mkdir bin/lib
gcc -Isrc/lib -c src/lib/lib.c -o bin/lib/lib.o

If we look at the dependency graph before running Stroll we’ll see:

The new task lib appeared in the graph but without any dependencies; it is marked as out-of-date because Stroll has never executed it. If we run Stroll now, it will execute both tasks reaching the following state:

The tasks are currently independent and can be built in parallel but let’s make use of the library by modifying the main source file as follows:

#include 

int main()
{
    greet("World");
    return 0;
}

If we run Stroll now, the build will fail:

$ stroll build
Executing build/main.bat...
Script build/main.bat has failed.
Done

$ cat build/main.bat.stderr
A subdirectory or file bin already exists.
src\main.c:2:17: fatal error: lib.h: No such file or directory
compilation terminated.

By examining the file build/main.bat.stderr helpfully created by Stroll, we can see that the build failed because we forgot to modify the build script and point gcc to our library. An error is visualised by a task with a double border:

Let’s fix the script main/build.bat:

mkdir bin
gcc -Isrc/lib src/main.c bin/lib/lib.o -o bin/main.exe

With this fix, Stroll completes successfully and produces the following dependency graph:

We can now demonstrate the early cut-off feature by adding a comment to the file src/lib/lib.c. Before running Stroll let’s check that this change makes both lib and main out-of-date:

As you can see, both tasks are now marked as out-of-date: lib’s direct input has changed, which transitively also affects main. Let’s Stroll:

$ stroll build
Executing build/lib.bat...
Done

Stroll has rebuilt the library but the file bin/lib/lib.o didn’t change, thus restoring the up-to-date status of the task main.

Finally, to clean up after our experiments, let’s create a new task clean and place it into a different directory: clean/clean.bat.

$ cat clean/clean.bat
rm bin/lib/lib.o
rm bin/main.exe

$ stroll clean
Executing clean/clean.bat...
Done

The corresponding dependency graph shows two outputs — the deleted binary files.

As you might have noticed, Stroll uses directories as collections of build tasks related to a common build target. To build a target we simply run Stroll in the corresponding directory. If you’d like to build just a single task from a directory, you can specify the full path, for example:

$ stroll build/lib.bat
Executing build/lib.bat...

This executes the lib task (regardless of its current status). Note that the main task is currently out-of-date because clean deleted its output:

To complete the build, run stroll build, which will execute only the main task, as desired.

Challenge: Try to orchestrate a situation, where Stroll would execute one of the tasks twice, hence demonstrating that Stroll is not a minimal build system.

Under the hood

Stroll is implemented in ~400 lines of Haskell (including comments) and uses the following libraries in addition to the standard ones:

Shake’s cmd function is used to execute build tasks tracking their file accesses. This relies on Shake’s support of the fsatrace utility.
The algebraic graphs library Alga is used for graph construction, traversal and visualisation.
Libraries cryptonite and yaml are used for computing file content hashes and serialising/deserialising YAML files.

Many thanks to everyone who contributed to these projects!

You are welcome to browse the source code of Stroll and/or play with it, but be warned: it has a few serious limitations (discussed below), and I’m not sure they will ever be fixed.

Limitations

Stroll is fun and I consider it a successful experiment but the current implementation has a few serious limitations:

The fsatrace utility does not track reads from non-existing files, which means Stroll cannot determine that a task is out-of-date because a file that was previously missing has now appeared. Directory scans are not tracked either, that’s why I didn’t use a command like rm -rf bin/* in the clean script. A model of Stroll with ideal tracking that is free from these limitations is available here.
Although fsatrace can track file movement, e.g. mv src dst, Stroll does not yet support it and terminates with an error if detected.
The current implementation is completely sequential, i.e. build tasks are executed one by one. It’s possible to make Stroll parallel, but it’d require quite a bit of engineering. I might do it someday.
Stroll is not a cloud build system although it’s possible to make it one by adding shared storage of artefacts keyed by their hashes. For more details see the Build Systems à la Carte paper.

Final remarks

One interesting aspect that I haven’t demonstrated above is how one can mix and match different languages when writing individual build tasks. For example, we could use Shake for compiling C source files into object files. Note that Shake itself is a build system that maintains its own state, but it still works out just fine: if Shake decides to rebuild only one object file (since others are up-to-date) Stroll can safely remember this decision — as long as all input files are the same and the Shake’s database is unchanged, we can assume that the results produced by Shake previously are still valid. This means that even though Stroll’s task granularity may be quite coarse (e.g. one task for 100 object files), these coarse-grain tasks can benefit from fine-grain incrementality and parallelism supported by other build systems, such as Shake. And if you happen to have two existing build systems written in different languages you don’t need to rewrite anything: you can compose your legacy build systems simply by placing them into the same directory and using Stroll.

Unusually, Stroll can cope with cyclic task descriptions, where a few build tasks form a dependency cycle, as well as with build tasks that generate new build tasks! Stroll simply keeps building until reaching a fixed point where all tasks are up-to-date.

Stroll does not fit the modelling approach from the Build Systems à la Carte paper, where build tasks have statically known outputs: Stroll supports both dynamic inputs and dynamic outputs. I conjecture that such build systems cannot be minimal, i.e. they fundamentally require a trial and error approach used by Stroll, where unnecessary work may be performed while discovering the complete dependency graph.

Acknowledgements

I’ve been thinking about this idea for a while and had many illuminating discussions with Ulf Adams, Arseniy Alekseyev, Jeremie Dimino, Georgy Lukyanov, Neil Mitchell, Iman Narasamdya, Simon Peyton Jones, Newton Sanches, Danil Sokolov, and probably others. A few people in this list have been sceptical about the idea, so I do not imply that they endorsed Stroll — I’m merely thankful to everyone for their insights.

You should try Hadrian

Andrey — Fri, 25 Jan 2019 02:57:58 +0000

This is an announcement for GHC developers:

You should try to use Hadrian as the GHC build system, because it will (hopefully!) become the default around GHC 8.8.

What is Hadrian and how can I try it?

Hadrian is a new build system for the Glasgow Haskell Compiler, which is written in Haskell. It lives in the directory “hadrian” in the GHC tree, and we have been actively developing it in the past year to reach feature and correctness parity with the existing Make-based build system. While we haven’t quite reached this goal (more on this below), Hadrian is already working well and we run Hadrian jobs alongside the Make ones in our CI pipelines since the recent move to GitLab.

At this point, we would like to encourage everyone to try using Hadrian for their usual GHC development tasks. Hadrian’s documentation resides in GHC’s source tree, and below are the documents you will be most interested in:

The README is the root of Hadrian’s documentation. It explains the basics and points to more specific documents where appropriate.
A cheatsheet-style document for GHC developers used to the Make build system, showing equivalent Make/Hadrian commands for many tasks.
A description of the “user settings” mechanism in Hadrian, which is where you can customise the build flavour, choose the packages to build, add file/package/platform-specific command line flags, etc.
A description of the “test” rule and all the options it supports.

The documentation can surely be improved, so please do not hesitate to send us feedback and suggestions here, or even better on GHC Trac; make sure you select the component “Build System (Hadrian)” when creating a new ticket.

You need Hadrian

Hadrian is new, requires time to learn, and still has rough edges, but it has been developed to make your lives better. Here are a few advantages of Hadrian over the Make-based build system:

1) Hadrian is more reliable

Hadrian can capture build dependencies more accurately, which means you rarely (if ever) need to do a clean rebuild.

2) Hadrian is faster

Hadrian is faster for two reasons: (i) more accurate build dependencies, (ii) tracking of file contents instead of file modification times. Both allow you to avoid a lot of unnecessary rebuilds. Building Hadrian itself may take a while but needs to be done only once.

3) Hadrian is easier to understand and modify

You no longer need to deal with Make’s global namespace of mutable string variables. Hadrian is written in the language you love; it has modules, types and pure functions.

If you come across a situation where Hadrian is worse than the Make build system in any of the above aspects, this is a bug and you should report it.

Helping Hadrian

The best way to help is to try Hadrian, and let us know how it goes, what doesn’t work, what’s missing for you, what you think should be easier, and so on. Below is a list of known issues that we are in the process of fixing or that we will be tackling soon:

Stage 2 GHC should be dynamically linked most of the time, but it never is, currently. See ticket #15837.
There are about a dozen of failing tests in the GHC testsuite, some related to #15837.
Binary distributions haven’t been thoroughly tested on many platforms (only some Linux flavours). There will definitely be some issues here. For example, the binary distribution rule currently fails on Windows.
There is no “validate” rule yet, only “test”, but we have all the pieces to make this happen and it has a very high priority.
There are issues with building cross compilers.

We are likely missing some features compared to the Make build system, but none of them should take a lot of time to implement at this point. If you spot one, let us know! We’ll do our best to implement it (or help you do it) as soon as we can. It is useful to look at the existing Hadrian tickets before submitting new ones, to make sure that the issue or idea that you would like to talk about hasn’t been brought up yet.

Of course, we welcome your code contributions too! Several GHC developers have a good understanding of the Hadrian codebase and will be able to help you. To find their names, have a look at the list of recent Hadrian commits. As you can see, Hadrian is actively developed by many people, and we hope you will join too.

United Monoids

Andrey — Wed, 12 Dec 2018 04:46:20 +0000

In this blog post we will explore the consequences of postulating 0 = 1 in an algebraic structure with two binary operations (S, +, 0) and (S, ⋅, 1). Such united monoids have a few interesting properties, which are not immediately obvious. For example, we will see that the axiom 0 = 1 is equivalent to a seemingly less extravagant axiom ab = ab + a, which will send us tumbling down the rabbit hole of algebraic graphs and topology.

We start with a brief introduction to monoids, rings and lattices. Feel free to jump straight to the section “What if 0 = 1?”, where the fun starts.

Monoids

A monoid (S, ∘, e) is a way to express a basic form of composition in mathematics: any two elements a and b of the set S can be composed into a new element a ∘ b of the same set S, and furthermore there is a special element e ∈ S, which is the identity element of the composition, as expressed by the following identity axioms:

a ∘ e = a
e ∘ a = a

In words, composing the identity element with another element does not change the latter. The identity element is sometimes also called unit.

As two familiar everyday examples, consider addition and multiplication over integer numbers: (ℤ, +, 0) and (ℤ, ⋅, 1). Given any two integers, these operations produce another integer, e.g. 2 + 3 = 5 and 2 ⋅ 3 = 6, never leaving the underlying set of integers ℤ; they also respect the identity axioms, i.e. both a + 0 = 0 + a = a and a ⋅ 1 = 1 ⋅ a = a hold for all integers a. Note: from now on we will often omit the multiplication operator and write simply ab instead of a ⋅ b, which is a usual convention.

Another important monoid axiom is associativity:

a ∘ (b ∘ c) = (a ∘ b) ∘ c

It tells us that the order in which we group composition operations does not matter. This makes monoids convenient to work with and allows us to omit unnecessary parentheses. Addition and multiplication are associative: a + (b + c) = (a + b) + c and a(bc) = (ab)c.

Monoids are interesting to study, because they appear everywhere in mathematics, programming and engineering. Another example comes from Boolean algebra: the logical disjunction (OR) monoid ({0,1}, ∨, 0) and the logical conjunction (AND) monoid ({0,1}, ∧, 1). Compared to numbers, in Boolean algebra the meanings of composition and identity elements are very different (e.g. number zero vs logical false), yet we can abstract from these differences, which allows us to reuse general results about monoids across various specific instances.

In this blog post we will also come across commutative and idempotent monoids. In commutative monoids, the order of composition does not matter:

a ∘ b = b ∘ a

All four examples above (+, ⋅, ∨, ∧) are commutative monoids. String concatenation (S, ++, “”) is an example of a non-commutative monoid: indeed, “a” ++ “b” = “ab” and “b” ++ “a” = “ba” are different strings.

Finally, in an idempotent monoid, composing an element with itself does not change it:

a ∘ a = a

The disjunction ∨ and conjunction ∧ monoids are idempotent, whereas the addition +, multiplication ⋅ and concatenation ++ monoids are not.

A monoid which is both commutative and idempotent is a bounded semilattice; both disjunction and conjunction are bounded semilattices.

Rings and lattices

As you might have noticed, monoids often come in pairs: addition and multiplication (+, ⋅), disjunction and conjunction (∨, ∧), set union and intersection (⋃, ⋂), parallel and sequential composition (||, ;) etc. I’m sure you can list a few more examples of such pairs. Two common ways in which such monoid pairs can be formed are called rings and lattices.

A ring, or more generally a semiring, (S, +, 0, ⋅, 1) comprises an additive monoid (S, +, 0) and a multiplicative monoid (S, ⋅, 1), such that: they both operate on the same set S, the additive monoid is commutative, and the multiplicative monoid distributes over the additive one:

a(b + c) = ab + ac
(a + b)c = ac + bc

Distributivity is very convenient and allows us to open parentheses, and (if applied in reverse) to factor out a common term of two expressions. Furthermore, ring-like algebraic structures require that 0 annihilates all elements under multiplication:

a ⋅ 0 = 0
0 ⋅ a = 0

The most basic and widely known ring is that of integer numbers with addition and multiplication: we use this pair of monoids every day, with no fuss about the underlying theory. Various lesser known tropical and star semirings are a great tool in optimisation on graphs — read this cool functional pearl by Stephen Dolan if you want to learn more.

A bounded lattice (S, ∨, 0, ∧, 1) also comprises two monoids, which are called join (S, ∨, 0) and meet (S, ∧, 1). They operate on the same set S, are required to be commutative and idempotent, and satisfy the following absorption axioms:

a ∧ (a ∨ b) = a
a ∨ (a ∧ b) = a

Like rings, lattices show up very frequently in different application areas. Most basic examples include Boolean algebra ({0,1}, ∨, 0, ∧, 1), the power set (2^S, ⋃, Ø, ⋂, S), as well as integer numbers with negative and positive infinities and the operations max and min: (ℤ^±∞, max, -∞, min, +∞). All of these lattices are distributive, i.e. ∧ distributes over ∨ and vice versa.

What if 0 = 1?

Now that the scene has been set and all characters introduced, let’s see what happens when the identity elements of the two monoids in a pair (S, +, 0) and (S, ⋅, 1) coincide, i.e. when 0 = 1.

In a ring (S, +, 0, ⋅, 1), this leads to devastating consequences. Not only 1 becomes equal to 0, but all other elements of the ring become equal to 0 too, as demonstrated below:

a	=	a ⋅ 1	(identity of ⋅)
	=	a ⋅ 0	(we postulate 0 = 1)
	=	0	(annihilating 0)

The ring is annihilated into a single point 0.

In a bounded lattice (S, ∨, 0, ∧, 1), postulating 0 = 1 leads to the same catastrophe, albeit in a different manner:

a	=	1 ∧ a	(identity of ∧)
	=	1 ∧ (0 ∨ a)	(identity of ∨)
	=	0 ∧ (0 ∨ a)	(we postulate 0 = 1)
	=	0	(absorption axiom)

The lattice is absorbed into a single point 0.

Postulating the axiom 0 = 1 has so far led to nothing but disappointment. Let’s find another way of pairing monoids, which does not involve the axioms of annihilation and absorption.

United monoids

Consider two monoids (S, +, 0) and (S, ⋅, 1), which operate on the same set S, such that + is commutative and ⋅ distributes over +. We call these monoids united if 0 = 1. To avoid confusion with rings and lattices, we will use e to denote the identity element of both monoids:

a + e = ae = ea = a

We will call this the united identity axiom. We’ll also refer to e as empty, the operation + as overlay, and the operation ⋅ as connect.

What can we tell about united monoids? First of all, it is easy to prove that the monoid (S, +, e) is idempotent:

a + a	=	ae + ae	(united identity)
	=	a(e + e)	(distributivity)
	=	ae	(united identity)
	=	a	(united identity)

Recall that this means that (S, +, e) is a bounded semilattice.

The next consequence of the united identity axiom is a bit more unusual:

ab = ab + a
ab = ab + b
ab = ab + a + b

We will refer to the above properties as containment laws: intuitively, when you connect a and b, the constituent parts are contained in the result ab. Let us prove containment:

ab + a	=	ab + ae	(united identity)
	=	a(b + e)	(distributivity)
	=	ab	(united identity)

The two other laws are proved analogously (in fact, they are equivalent to each other).

Surprisingly, the containment law ab = ab + a is equivalent to the united identity law 0 = 1, i.e. the latter can be proved from the former:

0	=	1 ⋅ 0	(1 is identity of ⋅)
	=	1 ⋅ 0 + 1	(containment)
	=	0 + 1	(1 is identity of ⋅)
	=	1	(0 is identity of +)

This means that united monoids can equivalently be defined as follows:

(S, +) is a commutative semigroup.
(S, ⋅, e) is a monoid that distributes over +.
Containment axiom: ab = ab + a.

Then the fact that (S, +, e) is also a monoid can be proved as above.

Finally, let’s prove one more property of united monoids: non-empty elements of S can have no inverses. More precisely:

if a + b = e or ab = e then a = b = e.

The lack of overlay inverses follows from overlay idempotence:

a	=	a + e	(united identity)
	=	a + a + b	(assumption a + b = e)
	=	a + b	(idempotence)
	=	e	(assumption a + b = e)

The lack of connect inverses follows from the containment law:

a	=	e + a	(united identity)
	=	ab + a	(assumption ab = e)
	=	ab	(containment)
	=	e	(assumption ab = e)

It is time to look at some examples of united monoids.

Example 1: max and plus, united

One example appears in this paper on Haskell’s ApplicativeDo language extension. It uses a simple cost model for defining the execution time of programs composed in parallel or in sequence. The two monoids are:

(ℤ^≥0, max, 0): the execution time of programs a and b composed in parallel is defined to be the maximum of their execution times:

time(a || b) = max(time(a), time(b))
(ℤ^≥0, +, 0): the execution time of programs a and b composed in sequence is defined to be the sum of their execution times:

time(a ; b) = time(a) + time(b)

Execution times are non-negative, hence both max and + have identity 0, which is the execution time of the empty program: max(a, 0) = a + 0 = a. It is easy to check distributivity (+ distributes over max) and containment:

a + max(b, c) = max(a + b, a + c)
max(a + b, a) = a + b

Note that the resulting algebraic structure is different from the tropical max-plus semiring (ℝ^−∞, max, −∞, +, 0) commonly used in scheduling, where the identity of max is −∞.

In general, various flavours of parallel and sequential composition often form united monoids. In this paper about Concurrent Kleene Algebra the authors use the term bimonoid to refer to such structures, but this term is also used to describe an unrelated concept in category theory, so let me stick to “united monoids” here, which has zero google hits.

Example 2: algebraic graphs

My favourite algebraic structure is the algebra of graphs described in this paper. The algebra comprises two monoids that have the same identity, which motivated me to study similar algebraic structures, and led to writing this blog post about the generalised notion of united monoids.

As a brief introduction, consider the following operations on graphs. The overlay operation + takes two graphs (V₁, E₁) and (V₂, E₂), and produces the graph containing the union of their vertices and edges:

(V₁, E₁) + (V₂, E₂) = (V₁ ∪ V₂, E₁ ∪ E₂)

The connect operation ⋅ is similar to overlay, but it also adds an edge from each vertex of the first graph to each vertex of the second graph:

(V₁, E₁) ⋅ (V₂, E₂) = (V₁ ∪ V₂, E₁ ∪ E₂ ∪ V₁ × V₂)

The operations have the same identity e — the empty graph (∅, ∅) — and form a pair of united monoids, where ⋅ distributes over +.

In addition to the laws of united monoids described above, the algebra of graphs has the axiom of decomposition:

abc = ab + ac + bc

The intuition behind this axiom is that any expression in the algebra of graphs can be broken down into vertices and pairs of vertices (edges). Note that the containment laws follow from decomposition, e.g.:

ab	=	aeb	(e identity of ⋅)
	=	ae + ab + eb	(decomposition)
	=	ae + ab + b	(e identity of ⋅)
	=	a(e + b) + b	(distributivity)
	=	ab + b	(e identity of +)

By postulating the commutativity of the connect operation (ab = ba), we can readily obtain undirected graphs.

The algebra of graphs can be considered a “2D” special case of united monoids, where one can only connect elements pairwise; any 3-way connection abc falls apart into pieces. A 3-dimensional variant of the algebra can be obtained by replacing the decomposition axiom with:

abcd = abc + abd + acd + bcd

This allows us to connect vertices into pairs (edges) and triples (faces), but forces 4-way products abcd to fall apart into faces, as shown below:

Note that 3-decomposition follows from 2-decomposition: if all 3-way products fall apart then so do all 4-way products, but not vice versa. Borrowing an example from David Spivak’s paper on modelling higher- dimensional networks, such 3D graphs allow us to distinguish these two different situations:

Three people having a conversation together, e.g. over a restaurant table. This can be modelled with a filled-in triangle abc.
Three people having three separate tête-à-tête conversations. This can be modelled with a hollow triangle ab + ac + bc.

Similar examples show up in concurrency theory, where one might need to distinguish three truly concurrent events from three events that are concurrent pairwise, but whose overall concurrency is limited by shared resources, e.g. three people eating ice-cream with two spoons, or going through a two-person-wide door. There is a short paper on this topic by my PhD advisor Alex Yakovlev, written in 1989 (on a typewriter!).

Example 3: simplicial complexes

United monoids of growing dimension lead us to topology, specifically to simplicial complexes, which are composed of simple n-dimensional shapes called simplices, such as point (0-simplex), segment (1-simplex), triangle (2-simplex), tetrahedron (3-simplex), etc. — here is a cool video. We show an example of a simplicial complex below, along with a united monoid expression C that describes it, and two containment properties. We’ll further assume commutativity of connection: ab = ba.

Simplicial complexes are closed in terms of containment. For example, a filled-in triangle contains its edges and vertices, and cannot appear in a simplicial complex without any of them. This property can be expressed algebraically as follows:

abc = abc + ab + ac + bc + a + b + c

Interestingly, this 3D containment law follows from the 2D version that we defined for united monoids:

abc	=	(ab + a + b)c	(containment)
	=	(ab)c + ac + bc	(distributivity)
	=	(abc + ab + c) + (ac + a) + (bc + b)	(containment)
	=	abc + ab + ac + bc + a + b + c	(commutativity)

We can similarly prove n-dimensional versions of the containment law; they all trivially follow from the basic containment axiom ab = ab + a, or, alternatively, from the united identity axiom 0 = 1.

United monoids in Haskell

Now let’s put together a small library for united monoids in Haskell and express some of the above examples in it.

Monoids are already represented in the standard Haskell library base by the type class Monoid. We need to extend it to the type class Semilattice, which does not define any new methods, but comes with two new laws. We also provide a few convenient aliases, following the API of the algebraic-graphs library:

-- Laws:
-- * Commutativity: a <> b = b <> a
-- * Idempotence:   a <> a = a
class Monoid m => Semilattice m

empty :: Semilattice m => m
empty = mempty

overlay :: Semilattice m => m -> m -> m
overlay = mappend

overlays :: Semilattice m => [m] -> m
overlays = foldr overlay empty

infixr 6 <+>
(<+>) :: Semilattice m => m -> m -> m
(<+>) = overlay

-- The natural partial order on the semilattice
isContainedIn :: (Eq m, Semilattice m) => m -> m -> Bool
isContainedIn x y = x <+> y == y

We are now ready to define the type class for united monoids that defines a new method connect and associated laws:

-- Laws:
-- * United identity:     a <.> empty == empty <.> a == a
-- * Associativity:   a <.> (b <.> c) == (a <.> b) <.> c
-- * Distributivity:  a <.> (b <+> c) == a <.> b <+> a <.> c 
--                    (a <+> b) <.> c == a <.> c <+> b <.> c
class Semilattice m => United m where
    connect :: m -> m -> m

infixr 7 <.>
(<.>) :: United m => m -> m -> m
(<.>) = connect

connects :: United m => [m] -> m
connects = foldr connect empty

Algebraic graphs are a trivial instance:

import Algebra.Graph (Graph)
import qualified Algebra.Graph as Graph

-- TODO: move orphan instances to algebraic-graphs library 
instance Semigroup (Graph a) where
    (<>) = Graph.overlay

instance Monoid (Graph a) where
    mempty = Graph.empty

instance Semilattice (Graph a)

instance United (Graph a) where
    connect = Graph.connect

We can now express the above simplicial complex example in Haskell and test whether it contains the filled-in and the hollow triangles:

-- We are using OverloadedStrings for creating vertices
example :: (United m, IsString m) => m
example = overlays [ "p" <.> "q" <.> "r" <.> "s"
                   , ("r" <+> "s") <.> "t"
                   , "u"
                   , "v" <.> "x"
                   , "w" <.> ("x" <+> "y" <+> "z")
                   , "x" <.> "y" <.> "z" ]

-- Filled-in triangle
rstFace :: (United m, IsString m) => m
rstFace = "r" <.> "s" <.> "t"

-- Hollow triangle
rstSkeleton :: (United m, IsString m) => m
rstSkeleton = "r" <.> "s" <+> "r" <.> "t" <+> "s" <.> "t"

To perform the test, we need to instantiate the polymorphic united monoid expression to the concrete data type like Graph Point:

newtype Point = Point { getPoint :: String }
    deriving (Eq, Ord, IsString)

λ> rstFace `isContainedIn` (example :: Graph Point)
True

λ> rstSkeleton `isContainedIn` (example :: Graph Point)
True

As you can see, if we interpret the example simplicial complex using the algebraic graphs instance, we cannot distinguish the filled-in and hollow triangles, because the filled-in triangle falls apart into edges due to the 2-decomposition law abc = ab + ac + bc.

Let’s define a data type for representing simplicial complexes. We start with simplices, which can be modelled by sets.

-- A simplex is formed on a set of points
newtype Simplex a = Simplex { getSimplex :: Set a }
    deriving (Eq, Semigroup)

-- Size-lexicographic order: https://en.wikipedia.org/wiki/Shortlex_order
instance Ord a => Ord (Simplex a) where
    compare (Simplex x) (Simplex y) =
        compare (Set.size x) (Set.size y) <>
        compare x y

instance Show a => Show (Simplex a) where
    show = intercalate "." . map show . Set.toList . getSimplex

instance IsString a => IsString (Simplex a) where
    fromString = Simplex . Set.singleton . fromString

isFaceOf :: Ord a => Simplex a -> Simplex a -> Bool
isFaceOf (Simplex x) (Simplex y) = Set.isSubsetOf x y

Note that the Ord instance is defined using the size-lexicographic order so that a simplex x can be a face of a simplex y only when x <= y.

Now we can define simplicial complexes, which are sets of simplices that are closed with respect to the subset relation.

-- A simplicial complex is a set of simplices
-- We only store maximal simplices for efficiency
newtype Complex a = Complex { getComplex :: Set (Simplex a) }
    deriving (Eq, Ord)

instance Show a => Show (Complex a) where
    show = intercalate " + " . map show . Set.toList . getComplex

instance IsString a => IsString (Complex a) where
    fromString = Complex . Set.singleton . fromString

-- Do not add a simplex if it is contained in existing ones
addSimplex :: Ord a => Simplex a -> Complex a -> Complex a
addSimplex x (Complex y)
    | any (isFaceOf x) y = Complex y
    | otherwise          = Complex (Set.insert x y)

-- Drop all non-minimal simplices
normalise :: Ord a => Complex a -> Complex a
normalise = foldr addSimplex empty . sort . Set.toList . getComplex

instance Ord a => Semigroup (Complex a) where
    Complex x <> Complex y = normalise (Complex $ x <> y)

instance Ord a => Monoid (Complex a) where
    mempty = Complex Set.empty

instance Ord a => Semilattice (Complex a)

instance Ord a => United (Complex a) where
    connect (Complex x) (Complex y) = normalise . Complex $ Set.fromList
        [ a <> b | a <- Set.toList x, b <- Set.toList y ]

Now let’s check that simplicial complexes allow us to distinguish the filled-in triangle from the hollow one:

λ> example :: Complex Point
u + r.t + s.t + v.x + w.x + w.y + w.z + x.y.z + p.q.r.s

λ> rstFace :: Complex Point
r.s.t

λ> rstSkeleton :: Complex Point
r.s + r.t + s.t

λ> rstFace `isContainedIn` (example :: Complex Point)
False

λ> rstSkeleton `isContainedIn` (example :: Complex Point)
True

Success! As you can check in the diagram above, the example simplicial complex contains a hollow triangle rs + rt + st, but does not contain the filled-in triangle rst.

If you would like to experiment with the code above, check out this repository: https://github.com/snowleopard/united.

Final remarks

I’ve got a few more thoughts, but it’s time to wrap up this blog post. I’m impressed that you’ve made it this far =)

Let me simply list a few things I’d like to explore in future:

- What about monoids that lurk in applicative functors and monads? As we know (e.g. from the final lecture of a great lecture series by Bartosz Milewski), pure :: a → m a and join :: m (m a) → m a are the monoid identity and composition in disguise. Couldn’t we unite this monoid with the monoid that corresponds to the commutative monad? Note that they have the same identity pure :: a → m a! This thought popped into my head when watching Edward Kmett’s live-coding session on commutative applicative functors and monads.
- There are cases where a single overlay monoid is united with many, possibly infinitely many connect monoids. An example comes from algebraic graphs with edge labels that have a connect operation for each possible edge label — see my Haskell eXchange 2018 talk.
- We can also define united semigroups that satisfy the containment axiom ab = ab + a but have no identity element, for example, non-empty algebraic graphs. The term “united semigroups” sounds somewhat nonsensical since semigroups have no “units”, however note that if they secretly had identity elements, they would have been the same.
- Speaking of “units”, in ring theory, a unit is any element a that has a multiplicative inverse b such that ab = 1. The lack of inverses, which we proved above, means there is only one unit in united monoids — the shared identity e.
- I have a feeling that united monoids are somehow inherently linked to Brent Yorgey’s monoidal sparks.
- We can also consider united partial monoids where the two operations may be defined only partially. For example, this paper by Jeremy Gibbons defines operations above and beside for composing directed acyclic graphs so that they both have the same identity (the empty graph), but the operation beside is defined only for graphs of matching types.

Finally, I’d like to ask a question: have you come across united monoids, perhaps under a different name? As we’ve seen, having 0 = 1 does make sense in some cases, but I couldn’t find much literature on this topic.

Update: boundary operator and derivatives

Consider the following definition of the boundary operator:

∂x = overlay { a | a < x }

where the overlay is over all elements of the set, and a < b denotes strict containment, i.e.

a < b ⇔ a + b = b ∧ a ≠ b

First, let’s apply this definition to a few basic simplices:

∂a = empty
∂(ab) = a + b + empty = a + b
∂(abc) = ab + ac + bc + a + b + c + empty = ab + ac + bc

This looks very similar to the boundary operator from topology, e.g. the boundary of the filled-in triangle abc is the hollow triangle ab + ac + bc, and if we apply the boundary operator twice, the result is unchanged, i.e. ∂(ab + ac + bc) = ab + ac + bc.

Surprisingly, the boundary operator seems to satisfy the product rule for derivatives for non-empty a and b:

∂(ab) = ∂(a)b + a∂(b)

I’m not sure where this is going, but it’s cool. Perhaps, there is a link with derivatives of types?

Thanks to Dave Clarke who suggested to look at the boundary operator.

Further update: One problem with the above definition is that the sum rule for derivatives doesn’t hold, i.e. ∂(a + b) ≠ ∂(a) + ∂(b). Sjoerd Visscher suggested to define ∂ using the desired (usual) laws for derivatives:

∂(ab) = ∂(a)b + a∂(b)
∂(a + b) = ∂(a) + ∂(b)

Coupled with ∂(empty) = ∂(a) = empty (where a is a vertex), this leads to a different boundary operator, where the boundary of the filled-in triangle abc is the hollow triangle ab + ac + bc, and the boundary of the hollow triangle is simply the three underlying vertices a + b + c:

This definition of the boundary operator reduces the “dimension” of a united monoid expression by 1 (unless it is already empty).

Build Systems à la Carte

Andrey — Sat, 07 Jul 2018 10:52:31 +0000

In a recent blog post, I shared a preliminary version of the paper on build systems that Neil Mitchell, Simon Peyton Jones and I submitted to the ICFP 2018 conference. The paper was accepted and yesterday, after months of revisions and polishing, we’ve finally completed this work. The paper and associated executable models are openly available; here is a direct link to the PDF.

Build systems, such as classic Make, are big, complicated, and used by every software developer on the planet. But they are a sadly unloved part of the software ecosystem, very much a means to an end, and seldom the focus of attention. Rarely do people ask questions like “What does it mean for my build system to be correct?” or “What are the trade-offs between different approaches?”. For years Make dominated, but more recently the challenges of scale have driven large software firms like Microsoft, Facebook and Google to develop their own build systems, exploring new points in the design space. In this paper we offer a general framework in which to understand and compare build systems, in a way that is both abstract (omitting incidental detail) and yet precise (implemented as Haskell code).

As one of our main contributions we identify two key design choices that are typically deeply wired into any build system: (i) the order in which tasks are built (the scheduling algorithm), and (ii) whether or not a task is (re-)built (the rebuilding strategy). These choices turn out to be orthogonal, which leads us to a new classification of the design space, as shown in the table below.

Rebuilding strategy	Scheduling algorithm
Rebuilding strategy	Topological	Restarting	Suspending
Dirty bit	Make	Excel
Verifying traces	Ninja		Shake
Constructive traces	CloudBuild	Bazel	X
Deep constructive traces	Buck		Nix

We can readily remix the ingredients to design new build systems with desired properties: the spot marked by X is particularly interesting since it combines the advantages of Shake and Bazel build systems. Neil is now working on implementing this new build system — Cloud Shake.

Read the paper, it’s fun. We even explain what Frankenbuilds are

Selective applicative functors

Andrey — Wed, 27 Jun 2018 05:22:53 +0000

Update: I have written a paper about selective applicative functors, and it completely supersedes this blog post (it also uses a slightly different notation). You should read the paper instead.

I often need a Haskell abstraction that supports conditions (like Monad) yet can still be statically analysed (like Applicative). In such cases people typically point to the Arrow class, more specifically ArrowChoice, but when I look it up, I find several type classes and a dozen of methods. Impressive, categorical but also quite heavy. Is there a more lightweight approach? In this blog post I’ll explore what I call selective applicative functors, which extend the Applicative type class with a single method that makes it possible to be selective about effects.

Please meet Selective:

class Applicative f => Selective f where
    handle :: f (Either a b) -> f (a -> b) -> f b

Think of handle as a selective function application: you apply a handler function of type a → b when given a value of type Left a, but can skip the handler (along with its effects) in the case of Right b. Intuitively, handle allows you to efficiently handle errors, i.e. perform the error-handling effects only when needed.

Note that you can write a function with this type signature using Applicative functors, but it will always execute the effect associated with the handler so it’s potentially less efficient:

handleA :: Applicative f => f (Either a b) -> f (a -> b) -> f b
handleA x f = (\e f -> either f id e) <$> x <*> f

Selective is more powerful^(*) than Applicative: you can recover the application operator <*> as follows (I’ll use the suffix S for Selective).

apS :: Selective f => f (a -> b) -> f a -> f b
apS f x = handle (Left <$> f) (flip ($) <$> x)

Here we tag a given function a → b as an error and turn a value of type a into an error-handling function ($a), which simply applies itself to the error a → b yielding b as desired. We will later define laws for the Selective type class which will ensure that apS is a legal application operator <*>, i.e. that it satisfies the laws of the Applicative type class.

The select function is a natural generalisation of handle: instead of skipping one unnecessary effect, it selects which of the two given effectful functions to apply to a given value. It is possible to implement select in terms of handle, which is a good puzzle (try it!):

select :: Selective f =>
    f (Either a b) -> f (a -> c) -> f (b -> c) -> f c
select = ... -- Try to figure out the implementation!

Finally, any Monad is Selective:

handleM :: Monad f => f (Either a b) -> f (a -> b) -> f b
handleM mx mf = do
    x <- mx
    case x of
        Left  a -> fmap ($a) mf
        Right b -> pure b

Selective functors are sufficient for implementing many conditional constructs, which traditionally require the (more powerful) Monad type class. For example:

ifS :: Selective f => f Bool -> f a -> f a -> f a
ifS i t e = select
    (bool (Right ()) (Left ()) <$> i) (const <$> t) (const <$> e)

Here we turn a Boolean value into Left () or Right () and then select an appropriate branch. Let’s try this function in a GHCi session:

λ> ifS (odd . read <$> getLine) (putStrLn "Odd") (putStrLn "Even")
0
Even

λ> ifS (odd . read <$> getLine) (putStrLn "Odd") (putStrLn "Even")
1
Odd

As desired, only one of the two effectful functions is executed. Note that here f = IO with the default selective instance: handle = handleM.

Using ifS as a building block, we can implement other useful functions:

-- | Conditionally apply an effect.
whenS :: Selective f => f Bool -> f () -> f ()
whenS x act = ifS x act (pure ())

-- | A lifted version of lazy Boolean OR.
(<||>) :: Selective f => f Bool -> f Bool -> f Bool
(<||>) a b = ifS a (pure True) b

See more examples in the repository. (Note: I recently renamed handle to select, and select to branch in the repository. Apologies for the confusion.)

Static analysis

Like applicative functors, selective functors can be analysed statically. As an example, consider the following useful data type Validation:

data Validation e a = Failure e | Success a
    deriving (Functor, Show)

instance Semigroup e => Applicative (Validation e) where
    pure = Success
    Failure e1 <*> Failure e2 = Failure (e1 <> e2)
    Failure e1 <*> Success _  = Failure e1
    Success _  <*> Failure e2 = Failure e2
    Success f  <*> Success a  = Success (f a)

instance Semigroup e => Selective (Validation e) where
    handle (Success (Right b)) _ = Success b
    handle (Success (Left  a)) f = Success ($a) <*> f
    handle (Failure e        ) _ = Failure e

This data type is used for validating complex data: if reading one or more fields has failed, all errors are accumulated (using the operator <> from the semigroup e) to be reported together. By defining the Selective instance, we can now validate data with conditions. Below we define a function to construct a Shape (a Circle or a Rectangle) given a choice of the shape s :: f Bool and the shape’s parameters (Radius, Width and Height) in an arbitrary selective context f.

type Radius = Int
type Width  = Int
type Height = Int

data Shape = Circle Radius | Rectangle Width Height deriving Show

shape :: Selective f =>
    f Bool -> f Radius -> f Width -> f Height -> f Shape
shape s r w h = ifS s (Circle <$> r) (Rectangle <$> w <*> h)

We choose f = Validation [String] to report the errors that occurred when reading values. Let’s see how it works.

λ> shape (Success True) (Success 10)
    (Failure ["no width"]) (Failure ["no height"])
Success (Circle 10)

λ> shape (Success False) (Failure ["no radius"])
    (Success 20) (Success 30)
Success (Rectangle 20 30)

λ> shape (Success False) (Failure ["no radius"])
     (Success 20) (Failure ["no height"])
Failure ["no height"]

λ> shape (Success False) (Failure ["no radius"])
    (Failure ["no width"]) (Failure ["no height"])
Failure ["no width","no height"]

λ> shape (Failure ["no choice"]) (Failure ["no radius"])
    (Success 20) (Failure ["no height"])
Failure ["no choice"]

In the last example, since we failed to parse which shape has been chosen, we do not report any subsequent errors. But it doesn’t mean we are short-circuiting the validation. We will continue accumulating errors as soon as we get out of the opaque conditional:

twoShapes :: Selective f => f Shape -> f Shape -> f (Shape, Shape)
twoShapes s1 s2 = (,) <$> s1 <*> s2

λ> s1 = shape (Failure ["no choice 1"]) (Failure ["no radius 1"])
    (Success 20) (Failure ["no height 1"])
λ> s2 = shape (Success False) (Failure ["no radius 2"])
    (Success 20) (Failure ["no height 2"])
λ> twoShapes s1 s2
Failure ["no choice 1","no height 2"]

Another example of static analysis of selective functors is the Task abstraction from the previous blog post.

instance Monoid m => Selective (Const m) where
    handle = handleA

type Task c k v = forall f. c f => (k -> f v) -> k -> Maybe (f v)

dependencies :: Task Selective k v -> k -> [k]
dependencies task key = case task (\k -> Const [k]) key of
                            Nothing         -> []
                            Just (Const ks) -> ks

The definition of the Selective instance for the Const functor simply falls back to the applicative handleA, which allows us to extract the static structure of any selective computation very similarly to how this is done with applicative computations. In particular, the function dependencies returns an approximation of dependencies of a given key: instead of ignoring opaque conditional statements as in Validation, we choose to inspect both branches collecting dependencies from both of them.

Here is an example from the Task blog post, where we used the Monad abstraction to express a spreadsheet with two formulas: B1 = IF(C1=1,B2,A2) and B2 = IF(C1=1,A1,B1).

task :: Task Monad String Integer
task fetch "B1" = Just $ do c1 <- fetch "C1"
                            if c1 == 1 then fetch "B2"
                                       else fetch "A2"
task fetch "B2" = Just $ do c1 <- fetch "C1"
                            if c1 == 1 then fetch "A1"
                                       else fetch "B1"
task _     _    = Nothing

Since this task description is monadic we could not analyse it statically. But now we can! All we need to do is rewrite it using Selective:

task :: Task Selective String Integer
task fetch "B1" = Just $
    ifS ((1==) <$> fetch "C1") (fetch "B2") (fetch "A2")
task fetch "B2" = Just $
    ifS ((1==) <$> fetch "C1") (fetch "A1") (fetch "B1")
task _     _    = Nothing

We can now apply the function dependencies defined above and draw the dependency graph using your favourite graph library:

λ> dependencies task "B1"
["A2","B2","C1"]
λ> dependencies task "B2"
["A1","B1","C1"]
λ> dependencies task "A1"
[]

λ> writeFile "task.dot" $ exportAsIs $ graph (dependencies task) "B1"
λ> :! dot -Tsvg task.dot -o task.svg

This produces the graph below, which matches the one I had to draw manually last time, since I had no Selective to help me.

Laws

Instances of the Selective type class must satisfy a few laws to make it possible to refactor selective computations. These laws also allow us to establish a formal relation with the Applicative and Monad type classes. The laws are complex, but I couldn’t figure out how to simplify them. Please let me know if you find an improvement.

(F1) Apply a pure function to the result:

f <$> handle x y = handle (second f <$> x) ((f .) <$> y)

(F2) Apply a pure function to the left (error) branch:

handle (first f <$> x) y = handle x ((. f) <$> y)

(F3) Apply a pure function to the handler:

handle x (f <$> y) =
    handle (first (flip f) <$> x) (flip ($) <$> y)

(P1) Apply a pure handler:
```
handle x (pure y) = either y id <$> x
```
(P2) Handle a pure error:
```
handle (pure (Left x)) y = ($x) <$> y
```

(A1) Associativity (in disguise):

handle x (handle y z) =
    handle (handle (f <$> x) (g <$> y)) (h <$> z)
  where
    f x = Right <$> x
    g y = \a -> bimap (,a) ($a) y
    h z = uncurry z

-- or in operator form with (<*?) = handle

x <*? (y <*? z) = (f <$> x) <*? (g <$> y) <*? (h <$> z)

Note that there is no law for handling a pure value, i.e. we do not require that the following holds:

handle (pure (Right x)) y = pure x

In particular, the following is allowed too:

handle (pure (Right x)) y = const x <$> y

We therefore allow handle to be selective about effects in this case. If we insisted on adding the first version of the above law, that would rule out the useful Const instance. If we insisted on the second version of the law, we would essentially be back to Applicative.

A consequence of the above laws is that apS satisfies Applicative laws (I do not have a formal proof, but you can find some proof sketches here). Note that we choose not to require that apS = <*>, since this forbids some interesting instances, such as Validation defined above.

If f is also a Monad, we require that handle = handleM.

Using the laws, it is possible to rewrite any selective computation into a normal form (the operator + denotes the sum type constructor):

   f (a + b + ... + z)    -- An initial value of a sum type
-> f (a -> (b + ... + z)) -- How to handle a's
-> f (b -> (c + ... + z)) -- How to handle b's
...
-> f (y -> z)             -- How to handle y's
-> f z                    -- The result

In words, we start with a sum type and handle each alternative in turn, possibly skipping unnecessary handlers, until we end up with a resulting value.

Alternative formulations

There are other ways of expressing selective functors in Haskell and most of them are compositions of applicative functors and the Either monad. Below I list a few examples. All of them are required to perform effects from left to right.

-- Composition of Applicative and Either monad
class Applicative f => SelectiveA f where
    (|*|) :: f (Either e (a -> b)) -> f (Either e a)
          -> f (Either e b)

-- Composition of Starry and Either monad
-- See: https://duplode.github.io/posts/applicative-archery.html
class Applicative f => SelectiveS f where
    (|.|) :: f (Either e (b -> c)) -> f (Either e (a -> b))
          -> f (Either e (a -> c))

-- Composition of Monoidal and Either monad
-- See: http://blog.ezyang.com/2012/08/applicative-functors/
class Applicative f => SelectiveM f where
    (|**|) :: f (Either e a) -> f (Either e b)
           -> f (Either e (a, b))

I believe these formulations are equivalent to Selective, but I have not proved the equivalence formally. I like the minimalistic definition of the type class based on handle, but the above alternatives are worth consideration too. In particular, SelectiveS has a much nicer associativity law, which is just (x |.| y) |.| z = x |.| (y |.| z).

Concluding remarks

Selective functors are powerful: like monads they allows us to inspect values in an effectful context. Many monadic computations can therefore be rewritten using the Selective type class. Many, but not all! Crucially, selective functors cannot implement the function join:

join :: Selective f => f (f a) -> f a
join = ... -- This puzzle has no solution, better solve 'select'!

I’ve been playing with selective functors for a few weeks, and I have to admit that they are very difficult to work with. Pretty much all selective combinators involve mind-bending manipulations of Lefts and Rights, with careful consideration of which effects are necessary. I hope all this complexity can be hidden in a library.

I haven’t yet looked into performance issues, but it is quite likely that it will be necessary to add more methods to the type class, so that their default implementations can be replaced with more efficient ones on instance-by-instance basis (similar optimisations are done with Monad and Applicative).

Have you come across selective functors before? The definition of the type class is very simple, so somebody must have looked at it earlier.

Also, do you have any other interesting use-cases for selective functors?

Big thanks to Arseniy Alekseyev, Ulan Degenbaev and Georgy Lukyanov for useful discussions, which led to this blog post.

Footnotes and updates

^(*) As rightly pointed out by Darwin226 in the reddit discussion, handle = handleA gives a valid Selective instance for any Applicative, therefore calling it less powerful may be questionable. However, I would like to claim that Selective does provide additional power: it gives us vocabulary to talk about unnecessary effects. We might want to be able to express three different ideas:

Express the requirement that all effects must be performed. This corresponds to the Applicative type class and handleA. There is no way to distinguish necessary effects from unnecessary ones in the Applicative setting.
Express the requirement that unnecessary effects must be skipped. This is a stricter version of Selective, which corresponds to handleM.
Express the requirement that unnecessary effects may be skipped. This is the version of Selective presented in this blog post: handle is allowed to be anywhere in the range from handleA to handleM.

I think all three ideas are useful, and it is very interesting to study the stricter version of Selective too. I’d be interested in hearing suggestions for the corresponding set of laws. The following two laws seem sensible:

handle (Left  <$> x) f = flip ($) <$> x <*> f
handle (Right <$> x) f = x

The Task abstraction

Andrey — Mon, 26 Mar 2018 14:27:19 +0000

Neil Mitchell, Simon Peyton Jones and I have just finished a paper describing a systematic and executable framework for developing and comparing build systems. The paper and associated code are available here: https://github.com/snowleopard/build. The code is not yet well documented and polished, but I’ll bring it in a good shape in April. You can learn more about the motivation behind the project here.

(Update: the paper got accepted to ICFP! Read the PDF, watch the talk.)

In this blog post I would like to share one interesting abstraction that we came up with to describe build tasks:

type Task c k v = forall f. c f => (k -> f v) -> k -> Maybe (f v)

A Task is completely isolated from the world of compilers, file systems, dependency graphs, caches, and all other complexities of real build systems. It just computes the value of a key k, in a side-effect-free way, using a callback of type k → f v to find the values of its dependencies. One simple example of a callback is Haskell’s readFile function: as one can see from its type FilePath → IO String, given a key (a file path k = FilePath) it can find its value (the file contents of type v = String) by performing arbitrary IO effects (hence, f = IO). We require task descriptions to be polymorhic in f, so that we can reuse them in different computational contexts f without rewriting from scratch.

This highly-abstracted type is best introduced by an example. Consider the following Excel spreadsheet (yes, Excel is a build system in disguise):

A1: 10     B1: A1 + A2
A2: 20     B2: B1 * 2

Here cell A1 contains the value 10, cell B1 contains the formula A1 + A2, etc. We can represent the formulae (i.e. build tasks) of this spreadsheet with the following task description:

sprsh1 :: Task Applicative String Integer
sprsh1 fetch "B1" = Just ((+)  <$> fetch "A1" <*> fetch "A2")
sprsh1 fetch "B2" = Just ((*2) <$> fetch "B1")
sprsh1 _     _    = Nothing

We instantiate the type of keys k with String (cell names), and the type of values v with Integer (real spreadsheets contain a wider range of values, of course). The task description sprsh1 embodies all the formulae of the spreadsheet, but not the input values. Like every Task, sprsh1 is given a callback fetch and a key. It pattern-matches on the key to see if it has a task description (a formula) for it. If not, it returns Nothing, indicating that the key is an input. If there is a formula in the cell, it computes the value of the formula, using fetch to find the value of any keys on which it depends.

The definition of Task and the above example look a bit mysterious. Why do we require Task to be polymorphic in the type constructor f? Why do we choose the c = Applicative constraint? The answer is that given one task description, we would like to explore many different build systems that can build it and it turns out that each of them will use a different f. Furthermore, we found that constraints c classify build tasks in a very interesting way:

Task Applicative: In sprsh1 we needed only Applicative operations, expressing the fact that the dependencies between cells can be determined statically; that is, by looking at the formulae, without “computing” them — we’ll demonstrate this later.
Task Monad: some tasks cannot be expressed using only Applicative operations, since they inspect actual values and can take different computation paths with different dependencies. Dependencies of such tasks are dynamic, i.e. they cannot be determined statically.
Task Functor is somewhat degenerate: the task description cannot even use the application operator <*>, which limits dependencies to a single linear chain. Functorial tasks are easy to build, and are somewhat reminiscent of tail recursion.
Task Alternative, Task MonadPlus and their variants can be used for describing tasks with non-determinism.

Now let’s look at some examples of what we can do with tasks.

Compute

Given a task, we can compute the value corresponding to a given key by providing a pure store function that associates keys to values:

compute :: Task Monad k v -> (k -> v) -> k -> Maybe v
compute task store = fmap runIdentity . task (Identity . store)

Here we do not need any effects in the fetch callback to task, so we can use the standard Haskell Identity monad (I first learned about this trivial monad from this blog post). The use of Identity just fixes the ‘impedance mismatch’ between the function store, which returns a pure value v, and the fetch argument of the task, which must return an f v for some f. To fix the mismatch, we wrap the result of store in the Identity monad: the function Identity . store has the type k → Identity v, and can now be passed to a task. The result comes as Maybe (Identity v), hence we now need to get rid of the Identity wrapper by applying runIdentity to the contents of Maybe.

In the GHCi session below we define a pure key/value store with A1 set to 10 and all other keys set to 20 and compute the values corresponding to keys A1 and B1 in the sprsh1 example:

λ> store key = if key == "A1" then 10 else 20
λ> compute sprsh1 store "A1"
Nothing

λ> compute sprsh1 store "B1"
Just 30

As expected, we get Nothing for an input key A1 and Just 30 for B1.

Notice that, even though compute takes a Task Monad as its argument, its application to a Task Applicative typechecks just fine. It feels a bit like sub-typing, but is actually just ordinary higher-rank polymorphism.

Now let’s look at a function that can only be applied to applicative tasks.

Static dependencies

The formula A1 + A2 in the sprsh1 example statically depends on two keys: A1 and A2. Usually we would extract such static dependencies by looking at the syntax tree of the formula. But our Task abstraction has no such syntax tree. Yet, remarkably, we can use the polymorphism of a Task Applicative to find its dependencies. Here is the code:

dependencies :: Task Applicative k v -> k -> [k]
dependencies task key = case task (\k -> Const [k]) key of
                            Nothing         -> []
                            Just (Const ks) -> ks

Here Const is the standard Haskell Const functor. We instantiate f to Const [k]. So a value of type f v, or in this case Const [k] v, contains no value v, but does contain a list of keys of type [k] which we use to record dependencies. The fetch callback that we pass to task records a single dependency, and the standard Applicative instance for Const combines the dependencies from different parts of the task. Running the task with f = Const [k] will thus accumulate a list of the task’s dependencies – and that is just what dependencies does:

λ> dependencies sprsh1 "A1"
[]

λ> dependencies sprsh1 "B1"
["A1", "A2"]

Notice that these calls to dependencies do no actual computation. They cannot: we are not supplying any input values. So, through the wonders of polymorphism, we are able to extract the dependencies of the spreadsheet formula, and to do so efficiently, simply by running its code in a different Applicative! This is not new, for example see this paper, but it is cool.

Dynamic dependencies

Some build tasks have dynamic dependencies, which are determined by values of intermediate computations. Such tasks correspond to the type Task Monad k v. Consider this spreadsheet example:

A1: 10      B1: IF(C1=1,B2,A2)      C1: 1
A2: 20      B2: IF(C1=1,A1,B1)

Note that B1 and B2 statically form a dependency cycle, but Excel (which uses dynamic dependencies) is perfectly happy. The diagram below illustrates how cyclic dependencies are resolved when projecting them on conditions C1=1 and C1=2 (rectangles and rounded rectangles denote inputs and outputs, respectively). Incidentally, my PhD thesis was about a mathematical model for such conditional dependency graphs, which was later developed into an algebra of graphs.

We can express this spreadsheet using our task abstraction as:

sprsh2 :: Task Monad String Integer
sprsh2 fetch "B1" = Just $ do c1 <- fetch "C1"
                              if c1 == 1 then fetch "B2"
                                         else fetch "A2"
sprsh2 fetch "B2" = Just $ do c1 <- fetch "C1"
                              if c1 == 1 then fetch "A1"
                                         else fetch "B1"
sprsh2 _     _    = Nothing

The big difference compared to sprsh1 is that the computation now takes place in a Monad, which allows us to extract the value of C1 and fetch different keys depending on whether or not C1 = 1.

We cannot find dependencies of monadic tasks statically; notice that the application of the function dependencies to sprsh2 will not typecheck. We need to run a monadic task with concrete values that will determine the discovered dependencies. Thus, we introduce the function track: a combination of compute and dependencies that computes both the resulting value and the list of its dependencies in an arbitrary monadic context m:

track :: Monad m =>
         Task Monad k v -> (k -> m v) -> k -> Maybe (m (v, [k]))
track task fetch = fmap runWriterT . task trackingFetch
  where
    trackingFetch :: k -> WriterT [k] m v
    trackingFetch k = tell [k] >> lift (fetch k)

We use the standard^(*) Haskell WriterT monad transformer to record additional information — a list of keys [k] — when computing a task in an arbitrary monad m. We substitute the given fetch with a trackingFetch that, in addition to fetching a value, tracks the corresponding key. The task returns the value of type Maybe (WriterT [k] m v), which we unwrap by applying runWriterT to the contents of Maybe. Below we give an example of tracking monadic tasks when m = IO:

λ> fetchIO k = do putStr (k ++ ": "); read <$> getLine
λ> fromJust $ track sprsh2 fetchIO "B1"
C1: 1
B2: 10
(10,["C1","B2"])

λ> fromJust $ track sprsh2 fetchIO "B1"
C1: 2
A2: 20
(20,["C1","A2"])

As expected, the dependencies of cell B1 from sprsh2 are determined by the value of C1, which in this case is obtained by reading from the standard input via the fetchIO callback.

A simple build system

Given a task description, a target key, and a store, a build system returns a new store in which the values of the target key and all its dependencies are up to date. What does “up to date” mean? The paper answers that in a formal way.

The three functions described above (compute, dependencies and track) are sufficient for defining the correctness of build systems as well as for implementing a few existing build systems at a conceptual level. Below is an example of a very simple (but inefficient) build system:

busy :: Eq k => Task Monad k v -> k -> Store k v -> Store k v
busy task key store = execState (fetch key) store
  where
    fetch :: k -> State (Store k v) v
    fetch k = case task fetch k of
        Nothing  -> gets (getValue k)
        Just act -> do v <- act; modify (putValue k v); return v

Here Store k v is an abstract store datatype equipped with getValue and setValue functions. The busy build system defines the callback fetch so that, when given a target key, it brings the key up to date in the store, and returns its value. The function fetch runs in the standard Haskell State monad, initialised with the incoming store by execState. To bring a key k up to date, fetch asks the task description task how to compute k. If task returns Nothing the key is an input, so fetch simply reads the result from the store. Otherwise fetch runs the action act returned by the task to produce a resulting value v, records the new key/value mapping in the store, and returns v. Notice that fetch passes itself to task as an argument, so that the latter can use fetch to recursively find the values of k‘s dependencies.

Given an acyclic task description, the busy build system terminates with a correct result, but it is not a minimal build system: it doesn’t keep track of keys it has already built, and will therefore busily recompute the same keys again and again if they have multiple dependants. See the paper for implementations of much more efficient build systems.

For a few monads more

We have already used a few cool Haskell types — Identity, Const, WriterT and State — to manipulate our Task abstraction. Let’s meet a few other members of the cool-types family: Proxy, ReaderT, MaybeT and EitherT.

The Proxy data type allows us to check whether a key is an input without providing a fetch callback:

isInput :: Task Monad k v -> k -> Bool
isInput task = isNothing . task (const Proxy)

This works similarly to the dependencies function, but in this case we do not even need to record any additional information, thus we can replace Const with Proxy.

One might wonder: if we do not need the fetch callback in case of input, can we rewrite our Task abstraction as follows?

type Task2 c k v = forall f. c f => k -> Maybe ((k -> f v) -> f v)

Yes, we can! This definition is isomorphic to Task. This isn’t immediately obvious, so below is a proof. I confess: it took me a while to find it.

toTask :: Task2 Monad k v -> Task Monad k v
toTask task2 fetch key = ($fetch) <$> task2 key

fromTask :: Task Monad k v -> Task2 Monad k v
fromTask task key = runReaderT <$> task (\k -> ReaderT ($k)) key

The toTask conversion is relatively straightforward, but fromTask is not: it uses a ReaderT monad transformer to supply the fetch callback as the computation environment, extracting the final value with runReaderT.

Our task abstraction operates on pure values and has no mechanism for exception handling. It turns out that it is easy to turn any Task into a task that can handle arbitrary exceptions occurring in the fetch callback:

exceptional :: Task Monad k v -> Task Monad k (Either e v)
exceptional task fetch = fmap runExceptT . task (ExceptT . fetch)

The exceptional task transformer simply hides exceptions of the given fetch of type k → f (Either e v) by using the standard ExceptT monad transformer, passes the resulting fetch callback of type k → ExceptT e f v to the original task, and propagates the exceptions by runExceptT. Using MaybeT, one can also implement a similar task transformer that turns a Task Monad k v into the its partial version Task Monad k (Maybe v).

Our final exercise is to extract all possible computation results of a non-deterministic task, e.g. B1 = A1 + RANDBETWEEN(1,2) that can be described as a Task Alternative:

sprsh3 :: Task Alternative String Integer
sprsh3 fetch "B1" = Just $ (+) <$> fetch "A1" <*> (pure 1 <|> pure 2)
sprsh3 _     _    = Nothing

We therefore introduce the function computeND that returns the list of all possible results of the task instead of just one value (‘ND’ stands for ‘non-deterministic’):

computeND :: Task MonadPlus k v -> (k -> v) -> k -> Maybe [v]
computeND task store = task (return . store)

The implementation is almost straightforward: we choose f = [] reusing the standard MonadPlus instance for lists. Let’s give it a try:

λ> store key = if key == "A1" then 10 else 20
λ> computeND sprsh3 store "A1"
Nothing

λ> computeND sprsh3 store "B1"
Just [11,12]

λ> computeND sprsh1 store "B1"
Just [30]

Notice that we can apply computeND to both non-deterministic (sprsh3) as well as deterministic (sprsh1) task descriptions.

Non-deterministic tasks are interesting because they allow one to try different algorithms to compute a value in parallel and grab the first available result — a good example is portfolio-based parallel SAT solvers. This shouldn’t be confused with a deterministic composition of tasks, which is also a useful operation, but does not involve any non-determinism:

compose :: Task Monad k v -> Task Monad k v -> Task Monad k v
compose t1 t2 fetch key = t1 fetch key <|> t2 fetch key

Here we simply compose two task descriptions, picking the first one that knows how to compute a given key. Together with the trivial task that returns Nothing for all keys, this gives rise to the Task monoid.

Final remarks

We introduced the task abstraction to study build systems, but it seems to be linked to a few other topics, such as memoization, self-adjusting computation, lenses and profunctor optics, propagators and what not.

Have we just reinvented the wheel? It might seem so, especially if you look at these type signatures from the lens library:

type Lens s t a b
    = forall f. Functor f => (a -> f b) -> s -> f t

type Traversal s t a b
    = forall f. Applicative f => (a -> f b) -> s -> f t

Our implementations of functions like dependencies are heavily inspired by — or to be more accurate — stolen from the lens library. Alas, we have been unable to remove the Maybe used to encode whether a key is an input, without complicating other aspects of our definition.

The task abstraction can be used to express pure functions in a way that is convenient for their memoization. Here is an example of encoding one of the most favourite functions of functional programmers:

fibonacci :: Task Applicative Integer Integer
fibonacci fetch n
    | n >= 2 = Just $ (+) <$> fetch (n-1) <*> fetch (n-2)
    | otherwise = Nothing

Here the keys n < 2 are input parameters, and one can obtain the usual Fibonacci sequence by picking 0 and 1 for n = 0 and n = 1, respectively. Any minimal build system will compute the sequence with memoization, i.e. without recomputing the same value twice.

Interestingly, the Ackermann function — a famous example of a function that is not primitive recursive — can’t be expressed as a Task Applicative, since it needs to perform an intermediate recursive call to determine one of its dependencies:

ackermann :: Task Monad (Integer, Integer) Integer
ackermann fetch (n, m)
    | m < 0 || n < 0 = Nothing
    | m == 0    = Just $ pure (n + 1)
    | n == 0    = Just $ fetch (m - 1, 1)
    | otherwise = Just $ do index <- fetch (m, n - 1)
                            fetch (m - 1, index)

Now that we’ve seen examples of applicative and monadic tasks, let us finish with an example of a functorial task — the Collatz sequence:

collatz :: Task Functor Integer Integer
collatz fetch n | n <= 0    = Nothing
                | otherwise = Just $ f <$> fetch (n - 1)
  where
    f k | even k    = k `div` 2
        | otherwise = 3 * k + 1

So here is a claim: given a Task, we can memoize, self-adjust, propagate and probably do any other conceivable computation on it simply by picking the right build system!

Update: how to handle failures

Sjoerd Visscher’s comment (below) pointed out that the fetch callback is defined to be total: it has type k → f v and returns a value for every key. It may be useful to allow it to fail for some keys. I know of three ways of modelling failure using the Task abstraction:

(1) Include failures into the type of values v, for example:

data Value = FileNotFound | FileContents ByteString

This is convenient if tasks are aware of failures. For example, a task may be able to cope with missing files, e.g. if fetch “username.txt” returns FileNotFound, the task could use the literal string “User” as a default value. In this case it will depend on the fact that the file username.txt is missing, and will need to be rebuilt if the user later creates this file.

In many cases this approach is isomorphic to choosing v = Either e v’.

(2) Include failures into the computation context f, for example:

cells :: Map String Integer
cells = Map.fromList [("A1", 10), ("A2", 20)]

fetch :: String -> Maybe Integer
fetch k = Map.lookup k cells

We are choosing f = Maybe and thanks to the polymorphism of Task, any task can be executed in this context without any changes. For example, sprsh1 fetch “B1” now returns Just (Just 30), but sprsh1 fetch “B2” fails with Just Nothing.

This is convenient if tasks are not aware of failures, e.g. we can model Excel formulas as pure arithmetic functions, and introduce failures “for free” if/when needed by instantiating Task with an appropriate f. Also see the function exceptional defined above, which allows us to add arbitrary exceptions to a failure-free context f.

(3) Finally, the task itself might not want to encode failures into the type of values v, but instead demand that f has a built-in notion of failures. This can be done by choosing a suitable constraint c, such as Alternative, MonadPlus or even better something specific to failures e.g. MonadZero or MonadFail. Then both the callback and the task can reuse the same failure mechanism as shown below:

class Monad m => MonadFail m where
    fail :: String -> m a

sprsh4 :: Task MonadFail String Integer
sprsh4 fetch "B1" = Just $ do
    a1 <- fetch "A1"
    a2 <- fetch "A2"
    if a2 == 0 then fail "division by 0" else return (a1 `div` a2)
sprsh4 _ _ = Nothing

Are there any other types of failure that are not covered above?

Footnotes

^(*) Beware: as of writing, standard WriterT transformers have a space leak which may be an issue if a task has many dependencies. You might want to consider using a more efficient CPS-based WriterT transformer.

Formal verification of spacecraft control programs

Andrey — Wed, 07 Feb 2018 10:46:20 +0000

Last February I spent two weeks in Vienna, visiting Jakob Lechner and RUAG Space Austria, a company developing components for space missions. Jakob and his colleagues designed a specialised processing core called REDFIN (REDuced instruction set for Fixed-point & INteger arithmetic) for executing simple spacecraft control tasks, such as satellite antenna pointing. During the visit I implemented a prototype of a formal verification framework to support the development of REDFIN programs. Afterwards I was quite busy with my other projects, but my PhD student Georgy Lukyanov helped to further improve the prototype.

Jakob, Georgy and I have just submitted a conference paper describing the REDFIN core and the verification framework. Please have a look and let us know what you think. This will be a timely read after yesterday’s exciting SpaceX launch.

As we all know, writing programs is easy. Writing correct programs on the other hand is very hard. And if you are writing a program for a space mission you better be sure it is correct. So, what can you realistically do? Of course you should write and/or generate a lot of tests, but tests do not provide the full correctness guarantee. You can also use a strongly typed programming language and prove some properties at compile time, but your space-grade processor is highly unlikely to be a well-supported compilation target (and you might not be able to take your favourite garbage collector to space — it’s way too heavy!). You could also use formal methods and develop programs with the help of a theorem prover, eventually refining an abstract specification down to the level of bare metal. That might take you years though, especially if you have no previous experience in formal methods.

When I was working on my PhD I did some work on formal specification of processors and instruction sets, so this is a long-time interest of mine. Hence, when I heard about the REDFIN project I immediately self-invited myself to RUAG Space Austria and tried to figure out a way to engineer a simple solution that can be integrated into the existing development and verification workflow without a big disruption and a steep learning curve. Eventually I used Haskell to implement a small DSL for capturing the semantics of REDFIN programs and connected it to an SMT solver using SBV, a wonderful symbolic verification library developed by Levent Erkok (huge thanks!). This idea is not new and we reference a few early papers that developed and applied it to Arm and Intel processors. I also got a lot of inspiration from this blog post by Stephen Diehl and this talk by Tikhon Jelvis. Thank you Stephen and Tikhon!

P.S.: The code is not yet available, but I hope we’ll release it soon.

Hadrian is on the way

Andrey — Wed, 15 Nov 2017 23:54:53 +0000

Hadrian, a new build system for GHC that we have been working on for the past three years, has finally been merged into the GHC tree (update: we have temporally switched to a submodule). However it’s not yet time to celebrate — there are still many issues that need to be addressed before the Make-based build system may retire.

Want to try? Checkout the GHC repository and run hadrian/build.sh -j or hadrian/build.bat -j on Windows and it should build you a GHC binary. In case of problems, have a look at the README and/or raise an issue.

Here is a quick update on the on-going development:

Hadrian can build GHC and can already be used as part of the CI infrastructure. However, the resulting binary does not pass the validation. Zhen Zhang is looking into this, but more help is needed.
A major refactoring by Moritz Angermann is on the way. Moritz is primarily interested in cross compilation, but to make it work he had to get rid of the ghc-cabal utility, reorganise the build tree, and make numerous other improvements to Hadrian.
There is currently no support for binary distribution. Ben Gamari is looking into this issue.
Dynamic linking on Windows is not implemented. Tamar Christina has kindly offered help with this.
Hadrian source code is still not fully documented and tested, and generally requires some polishing. I am currently taking care of this when not distracted by urgent bug fixes and will appreciate your help in making Hadrian easier to understand and use.

I can’t believe that we seem to approach the finish line! It’s been a long, tedious but also interesting project. Thank you all for helping us get this far, and I hope we’ll celebrate the switch from Make to Hadrian soon.

Old graphs from new types

Andrey — Tue, 31 Jan 2017 23:36:52 +0000

After I got back from the holiday that I planned in the previous blog post, I spent the whole January playing with the algebra of graphs and trying to find interesting and useful ways of constructing graphs, focusing on writing polymorphic code that can manipulate graph expressions without turning them into concrete data structures. I’ve put together a small toolbox containing a few quirky types, which I’d like to share with you in this blog post. If you are not familiar with the algebra of graphs, please read the introductory blog post first.

Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.

Graph transpose

One of the simplest transformations one can apply to a graph is to flip the direction of all of its edges. It’s usually straightforward to implement but whatever data structure you use to represent graphs, you will spend at least O(1) time to modify it (say, by flipping the treatAsTransposed flag); much more often you will have to traverse the data structure and flip every edge, resulting in O(|V|+|E|) time complexity. What if I told you that by using Haskell’s type system, we can transpose polymorphic graphs in zero time? Sounds suspicious? Let’s see how this works.

Consider the following Graph instance:

newtype Transpose g = T { transpose :: g }

instance Graph g => Graph (Transpose g) where
    type Vertex (Transpose g) = Vertex g
    empty       = T empty
    vertex      = T . vertex
    overlay x y = T $ transpose x `overlay` transpose y
    connect x y = T $ transpose y `connect` transpose x -- flip

We wrap a graph in a newtype flipping the order of arguments to connect. Let’s check if this works:

λ> edgeList $ 1 * (2 + 3) * 4
[(1,2),(1,3),(1,4),(2,4),(3,4)]
λ> edgeList $ transpose $ 1 * (2 + 3) * 4
[(2,1),(3,1),(4,1),(4,2),(4,3)]

Cool! And this has zero runtime cost, because all we do is wrapping and unwrapping the newtype, which is guaranteed to be free. As an exercise, verify that transpose is an antihomomorphism on graphs, that is:

T(ε) = ε
T(v) = v
T(x + y) = T(x) + T(y)
T(x → y) = T(y) → T(x)

Furthermore, transpose is its own inverse: transpose . transpose = id.

To make sure transpose is only applied to polymorphic graphs, we do not export the constructor T, therefore the only way to call transpose is to give it a polymorphic argument and let the type inference interpret it as a value of type Transpose. The type signature is a little unsatisfying though:

λ> :t transpose
transpose :: Transpose g -> g

It’s not clear at all from the type that the function operates on graphs. Do you have any ideas how to improve it?

Merging graph vertices with a functor

Here is a puzzle for you: can you implement a function gmap that given a function a -> b and a polymorphic graph whose vertices are of type a will produce a polymorphic graph with vertices of type b by applying the function to each vertex? Yes, this is almost a Functor but it doesn’t have the usual type signature, because Graph is not a higher-kinded type.

My solution is as follows, but I feel there may be simpler ones:

newtype GraphFunctor a =
    GF { gfor :: forall g. Graph g => (a -> Vertex g) -> g }

instance Graph (GraphFunctor a) where
    type Vertex (GraphFunctor a) = a
    empty       = GF $ \_ -> empty
    vertex  x   = GF $ \f -> vertex (f x)
    overlay x y = GF $ \f -> gmap f x `overlay` gmap f y
    connect x y = GF $ \f -> gmap f x `connect` gmap f y

gmap :: Graph g => (a -> Vertex g) -> GraphFunctor a -> g
gmap = flip gfor

Essentially, we are defining another newtype wrapper, which pushes the given function all the way towards the vertices. This has no runtime cost, just as before, although the actual evaluation of the given function at each vertex will not be free, of course. Let’s test this!

λ> adjacencyList $ 1 * 2 * 3 + 4 * 5
[(1,[2,3]),(2,[3]),(3,[]),(4,[5]),(5,[])]
λ> :t gmap (+1) $ 1 * 2 * 3 + 4 * 5
gmap (+1) $ 1 * 2 * 3 + 4 * 5 :: (Graph g, Num (Vertex g)) => g
λ> adjacencyList $ gmap (+1) $ 1 * 2 * 3 + 4 * 5
[(2,[3,4]),(3,[4]),(4,[]),(5,[6]),(6,[])]

As you can see, we can increment the value of each vertex by mapping function (+1) over the graph. The resulting expression is a polymorphic graph, as desired. Again, we’ve done some useful work without turning the graph into a concrete data structure. As an exercise, show that gmap satisfies the functor laws: gmap id = id and gmap f . gmap g = gmap (f . g). A useful first step is to prove that mapping a function is a homomorphism:

M_f(ε) = ε
M_f(v) = f(v)
M_f(x + y) = M_f(x) + M_f(y)
M_f(x → y) = M_f(x) → M_f(y)

An alert reader might wonder: what happens if the function maps two original vertices into the same one? They will be merged! Merging graph vertices is a useful function, so let’s define it in terms of gmap:

mergeVertices :: Graph g => (Vertex g -> Bool) -> Vertex g
    -> GraphFunctor (Vertex g) -> g
mergeVertices p v = gmap $ \u -> if p u then v else u

λ> adjacencyList $ mergeVertices odd 3 $ 1 * 2 * 3 + 4 * 5
[(2,[3]),(3,[2,3]),(4,[3])]

The function takes a predicate on graph vertices and a target vertex and maps all vertices satisfying the predicate into the target vertex, thereby merging them. In our example all odd vertices {1, 3, 5} are merged into 3, in particular creating the self-loop 3 → 3. Note: it takes linear time O(|g|) for mergeVertices to apply the predicate to each vertex (|g| is the size of the expression g), which may be much more efficient than merging vertices in a concrete data structure; for example, if the graph is represented by an adjacency matrix, it will likely be necessary to rebuild the resulting matrix from scratch, which takes O(|V|^2) time. Since for many graphs we have |g| = O(|V|), the matrix-based mergeVertices will run in O(|g|^2).

Expanding vertices into subgraphs (hey, monads!)

What do the operations of removing a vertex and splitting a vertex have in common? They both can be implemented by replacing each vertex of a graph with a (possibly empty) subgraph and flattening the result. Sounds familiar? You may recognise this as monad’s bind function, or Haskell’s operator >>=, which is so useful that it even made it to the Haskell’s logo. We are going to implement bind on graphs by wrapping it into yet another newtype:

newtype GraphMonad a =
    GM { bind :: forall g. Graph g => (a -> g) -> g }

instance Graph (GraphMonad a) where
    type Vertex (GraphMonad a) = a
    empty       = GM $ \_ -> empty
    vertex  x   = GM $ \f -> f x -- here is the trick!
    overlay x y = GM $ \f -> bind x f `overlay` bind y f
    connect x y = GM $ \f -> bind x f `connect` bind y f

As you can see, the implementation is almost identical to gmap: instead of wrapping the value f x into a vertex, we should just leave it as is. The resulting transformation is also a homomorphism. Let’s see how we can make use of this new type in our toolbox.

We are first going to implement a filter-like function induce that, given a vertex predicate and a graph, will compute the induced subgraph on the set of vertices that satisfy the predicate by turning all other vertices into empty subgraphs and flattening the result.

induce :: Graph g => (Vertex g -> Bool)
    -> GraphMonad (Vertex g) -> g
induce p g = bind g $ \v -> if p v then vertex v else empty

λ> edgeList $ clique [0..4]
[(0,1),(0,2),(0,3),(0,4),(1,2),(1,3),(1,4),(2,3),(2,4),(3,4)]
λ> edgeList $ induce (<3) $ clique [0..4]
[(0,1),(0,2),(1,2)]
λ> induce (<3) (clique [0..4]) == (clique [0..2] :: Basic Int)
True

As you can see, by inducing a clique on a subset of the vertices that we like (<3), we get a smaller clique, as expected.

We can now implement removeVertex via induce:

removeVertex :: (Eq (Vertex g), Graph g) => Vertex g
    -> GraphMonad (Vertex g) -> g
removeVertex v = induce (/= v)

λ> adjacencyList $ removeVertex 2 $ 1 * (2 + 3)
[(1,[3]),(3,[])]

Removing an edge is not as simple. I suspect that this has something to do with the fact that the corresponding transformation doesn’t seem to be a homomorphism. Indeed, you will find it tricky to satisfy the last homomorphism requirement on R_x→y:

R_x→y(x → y) = R_x→y(x) → R_x→y(y)

We can, however, implement a function disconnect that removes all edges between two different vertices as follows:

disconnect :: (Eq (Vertex g), Graph g) => Vertex g -> Vertex g
    -> GraphMonad (Vertex g) -> g
disconnect u v g = removeVertex u g `overlay` removeVertex v g

λ> adjacencyList $ disconnect 1 2 $ 1 * (2 + 3)
[(1,[3]),(2,[]),(3,[])]

That is, we create two graphs: one without u, the other without v, and overlay them, which removes both u → v and v → u edges. I still don’t have a solution for removing just a single edge u → v, or even just a self-loop v → v (note: disconnect v v = removeVertex v). Maybe you can find a solution? (Update: Arseniy Alekseyev found a solution for removing self-loops that can be generalised for removing edges, see a note at the end of the blog post.)

Curiously, we can have a slightly shorter implementation of disconnect, because a function returning a graph can also be given a Graph instance:

instance Graph g => Graph (a -> g) where
    type Vertex (a -> g) = Vertex g
    empty       = pure empty
    vertex      = pure . vertex
    overlay x y = overlay <$> x <*> y
    connect x y = connect <$> x <*> y

disconnect :: (Eq (Vertex g), Graph g) => Vertex g -> Vertex g
    -> GraphMonad (Vertex g) -> g
disconnect u v = removeVertex u `overlay` removeVertex v

Finally, as promised, here is how we can split a vertex into a list of given vertices using the bind function:

splitVertex :: (Eq (Vertex g), Graph g) => Vertex g
    -> [Vertex g] -> GraphMonad (Vertex g) -> g
splitVertex v vs g = bind g $
    \u -> if u == v then vertices vs else vertex u

λ> adjacencyList $ splitVertex 1 [0, 1] $ 1 * (2 + 3)
[(0,[2,3]),(1,[2,3]),(2,[]),(3,[])]

Here vertex 1 is split into a pair of vertices {0, 1} that have the same connectivity.

Constructing De Bruijn graphs

To demonstrate that we can construct reasonably sophisticated graphs using the presented toolkit, let’s try it on De Bruijn graphs, an interesting combinatorial object that frequently shows up in computer engineering and bioinformatics. My implementation is fairly short, but requires some explanation:

deBruijn :: (Graph g, Vertex g ~ [a]) => Int -> [a] -> g
deBruijn len alphabet = bind skeleton expand
  where
    overlaps = mapM (const alphabet) [2..len]
    skeleton = fromEdgeList [       (Left s, Right s)  | s <- overlaps ]
    expand v = vertices     [ either ([a]++) (++[a]) v | a <- alphabet ]

The function builds a De Bruijn graph of dimension len from symbols of the given alphabet. The vertices of the graph are all possible words of length len containing symbols of the alphabet, and two words are connected x → y whenever x and y match after we remove the first symbol of x and the last symbol of y (equivalently, when x = az and y = zb for some symbols a and b). An example of a 3-dimensional De Bruijn graph on the alphabet {0, 1} is shown in the diagram below (right).

Here are all the ingredients of the solution:

overlaps contains all possible words of length len-1 that correspond to overlaps of connected vertices.
skeleton is a graph with one edge per overlap, with Left and Right vertices acting as temporary placeholders (see the diagram).
We replace a vertex Left s with a subgraph of two vertices {0s, 1s}, i.e. the vertices whose suffix is s. Symmetrically, Right s is replaced by a subgraph of two vertices {s0, s1}. This is captured by the function expand.
The result is obtained by computing bind skeleton expand, as illustrated above.

…and this works as expected:

λ> edgeList $ deBruijn 3 "01"
[("000","000"),("000","001"),("001","010"),("001","011")
,("010","100"),("010","101"),("011","110"),("011","111")
,("100","000"),("100","001"),("101","010"),("101","011")
,("110","100"),("110","101"),("111","110"),("111","111")]
λ> all (\(x,y) -> drop 1 x == dropEnd 1 y) $ edgeList $ deBruijn 9 "abc"
True
λ> Set.size $ vertexSet $ deBruijn 9 "abc"
19683 -- i.e. 3^9

That’s all for now! I hope I’ve convinced you that you don’t necessarily need to operate on concrete data structures when constructing graphs. You can write both efficient and reusable code by using Haskell’s types for interpreting polymorphic graph expressions as maps, binds and other familiar transforms. Give me any old graph, and I’ll write you a new type to construct it!

P.S.: The algebra of graphs is available in the alga library.

Update: Arseniy Alekseyev found a nice solution for removing self-loops. Let R_v denote the operation of removing a vertex v, and R_v→v denote the operation of removing a self-loop v → v. Then the latter can be defined as follows:

R_v→v(ε) = ε
R_v→v(x) = x
R_v→v(x + y) = R_v→v(x) + R_v→v(y)
R_v→v(x → y) = R_v(x) → R_v→v(y) + R_v→v(x) → R_v(y)

It’s not a homomorphism, but it seems to work. Cool! Furthermore, we can generalise the above and implement the operation R_u→v that removes an edge u → v:

R_u→v(ε) = ε
R_u→v(x) = x
R_u→v(x + y) = R_u→v(x) + R_u→v(y)
R_u→v(x → y) = R_u(x) → R_u→v(y) + R_u→v(x) → R_v(y)

Note that the size of the expression can substantially increase as a result of applying such operations. Given an expression g of size |g|, what is the worst possible size of the result |R_u→v(g)|?

Graphs in disguise: from todo lists to build systems

Andrey — Thu, 22 Dec 2016 13:33:52 +0000

In this blog post we will look at an example of using the algebra of graphs for manipulating sequences of items, which at first sight might not look like graphs, but are actually dependency graphs in disguise. We will develop a tiny DSL for composing todo lists on top of the alga library and will show how it can be used for planning a holiday and, on a more serious note, for writing software build systems. This blog post will partially answer the question about possible applications of the algebra of graphs that was asked in this reddit discussion.

Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.

Todo lists and the algebra of graphs

Todo lists are sequences of items that, as one would expect, need to be done. The order of items in the sequence matters, because some items may depend on others. The simplest todo list is the empty one. Then we have todo lists containing a single item, from which we can build up longer lists using the same operators we introduced to construct graphs.

An item will correspond to a graph vertex. We’ll use the OverloadedStrings GHC extension, so we can create todo items without explicitly wrapping them into a vertex. This will also allow everyone to choose their favourite representation for strings; plain old String is fine for our examples:

{-# LANGUAGE OverloadedStrings #-}
import Data.String

import Algebra.Graph
import Algebra.Graph.Util

instance (IsString a, Ord a) => IsString (Todo a) where
    fromString = vertex . fromString

shopping :: Todo String
shopping = "presents"

Here Todo is a Graph instance whose implementation will be revealed later. One can combine several items into a single todo list using the overlay operator + of the algebra:

shopping :: Todo String
shopping = "presents" + "coat" + "scarf"

The semantics of a todo list is just a list of items in the order they can be completed, or Nothing if the there is no possible completion order that satisfies all dependency constraints between different items. We can extract the semantics using todo function with the following signature:

todo :: Ord a => Todo a -> Maybe [a]

The overlay operator is commutative, therefore reordering items in the shopping list does not change the semantics:

λ> todo shopping
Just ["coat","presents","scarf"]
λ> todo $ "coat" + "scarf" + "presents"
Just ["coat","presents","scarf"]
λ> shopping == "coat" + "scarf" + "presents"
True

As you can see, the items are simply ordered alphabetically as there are no dependencies between them. Let’s add some! To do that we’ll use the connect operator → from the algebra. When two todo lists are combined with →, the meaning is that all items in the first list must be completed before we can start completing items from the second todo list. I’m currently planning a holiday trip to visit friends and therefore will need to pack all stuff that I buy before travelling:

holiday :: Todo String
holiday = shopping * "pack" * "travel"

λ> todo holiday
Just ["coat","presents","scarf","pack","travel"]
λ> shopping == holiday
False
λ> shopping `isSubgraphOf` holiday
True

Items "pack" and "travel" have been appended to the end of the list even though "pack" comes before "presents" alphabetically, and rightly so: we can’t pack presents before we buy them!

Now let’s add a new dependency constraint to an existing todo list. For example, I might want to buy a new scarf before a coat, because I would like to make sure the coat looks good with the new scarf:

λ> todo $ holiday + "scarf" * "coat"
Just ["presents","scarf","coat","pack","travel"]

Look how the resulting list changed: "coat" has been moved after "scarf" to meet the new constraint! Of course, it’s not too difficult to add contradictory constraints, making the todo list impossible to schedule:

λ> todo $ holiday + "travel" * "presents"
Nothing

There is nothing we can do to complete all items if there is a circular dependency in our todo list: "presents" → "pack" → "travel" → "presents".

It may sometimes be useful to have some notion of item priorities to schedule some items as soon or as late as possible. Let me illustrate this with an example, by modifying our todo lists as follows:

shopping :: Todo String
shopping = "presents" + "coat" + "phone wife" * "scarf"

holiday :: Todo String
holiday = shopping * "pack" * "travel" + "scarf" * "coat"

As you see, I now would like to phone my wife before buying the scarf to make sure it also matches the colour of one of her scarves (she has a dozen of them and I can’t possibly remember all the colours). Let’s see how this changes the resulting order:

λ> todo holiday
Just ["phone wife","presents","scarf","coat","pack","travel"]

This works but is a little unsatisfactory: ideally I’d like to phone my wife right before buying the scarf. To achieve that I can amend the shopping list by changing the priority of the item "phone wife":

-- Lower the priority of items in a given todo list
low :: Todo a -> Todo a

shopping :: Todo String
shopping = "presents" + "coat" + low "phone wife" * "scarf"

λ> todo holiday
Just ["presents","phone wife","scarf","coat","pack","travel"]

Aha, this is better: "phone wife" got scheduled as late as possible, and is now right next to "scarf", as desired. But wait — if my wife finds out that I gave a low priority to my phone calls to her, I’ll get into trouble! I need to find a better way to achieve the same effect. In essence, we would like to have a variant of the connect operator that pulls the arguments together as close as possible during scheduling (and, alternatively, we may also want to repel arguments as far from each other as possible).

-- Pull the arguments together as close as possible
(~*~) :: Ord a => Todo a -> Todo a -> Todo a

-- Repel the arguments as far as possible
(>*<) :: Ord a => Todo a -> Todo a -> Todo a

shopping :: Todo String
shopping = "presents" + "coat" + "phone wife" ~*~ "scarf"

This looks better and leads to the same result as the code above.

The final holiday expression can be visualised as follows:

Here the overlay operator + is shown simply by placing its arguments next to each other, the connect operators are shown by arrows, and the arrow with a small triangle stands for the tightly connect operator ~*~. By following the laws of the algebra, we can flatten the graph expression into a dependency graph shown below:

The graph is then linearised into a list of items by the todo function.

So, here you go: you can plan your holiday (or anything else) in Haskell using the alga library!

Constructing command lines in build systems

The above reminds me of build systems that construct command lines for executing various external programs, such as compilers, linkers, etc. A command line is just a list of strings, that typically include the path to the program that is being executed, paths to source files, and various configuration flags. Some of these strings may have order constraints between them, quite similar to todo lists. Let’s see if we can use our tiny DSL for todo lists for describing command lines.

Here is a simple command line to compile "src.c" with GCC compiler:

cmdLine1 :: Todo String
cmdLine1 = "gcc" * ("-c" ~*~ "src.c" + "-o" ~*~ "src.o")

λ> todo cmdLine1
Just ["gcc","-c","src.c","-o","src.o"]

Build systems are regularly refactored, and it is useful to track changes in a build system to automatically rebuild affected files if need be (for example, in the new GHC build system Hadrian we track changes in command lines and this helps a lot in its development). Some changes do not change the semantics of a build system and can therefore be safely ignored. As an example, one can rewrite cmdLine1 defined above by swapping the source and object file parts of the command line:

cmdLine2 :: Todo String
cmdLine2 = "gcc" * ("-o" ~*~ "src.o" + "-c" ~*~ "src.c")

λ> cmdLine1 == cmdLine2
True
λ> todo cmdLine2
Just ["gcc","-c","src.c","-o","src.o"]

As you can see, the above change has no effect, as we would expect from the commutativity of +. Replacing ~*~ with the usual connect operator on the other hand sometimes leads to changes in the semantics:

cmdLine3 :: Todo String
cmdLine3 = "gcc" * ("-c" * "src.c" + "-o" * "src.o")

λ> cmdLine1 == cmdLine3
False
λ> todo cmdLine3
Just ["gcc","-o","-c","src.c","src.o"]

The resulting sequence is correct from the point of view of a dependency graph, but is not a valid command line: the flag pairs got pushed apart. The change in semantics is recognised by the algebra and a rerun of the build system should reveal the error.

As a final exercise, let’s write a function that transforms command lines:

optimise :: Int -> Todo String -> Todo String
optimise level = (* flag)
  where
    flag = vertex $ "-O" ++ show level

λ> todo $ optimise 2 cmdLine1
Just ["gcc","-c","src.c","-o","src.o","-O2"]

As you can see, optimise 2 appends the optimisation flag "-O2" at the end of the command line, i.e. optimise 2 == (* "-O2").

Command lines in real build systems contain many conditional flags that are included only when compiling certain files on certain platforms, etc. You can read about how we deal with conditional flags in Hadrian here.

Under the hood

Scheduling a list of items subject to dependency constraints is a well-known problem, which is solved by topological sort of the underlying dependency graph. GHC’s containers library has an implementation of topological sort in Data.Graph module. It operates on adjacency lists and to reuse it we can define the following Graph instance:

newtype AdjacencyMap a = AM { adjacencyMap :: Map a (Set a) }
    deriving (Eq, Show)

instance Ord a => Graph (AdjacencyMap a) where
    type Vertex (AdjacencyMap a) = a
    empty       = AM $ Map.empty
    vertex  x   = AM $ Map.singleton x Set.empty
    overlay x y = AM $ Map.unionWith Set.union
        (adjacencyMap x) (adjacencyMap y)
    connect x y = AM $ Map.unionsWith Set.union
        [ adjacencyMap x, adjacencyMap y
        , fromSet (const . keysSet $ adjacencyMap y)
                  (keysSet $ adjacencyMap x) ]

adjacencyList :: AdjacencyMap a -> [(a, [a])]
adjacencyList = map (fmap Set.toAscList) . Map.toAscList . adjacencyMap

λ> adjacencyList $ clique [1..4]
[(1,[2,3,4]),(2,[3,4]),(3,[4]),(4,[])]

Todo is built on top of the TopSort graph instance, which is just a newtype wrapper around AdjacencyMap based representation of graphs:

newtype TopSort a = TS { fromTopSort :: AdjacencyMap a }
    deriving (Show, Num)

instance Ord a => Eq (TopSort a) where
    x == y = topSort x == topSort y

The custom Eq instance makes sure that graphs are considered equal if their topological sorts coincide. In particular all cyclic graphs fall into the same equivalence class corresponding to topSort g == Nothing:

λ> path [1..4] == (clique [1..4] :: TopSort Int)
True
λ> topSort $ clique [1..4]
Just [1,2,3,4]
λ> topSort $ path [1..4]
Just [1,2,3,4]
λ> topSort $ transpose $ clique [1..4]
Just [4,3,2,1]
λ> topSort $ circuit [1..4]
Nothing

Function topSort simply calls Data.Graph.topSort performing the necessary plumbing, which is not particularly interesting.

The current implementation has two issues: the topological sort is not always lexicographically first, as evidenced by cmdLine3 above, where "-o" precedes "-c" in the final ordering. The second issue is that topSort does not satisfy the closure axiom defined in the previous blog post. One possible approach to fix this is to compute the transitive reduction of the underlying dependency graph before the topological sort.

Have a great holiday everyone!

Graphs à la carte

Andrey — Tue, 13 Dec 2016 01:29:10 +0000

I received an overwhelming response to the introductory blog post about the algebra of graphs; thank you all for your remarks, questions and suggestions! In the second part of the series I will show that the algebra is not restricted only to directed graphs, but can be extended to axiomatically represent undirected graphs, reachability and dependency graphs (i.e. preorders and partial orders), their various combinations, and even hypergraphs.

Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.

Why algebra?

Before we continue, I’d like to note that any data structure for representing graphs (e.g. an edgelist, matrix-based representations, inductive graphs from the fgl library, GHC’s standard Data.Graph, etc.) can satisfy the axioms of the algebra with appropriate definitions of empty, vertex, overlay and connect, and I do not intend to compare these implementations against each other. I’m more interested in implementation-independent (polymorphic) functions that we can write and reuse, and in proving properties of these functions using the laws of the algebra. That’s why I think the algebra is worth studying.

As a warm-up exercise, let’s look at a few more examples of such polymorphic graph functions. One of the threads in the reddit discussion was about the path graph P₄: i.e. the graph with 4 vertices connected in a chain. Here is a function that can construct such path graphs on a given list of vertices:

path :: Graph g => [Vertex g] -> g
path []  = empty
path [x] = vertex x
path xs  = fromEdgeList $ zip xs (tail xs)

p4 :: (Graph g, Vertex g ~ Char) => g
p4 = path ['a', 'b', 'c', 'd']

Note that graph p4 is also polymorphic: we haven’t committed to any particular data representation, but we know that the vertices of p4 have type Char.

If we connect the last vertex of a path to the first one, we get a circuit graph, or a cycle. Let’s express this in terms of the path function:

circuit :: Graph g => [Vertex g] -> g
circuit []     = empty
circuit (x:xs) = path $ [x] ++ xs ++ [x]

pentagon :: (Graph g, Vertex g ~ Int) => g
pentagon = circuit [1..5]

From the definition we expect that a path is a subgraph of the corresponding circuit. Can we express this property in the algebra? Yes! It’s fairly standard to define a ≤ b as a + b = b for idempotent + and it turns out that this definition corresponds to the subgraph relation on graphs:

isSubgraphOf :: (Graph g, Eq g) => g -> g -> Bool
isSubgraphOf x y = overlay x y == y

We can use QuickCheck to test that our implementation satisfies the property:

λ> quickCheck $ \xs -> path xs `isSubgraphOf` (circuit xs :: Basic Int)
+++ OK, passed 100 tests.

QuickCheck can only test the property w.r.t. a particular instance, in this case we chose Basic Int, but using the algebra we can prove that it holds for all law-abiding instances of Graph (I leave this as an exercise for the reader).

As a final example, we will implement Cartesian graph product, usually denoted as G H, where the vertex set is V_G × V_H and vertex (x, y) is connected to vertex (x’, y’) if either x = x’ and y is connected to y’ in H, or y = y’ and x is connected to x’ in G:

box :: (Functor f, Foldable f, Graph (f (a,b))) => f a -> f b -> f (a,b)
box x y = foldr overlay empty $ xs ++ ys
  where
    xs = map (\b -> fmap (,b) x) $ toList y
    ys = map (\a -> fmap (a,) y) $ toList x

The Cartesian product G H is assembled by creating |V_H| copies of graph G and overlaying them with |V_G| copies of graph H. We get access to the list of graph vertices using toList from the Foldable instance and turn vertices of original graphs into pairs of vertices by fmap from the Functor instance.

As you can see, the code is still implementation-independent, although it requires that the graph data type is a Functor and a Foldable. Just like lists, trees and other containers, most graph data structures have Functor, Foldable, Applicative and Monad instances (e.g. our Basic data type has them all). Here is how pentagon `box` p4 looks:

(A side note: the type signature of box reminds me of this blog post by Edward Yang and makes me wonder if Functor, Foldable plus idempotent and commutative Monoid together imply Monoidal, as it seems that I only had to use empty and overlay from the Graph type class. This seems odd.)

Undirected graphs

As I hinted in the previous blog post, to switch from directed to undirected graphs it is sufficient to add the axiom of commutativity for the connect operator. For undirected graphs it makes sense to denote connect by or —, hence:

x y = y x.

Curiously, with the introduction of this axiom the associativity of follows from the (left-associated version of) decomposition axiom and commutativity of +:

(x y) z	=	x y + x z + y z	(left decomposition)
	=	y z + y x + z x	(commutativity of + and )
	=	(y z) x	(left decomposition)
	=	x (y z)	(commutativity of )

Commutativity of the connect operator forces graph expressions that differ only in the direction of edges into the same equivalence class. One can implement this by the symmetric closure of the underlying connectivity relation:

newtype Undirected a = Undirected { fromUndirected :: Basic a }
    deriving (Arbitrary, Functor, Foldable, Num, Show)

instance Ord a => Eq (Undirected a) where
    x == y = toSymmetric x == toSymmetric y

toSymmetric :: Ord a => Undirected a -> Relation a
toSymmetric = symmetricClosure . toRelation . fromUndirected

As you can see, we simply wrap our Basic implementaion in a newtype with a custom Eq instance that takes care of the commutativity of . We know that the resulting Undirected datatype satisfies all Graph laws, because we only made some previously different expressions equal but not vice versa.

Partial orders

In many applications graphs satisfy the transitivity property: if vertex x is connected to y, and y is connected to z, then the edge between x and z can be added or removed without changing the semantics of the graph. A common example is dependency graphs. The semantics of such graphs is typically a partial order on the set of vertices. To describe this class of graphs algebraically we can add the following closure axiom:

y ≠ ε ⇒ x → y + y → z + x → z = x → y + y → z

By using the axiom one can always rewrite a graph expression into its transitive closure or, alternatively, into its transitive reduction, hence all graphs that differ only in the existence of some transitive edges are forced into the same equivalence class. Note: the precondition (y ≠ ε) is necessary as otherwise + and → can no longer be distinguished:

x → z = x → ε → z = x → ε + ε → z + x → z = x → ε + ε → z = x + z

It is interesting that + and → have a simple meaning for partial orders: they correspond to parallel and sequential composition of partial orders, respectively. This allows one to algebraically describe concurrent systems — I will dedicate a separate blog post to this topic.

We can implement a PartialOrder instance by wrapping Basic in a newtype and providing a custom equality test via the transitive closure of the underlying relation, just like we did for undirected graphs:

newtype PartialOrder a = PartialOrder { fromPartialOrder :: Basic a }
    deriving (Arbitrary, Functor, Foldable, Num, Show)

instance Ord a => Eq (PartialOrder a) where
    x == y = toTransitive x == toTransitive y

toTransitive :: Ord a => PartialOrder a -> Relation a
toTransitive = transitiveClosure . toRelation . fromPartialOrder

Let’s test that our implementation correctly recognises the fact that path graphs are equivalent to cliques when interpreted as partial orders:

λ> quickCheck $ \xs -> path xs == (clique xs :: PartialOrder Int)
+++ OK, passed 100 tests.

Indeed, if we have a series of n tasks, where each task (apart from task 1) depends on the previous task (as expressed by path [1..n]), then task 1 is transitively a prerequisite for all subsequent tasks, task 2 is a prerequisite for tasks [3..n] etc., which can be expressed by clique [1..n].

Reflexive graphs

A partial order is reflexive (also called weak) if every element is related to itself. An example of a reflexive partial order is isSubgraphOf as introduced above: indeed, x `isSubgraphOf` x == True for all graphs x. To represent reflexive graphs algebraically we can introduce the following axiom:

vertex x = vertex x → vertex x

This enforces that each vertex has a self-loop. The implementation of Reflexive data type is very similar to that of Undirected and PartialOrder so I do not show it here (it is based on the reflexive closure of the underlying relation).

Note: cyclic reflexive partial orders correspond to preorders, for example:

(1 + 2 + 3) → (2 + 3 + 4)

is a preorder with vertices 2 and 3 forming an equivalence class. We can find the strongly-connected components and derive the following condensation:

{1} → {2, 3} → {4}

One way to interpret this preorder as a dependency graph is that tasks 2 and 3 are executed as a step, simultaneously, and that they both depend on task 1.

Mixing graph flavours

We can mix the three new axioms above in various combinations. For example, the algebra of undirected, reflexive and transitively closed graphs describes the laws of equivalence relations. Notably, it is not necessary to keep information about all edges in such graphs and there is an efficient implementation based on the disjoint set data structure. If you are curious about potential applications of such graphs, have a look at this paper where I use them to model switching networks. More precisely, I model families of switching networks; this requires another extension to the algebra: a unary condition operator, which I will cover in a future blog post.

Hypergraphs

This thread in the Hacker News discussion leads me another twist of the algebra. Let’s replace the decomposition axiom with 3-decomposition:

w → x → y → z = w → x → y + w → x → z + w → y → z + x → y → z

In words, instead of collapsing all expressions to vertices and edges (pairs of vertices), as we did with the 2-decomposition, we now collapse all expressions to vertices, edges and triples (or hyperedges of rank 3). I haven’t yet figured out whether the resulting algebra is particularly useful, but perhaps the reader can provide an insight?

To see the difference between 2- and 3-decomposition clearer, let’s substitute ε for w in 3-decomposition and simplify:

x → y → z = x → y + x → z + y → z + x → y → z

Looks familiar? It’s almost the 2-decomposition axiom! Yet there is no way to get rid of the term x → y → z on the right side: indeed, a triple is unbreakable in this algebra, and one can only extract the pairs (edges) that are embedded in it. In fact, we can take this further and rewrite the above expression to also expose the embedded vertices:

x → y → z = x + y + z + x → y + x → z + y → z + x → y → z

With 2-decomposition we can achieve something similar:

x → y = x + y + x → y

which I call the absorption theorem. It says that an edge x → y has vertices x and y (its endpoints) embedded in it. This seems intriguing but I have no idea where it leads to, I guess we’ll figure out together!

P.S.: All code snippets above are available in the alga repository. Look how nicely we can test the library thanks to the algebraic API!

An algebra of graphs

Andrey — Mon, 05 Dec 2016 16:26:10 +0000

Graph theory is my favourite topic in mathematics and computing science and in this blog post I’ll introduce an algebra of graphs that I’ve been working on for a while. The algebra has become my go-to tool for manipulating graphs and I hope you will find it useful too.

The roots of this work can be traced back to my CONCUR’09 conference submission that was rightly rejected. I subsequently published a few application-specific papers gradually improving my understanding of the algebra. The most comprehensive description can be found in ACM TECS (a preprint is available here). Here I’ll give a general introduction to the simplest version of the algebra of graphs and show how it can be implemented in Haskell.

Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.

Constructing graphs

Let G be a set of graphs whose vertices come from a fixed universe. As an example, we can think of graphs whose vertices are positive integers. A graph g ∈ G can be represented by a pair (V, E) where V is the set of its vertices and E ⊆ V × V is the set of its edges.

The simplest possible graph is the empty graph. I will be denoting it by ε in formulas and by empty in Haskell code. Hence, ε = (∅, ∅) and ε ∈ G.

A graph with a single vertex v will be denoted simply by v. For example, 1 ∈ G is a graph with a single vertex 1, that is ({1}, ∅). In Haskell I’ll use vertex to lift a given vertex to the type of graphs.

To construct bigger graphs from the above primitives I’ll use two binary operators overlay and connect, denoted by + and →, respectively. The overlay + of two graphs is defined as:

(V₁, E₁) + (V₂, E₂) = (V₁ ∪ V₂, E₁ ∪ E₂)

In words, the overlay of two graphs is simply the union of their vertices and edges. The definition of connect → is similar:

(V₁, E₁) → (V₂, E₂) = (V₁ ∪ V₂, E₁ ∪ E₂ ∪ V₁ × V₂)

The difference is that when we connect two graphs, we add an edge from each vertex in the left argument to each vertex in the right argument. Here are a few examples:

1 + 2 is the graph with two isolated vertices 1 and 2.
1 → 2 is the graph with a directed edge between vertices 1 and 2.
1 → (2 + 3) is the graph with three vertices {1, 2, 3} and two directed edges (1, 2) and (1, 3). In Haskell we can write connect 1 (overlay 2 3).
1 → 1 is the graph with vertex 1 and a self-loop (an edge going from a vertex to itself).

The following type class expresses the above in Haskell:

class Graph g where
    type Vertex g
    empty   :: g
    vertex  :: Vertex g -> g
    overlay :: g -> g -> g
    connect :: g -> g -> g

Let’s construct some graphs! A graph that contains a given list of unconnected vertices can be constructed as follows:

vertices :: Graph g => [Vertex g] -> g
vertices = foldr overlay empty . map vertex

And here is a clique (a fully connected graph) on a given list of vertices:

clique :: Graph g => [Vertex g] -> g
clique = foldr connect empty . map vertex

For example, clique [1..] is the infinite clique on all positive integers; we will call such cliques covering the whole universe complete graphs. We can also construct any graph given its edgelist:

fromEdgeList :: Graph g => [(Vertex g, Vertex g)] -> g
fromEdgeList = foldr overlay empty . map edge
  where
    edge (x, y) = vertex x `connect` vertex y

As we will see in the next section, graphs satisfy a few laws and form an algebraic structure that is very similar to a semiring.

Algebraic structure

The structure (G, +, →, ε) introduced above satisfies many usual laws:

(G, +, ε) is an idempotent commutative monoid
(G, →, ε) is a monoid
→ distributes over +, e.g. 1 → (2 + 3) = 1 → 2 + 1 → 3

The following decomposition axiom, is the only law that makes the algebra of graphs different from a semiring:

x → y → z = x → y + x → z + y → z

Indeed, in a semiring the two operators have different identity elements, let’s denote them ε₊ and ε_→, respectively. By using the decomposition axiom we can prove that they coincide:

ε₊	=	ε₊ → ε_→ → ε_→	(identity of →)
	=	ε₊ → ε_→ + ε₊ → ε_→ + ε_→ → ε_→	(decomposition)
	=	ε₊ + ε₊ + ε_→	(identity of →)
	=	ε_→	(identity of +)

The idempotence of + also follows from the decomposition axiom.

The following is a minimal set of axioms that describes the graph algebra:

+ is commutative and associative
(G, →, ε) is a monoid, i.e. → is associative and ε is the identity element
→ distributes over +
→ can be decomposed: x → y → z = x → y + x → z + y → z

An exercise for the reader: prove that ε is the identity of + from the minimal set of axioms above. This is not entirely trivial! Also prove that + is idempotent.

Note, to switch from directed to undirected graphs it is sufficient to add the axiom of commutativity of →. We will explore this in a future blog post.

Examples

Let’s look at two basic instances of the Graph type class that satisfy the laws from the previous section. The first one, called Relation, adopts our set-based definitions for the overlay and connect operators and is therefore a free instance (i.e. it doesn’t satisfy any other laws):

data Relation a = Relation { domain :: Set a, relation :: Set (a, a) }
    deriving (Eq, Show)

instance Ord a => Graph (Relation a) where
    type Vertex (Relation a) = a
    empty       = Relation Set.empty Set.empty
    vertex  x   = Relation (Set.singleton x) Set.empty
    overlay x y = Relation (domain   x `Set.union` domain   y)
                           (relation x `Set.union` relation y)
    connect x y = Relation (domain   x `Set.union` domain   y)
                           (relation x `Set.union` relation y
                            `Set.union` Set.fromDistinctAscList
                            [ (a, b) | a <- Set.elems (domain x)
                                     , b <- Set.elems (domain y) ])

Let’s also make Relation an instance of Num type class so we can use + and * operators for convenience.

instance (Ord a, Num a) => Num (Relation a) where
    fromInteger = vertex . fromInteger
    (+)         = overlay
    (*)         = connect
    signum      = const empty
    abs         = id
    negate      = id

Note: the Num law abs x * signum x == x is satisfied since x → ε = x. In fact, any Graph instance can be made a Num instance if need be. We can now play with graphs using interactive GHC:

λ> 1 * (2 + 3) :: Relation Int
Relation {domain = fromList [1,2,3], relation = fromList [(1,2),(1,3)]}
λ> 1 * (2 + 3) + 2 * 3 == (clique [1..3] :: Relation Int)
True

Another simple instance can be obtained by embedding all graph constructors into a basic algebraic datatype:

data Basic a = Empty
             | Vertex a
             | Overlay (Basic a) (Basic a)
             | Connect (Basic a) (Basic a)
             deriving Show

instance Graph (Basic a) where
    type Vertex (Basic a) = a
    empty   = Empty
    vertex  = Vertex
    overlay = Overlay
    connect = Connect

We cannot use the derived Eq instance here, because it would clearly violate the laws of the algebra, e.g. Overlay Empty Empty is structurally different from Empty. However, we can implement a custom Eq instance as follows:

instance Ord a => Eq (Basic a) where
    x == y = toRelation x == toRelation y
      where
        toRelation :: Ord a => Basic a -> Relation a
        toRelation = foldBasic

foldBasic :: (Vertex g ~ a, Graph g) => Basic a -> g
foldBasic Empty         = empty
foldBasic (Vertex  x  ) = vertex x
foldBasic (Overlay x y) = overlay (foldBasic x) (foldBasic y)
foldBasic (Connect x y) = connect (foldBasic x) (foldBasic y)

The Basic instance is useful because it allows to represent densely connected graphs more compactly. For example, clique [1..n] :: Basic Int has linear-size representation in memory, while clique [1..n] :: Relation Int stores each edge separately and therefore takes O(n²) memory. As I will demonstrate in future blog posts, we can exploit compact graph representations for deriving algorithms that are asymptotically faster on dense graphs compared to existing graph algorithms operating on edgelists.

Summary

I’ve been using the algebra of graphs presented above for several years in a number of different projects and found it very useful. There are a few flavours of the algebra that I will introduce in follow-up blog posts that allow to work with undirected graphs, transitively closed graphs (also known as partial orders or dependency graphs), graph families, and their various combinations. All these flavours of the algebra can be obtained by extending the set of axioms.

I am working on a Haskell library alga implementing the algebra of graphs and intend to release it soon. Let me know if you have any suggestions on how to improve the above code snippets.

Towards Cloud Build Systems with Dynamic Dependency Graphs

Andrey — Mon, 24 Oct 2016 00:36:04 +0000

I’ve recently submitted an application to the Royal Society Industry Fellowship scheme, aiming to continue my journey into the world of build systems. Below is the technical section, which I think is worth sharing regardless of the outcome of my application. I’d like to thank Neil Mitchell, Simon Peyton Jones, Simon Marlow and Nick Benton for many enlightening discussions on build systems that helped me understand both the problem and possible approaches to solving it.

Build systems

A build system is a critical component of most software projects, responsible for compiling the source code written in various programming languages and producing executable programs — end software products. Build systems sound simple, but they are not; in large software projects the build system grows from simple beginnings into a huge, complex engineering artefact. Why? Because it evolves every day as new features are continuously being added by software developers; because it builds programs for a huge variety of target hardware configurations (from mobile to cloud); and because it operates under strict correctness and performance requirements, standing on the critical path between the development of a new feature and its deployment into production.

It is known that build systems can take up to 27% of software development effort, and that improvements to build systems rapidly pay off [1]. Despite its importance, this subject is severely under-researched, which prompts major companies, such as Facebook and Google, to invest significant internal resources to make their own bespoke build system frameworks.

Challenges in large-scale build systems

The following two requirements are important for complex large-scale build systems, however, to the best of my knowledge, no existing build system can support both:

Distributed cloud builds: codebase of large software projects comprises millions of lines of code spread across thousands of files, each built independently by thousands of developers on thousands of machines. A distributed cloud build system speeds up builds dramatically (and saves energy) by transparently sharing build products among developers.
Dynamic dependency graphs: many existing build systems need to know all the dependencies between software components, i.e. the dependency graph, at the start of the build process. This makes it possible to analyse the graph ahead of time, but is fundamentally limited: parts of the dependency graph can often be discovered only during the build process, i.e. the dependency graph is dynamic, not static.

There exist build systems that support cloud builds, e.g. Facebook’s Buck and Google’s Bazel, as well as build systems that support dynamic dependencies, e.g. Neil Mitchell’s Shake (originally developed at Standard Chartered) and Jane Street’s Jenga, but not both.

Proposed approach and research objectives

I will combine two known techniques when developing the new build system: 1) storing build results in a cloud build cache, where a file is identified by a unique hash that combines hashes of the file, its build dependencies and the environment, so that users of the build system can fetch an already built file from the cache, or build it themselves and share it with others by uploading it into the build cache; and 2) storing the last known version of the dependency graph in a persistent graph storage, and updating it whenever a file needs to be rebuilt and the newly discovered set of dependencies differs from the previous record, as implemented in the Shake build system [2].

My first research objective is to formalise the semantics of cloud build systems with dynamic dependency graphs in the presence of build non-determinism (real compilers are not always deterministic) and partial builds (not all intermediate build results are necessary to store locally). I will then integrate cloud build caches with persistent graph storage and develop a domain-specific language for the new build system using scalable functional programming abstractions, such as polymorphic and monadic dependencies, high-level concurrency primitives, and compositional build configurations [3]. In 2012-2014 I developed an algebra for dependency graphs [4] that I will use for proving the correctness of new build algorithms. The final research objective is to evaluate the developed build system and explore opportunities for its integration into a real large-scale build infrastructure.

References

[1] S. McIntosh et al. “An empirical study of build maintenance effort”, In Proceedings of the International Conference on Software Engineering (ICSE), ACM, 2011.

[2] N. Mitchell. “Shake before building – replacing Make with Haskell”, In Proceedings of the International Conference on Functional Programming (ICFP), ACM, 2012.

[3] A. Mokhov, N. Mitchell, S. Peyton Jones, S. Marlow. “Non-recursive Make Considered Harmful: Build Systems at Scale”, In Proceedings of the International Haskell Symposium, ACM, 2016.

[4] A. Mokhov, V. Khomenko. “Algebra of Parameterised Graphs”, ACM Transactions on Embedded Computing Systems, 13(4s), p.143, 2014.