First, let’s recap the classic solution based on the depth-first search. I’ll use my favourite graph library Alga, so my examples will be in Haskell. Below I create an example undirected graph and compute the number of connected components by counting trees in the depth-first search forest. The graph and the forest are shown in the figure; the edges that belong to the forest are directed to illustrate the order of graph traversal.
λ> import Algebra.Graph.Undirected λ> example = edges [ (1,6), (2,6), (3,7), ... ] λ> length $ dfsForest $ fromUndirected example 3
This approach is very simple and you should definitely use it, provided that the linear complexity O(n + m) is fast enough for your application. But what if you need to go faster? Some time ago my colleagues and I wrote a paper where we showed how to take advantage of concurrency by implementing the breadth-first search on an FPGA, resulting in better time complexity O(d) where d is the diameter of the graph. This was an excellent result for our application where graphs had a small diameter, but in general, the diameter can be as large as n. So can we do better?
Yes! Tarjan’s paper presents several faster algorithms. It also describes the algorithms in a very nice compositional manner:
To play with and understand these algorithms, let’s translate the above definitions to Haskell, trying to preserve their clarity and conciseness while also being explicit about details. Jumping ahead a little, here is how Algorithm P will look like in the end:
algorithmP = repeat (parentConnect >>> update >>> shortcut)
Pretty close! Now, let’s define the primitives parentConnect etc. in terms of the underlying computational model where a lot of tiny processing threads communicate via short messages, in rounds. In a round, threads concurrently receive some messages, then update their local states, and possibly send out new messages for the next round. We can capture the local view of a computation round by the following type:
type Local s t i o = s -> t -> [i] -> (s, [(t, o)])
Here s is a type of states, t is a type of threads, and i and o are types of incoming and outgoing messages in a round of computation. In words, given the current state of a thread and a list of incoming messages, a Local function returns an updated state and a list of outgoing messages, each tagged with a target thread identifier — these messages will be delivered in the next computation round.
We can upload Local functions to our tiny processors, say on an FPGA, inject input messages into the communication network, and extract the outputs after a computation round, i.e. when all threads complete the computation. This requires specialised hardware and non-trivial setup, so let’s find a way to simulate such computations on a big sequential machine that has enough memory to have the global view of a round:
type Global s t i o = (Map t s, [(t, i)]) -> (Map t s, [(t, o)])
The Global function takes a map from threads to their current states and a list of all input messages in the network and returns the resulting global state: a map of new thread states and a list of newly generated output messages. Note that such global computations can be composed using function composition and form a category; the operator >>> in the above code snippet for Algorithm P is just the left-to-right composition defined in the standard module Control.Category.
Converting from a Local to a Global view is relatively straightforward:
round :: Ord t => Local s t i o -> Global s t i o round local (states, messages) = collect (Map.mapWithKey update states) where deliveries = Map.fromAscList (groupSort messages) update t s = local s t (Map.findWithDefault [] t deliveries) collect = runWriter . traverse writer
We first find all the deliveries, i.e. lists of incoming messages that should be delivered to each thread, then execute the local update functions of each thread, sequentially (note that this is OK since threads interact only between rounds), and finally, collect outgoing messages of all threads.
To express Tarjan’s algorithms, we will need the generic repeat function that executes a given Global computation repeatedly until thread states stop changing.
repeat :: (Eq s, Eq t) => Global s t Void Void -> Global s t Void Void repeat g (states, messages) | states == newStates = (states, messages) | otherwise = repeat g (newStates, newMessages) where (newStates, newMessages) = g (states, messages)
Note that we require that the network is quiescent before and after a given computation, as indicated by the type of incoming and outgoing messages i = o = Void. Of course, the computation may be comprised of multiple rounds, which can exchange messages between each other!
Now let’s use the above definitions to express Algorithms P and R from Tarjan’s paper. We’ll have n + m threads corresponding to vertices and edges, whose states will be very minimalistic: edges will have no state at all, and every vertex will store its current parent i.e. the minimum vertex of the connected component to which the vertex is currently assigned.
type Vertex = Int data Thread = VertexThread Vertex | EdgeThread Vertex Vertex data State = VertexState Vertex | EdgeState
A vertex that is its own parent is called root; all vertices are initially roots. If this reminds you of the disjoint-set data structure you are on the right track! All algorithms follow the same general idea: we start by assigning each vertex to a separate component and then use edges to inform their neighbouring vertices about other reachable vertices in the component, maintaining the invariant that the root of a component is its smallest vertex. As rounds progress, we grow a parent forest, not dissimilar to the forest produced by the depth-first search for our earlier example graph, but now taking advantage of concurrency.
Let’s also create a convenient type synonym for denoting global views of computation rounds involving s = State and t = Thread:
type (~>) i o = Global State Thread i o
We can now express computation primitives, such as connect, where edges inform their neighbours about each other:
connect :: Void ~> Vertex connect = round $ \s t _ -> case t of VertexThread _ -> (s, []) EdgeThread x y -> (s, [(VertexThread (max x y), min x y)])
The type Void ~> Vertex says that the round starts with no messages in the network and ends with messages carrying vertices. Vertex threads are dormant in the round, whereas edge threads generate one message each, sending a smaller vertex to the thread corresponding to the larger vertex so that the latter could update its parent. Note that we express the behaviour locally, and then use the function round defined above to obtain its Global semantics.
This connect round can be followed by the update round, where vertex threads process incoming messages, updating their parents accordingly:
update :: Vertex ~> Void update = round $ \s _ i -> case s of VertexState p -> (VertexState $ minimum (p : i), []) EdgeState -> (s, [])
The primitives connect and update are not sufficient on their own; we need another crucial ingredient — the shortcut primitive that halves the depths of trees in the parent forest, similarly to the path compression technique used in the disjoint-set data structure.
shortcut :: Void ~> Void shortcut = request >>> respondParent >>> update where request :: Void ~> Thread request = round $ \s t _ -> case s of VertexState p -> (s, [(VertexThread p, t)]) EdgeState -> (s, []) respondParent :: Thread ~> Vertex respondParent = round $ \s _ i -> case s of VertexState p -> (s, map (,p) i) EdgeState -> (s, [])
In shortcut, every vertex requests the parent of its parent sending itself as a “respond-to” address. In the subsequent round, each vertex thread responds by sending its parent. The process completes by running the update primitive defined above. Note how we can compose simple primitives together in a type-safe way, obtaining a non-trivial shortcut.
The last primitive that I’ll cover is parentConnect, a variation of connect that informs the parents of two edge vertices about each other:
parentConnect :: Void ~> Vertex parentConnect = request >>> respondParent >>> receive where request :: Void ~> Thread request = round $ \s t _ -> case t of VertexThread _ -> (s, []) EdgeThread x y -> (s, [(VertexThread x, t), (VertexThread y, t)]) receive :: Vertex ~> Vertex receive = round $ \s _ i -> case s of VertexState _ -> (s, []) EdgeState -> case i of [x, y] -> (s, [(VertexThread (max x y), min x y)]) _ -> error "Unexpected number of responses"
We start by requesting parents of edge vertices, continue by sending the responses using the respondParent primitive defined above, and finally receive and handle the responses in a way similar to connect.
All the ingredients required for expressing Algorithm P are now in place:
algorithmP :: Void ~> Void algorithmP = repeat (parentConnect >>> update >>> shortcut)
Great! My favourite algorithm from the paper is Algorithm R, which is a slight variation of Algorithm P:
algorithmR :: Void ~> Void algorithmR = repeat (parentConnect >>> rootUpdate >>> shortcut)
Here we use rootUpdate instead of update, whose only difference is that updates to non-root vertices are ignored. This makes the growth of the parent forest monotonic: we never exchange subtrees between trees and only graft whole trees to other trees. This greatly simplifies the analysis of the performance of Algorithm R compared to Algorithm P. In fact, according to Tarjan, analysis of Algorithm P is still an open problem! I encourage you to read the paper, which presents five (!) algorithms for computing connected components concurrently, and also to implement the two remaining primitives, extendedConnect and alter, required for expressing Algorithms E, A and RA using our little modelling framework.
Let’s check if our implementation of Algorithm P works correctly on the example graph:
initialise :: Graph Int -> Map Thread State initialise g = Map.fromList (vs ++ es) where vs = [ (VertexThread x, VertexState x) | x <- vertexList g ] es = [ (EdgeThread x y, EdgeState ) | (x, y) <- edgeList g ] run :: Global s t Void Void -> Map t s -> Map t s run g m = fst $ g (m, []) components :: Map Thread State -> [(Int, [Int])] components m = groupSort [ (p, x) | (VertexThread x, VertexState p) <- Map.toList m ] λ> mapM_ print $ components $ run algorithmP $ initialise example (1,[1,2,3,4,5,6,7,8,9,10,11,13,15,18,22,23,24,25,27,29,31,32,35,38]) (12,[12,14,16,17]) (19,[19,20,21,26,28,30,33,34,36,37])
As expected, there are three components with roots 1, 12 and 19, which matches the figure at the top of the blog post.
And here is an animation of how Algorithm R builds up the parent forest in a monotonic manner. Note that the edges in the parent forest are now shown as undirected (unlike in the earlier depth-first search figure) since they may change their logical direction during the execution of the algorithm (e.g. see the edge 4-9).
That’s all for now! I hope to find time soon to try these algorithms on Tinsel, a multi-FPGA hardware platform with thousands of processors connected by a fast network. Tinsel is being developed at the University of Cambridge as part of the POETS research project, which I worked on while at Newcastle University. Tinsel is very cool! Have a look at it if you are into FPGAs and unconventional computing architectures.
P.S.: If you haven’t heard about the Heidelberg Laureate Forum before, you can read about it here. If you are a young researcher in computer science or mathematics, you should consider applying — it’s an amazing event where you get a chance to talk to the leaders of the two fields, as well as to young researchers from all over the world. This year I’m here as part of the HLF blogging team, writing about people I met and things I learned here, and it’s been a lot of fun too — if you’d like to help cover and promote this event, get in touch with the HLF media team.
P.P.S.: And here are a couple of links: (i) a video recording of Tarjan’s HLF lecture and (ii) the complete source code from this blog post.
]]>
I’ve been studying and then working at Newcastle University for 14 years and enjoyed this time tremendously. People there, both in the streets and in the university, are warm and friendly so I felt welcome from Day 1. Academic life isn’t perfect for everyone but it was perfect for me. (By the way, I’m happy to recommend Newcastle University to anyone who is thinking of an academic career: it’s a great place and I’m happy to introduce you to my former colleagues.)
What I like about academia is the freedom of exploring new topics without asking anyone’s permission. In particular, in 2014 I unexpectedly started working on build systems, first practically by developing a new build system for the Glasgow Haskell Compiler and then theoretically by exploring the space of existing build systems in the paper Build Systems à la Carte. This led me to the build system Dune, which didn’t quite fit the model described in the paper, so I started collaborating with Jane Street to understand Dune better and we wrote another theoretical paper on “selective functors” together.
I find build systems fascinating and would like to dedicate some time to work on them. When I heard that the Dune team was looking for new developers I thought that this would be a great opportunity to both deepen my understanding of build systems, and also do something I wanted to do for a while — get some experience of working in industry and see if it would be a good environment for me too.
Like other members of the Tools and Compilers team at Jane Street, I plan to continue being involved in programming languages research. Fortunately, Dune and a few other Jane Street projects are open source, which makes it possible to openly collaborate and publish. I’ll also stay in touch with my former academic colleagues; Newcastle University has kindly granted me the Visiting Fellow status, so if you’ve been in touch with me via my university email, don’t worry, it will still work. Looks like this blog is still up too, although I’d like to set up a standalone blog at some point.
To complete the picture of my new place of work I’d like to highlight a few Jane Street tech talks: algebraic effects, incremental computation, compiling OCaml to FPGAs, as well as a more general talk about the use of OCaml at Jane Street. Not sure I’ll have time to contribute to any of these projects as I’m going to focus on Dune first, but it’s exciting to work in such an environment. My new Day 1 is tomorrow!
P.S.: If you are in London on 1-2 October, come along to the Build Meetup organised by Cloudflare, Bloomberg and Google where I’m giving a talk.
]]>I call this build system Stroll because its build algorithm reminds me of strolling through a park where you’ve never been before, trying to figure out an optimal path to your target destination, and likely going in circles occasionally until you’ve built up a complete mental map of the park.
Most build systems require the user to specify both build tasks as well as dependencies between the tasks: knowing the dependencies allows the build system to determine which tasks need to be executed when you modify some of the source files. From personal experience, describing dependencies accurately is difficult and is often a source of frustration.
Build tasks and their dependencies are typically described using domain-specific task description languages: Make uses makefiles, Bazel uses a Python-inspired Starlark, Shake uses Haskell, etc. While learning a new task description language is not a big deal, translating an existing build system to a new language may take years (or maybe I’m just slow).
Stroll is different. You use your favourite language(s) to describe build tasks, put them in a directory, and ask Stroll to execute them. Stroll does not look inside the tasks; it treats the tasks as black boxes and finds the dependencies between them by tracking their file accesses. This process is not optimal in the sense that a task may fail because its dependency is not ready and, therefore, will need to be built again later, but in the end, Stroll will learn the complete and accurate dependency graph and will store it to speed up future builds.
There is a build system called Fabricate that also tracks file accesses to automatically compute accurate task dependencies, but it requires the user to describe build tasks in a lightweight Python DSL and preschedule them by calling the tasks in the right order, essentially asking the user to do the strolling part themselves. Fabricate is cool and inspired Stroll but I’d like to explore this corner of the build systems space a bit further.
Stroll is just an experiment and I’m not sure if the main idea behind it is feasible, but it can already be used to automate some simple collections of build tasks. Let me give you a demo. I’m using Windows below but the demo works on Linux too.
Consider a simple “Hello, World!” program stored in the file src/main.c:
#include <stdio.h> int main() { printf("Hello, World!\n"); return 0; }
We can build an executable bin/main.exe from it by using the following simple build script build/main.bat:
mkdir bin gcc src/main.c -o bin/main.exe
Let’s ask Stroll to build our little project:
$ stroll build Executing build/main.bat... Done $ bin/main.exe Hello, World!
When executing build tasks, Stroll tracks their file accesses and stores the discovered information next to the build scripts, by appending the extension .stroll to their names:
$ cat build/main.bat.stroll exit-code: ExitSuccess operations: bin/main.exe: write: efc851e573be26cf8fe726caf70facf924ccdbae5c4fce241fdbe728b3abde76 src/main.c: read: bc31bb10c238be7ee34fd86edec0dc99d145f56067b13b82c351608efd38763c build/main.bat: read: da9a4390693741b8d52388f18b1c5ccc418531bc3b0a64900890c381a31e7839
As you can see, Stroll recorded the exit code of the script, two file reads and one file write, along with the corresponding hashes of file contents.
We can ask Stroll to visualise the discovered dependency graph:
$ stroll -g build | dot -Tpng -Gdpi=600 -o graph.png
The flag -g tells Stroll to print out the discovered dependency graph in the DOT format that we subsequently convert to the following PNG file:
The green box indicates that the only build task main is up-to-date. This means Stroll will not execute it in the next run unless any of its inputs or outputs change. If we modify src/main.c and regenerate the dependency graph, we can see that the task main is now out-of-date:
The specific dependency that is out-of-date is shown by a dashed edge. Note that Stroll is a self-tracking build system, i.e. it tracks changes not only in sources and build artefacts but in build tasks too.
To make the example a bit more interesting, let’s add a library providing a greeting function greet in files src/lib/lib.h and src/lib/lib.c:
$ cat src/lib/lib.h void greet(char *name); $ cat src/lib/lib.c #include <stdio.h> #include <lib.h> void greet(char *name) { printf("Hello, %s!\n", name); }
To compile the library we will add a new build task build/lib.bat:
mkdir bin/lib gcc -Isrc/lib -c src/lib/lib.c -o bin/lib/lib.o
If we look at the dependency graph before running Stroll we’ll see:
The new task lib appeared in the graph but without any dependencies; it is marked as out-of-date because Stroll has never executed it. If we run Stroll now, it will execute both tasks reaching the following state:
The tasks are currently independent and can be built in parallel but let’s make use of the library by modifying the main source file as follows:
#include <lib.h> int main() { greet("World"); return 0; }
If we run Stroll now, the build will fail:
$ stroll build Executing build/main.bat... Script build/main.bat has failed. Done $ cat build/main.bat.stderr A subdirectory or file bin already exists. src\main.c:2:17: fatal error: lib.h: No such file or directory compilation terminated.
By examining the file build/main.bat.stderr helpfully created by Stroll, we can see that the build failed because we forgot to modify the build script and point gcc to our library. An error is visualised by a task with a double border:
Let’s fix the script main/build.bat:
mkdir bin gcc -Isrc/lib src/main.c bin/lib/lib.o -o bin/main.exe
With this fix, Stroll completes successfully and produces the following dependency graph:
We can now demonstrate the early cut-off feature by adding a comment to the file src/lib/lib.c. Before running Stroll let’s check that this change makes both lib and main out-of-date:
As you can see, both tasks are now marked as out-of-date: lib’s direct input has changed, which transitively also affects main. Let’s Stroll:
$ stroll build Executing build/lib.bat... Done
Stroll has rebuilt the library but the file bin/lib/lib.o didn’t change, thus restoring the up-to-date status of the task main.
Finally, to clean up after our experiments, let’s create a new task clean and place it into a different directory: clean/clean.bat.
$ cat clean/clean.bat rm bin/lib/lib.o rm bin/main.exe $ stroll clean Executing clean/clean.bat... Done
The corresponding dependency graph shows two outputs — the deleted binary files.
As you might have noticed, Stroll uses directories as collections of build tasks related to a common build target. To build a target we simply run Stroll in the corresponding directory. If you’d like to build just a single task from a directory, you can specify the full path, for example:
$ stroll build/lib.bat Executing build/lib.bat...
This executes the lib task (regardless of its current status). Note that the main task is currently out-of-date because clean deleted its output:
To complete the build, run stroll build, which will execute only the main task, as desired.
Challenge: Try to orchestrate a situation, where Stroll would execute one of the tasks twice, hence demonstrating that Stroll is not a minimal build system.
Stroll is implemented in ~400 lines of Haskell (including comments) and uses the following libraries in addition to the standard ones:
Many thanks to everyone who contributed to these projects!
You are welcome to browse the source code of Stroll and/or play with it, but be warned: it has a few serious limitations (discussed below), and I’m not sure they will ever be fixed.
Stroll is fun and I consider it a successful experiment but the current implementation has a few serious limitations:
One interesting aspect that I haven’t demonstrated above is how one can mix and match different languages when writing individual build tasks. For example, we could use Shake for compiling C source files into object files. Note that Shake itself is a build system that maintains its own state, but it still works out just fine: if Shake decides to rebuild only one object file (since others are up-to-date) Stroll can safely remember this decision — as long as all input files are the same and the Shake’s database is unchanged, we can assume that the results produced by Shake previously are still valid. This means that even though Stroll’s task granularity may be quite coarse (e.g. one task for 100 object files), these coarse-grain tasks can benefit from fine-grain incrementality and parallelism supported by other build systems, such as Shake. And if you happen to have two existing build systems written in different languages you don’t need to rewrite anything: you can compose your legacy build systems simply by placing them into the same directory and using Stroll.
Unusually, Stroll can cope with cyclic task descriptions, where a few build tasks form a dependency cycle, as well as with build tasks that generate new build tasks! Stroll simply keeps building until reaching a fixed point where all tasks are up-to-date.
Stroll does not fit the modelling approach from the Build Systems à la Carte paper, where build tasks have statically known outputs: Stroll supports both dynamic inputs and dynamic outputs. I conjecture that such build systems cannot be minimal, i.e. they fundamentally require a trial and error approach used by Stroll, where unnecessary work may be performed while discovering the complete dependency graph.
I’ve been thinking about this idea for a while and had many illuminating discussions with Ulf Adams, Arseniy Alekseyev, Jeremie Dimino, Georgy Lukyanov, Neil Mitchell, Iman Narasamdya, Simon Peyton Jones, Newton Sanches, Danil Sokolov, and probably others. A few people in this list have been sceptical about the idea, so I do not imply that they endorsed Stroll — I’m merely thankful to everyone for their insights.
]]>You should try to use Hadrian as the GHC build system, because it will (hopefully!) become the default around GHC 8.8.
Hadrian is a new build system for the Glasgow Haskell Compiler, which is written in Haskell. It lives in the directory “hadrian” in the GHC tree, and we have been actively developing it in the past year to reach feature and correctness parity with the existing Make-based build system. While we haven’t quite reached this goal (more on this below), Hadrian is already working well and we run Hadrian jobs alongside the Make ones in our CI pipelines since the recent move to GitLab.
At this point, we would like to encourage everyone to try using Hadrian for their usual GHC development tasks. Hadrian’s documentation resides in GHC’s source tree, and below are the documents you will be most interested in:
The documentation can surely be improved, so please do not hesitate to send us feedback and suggestions here, or even better on GHC Trac; make sure you select the component “Build System (Hadrian)” when creating a new ticket.
Hadrian is new, requires time to learn, and still has rough edges, but it has been developed to make your lives better. Here are a few advantages of Hadrian over the Make-based build system:
Hadrian can capture build dependencies more accurately, which means you rarely (if ever) need to do a clean rebuild.
Hadrian is faster for two reasons: (i) more accurate build dependencies, (ii) tracking of file contents instead of file modification times. Both allow you to avoid a lot of unnecessary rebuilds. Building Hadrian itself may take a while but needs to be done only once.
You no longer need to deal with Make’s global namespace of mutable string variables. Hadrian is written in the language you love; it has modules, types and pure functions.
If you come across a situation where Hadrian is worse than the Make build system in any of the above aspects, this is a bug and you should report it.
The best way to help is to try Hadrian, and let us know how it goes, what doesn’t work, what’s missing for you, what you think should be easier, and so on. Below is a list of known issues that we are in the process of fixing or that we will be tackling soon:
We are likely missing some features compared to the Make build system, but none of them should take a lot of time to implement at this point. If you spot one, let us know! We’ll do our best to implement it (or help you do it) as soon as we can. It is useful to look at the existing Hadrian tickets before submitting new ones, to make sure that the issue or idea that you would like to talk about hasn’t been brought up yet.
Of course, we welcome your code contributions too! Several GHC developers have a good understanding of the Hadrian codebase and will be able to help you. To find their names, have a look at the list of recent Hadrian commits. As you can see, Hadrian is actively developed by many people, and we hope you will join too.
]]>
We start with a brief introduction to monoids, rings and lattices. Feel free to jump straight to the section “What if 0 = 1?”, where the fun starts.
A monoid (S, ∘, e) is a way to express a basic form of composition in mathematics: any two elements a and b of the set S can be composed into a new element a ∘ b of the same set S, and furthermore there is a special element e ∈ S, which is the identity element of the composition, as expressed by the following identity axioms:
a ∘ e = a
e ∘ a = a
In words, composing the identity element with another element does not change the latter. The identity element is sometimes also called unit.
As two familiar everyday examples, consider addition and multiplication over integer numbers: (ℤ, +, 0) and (ℤ, ⋅, 1). Given any two integers, these operations produce another integer, e.g. 2 + 3 = 5 and 2 ⋅ 3 = 6, never leaving the underlying set of integers ℤ; they also respect the identity axioms, i.e. both a + 0 = 0 + a = a and a ⋅ 1 = 1 ⋅ a = a hold for all integers a. Note: from now on we will often omit the multiplication operator and write simply ab instead of a ⋅ b, which is a usual convention.
Another important monoid axiom is associativity:
a ∘ (b ∘ c) = (a ∘ b) ∘ c
It tells us that the order in which we group composition operations does not matter. This makes monoids convenient to work with and allows us to omit unnecessary parentheses. Addition and multiplication are associative: a + (b + c) = (a + b) + c and a(bc) = (ab)c.
Monoids are interesting to study, because they appear everywhere in mathematics, programming and engineering. Another example comes from Boolean algebra: the logical disjunction (OR) monoid ({0,1}, ∨, 0) and the logical conjunction (AND) monoid ({0,1}, ∧, 1). Compared to numbers, in Boolean algebra the meanings of composition and identity elements are very different (e.g. number zero vs logical false), yet we can abstract from these differences, which allows us to reuse general results about monoids across various specific instances.
In this blog post we will also come across commutative and idempotent monoids. In commutative monoids, the order of composition does not matter:
a ∘ b = b ∘ a
All four examples above (+, ⋅, ∨, ∧) are commutative monoids. String concatenation (S, ++, “”) is an example of a non-commutative monoid: indeed, “a” ++ “b” = “ab” and “b” ++ “a” = “ba” are different strings.
Finally, in an idempotent monoid, composing an element with itself does not change it:
a ∘ a = a
The disjunction ∨ and conjunction ∧ monoids are idempotent, whereas the addition +, multiplication ⋅ and concatenation ++ monoids are not.
A monoid which is both commutative and idempotent is a bounded semilattice; both disjunction and conjunction are bounded semilattices.
As you might have noticed, monoids often come in pairs: addition and multiplication (+, ⋅), disjunction and conjunction (∨, ∧), set union and intersection (⋃, ⋂), parallel and sequential composition (||, ;) etc. I’m sure you can list a few more examples of such pairs. Two common ways in which such monoid pairs can be formed are called rings and lattices.
A ring, or more generally a semiring, (S, +, 0, ⋅, 1) comprises an additive monoid (S, +, 0) and a multiplicative monoid (S, ⋅, 1), such that: they both operate on the same set S, the additive monoid is commutative, and the multiplicative monoid distributes over the additive one:
a(b + c) = ab + ac
(a + b)c = ac + bc
Distributivity is very convenient and allows us to open parentheses, and (if applied in reverse) to factor out a common term of two expressions. Furthermore, ring-like algebraic structures require that 0 annihilates all elements under multiplication:
a ⋅ 0 = 0
0 ⋅ a = 0
The most basic and widely known ring is that of integer numbers with addition and multiplication: we use this pair of monoids every day, with no fuss about the underlying theory. Various lesser known tropical and star semirings are a great tool in optimisation on graphs — read this cool functional pearl by Stephen Dolan if you want to learn more.
A bounded lattice (S, ∨, 0, ∧, 1) also comprises two monoids, which are called join (S, ∨, 0) and meet (S, ∧, 1). They operate on the same set S, are required to be commutative and idempotent, and satisfy the following absorption axioms:
a ∧ (a ∨ b) = a
a ∨ (a ∧ b) = a
Like rings, lattices show up very frequently in different application areas. Most basic examples include Boolean algebra ({0,1}, ∨, 0, ∧, 1), the power set (2^{S}, ⋃, Ø, ⋂, S), as well as integer numbers with negative and positive infinities and the operations max and min: (ℤ^{±∞}, max, -∞, min, +∞). All of these lattices are distributive, i.e. ∧ distributes over ∨ and vice versa.
Now that the scene has been set and all characters introduced, let’s see what happens when the identity elements of the two monoids in a pair (S, +, 0) and (S, ⋅, 1) coincide, i.e. when 0 = 1.
In a ring (S, +, 0, ⋅, 1), this leads to devastating consequences. Not only 1 becomes equal to 0, but all other elements of the ring become equal to 0 too, as demonstrated below:
a | = | a ⋅ 1 | (identity of ⋅) |
= | a ⋅ 0 | (we postulate 0 = 1) | |
= | 0 | (annihilating 0) |
The ring is annihilated into a single point 0.
In a bounded lattice (S, ∨, 0, ∧, 1), postulating 0 = 1 leads to the same catastrophe, albeit in a different manner:
a | = | 1 ∧ a | (identity of ∧) |
= | 1 ∧ (0 ∨ a) | (identity of ∨) | |
= | 0 ∧ (0 ∨ a) | (we postulate 0 = 1) | |
= | 0 | (absorption axiom) |
The lattice is absorbed into a single point 0.
Postulating the axiom 0 = 1 has so far led to nothing but disappointment. Let’s find another way of pairing monoids, which does not involve the axioms of annihilation and absorption.
Consider two monoids (S, +, 0) and (S, ⋅, 1), which operate on the same set S, such that + is commutative and ⋅ distributes over +. We call these monoids united if 0 = 1. To avoid confusion with rings and lattices, we will use e to denote the identity element of both monoids:
a + e = ae = ea = a
We will call this the united identity axiom. We’ll also refer to e as empty, the operation + as overlay, and the operation ⋅ as connect.
What can we tell about united monoids? First of all, it is easy to prove that the monoid (S, +, e) is idempotent:
a + a | = | ae + ae | (united identity) |
= | a(e + e) | (distributivity) | |
= | ae | (united identity) | |
= | a | (united identity) |
Recall that this means that (S, +, e) is a bounded semilattice.
The next consequence of the united identity axiom is a bit more unusual:
ab = ab + a
ab = ab + b
ab = ab + a + b
We will refer to the above properties as containment laws: intuitively, when you connect a and b, the constituent parts are contained in the result ab. Let us prove containment:
ab + a | = | ab + ae | (united identity) |
= | a(b + e) | (distributivity) | |
= | ab | (united identity) |
The two other laws are proved analogously (in fact, they are equivelent to each other).
Surprisingly, the containment law ab = ab + a is equivalent to the united identity law 0 = 1, i.e. the latter can be proved from the former:
0 | = | 1 ⋅ 0 | (1 is identity of ⋅) |
= | 1 ⋅ 0 + 1 | (containment) | |
= | 0 + 1 | (1 is identity of ⋅) | |
= | 1 | (0 is identity of +) |
This means that united monoids can equivalently be defined as follows:
Then the fact that (S, +, e) is also a monoid can be proved as above.
Finally, let’s prove one more property of united monoids: non-empty elements of S can have no inverses. More precisely:
if a + b = e or ab = e then a = b = e.
The lack of overlay inverses follows from overlay idempotence:
a | = | a + e | (united identity) |
= | a + a + b | (assumption a + b = e) | |
= | a + b | (idempotence) | |
= | e | (assumption a + b = e) |
The lack of connect inverses follows from the containment law:
a | = | e + a | (united identity) |
= | ab + a | (assumption ab = e) | |
= | ab | (containment) | |
= | e | (assumption ab = e) |
It is time to look at some examples of united monoids.
One example appears in this paper on Haskell’s ApplicativeDo language extension. It uses a simple cost model for defining the execution time of programs composed in parallel or in sequence. The two monoids are:
time(a || b) = max(time(a), time(b))
time(a ; b) = time(a) + time(b)
Execution times are non-negative, hence both max and + have identity 0, which is the execution time of the empty program: max(a, 0) = a + 0 = a. It is easy to check distributivity (+ distributes over max) and containment:
a + max(b, c) = max(a + b, a + c)
max(a + b, a) = a + b
Note that the resulting algebraic structure is different from the tropical max-plus semiring (ℝ^{−∞}, max, −∞, +, 0) commonly used in scheduling, where the identity of max is −∞.
In general, various flavours of parallel and sequential composition often form united monoids. In this paper about Concurrent Kleene Algebra the authors use the term bimonoid to refer to such structures, but this term is also used to describe an unrelated concept in category theory, so let me stick to “united monoids” here, which has zero google hits.
My favourite algebraic structure is the algebra of graphs described in this paper. The algebra comprises two monoids that have the same identity, which motivated me to study similar algebraic structures, and led to writing this blog post about the generalised notion of united monoids.
As a brief introduction, consider the following operations on graphs. The overlay operation + takes two graphs (V_{1}, E_{1}) and (V_{2}, E_{2}), and produces the graph containing the union of their vertices and edges:
(V_{1}, E_{1}) + (V_{2}, E_{2}) = (V_{1} ∪ V_{2}, E_{1} ∪ E_{2})
The connect operation ⋅ is similar to overlay, but it also adds an edge from each vertex of the first graph to each vertex of the second graph:
(V_{1}, E_{1}) ⋅ (V_{2}, E_{2}) = (V_{1} ∪ V_{2}, E_{1} ∪ E_{2} ∪ V_{1} × V_{2})
The operations have the same identity e — the empty graph (∅, ∅) — and form a pair of united monoids, where ⋅ distributes over +.
In addition to the laws of united monoids described above, the algebra of graphs has the axiom of decomposition:
abc = ab + ac + bc
The intuition behind this axiom is that any expression in the algebra of graphs can be broken down into vertices and pairs of vertices (edges). Note that the containment laws follow from decomposition, e.g.:
ab | = | aeb | (e identity of ⋅) |
= | ae + ab + eb | (decomposition) | |
= | ae + ab + b | (e identity of ⋅) | |
= | a(e + b) + b | (distributivity) | |
= | ab + b | (e identity of +) |
By postulating the commutativity of the connect operation (ab = ba), we can readily obtain undirected graphs.
The algebra of graphs can be considered a “2D” special case of united monoids, where one can only connect elements pairwise; any 3-way connection abc falls apart into pieces. A 3-dimensional variant of the algebra can be obtained by replacing the decomposition axiom with:
abcd = abc + abd + acd + bcd
This allows us to connect vertices into pairs (edges) and triples (faces), but forces 4-way products abcd to fall apart into faces, as shown below:
Note that 3-decomposition follows from 2-decomposition: if all 3-way products fall apart then so do all 4-way products, but not vice versa. Borrowing an example from David Spivak’s paper on modelling higher- dimensional networks, such 3D graphs allow us to distinguish these two different situations:
Similar examples show up in concurrency theory, where one might need to distinguish three truly concurrent events from three events that are concurrent pairwise, but whose overall concurrency is limited by shared resources, e.g. three people eating ice-cream with two spoons, or going through a two-person-wide door. There is a short paper on this topic by my PhD advisor Alex Yakovlev, written in 1989 (on a typewriter!).
United monoids of growing dimension lead us to topology, specifically to simplicial complexes, which are composed of simple n-dimensional shapes called simplices, such as point (0-simplex), segment (1-simplex), triangle (2-simplex), tetrahedron (3-simplex), etc. — here is a cool video. We show an example of a simplicial complex below, along with a united monoid expression C that describes it, and two containment properties. We’ll further assume commutativity of connection: ab = ba.
Simplicial complexes are closed in terms of containment. For example, a filled-in triangle contains its edges and vertices, and cannot appear in a simplicial complex without any of them. This property can be expressed algebraically as follows:
abc = abc + ab + ac + bc + a + b + c
Interestingly, this 3D containment law follows from the 2D version that we defined for united monoids:
abc | = | (ab + a + b)c | (containment) |
= | (ab)c + ac + bc | (distributivity) | |
= | (abc + ab + c) + (ac + a) + (bc + b) | (containment) | |
= | abc + ab + ac + bc + a + b + c | (commutativity) |
We can similarly prove n-dimensional versions of the containment law; they all trivially follow from the basic containment axiom ab = ab + a, or, alternatively, from the united identity axiom 0 = 1.
Now let’s put together a small library for united monoids in Haskell and express some of the above examples in it.
Monoids are already represented in the standard Haskell library base by the type class Monoid. We need to extend it to the type class Semilattice, which does not define any new methods, but comes with two new laws. We also provide a few convenient aliases, following the API of the algebraic-graphs library:
-- Laws: -- * Commutativity: a <> b = b <> a -- * Idempotence: a <> a = a class Monoid m => Semilattice m empty :: Semilattice m => m empty = mempty overlay :: Semilattice m => m -> m -> m overlay = mappend overlays :: Semilattice m => [m] -> m overlays = foldr overlay empty infixr 6 <+> (<+>) :: Semilattice m => m -> m -> m (<+>) = overlay -- The natural partial order on the semilattice isContainedIn :: (Eq m, Semilattice m) => m -> m -> Bool isContainedIn x y = x <+> y == y
We are now ready to define the type class for united monoids that defines a new method connect and associated laws:
-- Laws: -- * United identity: a <.> empty == empty <.> a == a -- * Associativity: a <.> (b <.> c) == (a <.> b) <.> c -- * Distributivity: a <.> (b <+> c) == a <.> b <+> a <.> c -- (a <+> b) <.> c == a <.> c <+> b <.> c class Semilattice m => United m where connect :: m -> m -> m infixr 7 <.> (<.>) :: United m => m -> m -> m (<.>) = connect connects :: United m => [m] -> m connects = foldr connect empty
Algebraic graphs are a trivial instance:
import Algebra.Graph (Graph) import qualified Algebra.Graph as Graph -- TODO: move orphan instances to algebraic-graphs library instance Semigroup (Graph a) where (<>) = Graph.overlay instance Monoid (Graph a) where mempty = Graph.empty instance Semilattice (Graph a) instance United (Graph a) where connect = Graph.connect
We can now express the above simplicial complex example in Haskell and test whether it contains the filled-in and the hollow triangles:
-- We are using OverloadedStrings for creating vertices example :: (United m, IsString m) => m example = overlays [ "p" <.> "q" <.> "r" <.> "s" , ("r" <+> "s") <.> "t" , "u" , "v" <.> "x" , "w" <.> ("x" <+> "y" <+> "z") , "x" <.> "y" <.> "z" ] -- Filled-in triangle rstFace :: (United m, IsString m) => m rstFace = "r" <.> "s" <.> "t" -- Hollow triangle rstSkeleton :: (United m, IsString m) => m rstSkeleton = "r" <.> "s" <+> "r" <.> "t" <+> "s" <.> "t"
To perform the test, we need to instantiate the polymorphic united monoid expression to the concrete data type like Graph Point:
newtype Point = Point { getPoint :: String } deriving (Eq, Ord, IsString) λ> rstFace `isContainedIn` (example :: Graph Point) True λ> rstSkeleton `isContainedIn` (example :: Graph Point) True
As you can see, if we interpret the example simplicial complex using the algebraic graphs instance, we cannot distinguish the filled-in and hollow triangles, because the filled-in triangle falls apart into edges due to the 2-decomposition law abc = ab + ac + bc.
Let’s define a data type for representing simplicial complexes. We start with simplices, which can be modelled by sets.
-- A simplex is formed on a set of points newtype Simplex a = Simplex { getSimplex :: Set a } deriving (Eq, Semigroup) -- Size-lexicographic order: https://en.wikipedia.org/wiki/Shortlex_order instance Ord a => Ord (Simplex a) where compare (Simplex x) (Simplex y) = compare (Set.size x) (Set.size y) <> compare x y instance Show a => Show (Simplex a) where show = intercalate "." . map show . Set.toList . getSimplex instance IsString a => IsString (Simplex a) where fromString = Simplex . Set.singleton . fromString isFaceOf :: Ord a => Simplex a -> Simplex a -> Bool isFaceOf (Simplex x) (Simplex y) = Set.isSubsetOf x y
Note that the Ord instance is defined using the size-lexicographic order so that a simplex x can be a face of a simplex y only when x <= y.
Now we can define simplicial complexes, which are sets of simplices that are closed with respect to the subset relation.
-- A simplicial complex is a set of simplices -- We only store maximal simplices for efficiency newtype Complex a = Complex { getComplex :: Set (Simplex a) } deriving (Eq, Ord) instance Show a => Show (Complex a) where show = intercalate " + " . map show . Set.toList . getComplex instance IsString a => IsString (Complex a) where fromString = Complex . Set.singleton . fromString -- Do not add a simplex if it is contained in existing ones addSimplex :: Ord a => Simplex a -> Complex a -> Complex a addSimplex x (Complex y) | any (isFaceOf x) y = Complex y | otherwise = Complex (Set.insert x y) -- Drop all non-minimal simplices normalise :: Ord a => Complex a -> Complex a normalise = foldr addSimplex empty . sort . Set.toList . getComplex instance Ord a => Semigroup (Complex a) where Complex x <> Complex y = normalise (Complex $ x <> y) instance Ord a => Monoid (Complex a) where mempty = Complex Set.empty instance Ord a => Semilattice (Complex a) instance Ord a => United (Complex a) where connect (Complex x) (Complex y) = normalise . Complex $ Set.fromList [ a <> b | a <- Set.toList x, b <- Set.toList y ]
Now let’s check that simplicial complexes allow us to distinguish the filled-in triangle from the hollow one:
λ> example :: Complex Point u + r.t + s.t + v.x + w.x + w.y + w.z + x.y.z + p.q.r.s λ> rstFace :: Complex Point r.s.t λ> rstSkeleton :: Complex Point r.s + r.t + s.t λ> rstFace `isContainedIn` (example :: Complex Point) False λ> rstSkeleton `isContainedIn` (example :: Complex Point) True
Success! As you can check in the diagram above, the example simplicial complex contains a hollow triangle rs + rt + st, but does not contain the filled-in triangle rst.
If you would like to experiment with the code above, check out this repository: https://github.com/snowleopard/united.
I’ve got a few more thoughts, but it’s time to wrap up this blog post. I’m impressed that you’ve made it this far =)
Let me simply list a few things I’d like to explore in future:
Finally, I’d like to ask a question: have you come across united monoids, perhaps under a different name? As we’ve seen, having 0 = 1 does make sense in some cases, but I couldn’t find much literature on this topic.
Consider the following definition of the boundary operator:
∂x = overlay { a | a < x }
where the overlay is over all elements of the set, and a < b denotes strict containment, i.e.
a < b ⇔ a + b = b ∧ a ≠ b
First, let’s apply this definition to a few basic simplices:
∂a = empty
∂(ab) = a + b + empty = a + b
∂(abc) = ab + ac + bc + a + b + c + empty = ab + ac + bc
This looks very similar to the boundary operator from topology, e.g. the boundary of the filled-in triangle abc is the hollow triangle ab + ac + bc, and if we apply the boundary operator twice, the result is unchanged, i.e. ∂(ab + ac + bc) = ab + ac + bc.
Surprisingly, the boundary operator seems to satisfy the product rule for derivatives for non-empty a and b:
∂(ab) = ∂(a)b + a∂(b)
I’m not sure where this is going, but it’s cool. Perhaps, there is a link with derivatives of types?
Thanks to Dave Clarke who suggested to look at the boundary operator.
Further update: One problem with the above definition is that the sum rule for derivatives doesn’t hold, i.e. ∂(a + b) ≠ ∂(a) + ∂(b). Sjoerd Visscher suggested to define ∂ using the desired (usual) laws for derivatives:
∂(ab) = ∂(a)b + a∂(b)
∂(a + b) = ∂(a) + ∂(b)
Coupled with ∂(empty) = ∂(a) = empty (where a is a vertex), this leads to a different boundary operator, where the boundary of the filled-in triangle abc is the hollow triangle ab + ac + bc, and the boundary of the hollow triangle is simply the three underlying vertices a + b + c:
This definition of the boundary operator reduces the “dimension” of a united monoid expression by 1 (unless it is already empty).
Build systems, such as classic Make, are big, complicated, and used by every software developer on the planet. But they are a sadly unloved part of the software ecosystem, very much a means to an end, and seldom the focus of attention. Rarely do people ask questions like “What does it mean for my build system to be correct?” or “What are the trade-offs between different approaches?”. For years Make dominated, but more recently the challenges of scale have driven large software firms like Microsoft, Facebook and Google to develop their own build systems, exploring new points in the design space. In this paper we offer a general framework in which to understand and compare build systems, in a way that is both abstract (omitting incidental detail) and yet precise (implemented as Haskell code).
As one of our main contributions we identify two key design choices that are typically deeply wired into any build system: (i) the order in which tasks are built (the scheduling algorithm), and (ii) whether or not a task is (re-)built (the rebuilding strategy). These choices turn out to be orthogonal, which leads us to a new classification of the design space, as shown in the table below.
Rebuilding strategy | Scheduling algorithm | ||
Topological | Restarting | Suspending | |
Dirty bit | Make | Excel | |
Verifying traces | Ninja | Shake | |
Constructive traces | CloudBuild | Bazel | X |
Deep constructive traces | Buck | Nix |
We can readily remix the ingredients to design new build systems with desired properties: the spot marked by X is particularly interesting since it combines the advantages of Shake and Bazel build systems. Neil is now working on implementing this new build system — Cloud Shake.
Read the paper, it’s fun. We even explain what Frankenbuilds are
]]>I often need a Haskell abstraction that supports conditions (like Monad) yet can still be statically analysed (like Applicative). In such cases people typically point to the Arrow class, more specifically ArrowChoice, but when I look it up, I find several type classes and a dozen of methods. Impressive, categorical but also quite heavy. Is there a more lightweight approach? In this blog post I’ll explore what I call selective applicative functors, which extend the Applicative type class with a single method that makes it possible to be selective about effects.
Please meet Selective:
class Applicative f => Selective f where handle :: f (Either a b) -> f (a -> b) -> f b
Think of handle as a selective function application: you apply a handler function of type a → b when given a value of type Left a, but can skip the handler (along with its effects) in the case of Right b. Intuitively, handle allows you to efficiently handle errors, i.e. perform the error-handling effects only when needed.
Note that you can write a function with this type signature using Applicative functors, but it will always execute the effect associated with the handler so it’s potentially less efficient:
handleA :: Applicative f => f (Either a b) -> f (a -> b) -> f b handleA x f = (\e f -> either f id e) <$> x <*> f
Selective is more powerful^{(*)} than Applicative: you can recover the application operator <*> as follows (I’ll use the suffix S for Selective).
apS :: Selective f => f (a -> b) -> f a -> f b apS f x = handle (Left <$> f) (flip ($) <$> x)
Here we tag a given function a → b as an error and turn a value of type a into an error-handling function ($a), which simply applies itself to the error a → b yielding b as desired. We will later define laws for the Selective type class which will ensure that apS is a legal application operator <*>, i.e. that it satisfies the laws of the Applicative type class.
The select function is a natural generalisation of handle: instead of skipping one unnecessary effect, it selects which of the two given effectful functions to apply to a given value. It is possible to implement select in terms of handle, which is a good puzzle (try it!):
select :: Selective f => f (Either a b) -> f (a -> c) -> f (b -> c) -> f c select = ... -- Try to figure out the implementation!
Finally, any Monad is Selective:
handleM :: Monad f => f (Either a b) -> f (a -> b) -> f b handleM mx mf = do x <- mx case x of Left a -> fmap ($a) mf Right b -> pure b
Selective functors are sufficient for implementing many conditional constructs, which traditionally require the (more powerful) Monad type class. For example:
ifS :: Selective f => f Bool -> f a -> f a -> f a ifS i t e = select (bool (Right ()) (Left ()) <$> i) (const <$> t) (const <$> e)
Here we turn a Boolean value into Left () or Right () and then select an appropriate branch. Let’s try this function in a GHCi session:
λ> ifS (odd . read <$> getLine) (putStrLn "Odd") (putStrLn "Even") 0 Even λ> ifS (odd . read <$> getLine) (putStrLn "Odd") (putStrLn "Even") 1 Odd
As desired, only one of the two effectful functions is executed. Note that here f = IO with the default selective instance: handle = handleM.
Using ifS as a building block, we can implement other useful functions:
-- | Conditionally apply an effect. whenS :: Selective f => f Bool -> f () -> f () whenS x act = ifS x act (pure ()) -- | A lifted version of lazy Boolean OR. (<||>) :: Selective f => f Bool -> f Bool -> f Bool (<||>) a b = ifS a (pure True) b
See more examples in the repository. (Note: I recently renamed handle to select, and select to branch in the repository. Apologies for the confusion.)
Like applicative functors, selective functors can be analysed statically. As an example, consider the following useful data type Validation:
data Validation e a = Failure e | Success a deriving (Functor, Show) instance Semigroup e => Applicative (Validation e) where pure = Success Failure e1 <*> Failure e2 = Failure (e1 <> e2) Failure e1 <*> Success _ = Failure e1 Success _ <*> Failure e2 = Failure e2 Success f <*> Success a = Success (f a) instance Semigroup e => Selective (Validation e) where handle (Success (Right b)) _ = Success b handle (Success (Left a)) f = Success ($a) <*> f handle (Failure e ) _ = Failure e
This data type is used for validating complex data: if reading one or more fields has failed, all errors are accumulated (using the operator <> from the semigroup e) to be reported together. By defining the Selective instance, we can now validate data with conditions. Below we define a function to construct a Shape (a Circle or a Rectangle) given a choice of the shape s :: f Bool and the shape’s parameters (Radius, Width and Height) in an arbitrary selective context f.
type Radius = Int type Width = Int type Height = Int data Shape = Circle Radius | Rectangle Width Height deriving Show shape :: Selective f => f Bool -> f Radius -> f Width -> f Height -> f Shape shape s r w h = ifS s (Circle <$> r) (Rectangle <$> w <*> h)
We choose f = Validation [String] to report the errors that occurred when reading values. Let’s see how it works.
λ> shape (Success True) (Success 10) (Failure ["no width"]) (Failure ["no height"]) Success (Circle 10) λ> shape (Success False) (Failure ["no radius"]) (Success 20) (Success 30) Success (Rectangle 20 30) λ> shape (Success False) (Failure ["no radius"]) (Success 20) (Failure ["no height"]) Failure ["no height"] λ> shape (Success False) (Failure ["no radius"]) (Failure ["no width"]) (Failure ["no height"]) Failure ["no width","no height"] λ> shape (Failure ["no choice"]) (Failure ["no radius"]) (Success 20) (Failure ["no height"]) Failure ["no choice"]
In the last example, since we failed to parse which shape has been chosen, we do not report any subsequent errors. But it doesn’t mean we are short-circuiting the validation. We will continue accumulating errors as soon as we get out of the opaque conditional:
twoShapes :: Selective f => f Shape -> f Shape -> f (Shape, Shape) twoShapes s1 s2 = (,) <$> s1 <*> s2 λ> s1 = shape (Failure ["no choice 1"]) (Failure ["no radius 1"]) (Success 20) (Failure ["no height 1"]) λ> s2 = shape (Success False) (Failure ["no radius 2"]) (Success 20) (Failure ["no height 2"]) λ> twoShapes s1 s2 Failure ["no choice 1","no height 2"]
Another example of static analysis of selective functors is the Task abstraction from the previous blog post.
instance Monoid m => Selective (Const m) where handle = handleA type Task c k v = forall f. c f => (k -> f v) -> k -> Maybe (f v) dependencies :: Task Selective k v -> k -> [k] dependencies task key = case task (\k -> Const [k]) key of Nothing -> [] Just (Const ks) -> ks
The definition of the Selective instance for the Const functor simply falls back to the applicative handleA, which allows us to extract the static structure of any selective computation very similarly to how this is done with applicative computations. In particular, the function dependencies returns an approximation of dependencies of a given key: instead of ignoring opaque conditional statements as in Validation, we choose to inspect both branches collecting dependencies from both of them.
Here is an example from the Task blog post, where we used the Monad abstraction to express a spreadsheet with two formulas: B1 = IF(C1=1,B2,A2) and B2 = IF(C1=1,A1,B1).
task :: Task Monad String Integer task fetch "B1" = Just $ do c1 <- fetch "C1" if c1 == 1 then fetch "B2" else fetch "A2" task fetch "B2" = Just $ do c1 <- fetch "C1" if c1 == 1 then fetch "A1" else fetch "B1" task _ _ = Nothing
Since this task description is monadic we could not analyse it statically. But now we can! All we need to do is rewrite it using Selective:
task :: Task Selective String Integer task fetch "B1" = Just $ ifS ((1==) <$> fetch "C1") (fetch "B2") (fetch "A2") task fetch "B2" = Just $ ifS ((1==) <$> fetch "C1") (fetch "A1") (fetch "B1") task _ _ = Nothing
We can now apply the function dependencies defined above and draw the dependency graph using your favourite graph library:
λ> dependencies task "B1" ["A2","B2","C1"] λ> dependencies task "B2" ["A1","B1","C1"] λ> dependencies task "A1" [] λ> writeFile "task.dot" $ exportAsIs $ graph (dependencies task) "B1" λ> :! dot -Tsvg task.dot -o task.svg
This produces the graph below, which matches the one I had to draw manually last time, since I had no Selective to help me.
Instances of the Selective type class must satisfy a few laws to make it possible to refactor selective computations. These laws also allow us to establish a formal relation with the Applicative and Monad type classes. The laws are complex, but I couldn’t figure out how to simplify them. Please let me know if you find an improvement.
f <$> handle x y = handle (second f <$> x) ((f .) <$> y)
handle (first f <$> x) y = handle x ((. f) <$> y)
handle x (f <$> y) = handle (first (flip f) <$> x) (flip ($) <$> y)
handle x (pure y) = either y id <$> x
handle (pure (Left x)) y = ($x) <$> y
handle x (handle y z) = handle (handle (f <$> x) (g <$> y)) (h <$> z) where f x = Right <$> x g y = \a -> bimap (,a) ($a) y h z = uncurry z -- or in operator form with (<*?) = handle x <*? (y <*? z) = (f <$> x) <*? (g <$> y) <*? (h <$> z)
Note that there is no law for handling a pure value, i.e. we do not require that the following holds:
handle (pure (Right x)) y = pure x
In particular, the following is allowed too:
handle (pure (Right x)) y = const x <$> y
We therefore allow handle to be selective about effects in this case. If we insisted on adding the first version of the above law, that would rule out the useful Const instance. If we insisted on the second version of the law, we would essentially be back to Applicative.
A consequence of the above laws is that apS satisfies Applicative laws (I do not have a formal proof, but you can find some proof sketches here). Note that we choose not to require that apS = <*>, since this forbids some interesting instances, such as Validation defined above.
If f is also a Monad, we require that handle = handleM.
Using the laws, it is possible to rewrite any selective computation into a normal form (the operator + denotes the sum type constructor):
f (a + b + ... + z) -- An initial value of a sum type -> f (a -> (b + ... + z)) -- How to handle a's -> f (b -> (c + ... + z)) -- How to handle b's ... -> f (y -> z) -- How to handle y's -> f z -- The result
In words, we start with a sum type and handle each alternative in turn, possibly skipping unnecessary handlers, until we end up with a resulting value.
There are other ways of expressing selective functors in Haskell and most of them are compositions of applicative functors and the Either monad. Below I list a few examples. All of them are required to perform effects from left to right.
-- Composition of Applicative and Either monad class Applicative f => SelectiveA f where (|*|) :: f (Either e (a -> b)) -> f (Either e a) -> f (Either e b) -- Composition of Starry and Either monad -- See: https://duplode.github.io/posts/applicative-archery.html class Applicative f => SelectiveS f where (|.|) :: f (Either e (b -> c)) -> f (Either e (a -> b)) -> f (Either e (a -> c)) -- Composition of Monoidal and Either monad -- See: http://blog.ezyang.com/2012/08/applicative-functors/ class Applicative f => SelectiveM f where (|**|) :: f (Either e a) -> f (Either e b) -> f (Either e (a, b))
I believe these formulations are equivalent to Selective, but I have not proved the equivalence formally. I like the minimalistic definition of the type class based on handle, but the above alternatives are worth consideration too. In particular, SelectiveS has a much nicer associativity law, which is just (x |.| y) |.| z = x |.| (y |.| z)
.
Selective functors are powerful: like monads they allows us to inspect values in an effectful context. Many monadic computations can therefore be rewritten using the Selective type class. Many, but not all! Crucially, selective functors cannot implement the function join:
join :: Selective f => f (f a) -> f a join = ... -- This puzzle has no solution, better solve 'select'!
I’ve been playing with selective functors for a few weeks, and I have to admit that they are very difficult to work with. Pretty much all selective combinators involve mind-bending manipulations of Lefts and Rights, with careful consideration of which effects are necessary. I hope all this complexity can be hidden in a library.
I haven’t yet looked into performance issues, but it is quite likely that it will be necessary to add more methods to the type class, so that their default implementations can be replaced with more efficient ones on instance-by-instance basis (similar optimisations are done with Monad and Applicative).
Have you come across selective functors before? The definition of the type class is very simple, so somebody must have looked at it earlier.
Also, do you have any other interesting use-cases for selective functors?
Big thanks to Arseniy Alekseyev, Ulan Degenbaev and Georgy Lukyanov for useful discussions, which led to this blog post.
^{(*)} As rightly pointed out by Darwin226 in the reddit discussion, handle = handleA gives a valid Selective instance for any Applicative, therefore calling it less powerful may be questionable. However, I would like to claim that Selective does provide additional power: it gives us vocabulary to talk about unnecessary effects. We might want to be able to express three different ideas:
I think all three ideas are useful, and it is very interesting to study the stricter version of Selective too. I’d be interested in hearing suggestions for the corresponding set of laws. The following two laws seem sensible:
handle (Left <$> x) f = flip ($) <$> x <*> f handle (Right <$> x) f = x]]>
(Update: the paper got accepted to ICFP! Read the PDF, watch the talk.)
In this blog post I would like to share one interesting abstraction that we came up with to describe build tasks:
type Task c k v = forall f. c f => (k -> f v) -> k -> Maybe (f v)
A Task is completely isolated from the world of compilers, file systems, dependency graphs, caches, and all other complexities of real build systems. It just computes the value of a key k, in a side-effect-free way, using a callback of type k → f v to find the values of its dependencies. One simple example of a callback is Haskell’s readFile function: as one can see from its type FilePath → IO String, given a key (a file path k = FilePath) it can find its value (the file contents of type v = String) by performing arbitrary IO effects (hence, f = IO). We require task descriptions to be polymorhic in f, so that we can reuse them in different computational contexts f without rewriting from scratch.
This highly-abstracted type is best introduced by an example. Consider the following Excel spreadsheet (yes, Excel is a build system in disguise):
A1: 10 B1: A1 + A2 A2: 20 B2: B1 * 2
Here cell A1 contains the value 10, cell B1 contains the formula A1 + A2, etc. We can represent the formulae (i.e. build tasks) of this spreadsheet with the following task description:
sprsh1 :: Task Applicative String Integer sprsh1 fetch "B1" = Just ((+) <$> fetch "A1" <*> fetch "A2") sprsh1 fetch "B2" = Just ((*2) <$> fetch "B1") sprsh1 _ _ = Nothing
We instantiate the type of keys k with String (cell names), and the type of values v with Integer (real spreadsheets contain a wider range of values, of course). The task description sprsh1 embodies all the formulae of the spreadsheet, but not the input values. Like every Task, sprsh1 is given a callback fetch and a key. It pattern-matches on the key to see if it has a task description (a formula) for it. If not, it returns Nothing, indicating that the key is an input. If there is a formula in the cell, it computes the value of the formula, using fetch to find the value of any keys on which it depends.
The definition of Task and the above example look a bit mysterious. Why do we require Task to be polymorphic in the type constructor f? Why do we choose the c = Applicative constraint? The answer is that given one task description, we would like to explore many different build systems that can build it and it turns out that each of them will use a different f. Furthermore, we found that constraints c classify build tasks in a very interesting way:
Now let’s look at some examples of what we can do with tasks.
Given a task, we can compute the value corresponding to a given key by providing a pure store function that associates keys to values:
compute :: Task Monad k v -> (k -> v) -> k -> Maybe v compute task store = fmap runIdentity . task (Identity . store)
Here we do not need any effects in the fetch callback to task, so we can use the standard Haskell Identity monad (I first learned about this trivial monad from this blog post). The use of Identity just fixes the ‘impedance mismatch’ between the function store, which returns a pure value v, and the fetch argument of the task, which must return an f v for some f. To fix the mismatch, we wrap the result of store in the Identity monad: the function Identity . store has the type k → Identity v, and can now be passed to a task. The result comes as Maybe (Identity v), hence we now need to get rid of the Identity wrapper by applying runIdentity to the contents of Maybe.
In the GHCi session below we define a pure key/value store with A1 set to 10 and all other keys set to 20 and compute the values corresponding to keys A1 and B1 in the sprsh1 example:
λ> store key = if key == "A1" then 10 else 20 λ> compute sprsh1 store "A1" Nothing λ> compute sprsh1 store "B1" Just 30
As expected, we get Nothing for an input key A1 and Just 30 for B1.
Notice that, even though compute takes a Task Monad as its argument, its application to a Task Applicative typechecks just fine. It feels a bit like sub-typing, but is actually just ordinary higher-rank polymorphism.
Now let’s look at a function that can only be applied to applicative tasks.
The formula A1 + A2 in the sprsh1 example statically depends on two keys: A1 and A2. Usually we would extract such static dependencies by looking at the syntax tree of the formula. But our Task abstraction has no such syntax tree. Yet, remarkably, we can use the polymorphism of a Task Applicative to find its dependencies. Here is the code:
dependencies :: Task Applicative k v -> k -> [k] dependencies task key = case task (\k -> Const [k]) key of Nothing -> [] Just (Const ks) -> ks
Here Const is the standard Haskell Const functor. We instantiate f to Const [k]. So a value of type f v, or in this case Const [k] v, contains no value v, but does contain a list of keys of type [k] which we use to record dependencies. The fetch callback that we pass to task records a single dependency, and the standard Applicative instance for Const combines the dependencies from different parts of the task. Running the task with f = Const [k] will thus accumulate a list of the task’s dependencies – and that is just what dependencies does:
λ> dependencies sprsh1 "A1" [] λ> dependencies sprsh1 "B1" ["A1", "A2"]
Notice that these calls to dependencies do no actual computation. They cannot: we are not supplying any input values. So, through the wonders of polymorphism, we are able to extract the dependencies of the spreadsheet formula, and to do so efficiently, simply by running its code in a different Applicative! This is not new, for example see this paper, but it is cool.
Some build tasks have dynamic dependencies, which are determined by values of intermediate computations. Such tasks correspond to the type Task Monad k v. Consider this spreadsheet example:
A1: 10 B1: IF(C1=1,B2,A2) C1: 1 A2: 20 B2: IF(C1=1,A1,B1)
Note that B1 and B2 statically form a dependency cycle, but Excel (which uses dynamic dependencies) is perfectly happy. The diagram below illustrates how cyclic dependencies are resolved when projecting them on conditions C1=1 and C1=2 (rectangles and rounded rectangles denote inputs and outputs, respectively). Incidentally, my PhD thesis was about a mathematical model for such conditional dependency graphs, which was later developed into an algebra of graphs.
We can express this spreadsheet using our task abstraction as:
sprsh2 :: Task Monad String Integer sprsh2 fetch "B1" = Just $ do c1 <- fetch "C1" if c1 == 1 then fetch "B2" else fetch "A2" sprsh2 fetch "B2" = Just $ do c1 <- fetch "C1" if c1 == 1 then fetch "A1" else fetch "B1" sprsh2 _ _ = Nothing
The big difference compared to sprsh1 is that the computation now takes place in a Monad, which allows us to extract the value of C1 and fetch different keys depending on whether or not C1 = 1.
We cannot find dependencies of monadic tasks statically; notice that the application of the function dependencies to sprsh2 will not typecheck. We need to run a monadic task with concrete values that will determine the discovered dependencies. Thus, we introduce the function track: a combination of compute and dependencies that computes both the resulting value and the list of its dependencies in an arbitrary monadic context m:
track :: Monad m => Task Monad k v -> (k -> m v) -> k -> Maybe (m (v, [k])) track task fetch = fmap runWriterT . task trackingFetch where trackingFetch :: k -> WriterT [k] m v trackingFetch k = tell [k] >> lift (fetch k)
We use the standard^{(*)} Haskell WriterT monad transformer to record additional information — a list of keys [k] — when computing a task in an arbitrary monad m. We substitute the given fetch with a trackingFetch that, in addition to fetching a value, tracks the corresponding key. The task returns the value of type Maybe (WriterT [k] m v), which we unwrap by applying runWriterT to the contents of Maybe. Below we give an example of tracking monadic tasks when m = IO:
λ> fetchIO k = do putStr (k ++ ": "); read <$> getLine λ> fromJust $ track sprsh2 fetchIO "B1" C1: 1 B2: 10 (10,["C1","B2"]) λ> fromJust $ track sprsh2 fetchIO "B1" C1: 2 A2: 20 (20,["C1","A2"])
As expected, the dependencies of cell B1 from sprsh2 are determined by the value of C1, which in this case is obtained by reading from the standard input via the fetchIO callback.
Given a task description, a target key, and a store, a build system returns a new store in which the values of the target key and all its dependencies are up to date. What does “up to date” mean? The paper answers that in a formal way.
The three functions described above (compute, dependencies and track) are sufficient for defining the correctness of build systems as well as for implementing a few existing build systems at a conceptual level. Below is an example of a very simple (but inefficient) build system:
busy :: Eq k => Task Monad k v -> k -> Store k v -> Store k v busy task key store = execState (fetch key) store where fetch :: k -> State (Store k v) v fetch k = case task fetch k of Nothing -> gets (getValue k) Just act -> do v <- act; modify (putValue k v); return v
Here Store k v is an abstract store datatype equipped with getValue and setValue functions. The busy build system defines the callback fetch so that, when given a target key, it brings the key up to date in the store, and returns its value. The function fetch runs in the standard Haskell State monad, initialised with the incoming store by execState. To bring a key k up to date, fetch asks the task description task how to compute k. If task returns Nothing the key is an input, so fetch simply reads the result from the store. Otherwise fetch runs the action act returned by the task to produce a resulting value v, records the new key/value mapping in the store, and returns v. Notice that fetch passes itself to task as an argument, so that the latter can use fetch to recursively find the values of k‘s dependencies.
Given an acyclic task description, the busy build system terminates with a correct result, but it is not a minimal build system: it doesn’t keep track of keys it has already built, and will therefore busily recompute the same keys again and again if they have multiple dependants. See the paper for implementations of much more efficient build systems.
We have already used a few cool Haskell types — Identity, Const, WriterT and State — to manipulate our Task abstraction. Let’s meet a few other members of the cool-types family: Proxy, ReaderT, MaybeT and EitherT.
The Proxy data type allows us to check whether a key is an input without providing a fetch callback:
isInput :: Task Monad k v -> k -> Bool isInput task = isNothing . task (const Proxy)
This works similarly to the dependencies function, but in this case we do not even need to record any additional information, thus we can replace Const with Proxy.
One might wonder: if we do not need the fetch callback in case of input, can we rewrite our Task abstraction as follows?
type Task2 c k v = forall f. c f => k -> Maybe ((k -> f v) -> f v)
Yes, we can! This definition is isomorphic to Task. This isn’t immediately obvious, so below is a proof. I confess: it took me a while to find it.
toTask :: Task2 Monad k v -> Task Monad k v toTask task2 fetch key = ($fetch) <$> task2 key fromTask :: Task Monad k v -> Task2 Monad k v fromTask task key = runReaderT <$> task (\k -> ReaderT ($k)) key
The toTask conversion is relatively straightforward, but fromTask is not: it uses a ReaderT monad transformer to supply the fetch callback as the computation environment, extracting the final value with runReaderT.
Our task abstraction operates on pure values and has no mechanism for exception handling. It turns out that it is easy to turn any Task into a task that can handle arbitrary exceptions occurring in the fetch callback:
exceptional :: Task Monad k v -> Task Monad k (Either e v) exceptional task fetch = fmap runExceptT . task (ExceptT . fetch)
The exceptional task transformer simply hides exceptions of the given fetch of type k → f (Either e v) by using the standard ExceptT monad transformer, passes the resulting fetch callback of type k → ExceptT e f v to the original task, and propagates the exceptions by runExceptT. Using MaybeT, one can also implement a similar task transformer that turns a Task Monad k v into the its partial version Task Monad k (Maybe v).
Our final exercise is to extract all possible computation results of a non-deterministic task, e.g. B1 = A1 + RANDBETWEEN(1,2) that can be described as a Task Alternative:
sprsh3 :: Task Alternative String Integer sprsh3 fetch "B1" = Just $ (+) <$> fetch "A1" <*> (pure 1 <|> pure 2) sprsh3 _ _ = Nothing
We therefore introduce the function computeND that returns the list of all possible results of the task instead of just one value (‘ND’ stands for ‘non-deterministic’):
computeND :: Task MonadPlus k v -> (k -> v) -> k -> Maybe [v] computeND task store = task (return . store)
The implementation is almost straightforward: we choose f = [] reusing the standard MonadPlus instance for lists. Let’s give it a try:
λ> store key = if key == "A1" then 10 else 20 λ> computeND sprsh3 store "A1" Nothing λ> computeND sprsh3 store "B1" Just [11,12] λ> computeND sprsh1 store "B1" Just [30]
Notice that we can apply computeND to both non-deterministic (sprsh3) as well as deterministic (sprsh1) task descriptions.
Non-deterministic tasks are interesting because they allow one to try different algorithms to compute a value in parallel and grab the first available result — a good example is portfolio-based parallel SAT solvers. This shouldn’t be confused with a deterministic composition of tasks, which is also a useful operation, but does not involve any non-determinism:
compose :: Task Monad k v -> Task Monad k v -> Task Monad k v compose t1 t2 fetch key = t1 fetch key <|> t2 fetch key
Here we simply compose two task descriptions, picking the first one that knows how to compute a given key. Together with the trivial task that returns Nothing for all keys, this gives rise to the Task monoid.
We introduced the task abstraction to study build systems, but it seems to be linked to a few other topics, such as memoization, self-adjusting computation, lenses and profunctor optics, propagators and what not.
Have we just reinvented the wheel? It might seem so, especially if you look at these type signatures from the lens library:
type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t type Traversal s t a b = forall f. Applicative f => (a -> f b) -> s -> f t
Our implementations of functions like dependencies are heavily inspired by — or to be more accurate — stolen from the lens library. Alas, we have been unable to remove the Maybe used to encode whether a key is an input, without complicating other aspects of our definition.
The task abstraction can be used to express pure functions in a way that is convenient for their memoization. Here is an example of encoding one of the most favourite functions of functional programmers:
fibonacci :: Task Applicative Integer Integer fibonacci fetch n | n >= 2 = Just $ (+) <$> fetch (n-1) <*> fetch (n-2) | otherwise = Nothing
Here the keys n < 2 are input parameters, and one can obtain the usual Fibonacci sequence by picking 0 and 1 for n = 0 and n = 1, respectively. Any minimal build system will compute the sequence with memoization, i.e. without recomputing the same value twice.
Interestingly, the Ackermann function — a famous example of a function that is not primitive recursive — can’t be expressed as a Task Applicative, since it needs to perform an intermediate recursive call to determine one of its dependencies:
ackermann :: Task Monad (Integer, Integer) Integer ackermann fetch (n, m) | m < 0 || n < 0 = Nothing | m == 0 = Just $ pure (n + 1) | n == 0 = Just $ fetch (m - 1, 1) | otherwise = Just $ do index <- fetch (m, n - 1) fetch (m - 1, index)
Now that we’ve seen examples of applicative and monadic tasks, let us finish with an example of a functorial task — the Collatz sequence:
collatz :: Task Functor Integer Integer collatz fetch n | n <= 0 = Nothing | otherwise = Just $ f <$> fetch (n - 1) where f k | even k = k `div` 2 | otherwise = 3 * k + 1
So here is a claim: given a Task, we can memoize, self-adjust, propagate and probably do any other conceivable computation on it simply by picking the right build system!
Sjoerd Visscher’s comment (below) pointed out that the fetch callback is defined to be total: it has type k → f v and returns a value for every key. It may be useful to allow it to fail for some keys. I know of three ways of modelling failure using the Task abstraction:
(1) Include failures into the type of values v, for example:
data Value = FileNotFound | FileContents ByteString
This is convenient if tasks are aware of failures. For example, a task may be able to cope with missing files, e.g. if fetch “username.txt” returns FileNotFound, the task could use the literal string “User” as a default value. In this case it will depend on the fact that the file username.txt is missing, and will need to be rebuilt if the user later creates this file.
In many cases this approach is isomorphic to choosing v = Either e v’.
(2) Include failures into the computation context f, for example:
cells :: Map String Integer cells = Map.fromList [("A1", 10), ("A2", 20)] fetch :: String -> Maybe Integer fetch k = Map.lookup k cells
We are choosing f = Maybe and thanks to the polymorphism of Task, any task can be executed in this context without any changes. For example, sprsh1 fetch “B1” now returns Just (Just 30), but sprsh1 fetch “B2” fails with Just Nothing.
This is convenient if tasks are not aware of failures, e.g. we can model Excel formulas as pure arithmetic functions, and introduce failures “for free” if/when needed by instantiating Task with an appropriate f. Also see the function exceptional defined above, which allows us to add arbitrary exceptions to a failure-free context f.
(3) Finally, the task itself might not want to encode failures into the type of values v, but instead demand that f has a built-in notion of failures. This can be done by choosing a suitable constraint c, such as Alternative, MonadPlus or even better something specific to failures e.g. MonadZero or MonadFail. Then both the callback and the task can reuse the same failure mechanism as shown below:
class Monad m => MonadFail m where fail :: String -> m a sprsh4 :: Task MonadFail String Integer sprsh4 fetch "B1" = Just $ do a1 <- fetch "A1" a2 <- fetch "A2" if a2 == 0 then fail "division by 0" else return (a1 `div` a2) sprsh4 _ _ = Nothing
Are there any other types of failure that are not covered above?
^{(*)} Beware: as of writing, standard WriterT transformers have a space leak which may be an issue if a task has many dependencies. You might want to consider using a more efficient CPS-based WriterT transformer.
]]>Jakob, Georgy and I have just submitted a conference paper describing the REDFIN core and the verification framework. Please have a look and let us know what you think. This will be a timely read after yesterday’s exciting SpaceX launch.
As we all know, writing programs is easy. Writing correct programs on the other hand is very hard. And if you are writing a program for a space mission you better be sure it is correct. So, what can you realistically do? Of course you should write and/or generate a lot of tests, but tests do not provide the full correctness guarantee. You can also use a strongly typed programming language and prove some properties at compile time, but your space-grade processor is highly unlikely to be a well-supported compilation target (and you might not be able to take your favourite garbage collector to space — it’s way too heavy!). You could also use formal methods and develop programs with the help of a theorem prover, eventually refining an abstract specification down to the level of bare metal. That might take you years though, especially if you have no previous experience in formal methods.
When I was working on my PhD I did some work on formal specification of processors and instruction sets, so this is a long-time interest of mine. Hence, when I heard about the REDFIN project I immediately self-invited myself to RUAG Space Austria and tried to figure out a way to engineer a simple solution that can be integrated into the existing development and verification workflow without a big disruption and a steep learning curve. Eventually I used Haskell to implement a small DSL for capturing the semantics of REDFIN programs and connected it to an SMT solver using SBV, a wonderful symbolic verification library developed by Levent Erkok (huge thanks!). This idea is not new and we reference a few early papers that developed and applied it to Arm and Intel processors. I also got a lot of inspiration from this blog post by Stephen Diehl and this talk by Tikhon Jelvis. Thank you Stephen and Tikhon!
P.S.: The code is not yet available, but I hope we’ll release it soon.
]]>Want to try? Checkout the GHC repository and run hadrian/build.sh -j
or hadrian/build.bat -j
on Windows and it should build you a GHC binary. In case of problems, have a look at the README
and/or raise an issue.
Here is a quick update on the on-going development:
ghc-cabal
utility, reorganise the build tree, and make numerous other improvements to Hadrian.I can’t believe that we seem to approach the finish line! It’s been a long, tedious but also interesting project. Thank you all for helping us get this far, and I hope we’ll celebrate the switch from Make to Hadrian soon.
]]>Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.
One of the simplest transformations one can apply to a graph is to flip the direction of all of its edges. It’s usually straightforward to implement but whatever data structure you use to represent graphs, you will spend at least O(1) time to modify it (say, by flipping the treatAsTransposed
flag); much more often you will have to traverse the data structure and flip every edge, resulting in O(|V|+|E|) time complexity. What if I told you that by using Haskell’s type system, we can transpose polymorphic graphs in zero time? Sounds suspicious? Let’s see how this works.
Consider the following Graph
instance:
newtype Transpose g = T { transpose :: g } instance Graph g => Graph (Transpose g) where type Vertex (Transpose g) = Vertex g empty = T empty vertex = T . vertex overlay x y = T $ transpose x `overlay` transpose y connect x y = T $ transpose y `connect` transpose x -- flip
We wrap a graph in a newtype
flipping the order of arguments to connect
. Let’s check if this works:
λ> edgeList $ 1 * (2 + 3) * 4 [(1,2),(1,3),(1,4),(2,4),(3,4)] λ> edgeList $ transpose $ 1 * (2 + 3) * 4 [(2,1),(3,1),(4,1),(4,2),(4,3)]
Cool! And this has zero runtime cost, because all we do is wrapping and unwrapping the newtype
, which is guaranteed to be free. As an exercise, verify that transpose is an antihomomorphism on graphs, that is:
Furthermore, transpose is its own inverse: transpose . transpose = id
.
To make sure transpose
is only applied to polymorphic graphs, we do not export the constructor T
, therefore the only way to call transpose
is to give it a polymorphic argument and let the type inference interpret it as a value of type Transpose
. The type signature is a little unsatisfying though:
λ> :t transpose transpose :: Transpose g -> g
It’s not clear at all from the type that the function operates on graphs. Do you have any ideas how to improve it?
Here is a puzzle for you: can you implement a function gmap
that given a function a -> b
and a polymorphic graph whose vertices are of type a
will produce a polymorphic graph with vertices of type b
by applying the function to each vertex? Yes, this is almost a Functor
but it doesn’t have the usual type signature, because Graph
is not a higher-kinded type.
My solution is as follows, but I feel there may be simpler ones:
newtype GraphFunctor a = GF { gfor :: forall g. Graph g => (a -> Vertex g) -> g } instance Graph (GraphFunctor a) where type Vertex (GraphFunctor a) = a empty = GF $ \_ -> empty vertex x = GF $ \f -> vertex (f x) overlay x y = GF $ \f -> gmap f x `overlay` gmap f y connect x y = GF $ \f -> gmap f x `connect` gmap f y gmap :: Graph g => (a -> Vertex g) -> GraphFunctor a -> g gmap = flip gfor
Essentially, we are defining another newtype
wrapper, which pushes the given function all the way towards the vertices. This has no runtime cost, just as before, although the actual evaluation of the given function at each vertex will not be free, of course. Let’s test this!
λ> adjacencyList $ 1 * 2 * 3 + 4 * 5 [(1,[2,3]),(2,[3]),(3,[]),(4,[5]),(5,[])] λ> :t gmap (+1) $ 1 * 2 * 3 + 4 * 5 gmap (+1) $ 1 * 2 * 3 + 4 * 5 :: (Graph g, Num (Vertex g)) => g λ> adjacencyList $ gmap (+1) $ 1 * 2 * 3 + 4 * 5 [(2,[3,4]),(3,[4]),(4,[]),(5,[6]),(6,[])]
As you can see, we can increment the value of each vertex by mapping function (+1)
over the graph. The resulting expression is a polymorphic graph, as desired. Again, we’ve done some useful work without turning the graph into a concrete data structure. As an exercise, show that gmap
satisfies the functor laws: gmap id = id
and gmap f . gmap g = gmap (f . g)
. A useful first step is to prove that mapping a function is a homomorphism:
An alert reader might wonder: what happens if the function maps two original vertices into the same one? They will be merged! Merging graph vertices is a useful function, so let’s define it in terms of gmap
:
mergeVertices :: Graph g => (Vertex g -> Bool) -> Vertex g -> GraphFunctor (Vertex g) -> g mergeVertices p v = gmap $ \u -> if p u then v else u λ> adjacencyList $ mergeVertices odd 3 $ 1 * 2 * 3 + 4 * 5 [(2,[3]),(3,[2,3]),(4,[3])]
The function takes a predicate on graph vertices and a target vertex and maps all vertices satisfying the predicate into the target vertex, thereby merging them. In our example all odd vertices {1, 3, 5} are merged into 3, in particular creating the self-loop 3 → 3. Note: it takes linear time O(|g|) for mergeVertices
to apply the predicate to each vertex (|g| is the size of the expression g), which may be much more efficient than merging vertices in a concrete data structure; for example, if the graph is represented by an adjacency matrix, it will likely be necessary to rebuild the resulting matrix from scratch, which takes O(|V|^2) time. Since for many graphs we have |g| = O(|V|), the matrix-based mergeVertices
will run in O(|g|^2).
What do the operations of removing a vertex and splitting a vertex have in common? They both can be implemented by replacing each vertex of a graph with a (possibly empty) subgraph and flattening the result. Sounds familiar? You may recognise this as monad’s bind
function, or Haskell’s operator >>=
, which is so useful that it even made it to the Haskell’s logo. We are going to implement bind
on graphs by wrapping it into yet another newtype
:
newtype GraphMonad a = GM { bind :: forall g. Graph g => (a -> g) -> g } instance Graph (GraphMonad a) where type Vertex (GraphMonad a) = a empty = GM $ \_ -> empty vertex x = GM $ \f -> f x -- here is the trick! overlay x y = GM $ \f -> bind x f `overlay` bind y f connect x y = GM $ \f -> bind x f `connect` bind y f
As you can see, the implementation is almost identical to gmap
: instead of wrapping the value f
x
into a vertex
, we should just leave it as is. The resulting transformation is also a homomorphism. Let’s see how we can make use of this new type in our toolbox.
We are first going to implement a filter-like function induce
that, given a vertex predicate and a graph, will compute the induced subgraph on the set of vertices that satisfy the predicate by turning all other vertices into empty subgraphs and flattening the result.
induce :: Graph g => (Vertex g -> Bool) -> GraphMonad (Vertex g) -> g induce p g = bind g $ \v -> if p v then vertex v else empty λ> edgeList $ clique [0..4] [(0,1),(0,2),(0,3),(0,4),(1,2),(1,3),(1,4),(2,3),(2,4),(3,4)] λ> edgeList $ induce (<3) $ clique [0..4] [(0,1),(0,2),(1,2)] λ> induce (<3) (clique [0..4]) == (clique [0..2] :: Basic Int) True
As you can see, by inducing a clique on a subset of the vertices that we like (<3
), we get a smaller clique, as expected.
We can now implement removeVertex
via induce
:
removeVertex :: (Eq (Vertex g), Graph g) => Vertex g -> GraphMonad (Vertex g) -> g removeVertex v = induce (/= v) λ> adjacencyList $ removeVertex 2 $ 1 * (2 + 3) [(1,[3]),(3,[])]
Removing an edge is not as simple. I suspect that this has something to do with the fact that the corresponding transformation doesn’t seem to be a homomorphism. Indeed, you will find it tricky to satisfy the last homomorphism requirement on R_{x→y}:
We can, however, implement a function disconnect
that removes all edges between two different vertices as follows:
disconnect :: (Eq (Vertex g), Graph g) => Vertex g -> Vertex g -> GraphMonad (Vertex g) -> g disconnect u v g = removeVertex u g `overlay` removeVertex v g λ> adjacencyList $ disconnect 1 2 $ 1 * (2 + 3) [(1,[3]),(2,[]),(3,[])]
That is, we create two graphs: one without u, the other without v, and overlay them, which removes both u → v and v → u edges. I still don’t have a solution for removing just a single edge u → v, or even just a self-loop v → v (note: disconnect v v = removeVertex v
). Maybe you can find a solution? (Update: Arseniy Alekseyev found a solution for removing self-loops that can be generalised for removing edges, see a note at the end of the blog post.)
Curiously, we can have a slightly shorter implementation of disconnect
, because a function returning a graph can also be given a Graph
instance:
instance Graph g => Graph (a -> g) where type Vertex (a -> g) = Vertex g empty = pure empty vertex = pure . vertex overlay x y = overlay <$> x <*> y connect x y = connect <$> x <*> y disconnect :: (Eq (Vertex g), Graph g) => Vertex g -> Vertex g -> GraphMonad (Vertex g) -> g disconnect u v = removeVertex u `overlay` removeVertex v
Finally, as promised, here is how we can split a vertex into a list of given vertices using the bind
function:
splitVertex :: (Eq (Vertex g), Graph g) => Vertex g -> [Vertex g] -> GraphMonad (Vertex g) -> g splitVertex v vs g = bind g $ \u -> if u == v then vertices vs else vertex u λ> adjacencyList $ splitVertex 1 [0, 1] $ 1 * (2 + 3) [(0,[2,3]),(1,[2,3]),(2,[]),(3,[])]
Here vertex 1 is split into a pair of vertices {0, 1} that have the same connectivity.
To demonstrate that we can construct reasonably sophisticated graphs using the presented toolkit, let’s try it on De Bruijn graphs, an interesting combinatorial object that frequently shows up in computer engineering and bioinformatics. My implementation is fairly short, but requires some explanation:
deBruijn :: (Graph g, Vertex g ~ [a]) => Int -> [a] -> g deBruijn len alphabet = bind skeleton expand where overlaps = mapM (const alphabet) [2..len] skeleton = fromEdgeList [ (Left s, Right s) | s <- overlaps ] expand v = vertices [ either ([a]++) (++[a]) v | a <- alphabet ]
The function builds a De Bruijn graph of dimension len
from symbols of the given alphabet
. The vertices of the graph are all possible words of length len
containing symbols of the alphabet
, and two words are connected x → y whenever x and y match after we remove the first symbol of x and the last symbol of y (equivalently, when x = az and y = zb for some symbols a and b). An example of a 3-dimensional De Bruijn graph on the alphabet {0, 1} is shown in the diagram below (right).
Here are all the ingredients of the solution:
overlaps
contains all possible words of length len-1
that correspond to overlaps of connected vertices.skeleton
is a graph with one edge per overlap, with Left
and Right
vertices acting as temporary placeholders (see the diagram).
Left s
with a subgraph of two vertices {0s
, 1s
}, i.e. the vertices whose suffix is s
. Symmetrically,
Right s
is replaced by a subgraph of two vertices {s0
, s1
}. This is captured by the function expand
.bind skeleton expand
, as illustrated above.…and this works as expected:
λ> edgeList $ deBruijn 3 "01" [("000","000"),("000","001"),("001","010"),("001","011") ,("010","100"),("010","101"),("011","110"),("011","111") ,("100","000"),("100","001"),("101","010"),("101","011") ,("110","100"),("110","101"),("111","110"),("111","111")] λ> all (\(x,y) -> drop 1 x == dropEnd 1 y) $ edgeList $ deBruijn 9 "abc" True λ> Set.size $ vertexSet $ deBruijn 9 "abc" 19683 -- i.e. 3^9
That’s all for now! I hope I’ve convinced you that you don’t necessarily need to operate on concrete data structures when constructing graphs. You can write both efficient and reusable code by using Haskell’s types for interpreting polymorphic graph expressions as maps, binds and other familiar transforms. Give me any old graph, and I’ll write you a new type to construct it!
P.S.: The algebra of graphs is available in the alga library.
Update: Arseniy Alekseyev found a nice solution for removing self-loops. Let R_{v} denote the operation of removing a vertex v, and R_{v→v} denote the operation of removing a self-loop v → v. Then the latter can be defined as follows:
It’s not a homomorphism, but it seems to work. Cool! Furthermore, we can generalise the above and implement the operation R_{u→v} that removes an edge u → v:
Note that the size of the expression can substantially increase as a result of applying such operations. Given an expression g of size |g|, what is the worst possible size of the result |R_{u→v}(g)|?
]]>Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.
Todo lists are sequences of items that, as one would expect, need to be done. The order of items in the sequence matters, because some items may depend on others. The simplest todo list is the empty one. Then we have todo lists containing a single item, from which we can build up longer lists using the same operators we introduced to construct graphs.
An item will correspond to a graph vertex. We’ll use the OverloadedStrings
GHC extension, so we can create todo items without explicitly wrapping them into a vertex
. This will also allow everyone to choose their favourite representation for strings; plain old String
is fine for our examples:
{-# LANGUAGE OverloadedStrings #-} import Data.String import Algebra.Graph import Algebra.Graph.Util instance (IsString a, Ord a) => IsString (Todo a) where fromString = vertex . fromString shopping :: Todo String shopping = "presents"
Here Todo
is a Graph
instance whose implementation will be revealed later. One can combine several items into a single todo list using the overlay operator + of the algebra:
shopping :: Todo String shopping = "presents" + "coat" + "scarf"
The semantics of a todo list is just a list of items in the order they can be completed, or Nothing
if the there is no possible completion order that satisfies all dependency constraints between different items. We can extract the semantics using todo
function with the following signature:
todo :: Ord a => Todo a -> Maybe [a]
The overlay operator is commutative, therefore reordering items in the shopping list does not change the semantics:
λ> todo shopping Just ["coat","presents","scarf"] λ> todo $ "coat" + "scarf" + "presents" Just ["coat","presents","scarf"] λ> shopping == "coat" + "scarf" + "presents" True
As you can see, the items are simply ordered alphabetically as there are no dependencies between them. Let’s add some! To do that we’ll use the connect operator → from the algebra. When two todo lists are combined with →, the meaning is that all items in the first list must be completed before we can start completing items from the second todo list. I’m currently planning a holiday trip to visit friends and therefore will need to pack all stuff that I buy before travelling:
holiday :: Todo String holiday = shopping * "pack" * "travel" λ> todo holiday Just ["coat","presents","scarf","pack","travel"] λ> shopping == holiday False λ> shopping `isSubgraphOf` holiday True
Items "pack"
and "travel"
have been appended to the end of the list even though "pack"
comes before "presents"
alphabetically, and rightly so: we can’t pack presents before we buy them!
Now let’s add a new dependency constraint to an existing todo list. For example, I might want to buy a new scarf before a coat, because I would like to make sure the coat looks good with the new scarf:
λ> todo $ holiday + "scarf" * "coat" Just ["presents","scarf","coat","pack","travel"]
Look how the resulting list changed: "coat"
has been moved after "scarf"
to meet the new constraint! Of course, it’s not too difficult to add contradictory constraints, making the todo list impossible to schedule:
λ> todo $ holiday + "travel" * "presents" Nothing
There is nothing we can do to complete all items if there is a circular dependency in our todo list: "presents"
→ "pack"
→ "travel"
→ "presents"
.
It may sometimes be useful to have some notion of item priorities to schedule some items as soon or as late as possible. Let me illustrate this with an example, by modifying our todo lists as follows:
shopping :: Todo String shopping = "presents" + "coat" + "phone wife" * "scarf" holiday :: Todo String holiday = shopping * "pack" * "travel" + "scarf" * "coat"
As you see, I now would like to phone my wife before buying the scarf to make sure it also matches the colour of one of her scarves (she has a dozen of them and I can’t possibly remember all the colours). Let’s see how this changes the resulting order:
λ> todo holiday Just ["phone wife","presents","scarf","coat","pack","travel"]
This works but is a little unsatisfactory: ideally I’d like to phone my wife right before buying the scarf. To achieve that I can amend the shopping list by changing the priority of the item "phone wife"
:
-- Lower the priority of items in a given todo list low :: Todo a -> Todo a shopping :: Todo String shopping = "presents" + "coat" + low "phone wife" * "scarf" λ> todo holiday Just ["presents","phone wife","scarf","coat","pack","travel"]
Aha, this is better: "phone wife"
got scheduled as late as possible, and is now right next to "scarf"
, as desired. But wait — if my wife finds out that I gave a low priority to my phone calls to her, I’ll get into trouble! I need to find a better way to achieve the same effect. In essence, we would like to have a variant of the connect operator that pulls the arguments together as close as possible during scheduling (and, alternatively, we may also want to repel arguments as far from each other as possible).
-- Pull the arguments together as close as possible (~*~) :: Ord a => Todo a -> Todo a -> Todo a -- Repel the arguments as far as possible (>*<) :: Ord a => Todo a -> Todo a -> Todo a shopping :: Todo String shopping = "presents" + "coat" + "phone wife" ~*~ "scarf"
This looks better and leads to the same result as the code above.
The final holiday
expression can be visualised as follows:
Here the overlay operator + is shown simply by placing its arguments next to each other, the connect operators are shown by arrows, and the arrow with a small triangle stands for the tightly connect operator ~*~
. By following the laws of the algebra, we can flatten the graph expression into a dependency graph shown below:
The graph is then linearised into a list of items by the todo
function.
So, here you go: you can plan your holiday (or anything else) in Haskell using the alga library!
The above reminds me of build systems that construct command lines for executing various external programs, such as compilers, linkers, etc. A command line is just a list of strings, that typically include the path to the program that is being executed, paths to source files, and various configuration flags. Some of these strings may have order constraints between them, quite similar to todo lists. Let’s see if we can use our tiny DSL for todo lists for describing command lines.
Here is a simple command line to compile "src.c"
with GCC compiler:
cmdLine1 :: Todo String cmdLine1 = "gcc" * ("-c" ~*~ "src.c" + "-o" ~*~ "src.o") λ> todo cmdLine1 Just ["gcc","-c","src.c","-o","src.o"]
Build systems are regularly refactored, and it is useful to track changes in a build system to automatically rebuild affected files if need be (for example, in the new GHC build system Hadrian we track changes in command lines and this helps a lot in its development). Some changes do not change the semantics of a build system and can therefore be safely ignored. As an example, one can rewrite cmdLine1
defined above by swapping the source and object file parts of the command line:
cmdLine2 :: Todo String cmdLine2 = "gcc" * ("-o" ~*~ "src.o" + "-c" ~*~ "src.c") λ> cmdLine1 == cmdLine2 True λ> todo cmdLine2 Just ["gcc","-c","src.c","-o","src.o"]
As you can see, the above change has no effect, as we would expect from the commutativity of +. Replacing ~*~
with the usual connect operator on the other hand sometimes leads to changes in the semantics:
cmdLine3 :: Todo String cmdLine3 = "gcc" * ("-c" * "src.c" + "-o" * "src.o") λ> cmdLine1 == cmdLine3 False λ> todo cmdLine3 Just ["gcc","-o","-c","src.c","src.o"]
The resulting sequence is correct from the point of view of a dependency graph, but is not a valid command line: the flag pairs got pushed apart. The change in semantics is recognised by the algebra and a rerun of the build system should reveal the error.
As a final exercise, let’s write a function that transforms command lines:
optimise :: Int -> Todo String -> Todo String optimise level = (* flag) where flag = vertex $ "-O" ++ show level λ> todo $ optimise 2 cmdLine1 Just ["gcc","-c","src.c","-o","src.o","-O2"]
As you can see, optimise 2
appends the optimisation flag "-O2"
at the end of the command line, i.e. optimise 2 == (* "-O2")
.
Command lines in real build systems contain many conditional flags that are included only when compiling certain files on certain platforms, etc. You can read about how we deal with conditional flags in Hadrian here.
Scheduling a list of items subject to dependency constraints is a well-known problem, which is solved by topological sort of the underlying dependency graph. GHC’s containers
library has an implementation of topological sort in Data.Graph
module. It operates on adjacency lists and to reuse it we can define the following Graph
instance:
newtype AdjacencyMap a = AM { adjacencyMap :: Map a (Set a) } deriving (Eq, Show) instance Ord a => Graph (AdjacencyMap a) where type Vertex (AdjacencyMap a) = a empty = AM $ Map.empty vertex x = AM $ Map.singleton x Set.empty overlay x y = AM $ Map.unionWith Set.union (adjacencyMap x) (adjacencyMap y) connect x y = AM $ Map.unionsWith Set.union [ adjacencyMap x, adjacencyMap y , fromSet (const . keysSet $ adjacencyMap y) (keysSet $ adjacencyMap x) ] adjacencyList :: AdjacencyMap a -> [(a, [a])] adjacencyList = map (fmap Set.toAscList) . Map.toAscList . adjacencyMap λ> adjacencyList $ clique [1..4] [(1,[2,3,4]),(2,[3,4]),(3,[4]),(4,[])]
Todo
is built on top of the TopSort
graph instance, which is just a newtype wrapper around AdjacencyMap
based representation of graphs:
newtype TopSort a = TS { fromTopSort :: AdjacencyMap a } deriving (Show, Num) instance Ord a => Eq (TopSort a) where x == y = topSort x == topSort y
The custom Eq
instance makes sure that graphs are considered equal if their topological sorts coincide. In particular all cyclic graphs fall into the same equivalence class corresponding to
topSort g == Nothing
:
λ> path [1..4] == (clique [1..4] :: TopSort Int) True λ> topSort $ clique [1..4] Just [1,2,3,4] λ> topSort $ path [1..4] Just [1,2,3,4] λ> topSort $ transpose $ clique [1..4] Just [4,3,2,1] λ> topSort $ circuit [1..4] Nothing
Function topSort
simply calls Data.Graph.topSort
performing the necessary plumbing, which is not particularly interesting.
The current implementation has two issues: the topological sort is not always lexicographically first, as evidenced by cmdLine3
above, where "-o"
precedes "-c"
in the final ordering. The second issue is that topSort
does not satisfy the closure axiom defined in the previous blog post. One possible approach to fix this is to compute the transitive reduction of the underlying dependency graph before the topological sort.
Have a great holiday everyone!
]]>Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.
Before we continue, I’d like to note that any data structure for representing graphs (e.g. an edgelist, matrix-based representations, inductive graphs from the fgl library, GHC’s standard Data.Graph
, etc.) can satisfy the axioms of the algebra with appropriate definitions of empty
, vertex
, overlay
and connect
, and I do not intend to compare these implementations against each other. I’m more interested in implementation-independent (polymorphic) functions that we can write and reuse, and in proving properties of these functions using the laws of the algebra. That’s why I think the algebra is worth studying.
As a warm-up exercise, let’s look at a few more examples of such polymorphic graph functions. One of the threads in the reddit discussion was about the path graph P_{4}: i.e. the graph with 4 vertices connected in a chain. Here is a function that can construct such path graphs on a given list of vertices:
path :: Graph g => [Vertex g] -> g path [] = empty path [x] = vertex x path xs = fromEdgeList $ zip xs (tail xs) p4 :: (Graph g, Vertex g ~ Char) => g p4 = path ['a', 'b', 'c', 'd']
Note that graph p4
is also polymorphic: we haven’t committed to any particular data representation, but we know that the vertices of p4
have type Char
.
If we connect the last vertex of a path to the first one, we get a circuit graph, or a cycle. Let’s express this in terms of the path
function:
circuit :: Graph g => [Vertex g] -> g circuit [] = empty circuit (x:xs) = path $ [x] ++ xs ++ [x] pentagon :: (Graph g, Vertex g ~ Int) => g pentagon = circuit [1..5]
From the definition we expect that a path is a subgraph of the corresponding circuit. Can we express this property in the algebra? Yes! It’s fairly standard to define a ≤ b as a + b = b for idempotent + and it turns out that this definition corresponds to the subgraph relation on graphs:
isSubgraphOf :: (Graph g, Eq g) => g -> g -> Bool isSubgraphOf x y = overlay x y == y
We can use QuickCheck to test that our implementation satisfies the property:
λ> quickCheck $ \xs -> path xs `isSubgraphOf` (circuit xs :: Basic Int) +++ OK, passed 100 tests.
QuickCheck can only test the property w.r.t. a particular instance, in this case we chose Basic Int
, but using the algebra we can prove that it holds for all law-abiding instances of Graph
(I leave this as an exercise for the reader).
As a final example, we will implement Cartesian graph product, usually denoted as G H, where the vertex set is V_{G} × V_{H} and vertex (x, y) is connected to vertex (x’, y’) if either x = x’ and y is connected to y’ in H, or y = y’ and x is connected to x’ in G:
box :: (Functor f, Foldable f, Graph (f (a,b))) => f a -> f b -> f (a,b) box x y = foldr overlay empty $ xs ++ ys where xs = map (\b -> fmap (,b) x) $ toList y ys = map (\a -> fmap (a,) y) $ toList x
The Cartesian product G H is assembled by creating |V_{H}| copies of graph G and overlaying them with |V_{G}| copies of graph H. We get access to the list of graph vertices using toList
from the Foldable
instance and turn vertices of original graphs into pairs of vertices by fmap
from the Functor
instance.
As you can see, the code is still implementation-independent, although it requires that the graph data type is a Functor
and a Foldable
. Just like lists, trees and other containers, most graph data structures have Functor
, Foldable
, Applicative
and Monad
instances (e.g. our Basic
data type has them all). Here is how
pentagon `box` p4
looks:
(A side note: the type signature of box
reminds me of this blog post by Edward Yang and makes me wonder if Functor
, Foldable
plus idempotent and commutative Monoid
together imply Monoidal
, as it seems that I only had to use empty
and overlay
from the Graph
type class. This seems odd.)
As I hinted in the previous blog post, to switch from directed to undirected graphs it is sufficient to add the axiom of commutativity for the connect operator. For undirected graphs it makes sense to denote connect by or —, hence:
Curiously, with the introduction of this axiom the associativity of follows from the (left-associated version of) decomposition axiom and commutativity of +:
(x y) z |
= | x y + x z + y z | (left decomposition) |
= | y z + y x + z x | (commutativity of + and ) | |
= | (y z) x | (left decomposition) | |
= | x (y z) | (commutativity of ) |
Commutativity of the connect operator forces graph expressions that differ only in the direction of edges into the same equivalence class. One can implement this by the symmetric closure of the underlying connectivity relation:
newtype Undirected a = Undirected { fromUndirected :: Basic a } deriving (Arbitrary, Functor, Foldable, Num, Show) instance Ord a => Eq (Undirected a) where x == y = toSymmetric x == toSymmetric y toSymmetric :: Ord a => Undirected a -> Relation a toSymmetric = symmetricClosure . toRelation . fromUndirected
As you can see, we simply wrap our Basic
implementaion in a newtype with a custom Eq
instance that takes care of the commutativity of . We know that the resulting Undirected
datatype satisfies all Graph
laws, because we only made some previously different expressions equal but not vice versa.
In many applications graphs satisfy the transitivity property: if vertex x is connected to y, and y is connected to z, then the edge between x and z can be added or removed without changing the semantics of the graph. A common example is dependency graphs. The semantics of such graphs is typically a partial order on the set of vertices. To describe this class of graphs algebraically we can add the following closure axiom:
By using the axiom one can always rewrite a graph expression into its transitive closure or, alternatively, into its transitive reduction, hence all graphs that differ only in the existence of some transitive edges are forced into the same equivalence class. Note: the precondition (y ≠ ε) is necessary as otherwise + and → can no longer be distinguished:
x → z = x → ε → z = x → ε + ε → z + x → z = x → ε + ε → z = x + z
It is interesting that + and → have a simple meaning for partial orders: they correspond to parallel and sequential composition of partial orders, respectively. This allows one to algebraically describe concurrent systems — I will dedicate a separate blog post to this topic.
We can implement a PartialOrder
instance by wrapping Basic
in a newtype and providing a custom equality test via the transitive closure of the underlying relation, just like we did for undirected graphs:
newtype PartialOrder a = PartialOrder { fromPartialOrder :: Basic a } deriving (Arbitrary, Functor, Foldable, Num, Show) instance Ord a => Eq (PartialOrder a) where x == y = toTransitive x == toTransitive y toTransitive :: Ord a => PartialOrder a -> Relation a toTransitive = transitiveClosure . toRelation . fromPartialOrder
Let’s test that our implementation correctly recognises the fact that path graphs are equivalent to cliques when interpreted as partial orders:
λ> quickCheck $ \xs -> path xs == (clique xs :: PartialOrder Int) +++ OK, passed 100 tests.
Indeed, if we have a series of n tasks, where each task (apart from task 1) depends on the previous task (as expressed by path [1..n]
), then task 1 is transitively a prerequisite for all subsequent tasks, task 2 is a prerequisite for tasks [3..n] etc., which can be expressed by clique [1..n]
.
A partial order is reflexive (also called weak) if every element is related to itself. An example of a reflexive partial order is isSubgraphOf
as introduced above: indeed,
x `isSubgraphOf` x == True
for all graphs x. To represent reflexive graphs algebraically we can introduce the following axiom:
This enforces that each vertex has a self-loop. The implementation of Reflexive
data type is very similar to that of Undirected
and PartialOrder
so I do not show it here (it is based on the reflexive closure of the underlying relation).
Note: cyclic reflexive partial orders correspond to preorders, for example:
(1 + 2 + 3) → (2 + 3 + 4)
is a preorder with vertices 2 and 3 forming an equivalence class. We can find the strongly-connected components and derive the following condensation:
{1} → {2, 3} → {4}
One way to interpret this preorder as a dependency graph is that tasks 2 and 3 are executed as a step, simultaneously, and that they both depend on task 1.
We can mix the three new axioms above in various combinations. For example, the algebra of undirected, reflexive and transitively closed graphs describes the laws of equivalence relations. Notably, it is not necessary to keep information about all edges in such graphs and there is an efficient implementation based on the disjoint set data structure. If you are curious about potential applications of such graphs, have a look at this paper where I use them to model switching networks. More precisely, I model families of switching networks; this requires another extension to the algebra: a unary condition operator, which I will cover in a future blog post.
This thread in the Hacker News discussion leads me another twist of the algebra. Let’s replace the decomposition axiom with 3-decomposition:
In words, instead of collapsing all expressions to vertices and edges (pairs of vertices), as we did with the 2-decomposition, we now collapse all expressions to vertices, edges and triples (or hyperedges of rank 3). I haven’t yet figured out whether the resulting algebra is particularly useful, but perhaps the reader can provide an insight?
To see the difference between 2- and 3-decomposition clearer, let’s substitute ε for w in 3-decomposition and simplify:
x → y → z = x → y + x → z + y → z + x → y → z
Looks familiar? It’s almost the 2-decomposition axiom! Yet there is no way to get rid of the term x → y → z on the right side: indeed, a triple is unbreakable in this algebra, and one can only extract the pairs (edges) that are embedded in it. In fact, we can take this further and rewrite the above expression to also expose the embedded vertices:
x → y → z = x + y + z + x → y + x → z + y → z + x → y → z
With 2-decomposition we can achieve something similar:
x → y = x + y + x → y
which I call the absorption theorem. It says that an edge x → y has vertices x and y (its endpoints) embedded in it. This seems intriguing but I have no idea where it leads to, I guess we’ll figure out together!
P.S.: All code snippets above are available in the alga repository. Look how nicely we can test the library thanks to the algebraic API!
]]>The roots of this work can be traced back to my CONCUR’09 conference submission that was rightly rejected. I subsequently published a few application-specific papers gradually improving my understanding of the algebra. The most comprehensive description can be found in ACM TECS (a preprint is available here). Here I’ll give a general introduction to the simplest version of the algebra of graphs and show how it can be implemented in Haskell.
Update: This series of blog posts was published as a functional pearl at the Haskell Symposium 2017.
Let G be a set of graphs whose vertices come from a fixed universe. As an example, we can think of graphs whose vertices are positive integers. A graph g ∈ G can be represented by a pair (V, E) where V is the set of its vertices and E ⊆ V × V is the set of its edges.
The simplest possible graph is the empty graph. I will be denoting it by ε in formulas and by empty
in Haskell code. Hence, ε = (∅, ∅) and ε ∈ G.
A graph with a single vertex v will be denoted simply by v. For example, 1 ∈ G is a graph with a single vertex 1, that is ({1}, ∅). In Haskell I’ll use vertex
to lift a given vertex to the type of graphs.
To construct bigger graphs from the above primitives I’ll use two binary operators overlay and connect, denoted by + and →, respectively. The overlay + of two graphs is defined as:
(V_{1}, E_{1}) + (V_{2}, E_{2}) = (V_{1} ∪ V_{2}, E_{1} ∪ E_{2})
In words, the overlay of two graphs is simply the union of their vertices and edges. The definition of connect → is similar:
(V_{1}, E_{1}) → (V_{2}, E_{2}) = (V_{1} ∪ V_{2}, E_{1} ∪ E_{2} ∪ V_{1} × V_{2})
The difference is that when we connect two graphs, we add an edge from each vertex in the left argument to each vertex in the right argument. Here are a few examples:
connect 1 (overlay 2 3)
.The following type class expresses the above in Haskell:
class Graph g where type Vertex g empty :: g vertex :: Vertex g -> g overlay :: g -> g -> g connect :: g -> g -> g
Let’s construct some graphs! A graph that contains a given list of unconnected vertices can be constructed as follows:
vertices :: Graph g => [Vertex g] -> g vertices = foldr overlay empty . map vertex
And here is a clique (a fully connected graph) on a given list of vertices:
clique :: Graph g => [Vertex g] -> g clique = foldr connect empty . map vertex
For example, clique [1..]
is the infinite clique on all positive integers; we will call such cliques covering the whole universe complete graphs. We can also construct any graph given its edgelist:
fromEdgeList :: Graph g => [(Vertex g, Vertex g)] -> g fromEdgeList = foldr overlay empty . map edge where edge (x, y) = vertex x `connect` vertex y
As we will see in the next section, graphs satisfy a few laws and form an algebraic structure that is very similar to a semiring.
The structure (G, +, →, ε) introduced above satisfies many usual laws:
The following decomposition axiom, is the only law that makes the algebra of graphs different from a semiring:
x → y → z = x → y + x → z + y → z
Indeed, in a semiring the two operators have different identity elements, let’s denote them ε_{+} and ε_{→}, respectively. By using the decomposition axiom we can prove that they coincide:
ε_{+ } | = | ε_{+} → ε_{→} → ε_{→} | (identity of →) |
= | ε_{+} → ε_{→} + ε_{+} → ε_{→} + ε_{→} → ε_{→} | (decomposition) | |
= | ε_{+} + ε_{+} + ε_{→} | (identity of →) | |
= | ε_{→} | (identity of +) |
The idempotence of + also follows from the decomposition axiom.
The following is a minimal set of axioms that describes the graph algebra:
An exercise for the reader: prove that ε is the identity of + from the minimal set of axioms above. This is not entirely trivial! Also prove that + is idempotent.
Note, to switch from directed to undirected graphs it is sufficient to add the axiom of commutativity of →. We will explore this in a future blog post.
Let’s look at two basic instances of the Graph type class that satisfy the laws from the previous section. The first one, called Relation, adopts our set-based definitions for the overlay and connect operators and is therefore a free instance (i.e. it doesn’t satisfy any other laws):
data Relation a = Relation { domain :: Set a, relation :: Set (a, a) } deriving (Eq, Show) instance Ord a => Graph (Relation a) where type Vertex (Relation a) = a empty = Relation Set.empty Set.empty vertex x = Relation (Set.singleton x) Set.empty overlay x y = Relation (domain x `Set.union` domain y) (relation x `Set.union` relation y) connect x y = Relation (domain x `Set.union` domain y) (relation x `Set.union` relation y `Set.union` Set.fromDistinctAscList [ (a, b) | a <- Set.elems (domain x) , b <- Set.elems (domain y) ])
Let’s also make Relation an instance of Num type class so we can use + and * operators for convenience.
instance (Ord a, Num a) => Num (Relation a) where fromInteger = vertex . fromInteger (+) = overlay (*) = connect signum = const empty abs = id negate = id
Note: the Num law abs x * signum x == x
is satisfied since x → ε = x. In fact, any Graph instance can be made a Num instance if need be. We can now play with graphs using interactive GHC:
λ> 1 * (2 + 3) :: Relation Int Relation {domain = fromList [1,2,3], relation = fromList [(1,2),(1,3)]} λ> 1 * (2 + 3) + 2 * 3 == (clique [1..3] :: Relation Int) True
Another simple instance can be obtained by embedding all graph constructors into a basic algebraic datatype:
data Basic a = Empty | Vertex a | Overlay (Basic a) (Basic a) | Connect (Basic a) (Basic a) deriving Show instance Graph (Basic a) where type Vertex (Basic a) = a empty = Empty vertex = Vertex overlay = Overlay connect = Connect
We cannot use the derived Eq instance here, because it would clearly violate the laws of the algebra, e.g. Overlay Empty Empty
is structurally different from Empty
. However, we can implement a custom Eq instance as follows:
instance Ord a => Eq (Basic a) where x == y = toRelation x == toRelation y where toRelation :: Ord a => Basic a -> Relation a toRelation = foldBasic foldBasic :: (Vertex g ~ a, Graph g) => Basic a -> g foldBasic Empty = empty foldBasic (Vertex x ) = vertex x foldBasic (Overlay x y) = overlay (foldBasic x) (foldBasic y) foldBasic (Connect x y) = connect (foldBasic x) (foldBasic y)
The Basic instance is useful because it allows to represent densely connected graphs more compactly. For example, clique [1..n] :: Basic Int
has linear-size representation in memory, while clique [1..n] :: Relation Int
stores each edge separately and therefore takes O(n^{2}) memory. As I will demonstrate in future blog posts, we can exploit compact graph representations for deriving algorithms that are asymptotically faster on dense graphs compared to existing graph algorithms operating on edgelists.
I’ve been using the algebra of graphs presented above for several years in a number of different projects and found it very useful. There are a few flavours of the algebra that I will introduce in follow-up blog posts that allow to work with undirected graphs, transitively closed graphs (also known as partial orders or dependency graphs), graph families, and their various combinations. All these flavours of the algebra can be obtained by extending the set of axioms.
I am working on a Haskell library alga implementing the algebra of graphs and intend to release it soon. Let me know if you have any suggestions on how to improve the above code snippets.
]]>
Build systems
A build system is a critical component of most software projects, responsible for compiling the source code written in various programming languages and producing executable programs — end software products. Build systems sound simple, but they are not; in large software projects the build system grows from simple beginnings into a huge, complex engineering artefact. Why? Because it evolves every day as new features are continuously being added by software developers; because it builds programs for a huge variety of target hardware configurations (from mobile to cloud); and because it operates under strict correctness and performance requirements, standing on the critical path between the development of a new feature and its deployment into production.
It is known that build systems can take up to 27% of software development effort, and that improvements to build systems rapidly pay off [1]. Despite its importance, this subject is severely under-researched, which prompts major companies, such as Facebook and Google, to invest significant internal resources to make their own bespoke build system frameworks.
Challenges in large-scale build systems
The following two requirements are important for complex large-scale build systems, however, to the best of my knowledge, no existing build system can support both:
There exist build systems that support cloud builds, e.g. Facebook’s Buck and Google’s Bazel, as well as build systems that support dynamic dependencies, e.g. Neil Mitchell’s Shake (originally developed at Standard Chartered) and Jane Street’s Jenga, but not both.
Proposed approach and research objectives
I will combine two known techniques when developing the new build system: 1) storing build results in a cloud build cache, where a file is identified by a unique hash that combines hashes of the file, its build dependencies and the environment, so that users of the build system can fetch an already built file from the cache, or build it themselves and share it with others by uploading it into the build cache; and 2) storing the last known version of the dependency graph in a persistent graph storage, and updating it whenever a file needs to be rebuilt and the newly discovered set of dependencies differs from the previous record, as implemented in the Shake build system [2].
My first research objective is to formalise the semantics of cloud build systems with dynamic dependency graphs in the presence of build non-determinism (real compilers are not always deterministic) and partial builds (not all intermediate build results are necessary to store locally). I will then integrate cloud build caches with persistent graph storage and develop a domain-specific language for the new build system using scalable functional programming abstractions, such as polymorphic and monadic dependencies, high-level concurrency primitives, and compositional build configurations [3]. In 2012-2014 I developed an algebra for dependency graphs [4] that I will use for proving the correctness of new build algorithms. The final research objective is to evaluate the developed build system and explore opportunities for its integration into a real large-scale build infrastructure.
References
[1] S. McIntosh et al. “An empirical study of build maintenance effort”, In Proceedings of the International Conference on Software Engineering (ICSE), ACM, 2011.
[2] N. Mitchell. “Shake before building – replacing Make with Haskell”, In Proceedings of the International Conference on Functional Programming (ICFP), ACM, 2012.
[3] A. Mokhov, N. Mitchell, S. Peyton Jones, S. Marlow. “Non-recursive Make Considered Harmful: Build Systems at Scale”, In Proceedings of the International Haskell Symposium, ACM, 2016.
[4] A. Mokhov, V. Khomenko. “Algebra of Parameterised Graphs”, ACM Transactions on Embedded Computing Systems, 13(4s), p.143, 2014.
]]>