Computing connected components in an undirected graph is one of the most basic graph problems. Given a graph with n vertices and m edges, you can find its components in linear time O(n + m) using depth-first or breadth-first search. But what if you need to go faster? In this blog post, I will describe a cool new concurrent algorithm for this problem, which I learned this week at the Heidelberg Laureate Forum from Robert Tarjan himself. The algorithm distributes the work among n + m tiny processors that work concurrently most of the time and requires O(log n) global synchronisation rounds. The algorithm is remarkably simple but it’s far from obvious that it works correctly and efficiently. Happily, Tarjan and his co-author S. Cliff Liu have done all the hard proofs in their recent paper, so we can simply take the algorithm and use it.
Moving to Jane Street
Exciting times! I’m moving from Newcastle University to Jane Street London to work on the OCaml build system Dune. I’ve received a lot of questions about this move already, so I decided to write down some answers in a blog post.
Stroll: an experimental build system
I’d like to share an experiment in developing a language-agnostic build system that does not require the user to specify dependencies between individual build tasks. By “language-agnostic” I mean the user does not need to learn a new language or a special file format for describing build tasks — existing scripts or executable files can be used as build tasks directly without modification.
I call this build system Stroll because its build algorithm reminds me of strolling through a park where you’ve never been before, trying to figure out an optimal path to your target destination, and likely going in circles occasionally until you’ve built up a complete mental map of the park.
You should try Hadrian
This is an announcement for GHC developers:
You should try to use Hadrian as the GHC build system, because it will (hopefully!) become the default around GHC 8.8.
In this blog post we will explore the consequences of postulating 0 = 1 in an algebraic structure with two binary operations (S, +, 0) and (S, ⋅, 1). Such united monoids have a few interesting properties, which are not immediately obvious. For example, we will see that the axiom 0 = 1 is equivalent to a seemingly less extravagant axiom ab = ab + a, which will send us tumbling down the rabbit hole of algebraic graphs and topology.
Build Systems à la Carte
In a recent blog post, I shared a preliminary version of the paper on build systems that Neil Mitchell, Simon Peyton Jones and I submitted to the ICFP 2018 conference. The paper was accepted and yesterday, after months of revisions and polishing, we’ve finally completed this work. The paper and associated executable models are openly available; here is a direct link to the PDF.
Build systems, such as classic Make, are big, complicated, and used by every software developer on the planet. But they are a sadly unloved part of the software ecosystem, very much a means to an end, and seldom the focus of attention. Rarely do people ask questions like “What does it mean for my build system to be correct?” or “What are the trade-offs between different approaches?”. For years Make dominated, but more recently the challenges of scale have driven large software firms like Microsoft, Facebook and Google to develop their own build systems, exploring new points in the design space. In this paper we offer a general framework in which to understand and compare build systems, in a way that is both abstract (omitting incidental detail) and yet precise (implemented as Haskell code).
Selective applicative functors
Update: I have written a paper about selective applicative functors, and it completely supersedes this blog post (it also uses a slightly different notation). You should read the paper instead.
I often need a Haskell abstraction that supports conditions (like Monad) yet can still be statically analysed (like Applicative). In such cases people typically point to the Arrow class, more specifically ArrowChoice, but when I look it up, I find several type classes and a dozen of methods. Impressive, categorical but also quite heavy. Is there a more lightweight approach? In this blog post I’ll explore what I call selective applicative functors, which extend the Applicative type class with a single method that makes it possible to be selective about effects.
Please meet Selective:
class Applicative f => Selective f where handle :: f (Either a b) -> f (a -> b) -> f b
Think of handle as a selective function application: you apply a handler function of type a → b when given a value of type Left a, but can skip the handler (along with its effects) in the case of Right b. Intuitively, handle allows you to efficiently handle errors, i.e. perform the error-handling effects only when needed.
The Task abstraction
Neil Mitchell, Simon Peyton Jones and I have just finished a paper describing a systematic and executable framework for developing and comparing build systems. The paper and associated code are available here: https://github.com/snowleopard/build. The code is not yet well documented and polished, but I’ll bring it in a good shape in April. You can learn more about the motivation behind the project here.
(Update: the paper got accepted to ICFP! Read the PDF, watch the talk.)
In this blog post I would like to share one interesting abstraction that we came up with to describe build tasks:
type Task c k v = forall f. c f => (k -> f v) -> k -> Maybe (f v)
A Task is completely isolated from the world of compilers, file systems, dependency graphs, caches, and all other complexities of real build systems. It just computes the value of a key k, in a side-effect-free way, using a callback of type k → f v to find the values of its dependencies. One simple example of a callback is Haskell’s readFile function: as one can see from its type FilePath → IO String, given a key (a file path k = FilePath) it can find its value (the file contents of type v = String) by performing arbitrary IO effects (hence, f = IO). We require task descriptions to be polymorhic in f, so that we can reuse them in different computational contexts f without rewriting from scratch.
Formal verification of spacecraft control programs
Last February I spent two weeks in Vienna, visiting Jakob Lechner and RUAG Space Austria, a company developing components for space missions. Jakob and his colleagues designed a specialised processing core called REDFIN (REDuced instruction set for Fixed-point & INteger arithmetic) for executing simple spacecraft control tasks, such as satellite antenna pointing. During the visit I implemented a prototype of a formal verification framework to support the development of REDFIN programs. Afterwards I was quite busy with my other projects, but my PhD student Georgy Lukyanov helped to further improve the prototype.
Jakob, Georgy and I have just submitted a conference paper describing the REDFIN core and the verification framework. Please have a look and let us know what you think. This will be a timely read after yesterday’s exciting SpaceX launch.
Continue reading Formal verification of spacecraft control programs
Hadrian is on the way
Hadrian, a new build system for GHC that we have been working on for the past three years, has finally been merged into the GHC tree (update: we have temporally switched to a submodule). However it’s not yet time to celebrate — there are still many issues that need to be addressed before the Make-based build system may retire.
Want to try? Checkout the GHC repository and run
hadrian/build.sh -j or
hadrian/build.bat -j on Windows and it should build you a GHC binary. In case of problems, have a look at the
README and/or raise an issue.
Here is a quick update on the on-going development:
- Hadrian can build GHC and can already be used as part of the CI infrastructure. However, the resulting binary does not pass the validation. Zhen Zhang is looking into this, but more help is needed.
- A major refactoring by Moritz Angermann is on the way. Moritz is primarily interested in cross compilation, but to make it work he had to get rid of the
ghc-cabalutility, reorganise the build tree, and make numerous other improvements to Hadrian.
- There is currently no support for binary distribution. Ben Gamari is looking into this issue.
- Dynamic linking on Windows is not implemented. Tamar Christina has kindly offered help with this.
- Hadrian source code is still not fully documented and tested, and generally requires some polishing. I am currently taking care of this when not distracted by urgent bug fixes and will appreciate your help in making Hadrian easier to understand and use.
I can’t believe that we seem to approach the finish line! It’s been a long, tedious but also interesting project. Thank you all for helping us get this far, and I hope we’ll celebrate the switch from Make to Hadrian soon.