Archive for the ‘Coding’ Category

GHC and Windows DLLs

Friday, July 3rd, 2009

Following on from Duncan’s work on Building plugins as Haskell shared libs, I’ve been working on supporting the same functionality on Windows. The end goal is to have a rts.dll, libHsBase.dll and myPlugin.dll and be able to write things like Excel plugins in Haskell without needing to statically link the whole runtime system and set of libraries into each one.

Windows uses the Portable Executable (PE) Format, so the hoops that must be jumped through are different than those for Linux and Mac OS X. Linux uses ELF for its object format, and Mac OS X uses Mach-O. Tool chain programs such as linkers and object file views are also different.

One of immediate issues is to deal with mutually recursive imports between Haskell libraries and the GHC Run Time System (RTS). Clearly, the code for a Haskell library will call the RTS to perform tasks such as allocating memory, throwing exceptions, forking threads and so on. However, the runtime system also calls back on the base library. For example, here is a function from the RTS which helps to create parallel threads:

void createSparkThread (Capability *cap) {
    StgTSO *tso;
    tso = createIOThread (cap, RtsFlags.GcFlags.initialStkSize,
                                  &base_GHCziConc_runSparks_closure);
    postEvent(cap, EVENT_CREATE_SPARK_THREAD, 0, tso->id);
    appendToRunQueue(cap,tso);
}

The variable base_GHCziConc_runSparks_closure is the name of a function closure in the GHC.Conc library which we won’t have code for when we’re linking the RTS.

One of the quirks of Windows is the need to generate so called “import libraries”. These contain stub code that is used to call a function in a DLL. For example, if code in module main.o wants to call a function fun in a library base.dll, the picture looks something like this:

## in main.o ##################### (linked into main.exe)
main:
    call fun
    ....
    call dword ptr [__imp_fun]
    …. 

## in base.lib ###################### (linked into main.exe)
fun:
    jmp dword ptr [__imp_fun]

__imp_fun:
.data
    .dword fun

## in base.dll ######################
fun:
    .. actual code for fun   

In Windows, all calls to a function in a DLL go via the Imported function Address Table (IAT). This is a table of pointers, and in the example above there is one entry named __imp_fun. There are two ways to use this table. The first way is illustrated by the first call to fun in main.o. This call targets stub code that looks up the pointer from the table and then jumps to it. The second way is to lookup the pointer and jump to it directly, but to do this we need to know that the function is in an external DLL at code generation time. A call fun instruction uses a PC relative offset, and is physically shorter than a call dword ptr [] instruction, so it’s not practical to change one to the other at link time.

The file base.lib is the “import library”, which contains the call stub and the IAT. Import libraries need to be generated independently from the main compiling and linking process, using Windows specific tools. The import library for a particular dll is then linked into every executable (or other dll) that uses it.

Anyway, I’ve spent the last few days wading through MSDN and the GHC build system, and I think I’ve cataloged at least all the major hoops. I’ll let you know how the jumping goes next post.

GHC, primops and exorcising GMP

Tuesday, June 9th, 2009

GHC uses GMP to implement the Haskell arbitrary-precision Integer type. It’s been this way for ages.

For various reasons using GMP is a slight problem for some users. Some users don’t really make use of Integer and don’t like to have to link to GMP. Since GMP uses the LGPL, if you want to ship closed source programs then you have to link to it dynamically. On Windows static linking is the default so you have to jump through hoops to link it dynamically. Then there are also users who make heavy use of GMP and find that the Integer library is far too limited an interface to GMP. However binding extra GMP functions is complicated by the the way that the GHC RTS uses it already (especially the memory management).

So what these people want is a way to build GHC such that the RTS does not directly link to GMP. Then the implementation of Integer should be in a library that is replaceable so that one can use a simple slow implementation, a super-duper binding to GMP or some other “big num” library.

Daniel Peebles, Ian Lynagh and I have been working on this problem recently. Ian and my contributions to this are supported by the IHG.

Getting GMP out of the RTS

Before we can think about replacements however we need to disentangle GMP from the RTS and at least move the existing GMP-based Integer implementation into a library. This Integer implementation would remain the default so it still has to be fast. Daniel has managed to rip GMP out of the RTS and we’re now focusing on how to move the GMP binding into its own library.

The difficulty of moving it out of the RTS is that currently almost all the GMP operations are bound as GHC “primops”, as opposed to using the FFI. This is partly historical accident (FFI arrived on the scene relatively late) and partly that due to certain FFI restrictions, the primop route is simpler and faster. The issue is that the wrapper code (around the actual GMP calls) needs to return several results to Haskell land, in particular things like (# Int, ByteArray# #). Using the FFI it is possible to return several results but one has to do it in the time-honoured tradition of C and emulate “out” parameters by passing pointers. The problem with doing that is we would need to do a lot of marshaling: temporarily allocate some memory, pass pointers and read back the results. All this just to return a few integers and pointers. It’s actually more tricky because at the level in the library stack where we have to implement Integer we do not actually have access to the FFI libraries (in fact currently we do not even have access to the IO type).

GHC primops

Primops bypass the single-result restrictions inherited from the C calling convention. We can write primops that directly return unboxed tuples, like (# Int, ByteArray# #). Primops (at least out-of-line primops) are implemented in Cmm, which is GHC’s low level intermediate language based on the C-​- language. These Cmm functions have to know exactly the internal calling convention that GHC uses, but there is no excess marshaling.

Unfortunately knowledge of the primops has to be baked into the compiler and the Cmm code has to be compiled into the RTS. So that’s no good for implementing Integer a separate library from the RTS.

What if we could use the FFI to import Cmm functions…

foreign import prim

That would make it possible to have out-of-line primops in a library. The library would contain the compiled .cmm files and the .hs code in the same library would “foreign import” the cmm function. In particular we could then just move the .cmm code we use for wrapping the GMP library calls from the RTS into the integer-gmp package. Then instead of getting primops like plusInteger# from the GHC.Prim module, we would just foreign import them, eg:

foreign import prim "plusInteger" plusInteger#
  :: Int# -> ByteArray#
  -> Int# -> ByteArray#
  -> (# Int#, ByteArray# #)

So that’s what I started implementing today, “foreign import prim“. It needs a slight extension in the lexer, parser, type checker, desugarer, core->stg, and stg->cmm phases. That sounds like a lot but the changes in each bit are pretty small. As a feature it is very similar to foreign C calls and also to primops, so fortunately it can share most code with those existing features. So far it’s going ok, I’ve got it producing convincing looking core, stg and cmm code. Tomorrow I’ll test it and review the design and changes with Simon Marlow.

If this works out ok then it should mean we’re still using the same well-tested gmp binding code and without any extra marshaling overhead. Correctness testing is mostly covered by the existing GHC testsuite. We still want to check the performance of course. To that end, Daniel has been working on an Integer performance benchmark. He’s tried it already using the simple pure-Haskell implementation of Integer. Apparently it does respectably but takes ages to calculate 10000 factorial.

Buildings plugins as Haskell shared libs

Thursday, May 21st, 2009

This post is a sneak preview about building Haskell shared libraries on Linux. We’ll look at how to use ghc to make a standalone Haskell shared library that exports C functions. We could use this shared library as part of a bigger project (without having to use ghc for the final linking) or we could load it dynamically, e.g. as a plugin in some other program.

This work is being supported by the IHG and it builds on the hard work of several other people over the last few years (see the first post in this series for the history and credits)

Building GHC with shared libs support

For starters you need the latest development version of GHC. See these instructions on getting the sources and doing the configure, build and install steps.

The only non-standard thing you need to do is to use ./configure --enable-shared. Note that this has only been tested on Linux x86-64 and x86, though in the past, the shared lib support has also worked on Linux PPC and OSX PPC.

Currently what you get is a ghc that itself is statically linked but it can build programs and shared libraries that dynamically link against the runtime system and base libraries.

Building programs that use shared libs

For example, for “hello world”:

$ ghc --make -dynamic Hello.hs

It is interesting to look at the output of the ldd program:

$ ldd ./Hello

I’ll not paste the whole output, but here’s a bit of it:

libHSbase-4.0.0.0-ghc6.11.so =>
  /opt/ghc/lib/ghc-6.11/base-4.0.0.0/libHSbase-4.0.0.0-ghc6.11.so
  (0x00007f8959aff000)

(I’ve simplified the ghc version slightly)

If you were to look at the full output what you would notice is that it links against each Haskell package as a separate .so file. What is more, it is able to find the shared libs even though they are not in a standard location like /usr/local/lib. This is because by default it is using the -rpath mechanism. It is also possible to build binaries in a mode that does not embed an rpath which might be more suitable for deployment.

Building shared libs

Suppose we have a module Foo.hs that uses the FFI to export a C function called foo():

module Foo where
import Foreign.C
foreign export ccall foo :: CInt -> CInt
foo :: CInt -> CInt
foo = ...

we can build it into a shared library:

$ ghc --make -dynamic -shared -fPIC Foo.hs -o libfoo.so

We need to use -dynamic, -shared and -fPIC. The -dynamic flag tells ghc at the compile step to produce code so that it can link dynamically to dependent packages. At the link step it tells ghc to actually link dynamically to dependent packages. The -shared flag tells ghc to link a shared library rather than a program. The -fPIC flag tells ghc to make code that is suitable to include into a shared library. If we were to break it down into separate compile and link steps then we would use:

$ ghc -dynamic -fPIC -c Foo.hs
$ ghc -dynamic -shared Foo.o Foo_stub.o -o libfoo.so

In principle you can use -shared without -dynamic in the link step. That would mean to statically link the rts all the base libraries into your new shared library. This would make a very big, but standalone shared library. However that would require all the static libraries to have been built with -fPIC so that the code is suitable to include into a shared library and we don’t do that at the moment.

If we use ldd again to look at the libfoo.so that we’ve made we will notice that it is missing a dependency on the rts library. This is problem that we’ve yet to sort out, so for the moment we can just add the dependency ourselves:

$ ghc --make -dynamic -shared -fPIC Foo.hs -o libfoo.so \
  -lHSrts-ghc6.11 -optl-Wl,-rpath,/opt/ghc/lib/ghc-6.11/

The reason it’s not linked in yet is because we need to be able to switch which version of the rts we’re using without having to relink every library. For example we want to be able to switch between the debug, threaded and normal rts versions. It’s quite possible to do this and it just needs a bit more rearranging in the build system to sort it out. Once it’s done you’ll even be able to switch rts at runtime, eg:

$ LD_PRELOAD=/opt/ghc/lib/ghc-6.11/libHSrts_debug-ghc6.11.so
$ ./Hello

Going back to our libfoo.so, now that it is linked against the rts it is completely standalone, we can link it into a C program using just gcc, or we can use dlopen() to load libfoo.so at runtime.

Assuming we’ve got libfoo.so in the current directory, we can link it into a C program:

$ gcc main.c -o main -lfoo -L.

If you use ldd now it’ll tell you that libfoo.so is not found. Remember that the runtime linker doesn’t look in the same places as the static linker. We told the static linker to look in the current directory with the flag -L.. For the dynamic linker we can either move our libfoo.so to /usr/local/lib or we can embed a path into the binary that tells the runtime linker where to look. One particularly neat way to do this is to tell it to look for the library not at an absolute path, but relative to the program itself:

$ gcc main.c -o main -lfoo -L. -Wl,-rpath,'$ORIGIN'

The Linux runtime linker understands the special variable $ORIGIN and interprets it as the location of the executable. This also works on Solaris. Windows and OS X have something similar. This makes it possible to distribute binaries along with shared libraries and have the whole lot fully relocatable.

If we want to load the library and call functions at runtime we would use C code like:

void *dl = dlopen("./libfoo.so", RTLD_LAZY);
int (*foo)(int a) = dlsym(dl, "foo");
printf("%d\n", foo(2500));

In this case we do not need to link our C program against libfoo.so (we just need -ldl for the dynamic linking functions like dlopen).

$ gcc main.c -o main -ldl

Now one thing to watch out for is that before you call any exported Haskell function, you have to start up the runtime system. If you just call foo() directly then it’ll emit a helpful error message to remind you. We have to use the C API of the Haskell FFI to initialise the runtime system. This is a little tiresome. In our case it’ll look like:

hs_init(&argc, &argv);
hs_add_root(__stginit_Foo);

The first line is specified by the Haskell FFI. The second is a GHC’ism. It initialises the module containing the function we’re going to call.

If you’re exporting a plugin API then hopefully the API will support some kind of plugin initialisation. In that case you can include the above C code to initialise the rts before any of the Haskell functions get called. We can do that by adding the above initialisation code into a C function and export that from our shared lib:


void init (void);
void init (void) { ... }

Then we would add init into our shared lib:

$ ghc -fPIC -c init.c
$ ghc -dynamic -shared Foo.o Foo_stub.o init.o -o libfoo.so \
  -lHSrts-ghc6.11 -optl-Wl,-rpath,/opt/ghc/lib/ghc-6.11/

Of course the calling program has to call init() first.

If you have to support a C API where there is no initialiser then we can use this trick:


static void init (void) __attribute__ ((constructor));
void init (void) { ... }

The constructor attribute means the function will be called on program startup or as soon as the library is loaded via dlopen.

“Hello world” now only 11k using GHC with shared libs

Tuesday, April 28th, 2009

$ ./Hello.dyn
Hello World!

$ ls -ogh Hello Hello.dyn
411K 2009-04-28 21:59 Hello
 11K 2009-04-28 21:55 Hello.dyn

On Linux x86-64 with GHC using shared libraries a “Hello World” program is now only 11k compared to 411k previously. By comparison, JHC manages 6.4k and an equivalent C program is 6.3k. (All sizes after running strip on the binary.)

As I mentioned earlier, the IHG has asked us to work on improving GHC’s support for shared libraries. I’ve been updating the new GHC build system to support --enable-shared and I’ve just now managed to get the build to go through. I’ll clean up my patches and send them in tomorrow. There are still a number of things to do. I’ve got to run the testsuite with everything built for shared libs. Clemens had this working before so I’m not expecting too many test failures. We also need to set up a GHC buildbot to use --enable-shared so that we do not get regressions.

The next task will be to test that it works to make a Haskell library that exports a C API and to use it as a plugin for some other program. Anyone got any good suggestions for a simple demo plugin? What programs have nice simple plugin APIs?

zlib and bzlib package updates

Sunday, November 2nd, 2008

I’m pleased to announce updates to the zlib and bzlib packages.

The releases are on Hackage:

What’s new

What’s new in these releases is that the extended API is slightly nicer. The simple API that most packages use is unchanged.

In particular, these functions have different types:

compressWith   :: CompressParams
               -> ByteString -> ByteString
decompressWith :: DecompressParams
               -> ByteString -> ByteString

The CompressParams and DecompressParams types are records of compression/decompression parameters. The functions are used like so:

compressWith   defaultCompressParams { ... }
decompressWith defaultDecompressParams { ... }

There is also a new parameter to control the size of the first output buffer. This lets applications save memory when they happen to have a good estimate of the output size (some apps like darcs know this exactly). By getting a good estimate and (de)compressing into a single-chunk lazy bytestring this lets apps convert to a strict bytestring with no extra copying cost.

Future directions

The simple API is very unlikely to change.

The current error handling for decompression is not ideal. It just throws exceptions for failures like bad format or unexpected end of stream. This is a tricky area because error streaming behaviour does not mix easily with error handling.

On option which I use in the iconv library is to have a data type describe the real error conditions, something like:

data DataStream
   = Chunk Strict.ByteString Checksum DataStream
   | Error Error -- for some suitable error type
   | End Checksum

With suitable fold functions and functions to convert to a lazy ByteString. Then people who care about error handling and streaming behaviour can use that type directly. For example it should be trivial to convert to an iterator style.

People have also asked for a continuation style api to give more control over dynamic behaviour like flushing the compression state (eg in a http server). Unfortunately this does not look easy. The zlib state is mutable and while this can be hidden in a lazy list, it cannot be hidden if we provide access to intermediate continuations. That is because those continuations can be re-run whereas a lazy list evaluates each element at most once (and with suitable internal locking this is even true for SMP).

Background

The zlib and bzlib packages provide functions for compression and decompression in the gzip, zlib and bzip2 formats. Both provide pure functions on streams of data represented by lazy ByteStrings:

compress, decompress :: ByteString -> ByteString

This makes it easy to use either in memory or with disk or network IO.
For example a simple gzip compression program is just:

import qualified Data.ByteString.Lazy as BS
import qualified Codec.Compression.GZip as GZip

main = BS.interact GZip.compress

Or you could lazily read in and decompress .gz file using:

content <- GZip.decompress <$> BS.readFile file

General

Both packages are bindings to the corresponding C libs, so they depend on those external C libraries (except on Windows where we build a bundled copy of the C lib source code). The compression speed is as you would expect since it’s the C lib that is doing all the work.

The zlib package is used in cabal-install to work with .tar.gz files. So it has actually been tested on Windows. It works with all versions of ghc since 6.4 (though it requires Cabal-1.2).

The darcs repos for the development versions live on code.haskell.org:

I’m very happy to get feedback on the API, the documentation or of course any bug reports.

What’s wrong with make?

Thursday, August 21st, 2008

Actually, make is really pretty good. Well, GNU make is pretty good. When used correctly it can rebuild large projects in the minimum number of steps, in parallel.

So what is wrong with it? I’m not complaining about the simple annoyances like insane use of tabs or being based on two inflexible programming languages (shell script and make macros). No, it’s got two fundamental problems: it cannot enforce correct dependency tracking and its dependency language is insufficiently expressive for common tasks.

It is possible to write incorrect make rules

It’s far too easy to write incorrect rules — rules where the named dependencies and targets do not match up with the action:


foo : bar
	cat bar baz > foo

So obviously that’s a totally contrived example and is easy to spot that it has an untracked dependency on baz, however it gets much harder with more complex rules and where the programs you call read or write files that you did not know about.

So the problem is that make cannot check that you’ve written a bogus rule and yet if you do your makefile is broken. What makes it worse is that it is not immediately obvious that the makefile is broken. It will work initially but you’ll end up not trusting parallel make or running into quirks that are solved by make clean. If you have to use make clean your makefile is broken.

The reason it cannot check for correctness of rules is of course because the actions are opaque blobs of shell script. The decision to use shell script made it quicker to implement make and users were already familiar with the language but the cost is that makefiles are more or less impossible to check automatically.

Dynamic dependencies cannot be expressed sanely in make

Make takes the view that there is a single massive static dependency graph. The single and massive aspects are not a problem but the static one is. The problem is that in many cases you do actually need to run some build actions before you know all the dependencies of each file in a project. The canonical example is running a tool to discover the dependencies of .hs or .c files.

GNU make has a clever extension which makes it possible to express this kind of dependency in simple cases. It lets one include deps.mk. That reads in the deps.mk file and extends the dependency graph with all the rules found in deps.mk. The clever thing is that deps.mk itself can depend on other things and can have a rule to generate it. For the example of dependencies of .hs or .c files, this lets us write a rule to call the compiler with a flag to dump make-format dependencies into a file. This works quite nicely, in this simple case.

The problem with doing include deps.mk is that the file can specify any rules at all. It can add edges between any existing nodes in make’s dependency graph. That means that make cannot be confident that it knows everything unless it has got deps.mk up to date and included it. So before anything else can be done, make has to update deps.mk, even if it turns out in the end that the deps given in the include file are irrelevant to the task at hand.

In practise we do not need to be able to add edges willy nilly all over the place and there is a great benefit to being more disciplined. Suppose for the sake of argument that for each .hs file I have a .deps file listing the files that the .hs file depends on. Now if I am trying to update foo.hs then I only need to look at the suffix rule for .hs files and the rules in foo.deps. I know that I do not need to look in bar.deps (unless I discover from foo.deps that foo.hs depends on bar.hi). This is a massive saving. It means I do not have to bring all the .deps files up to date. That also means I do not have to bring the corresponding .hs files up to date. That also means I do not have to run any pre-processors or code generators for those .hs files. That also means I do not have to compile the code generators.

You might think this is a contrived situation but it is not. The Gtk2Hs build system use GNU make (non-recursive of course). It has lots of .chs.pp files which get pre-processed with cpp and then c2hs to generate a .hs file. For each .chs file we look for all the imports and generate a .deps file which gets included into the top level makefile. So just to calculate all the dependencies we actually have to two about two thirds of the build. That’s because we first have to build c2hs and two other code generators and then run cpp and c2hs on a hundred or so files, just so we can generate all the .hs files and find their dependencies. This means that from a clean state I cannot build just one particular .hs file because the very makefile itself depends on all the other .hs files. Amusingly, just to make clean, make has to build half the project. Obviously that is insane so we have to hack it using includes that are conditional on the make target. That’s not a technique that scales however.

In any non-trivial build system it is necessary to interleave build actions with discovering dependencies. Classic make ignores this fact completely. GNU make notes it but forces us to do all the actions to discover the full dependency graph up front before we can get round to useful work. A better build system would be built around the notion of interleaving graph exploration with graph reduction. Saizan’s make-like framework is built around this principle.

Could we do better?

Looking around at previous work, like the classic recursive make considered harmful or the Vesta or Maak build systems, one lesson comes up again and again: track dependencies absolutely precisely or you lose.

I am convinced that the solution to the first problem is not to specify the dependencies and targets separately from the action. They should be specified together so that there is simply no way to express an untracked dependency. This is also the approach taken by Vesta and Maak.

If we have a bunch of primitive actions that correctly declare their dependencies and ways of combining actions that remembers dependencies then we should never end up with untracked dependencies. For example consider the above example again. Lets suppose we have two primitives:


readFile  :: FilePath -> Rule String
writeFile :: FilePath -> Rule (String -> ())

Clearly one has a single dependency and the other a single target. Lets write that contrived rule again:


rule = write <*> read
  where
    write = writeFile “foo”
    read  = (++) <$> readFile “bar” <*> readFile “baz”

So the intuition here is that applicative composition gives us static rule-graph composition. As for rules with dynamic dependencies, this corresponds to monadic composition:


(<**>) :: Rule a -> Rule (a -> b) -> Rule b
(>>=)  :: Rule a -> (a -> Rule b) -> Rule b

This lets us express the cases where very structure of the dependency graph depends on a value calculated from another part of the graph.

With static composition we can look at the resulting rule and ’see’ down both branches of the composition to discover the graph structure without running any actions. With dynamic composition we can look at the left branch statically but the right branch requires the value produced by the left before we can even discover its structure.

So we should use static composition where possible because it tells us more about the graph which gives us more opportunities for parallelism and for caching intermediate results to avoid needless recompilation. Where necessary we can use dynamic dependencies but we should not over-use them or our dependency ‘graph’ would just become one long chain.

So, that’s the idea. I should make clear: there is no corresponding implementation! Unlike my vaporware, Saizan has a working implementation of a slightly lower level API as part of his GSoC project. We have talked about putting a higher level API on top though it’s not completely clear how to do that yet.

There are also undoubtedly connections to FRP that could do with further exploration. I mean it is quite clearly an FRP style problem, but an ugly grungy one where many parts work by side effects in the file system.

Solving the diamond dependency problem

Tuesday, August 19th, 2008

In a previous post I explained the the dreaded diamond dependency problem. In this post I’ll look at a solution.

Ever seen an error like when building a package?

Couldn't match expected type
  `bytestring-0.9.0.4:Data.ByteString.ByteString'
  against inferred type `ByteString'

Or when configuring a package?

Warning: This package indirectly depends on multiple versions
of the same package. This is highly likely to cause a compile failure.
package pureMD5-0.1.2 requires bytestring-0.9.0.4
package binary-0.4.1 requires bytestring-0.9.1.0
package cabal2arch-0.3 requires bytestring-0.9.1.0

These failures were pretty common.

So the good news is that the latest release does solve the problem. It contains a completely new package dependency resolution algorithm based on constraint solving.

The problem space is this: we’ve got a bunch of packages, say A, B, C, D.
Diamond dependency graph

We’ve got a bunch of versions of of those packages, let’s say:

  • A-1.0, A-1.1
  • B-1.0
  • C-1.0, C-2.0.
  • D-1.0, D-1.1

Each package version has dependencies on other packages along with version constraints, for example A-1.0 might need C >= 1 and A-1.1 might need C >= 2. Then given a goal like A >= 1, we have to pick exactly one version of package A and one version of C and so on recursively until we have picked a package version to satisfy each dependency. So even if we can get to a package by following two different dependency paths, we must end up picking the same version. That’s what it means to avoids inconsistent dependencies. Of course the version we pick for a package has to satisfy the version constraints of all the packages that depend on it.

The problem space can actually get extremely large. In theory each choice about a package version is independent. In the above example supposing we have to pick a version of A, B, C and D then we have two choices for A, one choice for B and two choices for C and D which gives us 2*1*2*2 combinations overall. It is clear that we cannot explore the entire space of combinations because it’s exponential in size. That may not seem important for this example with just 4 packages but there are over 700 packages on hackage and we’d like to be able to install most of them at once. We need to apply some smarts.

Note that this is actually an NP-complete problem. It’s possible to encode 3-SAT into the package dependency problem. Fortunately most real instances are fairly easy so we need not give up hope.

Ultimately we have to make choices between versions of a package and we have to end up with a consistent set of choices (or fail). It makes sense therefore to keep track of which versions of which packages are still valid choices. We can make the problem easier if we can eliminate choices that we know would lead to impossible or inconsistent solutions. We can propagate information both up and down the dependency graph to help us eliminate choices.

For example supposing we looked at D-1.0 and D-1.1 and we discovered that D-1.0 could not possibly be built on our operating system but D-1.1 could. If later on we discover we need some version of D then we only have one possibility so we can pick it right away. More interestingly, it might also allow us to eliminate a choice higher up. Supposing for the sake of argument that C-1.0 needed D-1.0 and would not work with D-1.1 then by eliminating D-1.0 we can also eliminate C-1.0.

In the downwards direction, consider the effect of picking A-1.1 rather than A-1.0. We said earlier that A-1.0 needs C >= 1 and A-1.1 might need C >= 2. So making this choice for A eliminates C-1.0.

One class of strategies is to make a series of firm choices and try to build up to a consistent solution. If we do this with no backtracking then we will always run in polynomial time, but there will always be cases where we cannot find solutions even though a solution exists. The chance of us finding a solution in this kind of strategy depends on the order in which we make the choices.

The algorithm

The strategy I’ve adopted is to make package choices in a top down order. It is a kind of mixture between breadth first and depth first search. We have a set of packages that we know we will need and a set of specific package versions we’ve already picked. We also maintain a set of constraints on all the packages. This lets us know what versions are available for each package that we’ve yet to decide about.

In each round we pick a package and choose one version of it and update our set of constraints and if the package we picked depends on a new package then we add that to the pending package set. When there is a package we know we need and there is only one version left then we pick it. That’s the easy case.

The hard case is when all remaining packages have at least two versions we could pick. Here we just have to pick something and hope for the best. First of all, we have to decide which package we’re going to make a choice for. A good heuristic here is to decide on packages higher up the dependency graph earlier since their dependencies impose constraints on later choices. We can do this by caching the depth first numbering for each package. Once we’ve decided on which package to go for then we have to decide which version to pick. The most sensible thing is to pick the highest version, however there is a complication with installed packages that I’ll come back to later.

Results

So this algorithm produces pretty good results in most cases. It always runs in polynomial time and because of the way it gathers constraints it is guaranteed that if it finds a solution that it will be a valid consistent solution. The old dep solver which would very often come up with inconsistent solutions. Remember: consistent solutions avoid the diamond dependency problem.

Another good thing to gathering constraints is that when we fail we have some information about why we failed because we know which additional constraint made the constraint set inconsistent. We can translate that into a human readable error message, though not necessarily a particularly brilliant one.

By manually adding additional constraints we can also guide the solver to find solutions it would not otherwise find, or to avoid failure. In a sense this is just a hack to work around the fact that the solver isn’t even better, but in the mean time it’s pretty handy.

You can play around and get a sense of what the algorithm is doing if you run cabal install --dry-run -v pkg1 pkg2 etc and look at the intermediate output and the final solution.

The biggest test I’ve tried so far is trying to install all of hackage simultaniously. It takes several seconds to find a solution, which is not too bad. I had to discard about 30 packages to arrive at a consistent set. Adding any one of the discarded ones back leads to inconsitent dependencies, for example conflicts over which version of HaXml is required. There are also many packages that simply cannot ever be installed because they have missing dependencies.

Installed packages

So what about installed packages? I mentioned that there’s a complication there. We often have to decide between picking an already installed version of a package and the available one from hackage (even if they are the same version). If we pick the installed version then we immediately pick up a lot of precise constraints, because the installed package depends on exact versions. That is often enough to prevent us from finding a solution. Indeed in the diamond dependency examples I gave previously it was clear that to find a consistent solution we would have to rebuild some existing installed package. So we cannot default to preferring installed packages. A simple observation is that the available package always has looser constraints than the installed version, so it’s always safe to pick the available version over the installed version. We can always ‘improve’ our solution at the end if we find the constraints we end up with are consistent with the constraints of the installed version. And that is what we do. There is a further issue however. If ultimately we would like the option to use the installed version then we had better pick the same version of the available package and not just the highest available version. What we do is allow a per-package preference for the highest available version or the highest available version that has a corresponding installed version. For the ‘install’ command we prefer the latest version for the packages explicitly given and the installed version for all others. For the ‘upgrade’ command we prefer the latest version everywhere.

That seems to work pretty well, though I’ve found people do get somewhat confused when cabal-install decides to re-install a package they already have. What is not clear to them is that it is rebuilding a package against different versions of its dependencies. Probably the thing to do is to make that clearer in the output of cabal install --dry-run.

Improvements

There are still plenty of improvements that could be made. The algorithm currently does not deal well with installing packages for older compilers because it finds out too late that a package it choose could only work with a newer version of the base library. The solution there is some bottom up constraint propagation. We should check at the beginning that each package version could possibly be installed with the current compiler and if not eliminate it. There are still real cases where we end up with an avoidable conflict. For example we might have two packages that need HaXml, lets say Foo needs HaXml =1.13.* while there are two versions of Bar, one that works with HaXml =1.13.* and a later version that needs HaXml >= 1.16. If we end up picking Bar before Foo then we would default to the later version of HaXml but then fail when we find Foo needs the earlier version. This is not solved by the ordering heuristic because Foo and Bar do not depend on each other. This is an example where we need either more look ahead, backtracking or a much more sophisticated algorithm in general.

The right long term approach I think is a proper complete constraint solving algorithm.

New Cabal and cabal-install releases

Tuesday, June 17th, 2008

Cabal-1.4 and cabal-install-0.5 are now released!

If you were already using a pre-release of cabal-install then you can just:
$ cabal update
$ cabal install cabal-install

The Cabal library

There are no huge new features in the Cabal library itself. The main new feature is that is supports a significantly improved version of cabal-install. Cabal-1.4 does have a number of incremental improvements and lots of bug fixes. You can see the full list in the changelog.

So the most exciting thing is the release of cabal-install…

New features in cabal-install

Command line improvements

The most immediately noticeable thing is that the command line interface now has all the commands that runhaskell Setup.hs has. Of course it still has the features to download and install packages from hackage. It also gained an upload command. So it now provides a command line interface to the whole Cabal/Hackage system. There’s no need to use runhaskell Setup.hs ever again. There is also bash command line completion support included which I find is a great time saver.

Installing and upgrading

The next big thing is that it includes a new package dependency resolution system that finds correct and sensible solutions more of the time and has better default behaviour. The new behaviour should be similar to other package managers that people are used to.

For example, suppose you’ve got xmonad-0.5 installed and version 0.7 is the latest on hackage, then

$ cabal install xmonad

will upgrade to xmonad-0.7.

The behaviour of install is to upgrade as little as possible to satisfy your request, but sometimes you want to upgrade all the dependencies too. Supposing now that we have xmonad-0.7 installed, but we’re still using X11-1.4.1 and the latest version on hackage is X11-1.4.2, then

$ cabal upgrade xmonad

will install X11-1.4.2 and re-install xmonad-0.7, this time built against the newer X11-1.4.2.

So in general, the install command will install the latest version of things but will try and use any existing installed versions of dependencies while the upgrade command will also try to use the latest versions of all dependencies. As a special case, cabal upgrade on its own will try to upgrade all the packages that you have installed.

For both command there is a --dry-run flag so you can see what would be installed without actually doing it.

Hugs

Yes, it even works with hugs. That is, cabal-install built by ghc can manage the installation of packages for hugs. In principle cabal-install should be able to be run by hugs but currently the zlib binding is using a function that hugs does not support.

Note that for hugs, Cabal does not know what packages are already installed because there is no equivalent of the package database that ghc has. So that means cabal-install cannot do very sensible installation planning. It should work ok so long as all the dependencies are already installed.

Windows

Yes, it even works on Windows. The one caveat is that cabal-install cannot currently upgrade itself because Windows makes it hard for a process to overwrite its own executable file. It needs more complex trickery with the Win32 API. In the meantime the workaround is to rename the cabal.exe file first eg to cabal-foo.exe, then say cabal-foo install cabal-install.

Build reporting

One feature that made it into this release is build reporting. cabal-install keeps logs of all packages that you install (at least packages from hackage, not local ones). It records a bit of information about each package, in particular whether the outcome was successful or not. You can see these build reports in ~/.cabal/packages/$server/build-reports.log.

The plan in the longer term is to let people upload these build reports to hackage so we can get a wider range of testing data about the packages on hackage.

Information plumbing

Thursday, May 8th, 2008

If you look at real programs you’ll find there is often a lot of code spent just on information plumbing. Plumbing is code that doesn’t do any “real work” but is just concerned with getting the right information from one bit of a program to another, often with some rearranging or impedance matching. It’s not as cool as code with real algorithmic content. So how can we stop plumbing code taking over and obscuring the code that is doing real work?

A nice example is some code in cabal-install for installing a package. Well, I say installing a package but really the actual installing is done by the Cabal library. This code just gathers information from various sources and uses that to decide how to configure a package. There is a little bit of real work mixed in, like unpacking tarballs, but it is mostly plumbing. For example there is the global configure options that apply to each package, per-package flag assignments and dependencies. Then there are various options for controlling how we compile and run the Setup.hs scripts for each package, including what compiler to use and what version of the Cabal library to run them with. There’s also an option to use some kind of sudo style wrapper for the install phase.

The code followed a common pattern of a series of layers, each one calling the layer below. In the cabal install example we have executeInstallPlan that calls installConfiguredPackage, which calls installAvailablePackage which calls installUnpackedPackage which calls setupWrapper (which as the name suggests is a layer on top of something else). It’s a lot of layers. Lets pick out just two examples:

installConfiguredPackage verbosity
    scriptOptions miscOptions configFlags
    (ConfiguredPackage pkg flags deps)
  = installAvailablePackage verbosity
    scriptOptions miscOptions configFlags' pkg
  where
    configFlags' = configFlags {
      configConfigurationsFlags = flags,
      configConstraints = deps
    }

installAvailablePackage verbosity
    scriptOptions miscOptions configFlags
    (AvailablePackage _ pkg LocalPackage)
  = installUnpackedPackage verbosity
      scriptOptions miscOptions configFlags
      pkg Nothing

As you can see, the problem with this style is that each layer gets cluttered with passing all the information that the lower layers need. Bundling related bits of information into tuples or records helps to some extent. In this example you can see it’s been done already; scriptOptions, miscOptions and configFlags are bundles of other parameters. (The name miscOptions is a sure sign that we’re bundling arbitrary things together just to try and reduce the number of parameters we’re passing.)

What is worst though is that we end up with a tightly coupled system. To add a new feature at the bottom layer requires changing all the layers to pipe that information all the way down.

So one nice trick is instead of having each layer directly call the next one down, we parametrise each layer by all the layers below it. For example:

installConfiguredPackage configFlags
    (ConfiguredPackage pkg flags deps)
    installPkg
  = installPkg configFlags' pkg
  where
    configFlags' = configFlags {
      configConfigurationsFlags = flags,
      configConstraints = deps
    }

installAvailablePackage
    (AvailablePackage _ pkg LocalPackage)
    installPkg
  = installPkg pkg Nothing

So we take the next layer as an installPkg parameter, do whatever transform the layer was supposed to be do and call installPkg as the next layer. So all the extra parameters that this layer did not care about are gone. They can be passed directly to installPkg by the caller.

So in the end we stick all the layers together:

executeInstallPlan installPlan $ \cpkg ->
  installConfiguredPackage configFlags cpkg $
  \configFlags' apkg ->
    installAvailablePackage verbosity apkg $
      installUnpackedPackage verbosity
          scriptOptions miscOptions configFlags'

Unlike before, we can now see all the layers at once rather than just the top layer. So we can pass those extra parameters directly down to the bottom layer. As you can see from the installConfiguredPackage layer, we can modify the values on the way down but we only have to pay in plumbing for the ones we’re using at each layer rather than in every layer.

After the refactoring the code can become more general which has advantages for testing and reuse. For example installConfiguredPackage turns out to be pure and executeInstallPlan can work in any monad where as previously it was tied down to being in the IO monad because the bottom layer was in the IO monad. Not being in IO makes it much easier to use a testing system like QuickCheck and being able to pass a dummy makes it possible to test each layer in isolation.

So obviously the message must be that higher order plumbing is cool!

Well perhaps not, but it can make it a little less interconnected and easier to manage.

The dreaded diamond dependency problem

Thursday, April 24th, 2008

One issue that came up several times at the hackathon—especially when talking to the Yi hackers—is the dreaded diamond dependency problem. This happens when you have 4 packages in a diamond shaped dependency graph:
Diamond dependency graph
The problem arises when we have package B and C already installed but built against different versions of D and then we try to use both packages B and C together in package A:
Diamond dependency problem
This can work ok but only if packages B and C do not expose types defined in D in their interfaces. If they do then package A will not be able to use functions from B and C together because they will not be working with the same type. That is you’ll get a type error.

To pick a concrete example, suppose package D is bytestring and we have both bytestring-0.9.0.1 and 0.9.0.4 installed. Lets say B is utf8-string and C is regex-base. Lets say that package A is the Yi editor program. So the point is, at some place in the code in Yi we want to pass a bytestring produced as the result of UTF-8 decoding as input to one of the regex functions. But this does not work because the functions in the utf8-string package are using the ByteString type from bytestring-0.9.0.1 while the regex functions in the regex package are using the ByteString type from bytestring-0.9.0.4. So we get a type error when we try to compile Yi:

Couldn't match expected type `bytestring-0.9.0.4:Data.ByteString.ByteString'
against inferred type `bytestring-0.9.0.1:Data.ByteString.ByteString'

As far as GHC knows, these two types are totally unrelated!

This is obviously extremely annoying. There is also no easy solution. In this example we’re assuming that packages B and C have already been built, so there is actually no way to sensibly use the two packages together without rebuilding on of them against a different version of package D. In this case the obvious solution is to rebuild B to use the D-1.1 rather than D-1.0. The problem with rebuilding a package of course is it breaks all other packages that have already been built against it. It isn’t clear that you want a package manager to go automatically rebuilding lots of apparently unrelated packages.

In the longer term the best solution would seem to be to do what Nix does. In the above example, instead of replacing package B built against D-1.0 with B built against D-1.1, Nix would add
another instance of B built against D-1.1. So the original instance of B would remain unchanged and nothing would break. It’s the functional approach: we never mutate values (installed packages) we just create new ones and garbage collect old one when they are no longer needed.

In practice it means we have to identify installed packages using some hash of the package and the hashes of all dependent packages. jhc already does this and there are moves afoot to do something similar for GHC, though aimed more at tracking API/ABI changes. For sane source based package management, I think it’s the right direction to take.

I should note that this is not a new problem. You’ve been able to construct this problem ever since ghc started to allow multiple versions of the same package to be installed at once. We are just noticing it a lot more frequently now because we split up the base package and allow those split-out packages to be upgraded.

The current state of play is that Cabal warns of this problem but doesn’t really help you solve it. For the above example we’d get:

$ cabal configure
Configuring A-1.0...
Warning: This package indirectly depends on multiple versions of
the same package. This is highly likely to cause a compile failure.
package B-1.0 requires D-1.0
package C-1.0 requires D-1.1