Archive for April, 2009

“Hello world” now only 11k using GHC with shared libs

Tuesday, April 28th, 2009

$ ./Hello.dyn
Hello World!

$ ls -ogh Hello Hello.dyn
411K 2009-04-28 21:59 Hello
 11K 2009-04-28 21:55 Hello.dyn

On Linux x86-64 with GHC using shared libraries a “Hello World” program is now only 11k compared to 411k previously. By comparison, JHC manages 6.4k and an equivalent C program is 6.3k. (All sizes after running strip on the binary.)

As I mentioned earlier, the IHG has asked us to work on improving GHC’s support for shared libraries. I’ve been updating the new GHC build system to support --enable-shared and I’ve just now managed to get the build to go through. I’ll clean up my patches and send them in tomorrow. There are still a number of things to do. I’ve got to run the testsuite with everything built for shared libs. Clemens had this working before so I’m not expecting too many test failures. We also need to set up a GHC buildbot to use --enable-shared so that we do not get regressions.

The next task will be to test that it works to make a Haskell library that exports a C API and to use it as a plugin for some other program. Anyone got any good suggestions for a simple demo plugin? What programs have nice simple plugin APIs?

First round of Industrial Haskell Group development work

Tuesday, April 28th, 2009

The Industrial Haskell Group (IHG) have asked us to get cracking on a number of tasks:

  • Make dynamic/shared libraries work better
  • Make it possible to build GHC without using GMP
  • FFI checker/lint tool
  • Improving hsc2hs/c2hs + Cabal to make it easier to write C wrappers to C functions

We’ll talk in more detail about each one as we tackle them.

Shared libraries

We’ve started on the shared libraries task. This is quite a big area. Lots of people have put a lot of hard work into it already but there’s a fair bit left to do before we have GHC releases using them by default.

A little history

Wolfgang Thaller did a lot of the original work on generating position independent code (PIC) in the native codegen. Clemens Fruhwirth pushed things further along as part of a SoC project. He got shared libs working on Linux and started to address some of the packaging and management issues. GHC version 6.10 actually released with the shared libs code as an experimental feature.

Why do we care about shared libs?

There are several reasons we care. The greatest advantage is that it enables us to make plugins for other programs. There are loads of examples of this, think of plugins for things like vim, gimp, postgres, apache. On Windows if you want to make a COM or .NET component then it usually has to be as a shared library (a .dll file).

There has been most demand for this feature from Windows users over the years and for some time it has been possible to generate .dlls using GHC (though it was broken in version 6.10.1). It’s not been an easy feature to use however, and what’s more the current results are not exactly great. While you can currently take a bunch of Haskell modules that export a C API and make a .dll, the .dll file you get is huge. It statically links in the runtime system and all the other Haskell packages. So if you want to use more than one dll plugin then each one has it’s own copy of the GHC runtime system and all the libraries! Obviously this is not ideal. Having all these copies of the runtime system and base libs takes more memory, more disk space and slows things down. What everyone really wants is to be able to build the runtime system and each Haskell package as a separate .dll file. Then each plugin should be small and would share the runtime system and other dependencies that they have in common.

A somewhat superficial reason is that it makes your “Hello World” program much smaller because it doesn’t have to include a complete copy of the runtime system and half of the base library. It’s true that in most circumstances disk space is cheap, but if you’ve got some corporate shared storage that’s replicated and meticulously backed-up and if each of your 100 “small” Haskell plugins is actually 10MB big, then the disk space does not look quite so cheap.

Using shared libraries also makes things a bit easier for Haskell applications that want to do dynamic code loading. For example GHCi itself currently has to load two copies of the base package, the one that is statically linked with and another copy that it loads dynamically. With shared libraries it would just end up with another reference to the same copy of the single shared base library.

Shared libs also completely eliminates the need for the “split objs” hack that GHC uses to reduce the size of statically linked programs. This should make our link times a bit quicker.

What we’ll be doing

We’re planning to get things to the stage where a GHC user can make a working plugin on Linux x86, Linux x86-64 and Windows.

As recently as a few days ago people have managed to get GHC HEAD working with shared libraries on Linux x86-64. Since then however we’ve had the new GHC build system land in the HEAD branch. So the first thing I’ve been working on is porting the shared library support to the new build system. So far so good. I’ll report when I’ve got the build to go all the way through.

Platform progress and the Hackathon

Friday, April 24th, 2009

The Haskell Hackathon last weekend was a great success with more than 50 people attending over the three days. Thanks to the sponsors and local organisers!

If you’ve been to a few of these events you learn that it’s best not to come with too many preconceived ideas for what to work on. Since the point of the hackathon is really collaboration, you end up spending half the time talking and the other half working on cool ideas that other people bring.

I arrived with the general plan to work on the Haskell Platform release, and along with Don Stewart and Lennart Kolmodin we did actually get a bit done. I’m slightly embarrassed to admit that I spent three days at the Haskell Hackathon and wrote no Haskell code, only POSIX shell script and M4 autoconf macros!

Don and I updated the list of packages that will be in the first platform release. There were a few that needed to be bumped after the ghc-6.10.2 release. Our thanks to Ross who had already uploaded all the core and “extra libs” packages to Hackage.

The three of us also worked on making a generic Unix tarball of the platform. The point is for users of distros which do not yet have native packages for the platform to be able to download this tarball and ./configure; make; make install. We even managed to get something working just enough for people to be able to test it (haskell-platform-2009.0.0.tar.gz).

Chris Eidhof and Eelco Lempsink of Tupil designed a cool “Get Haskell” download page

(The silly caption was Chris’s joke in response to Ganesh’s comment about an earlier design)
The idea is that we would put this at http://haskell.org/download/ to provide an easy start for new users. For OSX and Windows, the icons would link directly to a download and a page with install and post-install instructions. The Linux icon would link to another page with instructions for each supported distro, or the generic tarball for unsupported distros.

Outside of the Hackathon people have also been working hard on the platform release. If you’re on the mailing list you’ll know that Mikhail Glushenkov has been making great progress on preparing a Windows installer. He’s got a beta version available (HaskellPlatform-2009.0.0-setup.exe). Report feedback in the platform trac ticket #6.

Gregory Collins has also been working hard on a cabal2macpkg tool to generate OSX packages from Cabal packages. He’ll use this for each package in the platform and then bundle them all together (along with ghc) into one installer. He’s been having difficulty with the fact that the package format for OSX Leopard is woefully under-documented.

If you’re someone who prepares distro packages then now is an excellent time to get started making sure you’ve got the correct versions of all the platform packages and making a haskell-platform meta-package. See the platform trac for more details.