Haskell Records Considered Grungy

April 24, 2007

Ugly field selection syntax
OK, the most trivial complaint first. If we have defined a record like this:

> data Bird = Bird { name :: String, wings :: Integer }

How do we go about accessing the ‘name’ and ‘wings’ fields of a record instance? If you are used to a language like C, you might say it would look something like this:

> (Bird { name = “Fred”, wings = 2 }).name

Unfortunately, this isn’t the case. Actually, declaring a record creates a named function which uses pattern matching to destroy a passed record and return the name. So access actually looks like this:

> name (Bird { name = “Fred”, wings = 2 })

Prefix notation may please the Lisp fans, but for me at least it can get a bit confusing, especially when you are dealing with nested fields. In this case, the code you write looks like:

> (innerRecordField . outerRecordField) record

Which (when read left to right, as is natural) is entirely the wrong order of accessors. However, it is possible to argue this is just a bug in my brain from having spent too long staring at C code.. anyway, let’s move onto more substantitive complaints!

Namespace pollution
Imagine you’re writing a Haskell program to model poulty farmers who work as programmers in their spare time, so naturally you want to add to the Bird record above a Person record:

> data Person = Person { name :: String, knowsHaskell :: Boolean }

But I think you’ll find the compiler has something to say about that….

Multiple declarations of `Main.name’
Declared at: Records.hs:3:19

Ouch! This is because of the automatic creation of the ‘name’ function I alluded to earlier. Let’s see what the Haskell compilers desugaring would look like:

> newtype Bird = Bird String Integer

> name :: Bird -> String
> name (Bird value, _) = value

> wings :: Bird -> Integer
> wings (Bird _, value) = value

> newtype Person = Person String Boolean

> name :: Person -> String
> name (Person value, _) = value

> knowsHaskell :: Person -> Boolean
> knowsHaskell (Person _, value) = value

As you can see, we have two name functions in the same scope: that’s no good! In particular, this means you can’t have records which share field names. However, using the magic of type classes we can hack up something approaching a solution. Let’s desugar the records as before, but instead of those name functions add this lot:

> class NameField a where
> name :: a -> String

> instance NameField Bird where
> name (Bird value, _) = value

> instance NameField Person where
> name (Person value, _) = value

All we have done here is used the happy (and not entirely accidental) fact that the ‘name’ field is of type String in both records to create a type class with instances to let us extract it from both record types. A use of this would look something like:

> showName :: (NameField a) => a -> IO String
> showName hasNameField = print (“Name: ” ++ (name hasNameField))

> showName (Person { name = “Simon Peyton-Jones”, knowsHaskell = true })
> showName (Bird { name = “Clucker”, wings = 2 })

Great stuff! Actually, we could use this hack to establish something like a subtype relationship on records, since any record with at least the fields of another could implement all of its field type classes (like the NameField type class, in this example). Another way this could be extended is to make use of the multiparameter type classes and functional dependency extensions to GHC to let the field types differ.

Of course, this is all just one hack on top of another. Actually, considerable brainpower has been expended on improving the Haskell record system, such as in a 2003 paper by the areforementioned Simon Peyton-Jones here. This proposal would have let you write something like this:

> showName :: (r <: { name :: String }) -> IO String
> showName { name = myName, .. } = print (“Name: ” ++ myName)

The “r <: { name : String }” indicates any record which contains at least a field called name with type String can be consumed. The two dots “..” in the pattern match likewise indicate that fields other than name may be present. Note also the use of an anonymous record type: no data decleration was required in the code above. This is obviously a lot more concise than having to create the type classes yourselves, as we did, but actually we can make it even more concise by using another of the proposed extensions:

> showName { name, .. } = print (“Name: ” ++ name)

Here, we omit the “name = myName” pattern match and make use of so-called “punning” to give us access to the name field: very nice! Unfortunately, all of this record-y goodness is speculative at least until Haskell’ gets off the ground.

Record update is not first class

Haskell gives us a conventient syntax for record update. Lets say that one of our chickens strayed too close to the local nuclear reactor and sprouted an extra limb:

> exampleBird = Bird { name = “Son Of Clucker”, wings = 2 }
> exampleBird { wings = 3 }

The last line above will return a Bird identical in all respects except that the wings will have been changed to 3. The naïve amongst us at this point might then think we could write something like:

> changeWings :: Integer -> Bird -> Bird
> changeWings x = { wings = x }

The intention here is to return a function that just sets a Bird records “wings” field to x. Unfortunately, this is not even remotely legal, which does make some sense since if it was record update should, to follow normal function application convention, look more like this:

> { wings = 3 } exampleBird

Right, I think that’s got everything that’s wrong about Haskell records off my chest: do you know of any points I’ve missed?

Edit: Corrected my pattern match syntax (whoops :-). Thanks, Saizan!
Edit 2: Clarified some points in response to jaybee’s comments on the Reddit comments page.


10 Responses to “Haskell Records Considered Grungy”

  1. Josef Svenningsson Says:

    I agree with you that Haskell’s record system is not the strongest part of the language. It’s a hack that was added because it was easy to do so and gave a way to refer to components of a constructor by name instead of by position. So the power to weight ratio is pretty high but it still is something of a wart.

    The best proposals for records that I have ever seen is that by Daan Leijen. “Extensible records with scoped labels”(http://www.cs.uu.nl/~daan/download/papers/scopedlabels.pdf) I wish some Haskell implementation would implement that so that one could play with it. It would also be very nice to have in Haskell’ but that may be too much to hope for.

  2. Daan Leijen, “Extensible records with scoped labels” – http://www.cs.ioc.ee/tfp-icfp-gpce05/tfp-proc/21num.pdf

    That paper may be of interest to you, its a very different, and very neat way of defining records.

  3. Logan Capaldo Says:

    An alternative hack (which I’m sure you are aware of) would be something like:

    module Bird where
    data Bird = Bird { name :: String, wings :: Integer }

    module Person where
    data Person = Person { name :: String, age :: Integer }

    module Main where
    import qualified Bird
    import qualified Person — heh

    > Person.name examplePerson

    > Bird.name exampleBird

    To make it a little less wordy:
    import qualified Bird as B

    > B.name exampleBird

  4. Thanks a lot for all the great comments! I’ll be sure to check out those two papers. And yeah, the module system is one (considerably cleaner, it must be said!) way to get around the namespace problem, but it’s still not perfect :). Heres hoping Haskell’ solves all this for us!

  5. Saizan Says:

    your desugaring examples are quite wrong:
    > name :: Bird -> String
    > name (value, _) = value
    should be written:
    > name (Bird value _) = value

    Also you can indeed write a changeWings function:
    > changeWings :: Integer -> Bird -> Bird
    > changeWings x b = b{wings = x }

    the problem is that you can’t write something like this:
    > changeField field v b = b{field = v}

  6. Saizan, thanks for your comment! You are indeed right that my pattern matches were totally off, I’ve fixed that (maybe it’ll teach me to make posts without a compiler available!).

    However, my point about changeWings stands: what I’m trying to say is that { wings = x } is not actually a function, it’s something a bit special that has meaning only when a record value is put before it. It’s ugly because it breaks the nice functional orthogonality of other Haskell constructions. I probably should have made this clearer!

  7. josh Says:

    I think the { wings = x } issue would be less of a problem if you could write “update field container value = container { field = value }”. But of course you can’t.

  8. Idetrorce Says:

    very interesting, but I don’t agree with you

  9. Jimmy H Says:

    I find it ironic that you complain about record extraction being prefix rather than postfix, and then later complain that record update isn’t available in point-free form, and indicate that if Haskell were saner, it would also be prefix…

    The “functional orthogonality” of Haskell that you mention is its best feature in my book. updateWings = {wings = 3} is just changing the semantics of record update to something in line with the rest of the language.

    For the people who want “update field container value = …”, what should its type be?

  10. Max Nanasy Says:

    You still have some errors in the way you wrote your pattern matches:

    > name (Bird value, _) = value
    should be
    > name (Bird value _) = value

    Commas are only for record syntax and some syntactically unique constructors, such as lists and tuples.

Comments are closed.

%d bloggers like this: