36 – John Hughes

Recorded 2023-09-21. Published 2023-10-31.

In this episode, Matti and Wouter are joined by John Hughes. John is one of the authors of the original Haskell Report and talks about why functional programming matters, the origins of QuickCheck testing, and how higher order functions and lazy evaluation is the key that makes functional programming so productive, and so much fun!

Transcript

This transcript may contain mistakes. Did you find any? Feel free to fix them!

Wouter Swierstra (0:00:14): Welcome to the next episode of the Haskell interview. I’m Wouter Swierstra, and I’m joined by Matthias Pall Gissurarson.

Matthias Pall Gissurarson (0:00:20): Hi.

WS (0:00:21): Our guest today is John Hughes, who will tell us how to specify it, why he loves buggy code, the importance of having fun, and why functional programming matters.

MPG (0:00:34): Alright. Welcome, John.

John Hughes (0:00:36): Thank you. It’s nice to be here.

MPG (0:00:38): So, John, first question: why does functional programming matter?

JH (0:00:43): So, let me answer that by saying why I fell for functional programming in the first place. I was fortunate to spend a year as a junior programmer between school and university in the Programming Research Group in Oxford. And I had an opportunity to learn all kinds of things during that year. And one of the things I did was I picked up a book about Lisp, and I thought it was just so cool. It was not a super great book, I don’t think, but I thought it was wonderful. And the only thing that disappointed me was that I didn’t have a Lisp implementation at hand to play with. So, I wrote one on my own, which was appalling when I read the code a couple of years later. But it worked, and I was able to start writing Lisp programs, and I thought it was so much fun. And the reason I thought it was fun was that I found I was so productive. I could do so much with so little. As a single developer, I could achieve much more than I was used to. And I think that’s the key—productivity. 

So, I wrote my old paper, Why Functional Programming Matters, about higher-order functions and lazy evaluation. I think those are two features that support productivity because they support modularity, and they’re two of the key features that make functional programming so productive and so much fun. But then other things are important as well. Like, all of us have debugged an awful side effect somewhere. And you can spend hours scratching your head because something has a side effect that you didn’t realize about. And when you’re writing functional code, you know that that’s not happening very much of the time. So, you can eliminate a whole lot of possible causes very quickly and debug things more quickly as a result. 

So, I think referential transparency—perhaps we often talk about this as supporting reasoning, and it sounds as though reasoning is something that only people doing formal reasoning in ivory towers should care about. But actually, when we are debugging code, that’s what we’re doing. We are reasoning. And it really is easier if you don’t have to worry about side effects.

So, there are all those reasons that just make functional programming a very productive kind of programming. And as you go through your career and you become a professor, you have less time for actual hacking. And so, it’s really important to be able to do a lot within the time that you have. But I look at the places where functional programming has drawn in and got a lot of users in the industry, but I think parallelism, yes, that has been important, especially early on when multicores came out, that that made people look at functional programming. But why did they stay with it? I think it has a lot to do with productivity.

WS (0:03:55): So, it’s interesting that you mention these two points in your paper, right? You talk about laziness and higher-order functions as being the glue which helps you assemble modular programs.

JH (0:04:07): Yeah.

WS (0:04:09): And I think I agree, but I was wondering if your opinion has changed over the years or if there are other reasons that have come up.

JH (0:04:20): Well, as I said, I think I would – in that paper, I was a little dismissive of things like referential transparency in that I argue that it’s important not to talk about the things you can’t do, but the things that you can do. So, I think perhaps I’ve got this – getting so interested in testing has given me a slightly different perspective. And that’s one that values referential transparency more highly than perhaps I did as a foolish young man. When you are testing a function, knowing that the only things affecting its behavior are the inputs that you can see, and the only thing it’s doing is producing the output. But you can also see that is a huge win.

WS (0:05:13): Yeah. So, I think it’s interesting that people sometimes say this, that laziness or trying to just design a lazy language meant that we didn’t have a good idea about how to do side effects. And then, as a result of that, people were encouraged to write pure code, and then all of a sudden people realized, “Oh, we don’t need all these side effects.” And then the purity is what actually is as important as the laziness, perhaps.

JH (0:05:42): So, I think there are people who would make that argument. And indeed, you can see that purity has made a lot of inroads in other programming languages as well. Like immutable data, it’s something that people writing parallel programs in any programming language will try and make a lot of use of nowadays, I believe.

I am not one of the people who has given up on lazy evaluation. I use it heavily when I write Haskell code and when I don’t have it, but I also have written a lot of Erlang code. Then I find I have to simulate it. So, there’s the usual trick with zero of a function as a thunk to represent laziness. And there are cases where search algorithms, for example, where building a lazy data structure and then traversing it is the natural way for me at least to solve a problem. And I’ve had to simulate lazy evaluation in Erlang in order to express that kind of code in a nice way. And when I do that, then I discover that, of course, you are in a situation where lazy values and strict values have different types. And that means that you have to think in advance about, where am I going to want laziness? And because there is a cost, or be a small one in putting laziness in, one tends to put it in too few places. One has to go back later on and refactor the code and make it lazier. And life is too short. 

WS (0:07:20): Yeah. Fair enough. So, one point the paper doesn’t mention is static typing, which I think is very important. But I was curious to hear, because you’ve worked quite a lot in both Haskell and Erlang, what your thoughts are there.

JH (0:07:39): Yes. So, I usually say that I think I have a somewhat unusual perspective on this. And it’s probably driven by the fact that a lot of the work that I’ve done in Erlang has been to do with QuickCheck. So, when I don’t have static types, what does that mean? It means I have to find my type errors by testing. But what do you know? I have a super testing tool, and all of my code is extremely thoroughly tested all the time. And provided that I’m working that way, then, actually, type errors are easy to find. You can say, well, static typing will give you the type error compile time, but test time is only very, very shortly after compile time. And so, for the most part, I find type errors easily enough anyway. 

Now, there are things that are more difficult, like refactoring code to change the type of something, changing the representation of data structure, for example, or an abstract data type. That is something I’m more nervous about in Erlang than in Haskell. So, that’s certainly one place where I think static types are very helpful.

The downside of static typing, I think you see in teaching. I used to teach first-year programming, the first programming course, to undergraduates, and explaining what Haskell type errors meant was actually beyond me. I told students, if it says numb, it means it’s something to do with arithmetic. The type error message means something is wrong. And it’s very difficult to explain to beginners what those things mean. And I think the fundamental problem here is that to understand type error messages, you have to understand static semantics as well as dynamic semantics. Whereas to understand test failures, you only have to understand dynamic semantics.

So, for people at the beginning of their programming career, just learning one semantics is hard enough. So, I’m not so convinced that static typing has been helpful in that particular context. I mean, the Haskell type system has developed a lot over the years, and I love things like GADTs that let me express all kinds of things very nicely in the types. So, yes, I enjoy static typing, but I have a more ambivalent regard for it than perhaps many other people do.

WS (0:10:17): Yeah. I certainly agree with your point that for beginners, I’m encountering sometimes very – I don’t want to say dark corners, but very specific Haskell things about type class constraints when they really just – they want to reverse a list or add up some things, and they make a mistake, which seems innocent enough. But then they suddenly see an error message which is exposing all kinds of details about how numbers work in Haskell that they’re completely unfamiliar with.

On the other hand, I do see attempts like Helium or I know other languages like Elm, who take a much less ambitious approach to having very fancy types, but then they say, “Oh, but we want to make these type errors really, really good.” And I’m a little bit on the fence because I feel that you want to confront even beginners that you want them to get into this, at least this discipline. It’s very easy. It’s very easy to get misled and think I’ll just write a little script and not care about my types. I’ll fix it later, or the kind of in the Python mindset almost. “Oh, I don’t need to test this code, really, because of course, it’ll work.” But then suddenly, you realize that there’s a branch and an if that hasn’t been explored or tested very thoroughly, and then this can really bite you later on, by experience. So, I’m a little bit on the fence here.

JH (0:11:54): Yeah. I think for beginners, there’s also the fact that types offer a kind of model before you write the code. So, just thinking up the types is very helpful for thinking up the code. That’s a key part of the – what’s the name? The design, your –

WS (0:12:16): How to design programs. 

JH (0:12:19): How to design programs. That’s right. That’s certainly valuable. On the other hand, any static type system is conservative. And that means there are programs that maybe even beginners can write that would work. But the type checker is going to reject them. And that’s something which is, I think, not so easy to explain to beginners. I mean, there’s swings and roundabouts here, but certainly, the overloading in Haskell is a source of severe problems for beginners, I think. 

WS (0:12:52): For sure.

JH (0:12:52): And I think many of us who were involved in Haskell in the early days and have done a lot of teaching using Haskell are rather dubious of the FTP reform in particular because that led to many more things being overloaded and many more programs being declared ambiguous.

WS (0:13:13): Well, there’s a trade-off here because, especially in this example, you can see that there’s a lot of programs which get accepted and return a result, but maybe not the thing you were trying to do. And because you were expecting a type error, you would’ve liked to see a type error rather than kind of soldier on and make a best guess, right?

JH (0:13:36): Yes. That’s also a problem. Now, I’m remembering a generator I wrote fairly recently where I wanted to fold over a list. So, I used foldr in my generator. Then I realized that the type checker didn’t know that I meant foldr over lists. Because, I mean, you know how it is in Haskell type system. If the things you feed in aren’t lists, it’s just an intermediate data structure. Then you consume it with foldr. There’s nothing there that tells the type checker it’s a list. So, what do you have to do? Well, you can add a type signature, except this is polymorphic types. So, first of all, you have to turn on scope type variables. And then perhaps it’ll go through, or perhaps you’ll get a rigid type variable error. Or you can add an identity function maybe of type list to list, and you know that GHC will unfold that. So, there’s no dynamic cost, but you can put that. I call that function in the right place to tell the type checker this is really a list. I mean, this is – it took a few attempts to get the code to compile for me. And I am reasonably expert where Haskell is concerned. When beginners are faced with this kind of problem, I think I fear that Haskell may have evolved in a way that is suited to experts, but becoming less and less suited to beginners. And if you want a paradigm to spread, you need beginners. Everybody’s a beginner at once.

WS (0:15:12): Yeah. Some of these things though, with the FTPs in particular, I think, I wonder how people learning Haskell kind of – when I make these mistakes, it’s because I always think, “Oh, null will only work on lists,” or “Foldr will only work on lists.” So, if I call it on something which is not a list, I’ll get a type error. But perhaps the beginners who now learn Haskell, they realize, “Oh, I actually have to be very careful about this thing because I’ve bitten by this mistake once I called null on a maybe and I didn’t get the result that I was expecting.” I don’t know. Maybe they’ll get used to this, and maybe some of it is also historical baggage of being an experienced Haskell user, expecting certain behavior, and then being confronted with something else.

JH (0:16:02): I think learning to use foldr, that’s something that I spend quite a bit of time on with my students when I teach Haskell. And just foldr on lists, there’s an aha moment when you realize what you can do with it. I think it’s not a trivial concept. And it’s extra hurdles to jump over on the way to using it and understanding it, or there’s a cost there.

WS (0:16:31): Yes. For sure. So, I had another episode I recorded recently with Iavor Diatchki, and it was interesting to hear him talk because he’s maintained quite a lot of Haskell code in very different styles. On the one hand, you have people – he said, “Okay, we have some very old code, just did some cryptographic processing, and I’ve pushed quite hard to not have it try to do fancy things, to keep it simple. And then we have another library which has gone full-on dependently typed Haskell, and they have an experience report about this at ICFP. And it’s a very different style of maintenance where the one is, sometimes by keeping things simple, it’s much easier for beginners to fix something in the code base because they don’t have to understand what all these fancy types mean.” It’s very tempting to think, “Oh, I can encode this extra invariant in my type signature if I add this multi-class, multi-argument type class here, and then add a GADT there.” But somehow, there is a cost, right? There is a cost to these abstractions that they make sense to you, but getting other people to maintain your code suddenly becomes much, much harder.

JH (0:17:58): Yes. That’s interesting. 

MPG (0:17:59): Right. So, usually, we ask people, how did you get into Haskell? But how did it come about for you?

JH (0:18:09): So, I was on the original Haskell Design committee, and that was – I think it was Paul Hudak’s initiative. And Haskell got started at FPCA ’87, if I recall correctly. So, that was Functional Programming and Computer Architecture, which was one of the two big functional programming conferences in those days. And Simon PJ was visiting Paul Hudak on the way there. And they hatched the idea of designing Haskell, the motivation being that there were many groups around the world working with lazy functional programming, and the strict functional programmers had ML. So, they had a language to gather around, but we, lazy functional programmers, did not really, or rather, everybody had their own language. And they were all very much the same, we thought. So, why not just agree on the common parts and have a common language that we could all work with? That was the idea. 

Actually, I went down with something at that conference. So, I spent most of the conference sitting in my hotel room within a short distance of the bathroom, which wasn’t very pleasant at all. Now, as a result, I missed the very first meeting where people got together and decided to design Haskell. But then I was involved in all of the subsequent meetings, including one in New Haven, which involved – we decided to meet there. I can’t imagine why we did this, really. We decided to meet in New Haven in January, and I flew on Icelandair. I got stuck in Reykjavík, delayed by a day on the way there. I mean, there was a snowstorm, of course. What do you expect in Connecticut in January? And so, it was very exciting and dramatic, but we had lots of interesting discussions at that time.

WS (0:20:14): Did you use many other functional languages between Lisp and the birth of Haskell?

JH (0:20:20): Oh, yes, of course. I mean, for a start, as I say, everybody designed their own. So, I had designed – so, first of all, when I was an undergraduate, we implemented a Gedanken compiler. And Gedanken is not purely functional, but it has a strong functional part. So, I used that. I used PAL as well. PAL was – you might say it’s a functional language where the notion of an lvalue was for a first class citizen. So, you could write 1 becomes 2 in PAL. It had side effects. And what it meant was create a new lvalue, put 1 on it, and then overwrite that with 2. 

WS (0:21:03): What’s an lvalue? 

JH (0:21:05): An lvalue, oh, it’s a semantic concept. It says basically the equivalent of an address. So, a variable would denote an lvalue. Nothing that can be updated. So, I used those early on. And then I had designed a simple functional language of my own that I’ve implemented with combinators after reading David Turner’s wonderful paper about SKI Combinators. So, I used that for experiments. And I used David Turner’s languages. And I never used SASL, but I had KRC (Kent Recursive Calculator), which was an untyped equational language, sort of like Haskell without types, where every equation had to fit on one line. It was sort of like that. I used that a lot during and immediately after my PhD. So, the work that is described in Why Functional Programming Matters, that was originally done using KRC. After that, David Turner designed Miranda, which was like KRC but with static types à la ML and with a more richer syntax. And that was hugely influential on Haskell. So, I was teaching Miranda to my students before we designed Haskell. I did a lot of work with that.

MPG (0:22:34): So, what’s the design of Haskell? Was there anything in particular that you had to fight for that the others didn’t want, or how did it work out in those days? 

JH (0:22:44): Yes. So, there was a lot of discussion, for example, of whether we should have a kind of definition style or an expression style of programming. So, what I mean by that is, if you think of – for example, when you define a function with several equations with unguards versus one equation at an if then else. And I had learned to love guards through David Turner’s languages. And I was a strong advocate for using guards in Haskell. In the end, we ended up using both. Or what can I say? It’s a classic committee thing. But I think it was a good thing that we did that. But I think we might very well have ended up with a syntax more like ML and without guards, had that not been something that had become very important to me in my teaching. So, in my functional programming course, I made very heavy use of guards when teaching students how to write code. So, that was one thing. 

Otherwise, I was always concerned about usability. So, a rather trivial thing is, in Haskell, you don’t have to say – when you write just one module, you don’t have to provide a module header. So, there’s a special case there. That special case was my doing. That was a small thing. But I looked at the Hello World program in Miranda and in Haskell as it was first designed. And I mean, that involves IO as well, which was in the first version of Haskell and was nowhere near as nice as it is today. But a part of it was giving an explicit module header where you have to add to explicitly export the main function. It was just – I feel it’s important that a language is usable for toy problems. And again, this has to do with bringing in beginners, making it easy to pick up and do the first few things. Then later on, of course, you want to write proper module headers on your modules. At the beginning, I think it’s important that you don’t need to. 

Another thing that – where else did I have an impact? The semantics of equations and pattern matching. There were a number of proposals for that and could very well have ended up with the semantics that chose the evaluation order based on all of the patterns in a function definition. So, if you have a function with several arguments, you can imagine analyzing the definition and seeing, is there an argument that we match on in every equation? If so, the function must be strict in that argument. So, that’s evaluated first. Now, that isn’t the way that Haskell pattern matching works today. It’s top-down, top to bottom, left to right. And that is the same as the semantics in Miranda. But we spent quite a long time exploring alternative semantics for pattern matching. And there have been a lot of work on optimizing pattern matching, but it had been done in strict languages. In strict languages, matching order is just a question of optimization. In lazy language, it’s a question of semantics.

And so, I was the one pushing for compositional semantics of function definition and pattern matching. So, I got the job of trying to find compositional semantics for this optimized version. Could I do that? No. To this day, I believe it would’ve required parallelism in some cases to implement. I mean, if you think you can write it, you can write a definition of parallel or or parallel and. And if you want to be absolutely faithful to each equation as it’s written, you need to evaluate the arguments in parallel because one of them might terminate early and give you true or false, whichever the case may be. That enables the function definition to complete. But I think building parallelism then as necessary to implement pattern matching would’ve been a big mistake. Parallelism is great in Haskell today, but it comes at a price, a price in terms of performance, which means that you need to use a large, relatively large granularity if you want to get a performance improvement at. If you use parallelism at a very fine grain, it is way too expensive. And we would’ve been building a requirement for that into the language if we’d chosen that semantics, I believe.

WS (0:27:27): So, what you mean by compositionality is in pattern matching, that you can give a semantics for each line in your definition independently and compose these as a definition for the whole?

JH (0:27:42): Yes.

WS (0:27:43): Right. And the problem would be here that if you did this analysis which checked every line match on one equation, that would become – or on a particular argument, that would be non-compositional because then, suddenly adding a line could mean that the function suddenly becomes strict in that argument. Is that right?

JH (0:28:05): Yes, that’s right. So, adding a line later on will change the semantics of earlier lines.

WS (0:28:10): Exactly.

JH (0:28:11): And I’m not saying it’s impossible to come up with compositional semantics, but the only way I could see to do it was to essentially put the syntax of the patterns into the semantics. And then, when all the equations defining a function being gathered together, apply the analysis basically. And that is not compositional in any useful sense.

WS (0:28:36): No, I can imagine.

JH (0:28:38): Yeah. So, that was one thing. Another thing, the monomorphism restriction. That arose after I wrote a very simple program that had exponential runtime, and I was very surprised by it. I was concerned about usability, again, for beginners. And if code that you think is going to have a linear running time turns out to have exponential running time, and you are supposed to be an expert in the language and God help beginners. I know that restriction that we ended up with hasn’t proven to be one of the most popular parts of the design, but it was in there to solve –

WS (0:29:15): Understandably, right? 

JH (0:29:16): The problem. 

MPG (0:29:18): But I mean, you did a lot of work on Haskell, but then you moved on to more testing stuff, particularly QuickCheck, right? So, how did that come about? It’s like 10 years later or so, right?

JH (0:29:36): Yes, that’s right. So, I had done my PhD in the Programming Research Group that was led by Tony Hoare at the time, and following methods were a very big part of that. I’d done some work with Richard Bird, who was very interested in equational reasoning. So, I was very interested in the equations, algebraic laws, all this kind of thing. But QuickCheck came about when some research grant that I had had, had come to an end, and there was a big deadline, and I had to write a report. I’d been doing a lot of work on that, and it was finally done. And I had a week, and nothing’s particular I needed to do. I thought, “Great, I’m just going to have fun for a week.” And the first thing I did was I made something. Haskell doc didn’t exist in those times, but Javadoc did. So, I made something that would invoke Hugs and then ask it about the types of functions and generate some documentation containing those types, which I thought was kind of cool. Then I thought, “Well, how else could I use Hugs as a plugin?” And I thought maybe I could put some equations into comments, I thought. And then use Hugs to test them. Well, I need some data, maybe some random data. But I’d started thinking about that, and I’d started writing an implementation. 

When Koen stuck his head on the door and said, “What are you doing?” And so, I showed him what I was doing, and Koen realized, of course, that I was being completely wrongheaded, putting the equations in comments, and that it was much more sensible to define a DSL. So, he went away, and next day he showed me his first version of QuickCheck, where what I had envisaged was defined using a DSL instead. And I thought it was great. But then there were some things I hadn’t thought about specifically on generating recursive types, random trees, and the like. You easily run into problems that I had foreseen. And so, I took Koen’s code and I added the size parameter to it, for example, and some other stuff. And then we went to and fro with those things over the course of the week. And by the end of the week, we had a pretty complete first version of QuickCheck. We had figured out how to do preconditions, for example.

So, it came about as a bit of fun. And we wrote a paper about it and didn’t know anything about the testing literature. We sent it to ESOP, and they turned it down. And it was probably a good thing they did because we had had very little time before the deadline. So, when we got it back, we were able to do a much better job of trying to learn about testing so that we could write a more sensible related work section. And then we sent the revised paper to ICFP. And 10 years later, we got the award for it. So, I guess that that should be an encouraging story for people who get a paper rejected—that quitting is not a good idea. So, that was fun. 

One of the reviews I got from ESOP, though, just said, “This has been done before in Emacs Lisp.” I would love to know what that referred to, but in all these years, I have never been able to find out.

WS (0:33:18): Surely. I mean, the random generation and the thinking of properties as kind of first-class values that you write down as a programmer, I think there were a lot of ideas in the QuickCheck paper that I haven’t seen anywhere, even using types in a way, right? Type classes to direct the generation. I mean, these are not easy things to do, especially not in Lisp. I wouldn’t know where to begin.

JH (0:33:51): Right. That’s a good point. An implementation of Emacs Lisp could not have used types in that way.

WS (0:34:00): No.

JH (0:34:01): But on the other hand, Erlang QuickCheck doesn’t either.

WS (0:34:03): No, no, no. I’m sure.

JH (0:34:04): You could make a property-based testing tool that doesn’t depend on types. But no, we were just having fun. And I think, actually, that is important because, even though it wasn’t what we had research funding to do, it was a whole lot of fun. It was fun for us at the time, and it’s been fun for other people since. And making testing fun is perhaps one of the most important contributions. I mean, I love sitting in testing code. Nowadays, I love buggy code, as long as it’s other people’s buggy code. Why? Because testing has become something that is so fun, and I love being able to take ideas from – that might have been ideas about how to do proofs about programs and turn them into effective tests. 

So, one of the things I like doing is using – you might say, using a reference implementation of a type as a model. And if you think about it, that goes back to Tony Hoare’s early paper on recursive data structures and proving the correctness of abstract data types. And you just take what Tony wrote in that paper. He was thinking of proofs, of course. You code it up as properties, and bingo, you have great tests. I love that. I love that interaction between two apparently separate fields. And I think perhaps in the past, people saw testing and form proof more as opposing fields. And yeah, by putting them together, there are lots of ideas that contribute to both. And I mean, nowadays, I think that’s a pretty wide accepted view, but it wasn’t back in the day.

WS (0:35:58): Yeah. Well, in testing forces, you have to think about the specification, right? And especially as opposed to unit testing, where you just – if I plug in the number 3, I should get the number 7 out. Okay, well, it seems to work. I think one major contribution here is that you force people to think abstractly about, wait a minute, what is the abstract specification that my function or my library should satisfy? And whether you go out proving that or testing that, or checking it on random values or enumerate values that you feed in, there’s a lot of discussion you can have about what the best way is or what the right way is. But thinking abstractly about what is it that my program should do, I think that’s the important thing, right?

JH (0:36:41): Absolutely. You have to change the way you think about testing. But that is – I think that is a great strength. It’s also a weakness because it makes it harder for people to pick a property-based testing tool up if they are used to writing user unit tests. I’d say thinking of examples is something that people have learned to do from the moment they began programming, or perhaps before. General properties –

WS (0:37:08): Is harder.

JH (0:37:11): Yeah, it is a bit harder. And for many people, that’s an obstacle to get over. I mean, I’ve worked a lot with trying to popularize property-based testing. And when we first did the work, I thought, “Great, I can write and test properties of my code. What could be better than that?” And then I eventually learned that coming up with properties of one’s code isn’t something that everybody finds easy to do. And indeed, when you take real applications, it isn’t something that is necessarily easy for anybody to do.

WS (0:37:45): So, what’s the hurdle that you’ve encountered? Or why do people struggle to embrace the idea? Because for me, maybe because of decades of functional programming, you think, “Oh, property-based testing, it’s a unit test generator. It’s like, I no longer need to write the unit test. I can write this function, which generates unit tests for me. It’s so much better.” 

JH (0:38:11): So, one problem is just the idea of coming up with an abstract spec. And sometimes people can’t really think of anything other than the implementation. Then they can end up writing a property that basically re-implement the system in the property. And then they feel, “I did the same work twice. What was the point of that?” And in fact, there isn’t very much of a point because if they made a mistake in the original system, they quite likely made it in the model, the specification as well. And so, it’s expensive and of low value. That’s not a good combination.

WS (0:38:49): How’s that different from writing a reference implementation?

JH (0:38:52): So, a reference implementation without concern for performance, but that is one way of writing good properties. I mean, when I talk about using a model, and by a model, what I mean really is a reference implementation where one need not worry about performance. So, that is one good way of coming up with properties. But coming up with properties is, I think, the key difficulty that people have run into. And that’s why I wrote a paper a couple of years ago called How to Specify It. So, let me just take the opportunity to plug that paper. That considers purely functional code, which is the simplest situation. Testing purely functional code, I mean. But it goes through five different ideas for how to think about the properties that you might write of your code. And those ideas, they’re also applicable to code that has effects, although they’re a little bit harder to apply. So, I’m hoping that that paper will work as a tutorial and enable more people to think of properties, like being able to satisfy.

WS (0:40:00): Yeah. So, you’ve not only used QuickCheck in kind of in vitro setting and your teaching, but you’ve also started your own company, which markets QuickCheck mostly for Erlangs or Quviq. So, how did that come about? 

JH (0:40:18): Well, that was interesting. I was the leader of quite a large research project for the Swedish Strategic Research Foundation. And halfway through the project, they called us in to present our results to their panel. And on the panel, they had representatives of the Swedish, great and the good, a lot of industry representatives. And the project was about trying to combine – it was called Combining Verification Methods, or the Cover Project. And the methods we were combining, we were trying to combine proofs using Agda with testing using QuickCheck. And we were trying to use the same properties for both, which was proving rather difficult, to be honest. I mean, I think a lot of great things came out of that project, among other things, Agda 2. But actually, combining the verification methods, I’m not sure we got as far as we’d hoped with that, but we demoed what we were doing. And as part of that, we started off doing QuickCheck. And Mike Williams from Erickson saw that, and he said, “We want that.” So, Mike, at that time, Mike was one of the small group who designed Erlang back in the early days. And at this point, he was leading the development of the software for Erickson’s 3G radio base stations. That’s how long ago this was, which was actually not being developed in Erlang for the most part, but they were using Erlang for testing it. Anyway, Mike saw that and wanted to apply it at Erickson. 

And then another person in the room was Jane Walerud, who is a very well-known entrepreneur here in Sweden. She was the Swedish Businesswoman of the Year, or Business Person of the Year, one year. So, she’s an extremely good entrepreneur, and she inspired the Erlang team to set up Bluetail. When Erlang was open-sourced, then the key developers left. They made a startup to apply the technology, and Jane was a big part of the inspiration behind that. So, she said, “You should start a company. Start a company and sell QuickCheck to Ericsson.” So, that was the initial inspiration. And in those days, then SSF, the research foundation, was – I mean, they still encouraged innovation, they encouraged startups, but in those days, they were allowed to be a bit more flexible with the way they use money. So, they simply gave us one or 200,000 crowns, a small amount of capital, but it meant we could buy computers, we could travel a bit. So, that was a big help in financing, getting things started. And then Mike gave us a contract to start working with Ericsson, and we went from there. 

So, I was pushed into it by our funding agency. And I say, but I had seen at that time, I mean, I’m a great believer in functional programming. Surprise, surprise. And I’d seen how people were beginning to apply it in industry, and I wanted to be part of that. So, you could say that I was ready. I was ready to be pushed. So, when the opportunity came that I took it. So, Koen didn’t. I initially asked Koen to start the company with me, but I was a professor already, so I had kind of reached the end of the academic career, you might say. But Koen hadn’t at that time. So, he wanted to – what is the word in English? Satsa på. He wanted to invest in his academic career. And so, I ended up – I didn’t want to start it by myself. And I knew Thomas Arts at the time. He was at the IT University. And he had very complementary skills to my skills, and particularly knew a lot about Erlang. He’s a very good person to have in a startup. And so, I asked him if he would like to join, and he said yes. And so, we got started. There we are.

MPG (0:44:45): Right. 

WS (0:44:46): And how’d you like the two worlds? The academia and industry? How do they compare for you? 

JH (0:44:56): I like, I have enjoyed having a foot in each camp. So, I usually say when you have a good idea in academia, you get excited about it. You write the paper. You send the paper into a conference. After a few months, you get the referees’ reports, and they are never as enthusiastic as you were. But with a bit of luck, they say, “Well, he could’ve done this, he could’ve done that. But I suppose we can take it.” Oh, well, maybe you have to be rejected once and then send it somewhere else. At any rate, they always give you things to do. You have to do a bit of work, get the paper into the shape they want, even if they ask for things that sometimes you don’t think are improvements, but you still have to do them. Then you go to the conference and you give the talk. And if you are lucky, some people come up to you afterwards and say, “That was really cool.” And somebody in the audience will get excited about the work. Same as you did. But that’s going to be a year after you had the idea. When you do something cool for a customer, they’re happy straight away. And I have really enjoyed that. 

So, I think the interaction with customers—when you can do great stuff for customers and you get an immediate reward in terms of their feedback, that is really nice. And so, that’s part of it. It is also, I think, putting your foot in the water. You learn things that came – some of them come as a surprise. So, for example, the first time a company that we were trying to woo said to me, “We let our customers do the testing.” I was rather shocked. I think in academia, a lot of academic work aims to improve the quality of software. But many software developers think their quality is perfectly good enough, thank you very much. They have other problems. So, basically, when you’re selling stuff, people ask, “Will it save me money? If not, will it make me money?” Those are the only two questions that matter. And so, you have to think about your research in those terms. And so, in particular, quality is not something that has necessarily any value on its own, right? And that was a bit of a shock to discover.

WS (0:47:12): Did you find that it’s hard to get developers to adopt new tools? I mean, it sounds – oftentimes, it’s like telling them, “Oh –” so, as you go up to a developer and tell them, “Oh, I have this new editor and it’s much better than whatever you’re using.” There’s always this reluctance that they’ll say, “Oh, I’m happy to – I found a way that works for me and I can ship software, which is good enough. And I still get paid every month, so why do I have to learn something new?” What’s the pitch for QuickCheck here?

JH (0:47:42): Right. So, I think one must always be careful about drawing a lot of different people over one comb. So, developers are different. And we usually find that there will be some developers, probably a minority, who get really enthusiastic about QuickCheck. And so, as part of the process, we have to find those people, and then we have to work with them and help them make property-based testing a success in their domain. And then there’ll be other people who kind of learn it but find it kind of difficult. And there’ll be people who worry that, “Here’s a new technology. If all my developers have to learn it, that’s going to cost so much. Well, what’s the ROI going to be?” And it’s quite difficult to pin down what ROI you can expect. 

WS (0:48:42): Yes, I can imagine.

JH (0:48:43): It also varies hugely depending on the kind of software that the company is developing. I think that’s one thing I’ve realized, that there are many companies who are developing software where, actually, they don’t think quality is all that important, and that’s because it isn’t. You know software where you can release a patch and the users will install it themselves and nobody complains and fine, what’s the problem? But then there are other companies where quality really does matter a lot. And the secret of selling something like property-based testing is finding the domains where quality really matters and focusing all of your energies and efforts on helping people in those domains. They’re the ones who really want it. They’re the ones who have the most to gain from it.

WS (0:49:27): So, what domains have you found where you can really kind of get them to adopt this technology?

MPG (0:49:34): So, the first domain was telecoms, and telecoms has always had very high-quality demands on the service that is delivered. So, that was a very thankful first domain to work with. But the one that we’re doing most work with at the moment is actually blockchain and smart contracts. So, when a program is going to be spending your money for you, you really would rather. It does so correctly. And in particular, there are things that it should not do, like get stuck on the blockchain holding a lot of your capital in a way that makes it impossible to recover. That’s nice. That’s a general property that all smart contracts should have. So, having such general properties makes it easier to write tests because you know that’s something you should be testing.

WS (0:50:31): Yes. Yeah, for sure.

MPG (0:50:34): Right. But do different languages have an effect here? Do you feel like if they’re already using Haskell, it’s easy to sell QuickCheck? Or is it okay to sell in Erlang or JavaScript or what have you?

JH (0:50:47): Yes. People like to use a property-based testing tool in the language that they are familiar with. So, we certainly have found companies who are already using Erlang are more receptive to picking up the QuviQ version of QuickCheck, which is an Erlang. One of the blockchain customers that we have at the moment, they use Haskell. So, we’re using the Haskell version of QuickCheck. So, yeah, I guess people like there to be no impedance mismatch between their testing tool and their code. In telecoms, that’s a little different because telecom systems, by and large, they’re built in a mix of programming languages anyway. So, if you have your test tool in yet another language, that isn’t such a big deal.

WS (0:51:35): So, this would be typically some low-level stuff in C, but then it’s directed by an Erlang program which moves things around, and you test the various bits and pieces. It must be complicated to set up or –

JH (0:51:49): No, it’s not just that. So, the 3G base stations, for example, they were built using Rose RT, which generated C++ code from a UML model. So, the developers were actually working with UML. The UML code has embedded C++ code, but what you see in front of you is diagrams. I’ve watched people searching, “Where is my code? I know it’s in here somewhere. It’s in one of these boxes.” But it’s difficult to get an overview of the whole. But a lot of the code was generated then, generated C++, and there was also an operator interface in Java. So, at the time, Java had very good GUI libraries, probably better than any other language at the time. So, that’s what they were used for that. So, we thought that the base station software was built with C++ since that was what the developers we were working with had told us. And then we found a bug that crashed the base station. When we looked in the log, we found Java exceptions. You could have knocked me down with a feather. But it turned out that there was a bug that broke one of the system invariants in the C++ part. But the first code to fall over was the operator interface that was trying to visualize the system state.

WS (0:53:14): Right. Yeah. That must be very hard to – I mean, I can imagine that QuickCheck ’s very good at finding bugs or getting things to crash, or I know tripping up unintended behavior, but then how do you go from a Java exception trace to figuring out where the problem is?

JH (0:53:39): So, in that particular case, shrinking worked really well. It was very slow because the bug crashed the base station. And so, to run the next test case, we had to reboot it. Rebooting the base station in those days took two minutes. So, every shrinking step took two minutes. We went for a long lunch. And when we came back, we had just a sequence of two commands that would bring the base station down every time. And it was actually to do with setting up a radio channel. So, 3G base stations have a radio channel to each phone that is in a call or is communicating, but there’s also a kind of global radio channel for sending signals to all the handsets in the cell. You still have to send a message to set that up. You need to specify frequency, power, which antennas, all this kind of stuff. But, of course, you should only send one such message. If you send a second one, well then, the base station is supposed to reject it. And so, it did almost always. But if you gave just the wrong parameters, we could make the base station set up two of these things. There should only ever be one. That’s what brought it down. But once we had the test case in front of us, then debugging it was not that hard.

WS (0:55:07): Yes, I can imagine. Very cool.

MPG (0:55:11): Right. But that’s also the – I mean, I guess a second question, right? So, you have the pure Haskell thing, but then when you move it over to other languages, you have to deal with all these side effects. And that’s a huge challenge in itself.

JH (0:55:27): Yes. So, when we first took QuickCheck and tried to apply it to Ericsson code, then we immediately ran into that problem. And it was quite difficult to solve, actually. We generated test cases that were programs that would perform invocations of the API on a test. And they were quite complicated to generate and quite complicated to assess the results of. So, the test code that we wrote for that very first application, it was really – I mean, this was specialized stuff. You would not be training normal developers to do that, certainly. Luckily, we were able to abstract ideas from that, and we built our first state machine modeling library. And state machine models—they’re something that we use heavily in QuickCheck for testing imperative code, but they’re also something that have been studied in general in the testing community, the testing literature. So, we have a particular version of those models that plays well with property-based testing. And having come up with that, then we’ve been able to use variations on that idea. 

So, we have a library that can be used out of the box with most imperative software. And then there can be particularly challenging examples. Like, for example, we did some testing of Dropbox where things are happening in the background. The Dropbox daemon is doing stuff, and modeling that is a little bit more complicated. But we were once again able to use variations on the same state machine model idea.

MPG (0:57:07): Right. I guess the final question here is like, because you have the foot in, you have feet in both worlds, and then in academia, you want to explain everything you’re doing, but as a business, you don’t want to give out your secrets, right? So, how do you balance that when you’re – yeah, how do you balance that?

JH (0:57:29): Well, I think you need to think about what is going to be public and what is not going to be. And it does demand being a lot more careful about what you say. And so, sometimes you can sit and you’re chatting to somebody, and then you say, “I can’t tell you that, I’m afraid.” But that’s a novel aspect as an academic. But I don’t think there’s any getting away from that. I think a company that just talks freely about all its IP would not belong for this world. So, yeah, that’s a fact of life.

MPG (0:58:05): Right. So, now we talked about how Haskell came to be and what you’re doing now, but where do you see the future of Haskell going? Are we going to have better type errors, or what’s going to happen next?

JH (0:58:21): Predictions are hard to make, especially about the future, and I think that’s very much the case with the likes of Haskell. Clearly, Haskell has built up a lot of momentum now. That’s great. What I want to see is, I don’t want to see it becoming too complicated. I don’t want to see it becoming too complicated for beginners to pick up and for the community to grow. I think we have been very successful in having a very friendly community that people have found it easy to join. And so, I hope that that will continue.

But when I got interested in functional programming, I remember being very inspired by people talking about the software crisis, saying there is a crisis in the cost of software. We need an order-of-magnitude reduction in the cost of software, and functional programming can deliver that. That was a very inspiring message. And I think if I look over what’s happened during my career, it’s true. Functional programming has delivered an order of magnitude, certainly, compared to the languages like Fortran, Pascal that we were using back in the days when the prediction was made. Where is the next order of magnitude coming from? I think that’s the important question here. And I’m not so clear what the answer to that may be. Maybe AI programming will turn out to work much better than it has done for me. But I think that’s what we need to be thinking about. It’s how to get another order of magnitude improvement in software productivity.

WS (1:00:00): Yeah. Is it even possible? I mean, that’s another question, right? I’m amazed by how I’ve seen Haskell change in the last decade. And it does make you wonder, which is why it’s a hard question to answer, maybe. But an order of magnitude is huge, right? 

JH (1:00:20): Yes, it is.

WS (1:00:22): It’s not easy to achieve.

JH (1:00:25): But it’s okay if it’s achieved over 30 or 40 years.

WS (1:00:27): That’s true. Yes. 

JH (1:00:30): But the question is, how could we make Haskell programmers twice as productive? You just have to do that three or four times. But doing it once, that’s already thought-provoking and challenging.

WS (1:00:46): Who knows? Maybe AI.

JH (1:00:49): I mean, Verse seems very interesting, where you add logic programming elements to a language that in some ways is rather like Haskell. I mean, Tim Sweeney is very visionary about the potential of functional logic programming. Maybe there, there’s productivity to be fetched. I don’t know. I think the jury is still out on that one, but it’s certainly something that’s very interesting that a lot of people want to follow.

WS (1:01:17): Yes, for sure. Okay, thanks very much, John, for your time.

JH (1:01:22): Okay. 

MPG (1:01:22): Yeah. Thanks so much.

WS (1:01:23): And great to have you all.

JH (1:01:25): Cheers.

Narrator (1:01:31): The Haskell Interlude Podcast is a project of the Haskell Foundation, and it is made possible by the generous support of our sponsors, especially the Monad-level sponsors: GitHub, Input Output, Juspay, and Meta.

SPONSORS
Monads
IOHK Juspay Meta
Applicatives
CarbonCloud Digital Asset ExFreight Mercury Obsidian Systems Platonic Systems Standard Chartered Tweag Well-Typed
Functors
Artificial Channable FlipStone Freckle Google HERP MLabs TripShot
To learn more about the Haskell Foundation
Haskell Foundation, Inc.
2093 Philadelphia Pike #8119
Claymont, DE 19703
USA