53 – Garrett Morris

Recorded 2024-05-21. Published 2024-07-18.

In this episode, Garrett Morris talks with Wouter Swierstra and Niki Vazou about his work on Haskell’s type classes, how to fail successfully, and how to construct a set of ponies.

Transcript

This transcript may contain mistakes. Did you find any? Feel free to fix them!

Wouter Swierstra (0:00:15): Welcome to the next episode of the Haskell interlude. I’m Wouter Swierstra, and my co-host today is Niki.

Niki Vazou (0:00:22): Hello.

WS (0:00:23): We’re joined by Garrett Morris, who will tell us a little bit about his work on Haskell’s type classes, what it means to fail successfully, and how to construct a set of ponies. 

Okay. Welcome, Garrett. The first question usually for all of our guests is, how did you get into Haskell?

Garrett Morris (0:00:43): Yeah. So, it’s a bit of a story. So, my first programming course at college was taught in Scheme. Like a fair number of the other students my year, I actually knew a certain amount of programming already. I’d done the AP exam. And so we started in Scheme, I think, in part just to backfoot all of the students who thought that they knew what they were doing, which was very effective actually. I think it was a great teaching technique. I don’t know that it works anymore, but I actually really appreciated it. It was also a mind-blowing experience for me. I had never done functional programming before. 

By the end of the quarter, I was completely convinced, but the big project assignment we’d done, which was this Scheme interpreter in Scheme, the metacircular evaluator, there were a bunch of bugs that I’d had to sort out that basically had to do with my not fully interpreting parts of the input and converting to AST nodes. And it felt like the sort of thing that even what I knew about type systems from Java and Pascal would have been able to address. And so, I asked the instructor. I said, “Well, this functional programming is amazing, and it’s blown my mind and all sort of thing, but Is there something like Scheme but with a little bit of a type system?” And the guy who taught the course was a geometer. And so, he started telling me about standard ML and OCaml a little bit. And there was an old guy walking by in the hall, and the old guy overheard this conversation, and he said, “Scott, why are you wasting this young man’s time with these children’s toys?” And he said, “Son, go learn Haskell.” And so, that turned out to be Doug McIlroy, who was adjunct faculty, spent most of his career at Bell Labs, and also a bit of a Haskell aficionado. 

And so, at that point, my freshman fall. I went back to my dorm room, and I found the Gentle Introduction to Haskell, which I think fails on all three of those words, and I tried to learn Haskell. It didn’t click immediately, but by the time I got to my senior year, I looped back around to it. I had an opportunity to do an undergrad thesis with Doug, looking at ideas on partial evaluation and do a bunch of Haskell programming. And so, this would have been 2004, 2005. I remember reading about HList and reading some of these. I think there’s a paper about using functional dependencies to do a tagless interpreter. That was pretty mind-blowing to me. And that pointed me to functional dependencies. I read Mark Jones’ Functional Dependencies paper, and I was like, “Well, this is something actually that seems incredibly cool.” I found it like a very well-written paper, a very pleasant paper. And so, it took a couple of years to pay that off, but that was my introduction to the Haskell research community and motivated me to go try to work with Mark at Portland State.

WS (0:03:17): And where did you do your undergraduate? 

GM (0:03:19): Dartmouth College in New Hampshire.

WS (0:03:20):  Oh, okay, right. And Doug McIlroy is famous for various things, but one of them is – I’m going to get this wrong. Is it Unix diff, or is it –

GM (0:03:32): I believe so. So, I think the story is that he managed the group that Unix came from or something at Bell Labs.

WS (0:03:38): Okay. Yeah. 

GM (0:03:40): So, yeah, he wrote a bunch of Unix utilities. He wrote a little program that generates pessimal input for quicksort algorithms. I remember that. 

WS (0:03:48): Okay.

NV (0:03:49): Wait, he wrote the Unix utilities in Haskell, no?

GM (0:03:52): No, no, no, no, no. This was the ’70s. I mean, Doug was retired essentially by the time that I knew him. But yeah, he was also very interested in Haskell, and he posts on the Haskell mailing list once in a blue moon or something. He still writes little utilities in Haskell, little demonstration programs.

WS (0:04:09): And he’s written one or two functional programs as well about – I’m blanking here. Power Series. Power Series, Power Serious. Yeah, that’s, I think, one of the papers that I enjoyed reading.

NV (0:04:20): And so, you found out this Mark Jones paper, and you just approached him and told him, “Hi, I want to do a PhD with you”?

GM (0:04:28): No, I did it even worse than that. When people ask me how to go to graduate school or what to do, I say basically, “Well, whatever I did, don’t do that.” I didn’t apply to work with Mark when I first applied to grad school. So, at the time, he had been at the Oregon Graduate Institute, and that was ceasing to exist. And all of the OGI faculty except him had already moved on, most of them to Portland State, and he was still listed at OGI. And I think it turned out this was for immigration reasons. So, as a senior, I looked at this. I thought, ‘Okay, I don’t understand what’s going on here, and I’m not going to email someone to find out. I’m just going to Ignore it and do something else.’ So, I applied to other places for graduate school, and I was resoundingly rejected. And I went off and worked in industry for a couple of years, which I didn’t get a lot out of, and my employers didn’t get a lot of either. Then I decided I really had missed the boat on going to graduate school. And by that time, Mark had moved to Portland State. 

So, again, I applied to graduate school at a bunch of places, and most of them politely declined, but PSU accepted me. So then I moved to Portland, and I turned up on campus. And this way, I had basically had no interaction with Mark. I had done none of the things you’re supposed to do. I hadn’t written to him. I just turned up and I said – I basically walked into his office one day and I said, “Well, you weren’t set up to be my initial advisor, but would you like a student anyway?” And at the time, they were just starting a big project on trying to do systems level programming in Haskell, or in a functional language. And so, Mark had an opportunity for a student. And so he hired me. But, as I said, I didn’t do any of the things that you’re supposed to do and any of the things that I recommend people to do. And I think it’s dumb luck in a way that I arrived there at the same time that Mark was accepting students, and there was a project that was vaguely suited to the kind of thing that I wanted to do anyway. 

WS (0:06:16): So, is this the Haskell operating system stuff that Mark worked on, or –

GM (0:06:22): So, it’s supposed to be the follow-on to that. There was a project called House, I think, that Mark and Andrew Tolmach and people worked on. So, House was an operating system written in Haskell. There was a follow on to that that Rebekah Leslie did. 

So, House was, I think, is basically a brand new operating system design from the ground up. Rebekah’s follow on was based on the L4 kernel. So, it was a standard as a more typical microkernel design but implemented in Haskell. And then based on that experience, there was this project that was supposed to design a language that would be like Haskell, but would be more suited to that kind of work. There were a lot of places where Rebekah found that maybe she was fighting against the language rather than using it to her advantage. 

So, we wanted higher-order functions, and we wanted type constructors and monads and all of that sort of thing. But maybe we didn’t need laziness so much, for example. That felt like it was something that they had seen, like just add an overhead to that really low-level programming tasks.

So, there was a project to build a front end, a back end, a semantic framework. There were some results that came out of the backend effort. They did a verified garbage collector in 2010 or something. The stuff that came out of the front end effort, I mean, I guess, instance chains and stuff, but didn’t turn out to be very systems programming specific really. Mark did end up writing a kernel, but it was, maybe a lot of the language innovation that we did wasn’t particularly systems-related.

NV (0:07:58): So, this language, I mean, sounds like what Rust started with to me, no?

GM (0:08:05): Well, that’s a little bit ironic in hindsight, obviously. It’s like maybe we missed an opportunity. I think there are a number of languages in this area, though. The NICTA folks, who aren’t called NICTA folks anymore, I believe, Liam O’Connor in particular has a language, and I’m blanking on the name, which is based on the approach they were taking for their kernel, but a little higher level. That’s maybe also in the vein of what we were trying to do.

WS (0:08:29): Is it Cogent or –

GM (0:08:31): Cogent, yes, thank you. So, Liam’s language, Cogent, is actually, I think, quite close to what the original idea for habit was, maybe from a different perspective. So, yeah, I think it’s an idea that’s still interesting. As I said, we didn’t end up delivering anything that was maybe super usable for other people.

WS (0:08:48): So, what did you work on for your PhD? Because you had some success. You’re sounding very modest.

GM (0:08:56): So, I would not have described it this way at the time. But the funny thing to me is I think my whole career, I’ve basically, one form or another, been working on extensibility and modularity. When I got to my PhD, I had discovered data types a la carte. My brief since an industry, I’d done some Haskell programming. I worked for a startup that tried to do some stuff in Haskell, and we had problems of the sort that data types a la carte addresses. And so, I thought data types a la carte was very cool, and I thought it was really interesting, but there are a couple of places where it seemed like the ideas of data types a la carte were hamstrung by the language design.

NV (0:09:31): Do you want to say what is data types a la carte?

GM (0:09:34): Yeah, data types a la carte is, I guess you could say it’s a paper that describes a solution to encoding extensible variants in Haskell or expressing extensible variants in Haskell. And so, it has an encoding of coproduct operator, or as coproduct operator over functor shapes type constructors, and describes a generic injection function. So, if you have a type that fits somewhere into a coproduct, you can inject it into the coproduct and then describe a couple of ways that you can use these coproducts again in a generic fashion by describing folds over them and using type classes to eliminate them. 

So, I think it’s very pretty. It’s something that I thought was extremely cool, but it was all implemented using, or mostly implemented relying on this feature called overlapping instances. Overlapping instances are one of these things that they’re very obvious until you start using them, and then it becomes apparent that they’re actually fragile and inflexible. And so, there are a bunch of places that I thought the idea of data types a la carte was really held back by the fact that the only tool they had to implement it was overlapping instances. And so, it seemed like there was an opportunity there to think about the way the type class system worked and say, “Well, can we come up with something that’s better than overlapping instances that makes it more possible to do this kind of type level programming, the kind of thing that was in data types a la carte?”

And so, the big things that were motivating for me, the first was that data types a la carte was limited in the ways that it had this very cool, like overloaded injection function that you could use to build smart constructors. So, for example, if you’re describing an expression data type, you could say, “Here’s my constant constructor.” And rather than just having to say, “Oh, there’s one expression data type and the constant is a constructor of that,” you could say, “Well, maybe there are many expression data types with different collections of things in them. But so long as constant is one of the cases of that data type, then this constructor will inject into that data type.” But this was limited to only being the base case. You couldn’t necessarily say, “Well, here’s a smaller expression type that’s got constants and addition and multiplication, and here’s a much larger expression type that’s got constants and addition and multiplication and lambda calculus and so forth. And I want to inject the smaller one into the bigger one.” So, that seemed like an awkward limitation. 

And the other thing was that you didn’t really get a case statement. What you got instead were these folds. And I know that we ought to think about functional programming in terms of folds, but I’m very simple. I like case statements. And again, the interesting challenge with the case statement seems to be, how do you account for what happens when you get rid of one case? So, the one case is like, well, how do I inject a smaller chunk into a bigger chunk and generalizing that? And this seemed almost like a parallel operation. Like, I want to split a bigger type into smaller types. 

So, those operations are things that motivated my work on what we end up calling instance chains, which is an attempt to rethink how to do programming with type classes. 

NV (0:12:27): Instance change.

GM (0:12:29): Instance chains.

NV (0:12:31): Okay.

GM (0:12:31): Like chains of instances. So, with that motivating example, we’ve tried to do a better job of type class programming. I ended up thinking it’s an interesting idea. I later went back and managed to translate the version that I had in my PhD into type families, which takes a little bit of effort and gets close to something that I would have been happy with, a slightly more usable, more general version of data types a la carte in Haskell.

NV (0:12:58): So, now you can define them without overlapping instances.

GM (0:13:02): Oh, well, of course, but at this point, everything went through the medium of type families in that version. So, in that sense, I would have argued as somebody who, to some degree, still believes that I had a good idea back then. I would have argued that you were trading one inconvenience for another. And maybe wouldn’t it be great if we could just do it directly?

NV (0:13:21): Okay. So, this instant chain, is it a way, an alternative way to do overlapping instances?

GM (0:13:29): So, I’ll give you the answer with a little bit of hindsight. That was where it started out. It’s that there were these applications of overlapping instances, like HList, like data types a la carte. There were just places where they seemed limiting. There wasn’t very much you could do. There were places where the implementations disagreed. So, like the instances in the data types a la carte paper actually weren’t accepted by Hugs. You had to drop one of them because Hugs had a more strict idea of how to understand overlapping instances. We started out by saying, “Well, wouldn’t it be nice if we just had a better way to do almost exactly overlapping instances?”

One thing that became apparent early on was that we needed an idea of when instances don’t hold if we wanted this mechanism to be coherent. So, let me try to explain this. So, I actually had an example that I thought was very cool. I mean, it just didn’t work, where I said like, “Okay, let’s think about sets.” And so basically, I wanted to define a type function, but we were thinking about it in terms of functional dependencies. And I said, “Okay. So, if you have a type and that type supports ordering, then a set of that type ought to be a binary tree. And if you have a type and that type supports equality, then the set of those ought to be list. Otherwise, it’s nothing.” And you could imagine having more things on top of this, like if it’s hashable, then you get some sort of hash map, and so forth. 

And I went into Mark’s office with this idea at some point, and I said, “Wouldn’t this be a great application of this?” We’d like to write this little – we already had the idea that you could write these things tied together in a single chain. That’s why we called it that. We’d have this set of instances: or T implies set T tree of T where… And here’s your implementation: else eq T implies set T equals list of T where blah, blah. You have it. Okay. And Mark said, “Well, that’s very nice. But now think about what happens when you do this in a modular setting.”

So, in this module over here, maybe I have a set of ponies, and the only thing I have around is an equality instance for ponies. And so, now my set of ponies is a list of ponies. And now I import this into another module, but in this other module, I’ve actually figured out how to order ponies. Like maybe I look at how long their tails are. Now in this third module, a set of ponies ought to be a tree of ponies, except now I’m importing from a module in which a set of ponies was a list of ponies, and lists of ponies and trees of ponies aren’t the same. And I said, “Huh, that seems like a bad thing.” 

And so, what we ended up coming up with was to say, “Well, the thing is really in that first module, you can’t commit to the fact that a set of ponies is a list of ponies because while you can prove that you have an equality instance for ponies and you can’t find an ordering instance for ponies, that doesn’t mean there isn’t one, more broadly, because it doesn’t mean there can’t be one later.” In essence, in Haskell 98, I think the point is, there are really only two statements you can make about a type class constraint. Either it holds, or it doesn’t hold yet. But you can never say it’s not going to hold. So, we said, “Well, okay. So then maybe what we need is we need a way to say that some instances don’t hold.” We need to finish the story. We need to be able to say, this instance, we ended up calling it fails. And now you can commit to an instance not being there, and that lets you do this. You can say, “Okay, now look, ponies are not orderable. I’m sorry. Some ponies and other ponies are just incomparable.” And once you make that commitment that there’s never going to be an instance for ponies, then you can say, “Well, great. Okay, instance of ponies is a list of ponies.” And for a long time, I thought this was basically that instance chains basically consisted of two things. One was this introduction of excluded instances or failing instances, and the other was then the ability to define overlap against in-style sequences of instances where you pick the one that matched and for which the hypotheses were not contradicted. 

What I later realized, I think, is that that ordering thing was actually – I mean, it’s a good piece of language design, but it’s not essential at all. And in fact, if you break the chains up, the really essential part is the idea that you have this negation running around and that you consider the hypotheses of an instance. When you say, “Does this instance apply,” you say, “Well, can I prove its context, its assumptions?” And basically, the chains themselves were just a convenient way to write sequences of negations. In my example, there’s an instance that says, “Well, if you know that T is ordered, then a set of T is a tree.” And the next one says, “If you know that T is an eq and you know that order of T fails, then a set of T is a list of T.” And those instances you can now think of as independent. You don’t need to establish any ordering among them. You don’t need to. I mean, it might be more efficient to check them in a particular order. 

NV (0:18:05): So, is this still active? Can I write that some instances do not exist?

GM (0:18:12): It’s active in the sense that you can dig up the GitHub repo that it lives in and try to build it. My usual joke is to say that it works great if you have my laptop. At the time, I had a very – I don’t know what the right word is, but the extreme view of this where I said, “Well, you know what? Overlapping instances are bad. We shouldn’t have them. We should just slash all of that out and try again from ground zero.” And I did at least recognize that was going to be difficult to convince the broader Haskell community of, which is why I was happy to do it in the context of a new language. I still think it would be very hard to convince the broader Haskell community that the power-to-weight ratio of these ideas, or at least to convince the GHC implementers that the power-to-weight ratio of these ideas made sense, because it does require that you rethink the way your type class solver works. The Haskell type class solver can be fairly greedy. Basically, as soon as you find the most specific instance that matches a predicate, you can just commit to that. You can say, “Okay, this is the instance. If this instance doesn’t work, we’re dead in the water.” Instance chains were really about saying, “No, frankly, I want to do prologue programming in my type class resolution.” And that means that your type class solver has to resemble a prologue implementation to a non-trivial degree. And I think I expect that you would have trouble getting that commit to land in GHC. I don’t necessarily think that that’s unwarranted skepticism. I thought it was a – I still think it’s a very pretty idea, and I think, starting from scratch, maybe there’s still a pretty good argument for it. But it would be a lot of work, and I don’t know how many people would really view the payoff. I want to say like pure script implemented a version of instance chains, although they didn’t have fails. They just used them to do a more constrained version of overlapping instances.

WS (0:20:00): So, do you know any other applications of this failure stuff?

GM (0:20:04): So, functional dependencies also introduce failure in a way, although it’s unexpected. So, I said in Haskell 98, if you have a certain predicate, you can either say this predicate holds now, or this predicate could hold eventually. If I have functional dependencies running around, then – suppose I have a class C with two parameters, and U and T determines U. So, now I have C int int floating around, maybe I know how to prove that. And you come along and say, “Well, how do you feel about C int bool?” Well, now I can say not only is there not an instance of this, but I can say there would never be an instance of this because it would violate the functional dependency.

And Mark, I went back and forth, and he said, “Oh, well, maybe we can just get rid of all of your ideas and replace them with an invisible functionally dependent final parameter to all classes. Is this a positive or negative instance?” And I said to Mark, “Well, maybe we just need a smarter solver and we can get rid of all the functional dependencies and just represent them as little pairs like C int int holds else C int U fails. So, those ideas seemed, in a way, to be linked. 

We imagined that there might be cases where you’d say, for example, “Oh, I don’t think that you –” I want to make an assertion like you can never have an equality instance for functions. I mean, you could, but it’s probably computationally infeasible. And so, rather than leaving these as unsolved constraints that eventually the programmer sees a message down the road that says, “I wasn’t able to solve this constraint,” you would reject it right up front. And there was an idea from some folks at Utrecht, I want to say, where they did this just from a usability perspective, like they wanted to have these. I feel like they were almost like compiler directives where you just said to the compiler, “Well, if you see the programmer ever generates an instance that looks like or a predicate that looks like this, just tell them right away it’s not going to work out.” So, you could see it from that perspective. But the main reason that I was interested in it was really to enable more interesting instance selection, but in a modular fashion, so that you could do this prologue style backtracking instance solving without negation as failure because that’s unmodular, but with an idea of negation that would allow you to get some interesting backtracking behavior.

WS (0:22:11): That makes sense. So, how do you feel about Haskell type classes more generally because you spent a lot of time working on them? But I think they’ve gone out of vogue a little bit nowadays, like write type families instead, because they’re much better. 

GM (0:22:26): In a way, I think that type classes and type families at some level have the same kind of idea at their core, which is to say they’re open families of definitions. So, maybe you could think about a type class as being like a term family. Like equals is an open family of terms. It’s a type indexed family of terms. And as the programmer continues to build their program, they get to add new instances of that. And then an open type family is obviously an open family, type indexed family of types. And I would argue that, to me, this is, in some sense, one of the defining features of Haskell. 

One of the things that Haskell still does that a lot of languages du jour don’t, or at least do very differently, Haskell has this particular idea of how you do modular programming. And that’s interesting. I think that’s still interesting, whether you’re talking about families of terms or families of types. That parallel actually is interesting in both ways. So, on the one hand, you could say, “Well, we have something like closed type families,” but there’s no corresponding idea of where are the closed term families? Where are closed type classes? So, this is something that Richard and I proposed in a paper in 2017. You can see it as a restriction of instance chains. Instance chains were supposed to give you backtracking in a modular fashion, but maybe you’re happy just having it in a closed type family style.

And I think the parallel the other way is to say, “Well, with type classes or with, in this sense, term families, there’s a nice type level sigil. So, if I’m going to use the equals term family, then I have to, in my type, ensure that the term family is defined where I’m going to use it. But the same thing doesn’t happen for type families. If I have a type family and I use that type family somewhere, I have no guarantee that there is no type level sigil that says, “Well, this type family actually has to be meaningful at the types that I use it at.” And again, in this paper in 2017, Constrained Type Families, Richard Eisenberg and I argued that that was actually the source of a number of these irritating restrictions on the uses of type families.

So, from that perspective, I would say, I think type classes are an interesting way to do generic programming. I think type families are an interesting way to do generic programming. And I actually think there’s still mileage left in asking how these features are similar and what we can learn from comparing them and what we can do to make them more interesting to the Haskell programmer.

There’s a lot of heat over functional dependencies versus type families, and there are some papers that I think were misunderstood, but I don’t know that that’s super relevant going forward so much as the underlying ideas. That says something like, for example, in the case of, what is your value to functional dependencies? In some sense, functional dependencies had that type level sigil that says, “I need this type family to be defined because it was tied in together.” So, maybe there’s the crux of that idea that’s still valuable and it’s still equally applicable whether you want to view these things as a property of a class or whether you want to view it as an independent type family. 

WS (0:25:21): Okay, so here’s a question. So, Elm is a language which is a lot like Haskell, and one of its defining characteristics is that it doesn’t have type classes. One of the reasons I think the Elm designers took that approach is that they said, “Well, one is it’s hard to come up with good error messages, and two, once you add type classes, you end up adding a gazillion extensions, saying, ‘Okay, do you want multi-parameter type classes or just single parameter? And can they be higher kind variables there and get overlapping instances or incoherent instances?’” The design space just gets so big. 

So, suppose the Elm language designers came to you and said, “Okay, here’s a language, which is just like Haskell, but it needs a class system and there’s no legacy code and there’s no historical baggage or anything like that,” what would you come up with in that case?

GM (0:26:13): Let me try to dodge that question. 

WS (0:26:17): Okay, sure. 

GM (0:26:18): Because I completely agree with your starting point about the design space of type classes. It’s broke. And I mean, I wrote a very similar paragraph in a proposal recently, and I think I used a phrase like “programmers encounter unexpected interactions among these features.” And I think that’s absolutely true. I think one of the reasons that happens, and this is very much a rational reconstruction, I wasn’t there, but type classes came packaged with a brilliant implementation technique. You simultaneously got both type classes and dictionary-passing translation. And that these ideas have been inseparable since Wadler and Blott, or even since Stephen Key’s work before that. And that’s led to a certain amount of confusion. Maybe type classes are really about implicit, and all of the rest of this is noise. Maybe type classes are really about modules, and all the rest of this is just a way to generate good module instances and so forth, or vice versa. I mean, you can do the same thing and say, “Maybe modules are really type classes over new types or whatever it is.”

I never felt like – maybe this is just because I didn’t understand it, but I never felt like people really gave a semantic account of type classes on their own. It was always by reduction to something else. Like I want type classes really just be implicit. And so then when you start layering complexity on top of it—and functional dependencies are the best example of this, but overlapping instances, I think, are also a reasonable example—these things only get explained in terms of how they get translated. So, functional dependencies, it’s like, well, but the dictionaries don’t change. So, what is a functional dependency if it doesn’t show up in the dictionary? And I would argue this is actually where functional dependencies don’t mesh with GHC, is that there needed to be something in the dictionary, but there wasn’t something there. And so, it ended up being this like ephemera of type inference that didn’t do what you wanted it to do. Overlapping instances, you can say, “Well, I just end up with dictionary values.” And now it’s a question of which dictionary I pick. It’s no longer visible in the core. It’s not visible in system FC. Our core understanding of these things is by translation to system F and this feature vanishes. 

And so, that’s part of what makes these things poorly specified and difficult to disentangle, is that there’s not been a lot of work that tried to give type classes to semantics directly before explaining how they were compiled. And so, I would say maybe you should do that. Maybe you should explain these things. And if you did that, then maybe you would end up with a core language that I could propose to the developers of Elm or whatever and say, “Well, here’s the core language in which you can think about type classes.” And then just as with any core language, you can say, “Well, what are the things that are actually motivated by my use case? Do I need type families? Do I need term families? Do I need whatever other features? Do I need something like closed type families? Do I need closed type classes where I can do this backtracking?” But in each of those cases, rather than trying to pick out the implementation that some other language used, you’d be picking out the core semantics.

WS (0:29:15): Right. So, I guess what I’m asking for my question is, rather than trying to come up – like I understand, if you’re writing a grand proposal, you have to come up with a grand vision saying we’re going to unify all of these things and then solve a grand framework for all type class extensions and functional dependencies and type families and all of this type level magic and come up with a unified system for that. Let’s take an opposite approach and say, okay, suppose looking back at all the experiments that we’ve seen, whether it’s instance chains or multi-parameter type classes, can you identify or propose a subset of these features where you say, “These things, they play well together. This is a nice coherent set of language extensions which give you a well-behaved instant search, which let you do 80 percent of the stuff that you want to do, but it’s good enough”?

GM (0:30:14): I mean, maybe the challenge for me is that I just find the many Olegs of type class wizardry were my introduction to Haskell. So, you’re saying like – so maybe the point is that 20 percent is where my personal interest lies.

WS (0:30:27): Sure. Yeah.

GM (0:30:28): When I’m teaching Haskell, I barely touch type classes at all. I acknowledge some of them on the way by. I think the monad class is important. And that’s honestly the only one. And that means I have to talk about functor and applicative, which I do. So, I like the functor class. I talk about applicative very grudgingly. The monad class is important in the way that it pays off higher-order abstraction and lets you say, “Okay, here’s an idea that’s hard to generalize. It’s hard to generalize in a language without type constructors and type classes, and it’s useful to have.” But you can do monads with Haskell 98 type classes. You can’t do them on a template library. So, at that point, you say, “Well, what features do you need?” I mean, I need constructor classes, to be very old school in my terminology. And that’s it.

NV (0:31:18):  So, all of these, isn’t the answer like whatever is in Haskell without any language extension?

WS (0:31:25): Haskell 98.

NV (0:31:27): Would you remove something maybe, or would you add something?

GM (0:31:31): Sounded like Wouter was saying something.

WS (0:31:33): No, no, no. I’m thinking, I mean, it depends. I agree with Garrett’s position, I guess. Like, what is it that you want to do? What do you want to use these type classes for? And do you have a language which already has an OCaml-style module system where you can abstract over types and hide all kinds of stuff behind a nice API? And if you suppose you have that, well, what good are type classes then? Well, maybe it’s a thing you just need them for some inference part or to do some type directed instance search, and then, okay. But then I could use better programming for that. I’m sure I could come up with another solution to this. So, what’s the problem that you’re trying to solve? Maybe that’s the real counter question to my open-ended, ill-specified Elm.

GM (0:32:20): I mean, language levels in Racket are a phenomenal idea from this perspective. I’m just recognizing like, you know what, we have a big pile of stuff, but actually different chunks of that pile are useful at different points. For me, I mean, I do think that type classes represent an interesting point in the question about how do you design modular programs and how do you do meta-programming. And so, from that perspective, it’s like, well, I – to me, the interesting point is where you do say, “Okay, this is not just a way for me to do implicit. This is not just a way for me to explain plus.” It’s like macros. I mean, what’s the simplest macro system that you can use to do some basic things? Well, fine, but that’s not where macros are interesting. Macros are interesting when you say, “No, this is a way for me to understand how I write programs.” That’s the way I personally feel about type classes in the modularity space. They’re interesting as the defining facet of how I write programs, but there’s no accounting for taste. 

WS (0:33:19): Oh, fair enough. So, you’ve also done a bunch of work on session types. How did you get into that?

GM (0:33:27): So, I was introduced to linear typing as an undergraduate, introduced to linear logic and linear types. And when I was looking for a job after I got my PhD, there was a postdoc position open at Edinburgh working for Phil Wadler. And that seemed like an opportunity that was very, very foolish to pass up. And they were interested in linear type systems, which was something I hadn’t done work on recently, but I had known from long before. And that was interesting to me. And the motivation for that was something about session types and concurrency, which I didn’t know anything about, but I figured, “Well, I get a chance to work with Phil, and I get a chance to think about linearity. So, why not?”

To be frank, I mean, I think my original interests didn’t change very much over the course of that project. I was interested in linearity, honestly, mostly from an overloading perspective. When you start talking about linear type systems, all of a sudden, you get – well, I mean, the whole point in some sense is that you get multiple kinds of pairs. You get multiple kinds of function arrow. You get the linear and the unrestricted function arrow. And yet, there are a bunch of things which you can actually talk about. For example, both kinds of pairing operation have a swap function, where you can take an A times B to a B times A. You can take an A with B to a B with A. Both kinds of function arrows have a composition operator.

So, that was the point where I think I actually became particularly motivated, as it seemed like there was an interesting overloading problem here. And again, when you start thinking about the higher kinded settings, it’s like there are various ways you can talk about a monad in a linear setting, having to do with what assumptions you make about the continuation. Is this a monad in which the continuation is going to be linear, I get to use it once? Or is this a setting in which the continuation is going to be unrestricted, I get to use it multiple times? And so, maybe one of these makes a lot of sense for a state monad, or the state monad, in some sense, can be polymorphic in how you use the continuation.

Non-determinism, the list monad, as a way to realize non-determinism, is much less polymorphic. It really wants to be able to run the continuation multiple times. And so, those problems struck me as interesting. I still don’t necessarily understand concurrency, so that hasn’t changed.

I’ve had a little bit of a sales pitch, but I worked with Sam Lindley a lot in Edinburgh, and one of his students, Wenhao Tang, led this paper that was presented at POPL this year that was not talking about monads, it was talking about algebraic effects. But in some sense, the conclusion is actually the same one that I just described, but reframed. It’s that when you’re thinking about the interaction of handlers and linearity. The key question is basically, how do you use the continuation? That paper tries to think about how you could do a type system suited to handlers and linearity, but trying to play with this question of, well, really, where’s the continuation and what commitments are we making for how the continuation gets used? 

NV (0:36:18): So, can you experiment with all these using Haskell’s linear types, or do you need something stronger?

GM (0:36:24): So, to be fair, I don’t know. I don’t know that stronger is the right word. I mean, it might just be a matter of saying different.

NV (0:36:31): Different.

GM (0:36:31): But I haven’t played enough with linear Haskell to give you a firm answer one way or the other.

NV (0:36:36): Yeah, because as far as I know, Haskell’s arrow can be parametric with respect to linearity, but I don’t know if you can encode all these continuations.

GM (0:36:48): It’d be an interesting exercise. I haven’t done it. 

NV (0:36:50): Yeah, yeah, yeah. So, do you have any active projects right now that you want to talk about?

GM (0:36:55): So, one reason that I talked about, wouldn’t it be great to have a semantics of type classes, is that that is one of the things that I’m working on. It’s a project that’s just started. But the other thing is continuing the story of extensible types, but thinking about them as a feature of the type system, rather than something that you would encode. So, I started writing about row types a couple of years ago. This was an accidental collaboration, but this was while I was in Edinburgh. I presented my encoding of data types a la carte. I argued ++. And that the key idea of that was this question when you’re talking about branching that says like, “Well, I need to be able to say I’ve handled this many constructors, and here are the constructors that are left.” So, you needed an idea of partitioning a data type.

There’s another guy at Edinburgh, James McKinna, who was working on views and bidirectional programming. And so, he actually had a very similar story, but in the case of records where he needed to separate out. When you projected out a view of some data type, and then for example, you mutated that, and then you wanted to push that mutation back up to the original, you need idea of the rest of the data type, like what’s left. And so we realized we were talking about the same thing, but from two very different places and two very different applications. And so, we collaborated on a paper that proposed a different way to think about row typing, one that was based on a qualified type system with row concatenation and then branching based on that. 

A current student of mine, Alex Hubers, and I did a follow on. So, the row-type system that James McKinna and I developed, along with most of the row type systems in the literature, are very good at saying, “Okay, I have some cases that are specific and I want to talk about, and then I have a bunch of other cases that are left parametric.” Usual parametricity results about them. 

The thing is that sometimes it seems to me like you really needed a look to say a little bit more about the part of that that’s parametric. So, if you imagine you’re writing a modular syntax tree transformation. So, it’s like, okay, I want to identify all of the infix application nodes and transform those into prefix application nodes. So, that’s the core of my algorithm. And here I know exactly what my constructors are. I have E infix on the way in, and I have E app on the way out. I’m good. And then you say, “And the rest of my syntax tree, I don’t care about. I want to leave it unspecified.” So, maybe I’ve desugared tuples by now, maybe I haven’t, but it doesn’t matter. So, I want to leave it all that parametric, and that’s parametric on the way in and it’s the same parameter on the way out, and I’m very happy. And then somebody comes along and says, “Well, isn’t your syntax tree recursive?” And I say, “Well, yes.” And they say, “So, that bit that you left parametric, don’t you need to recursively remove the infix constructors in there?” And I say, “Well, yes.” And they say, “Well, how do you recursively call your function on this row variable that you don’t know anything about?” And I say, “Oh, I just encoded it in Haskell, so I could work my way around that.” 

And that was the jumping-off point for the work that Alex and I were doing. That was about saying, how can we do generic programming over things for which we don’t know the constructors? And so we had, I think, a couple of really nice examples in there, actually, that say, how can you – suppose I have an arbitrary variant type. So, I don’t know any of the constructors that live in it. But I’ll tell you that I have a record of functor implementations for each type that appears in that. So, I’m never going to commit to what the labels are, but I’m just going to tell you, “Here’s a variant over some set of labels. Here’s a record over the same set of labels. And now, can you do a functor? Can you build a functor instance for that variant?”

WS (0:40:24): So, it’s a bit like true sums of products, but then open-ended in a way, right? 

GM (0:40:30): Exactly, exactly. So, I don’t know, this is me tooting my own horn a little bit, but I was very happy with parts of that paper. I thought there were examples that were really, really pleasant in that paper. There’s an early example where I basically say, well, suppose you give me an eliminator for a variant type, so a function out of a variant type. And I can basically, completely generically, give you a product of the individual cases and go back. So you can observe this, like Sigma variant over Z to T is equal – isomorphic to a product of individual functions for each entry and in your row Z. Then the final piece of the story – so that actually doesn’t talk about recursion that like sets it up. So, you can say what the functor instance is, but it doesn’t actually tell you how to do the recursion. And we’re now finishing off trying to do the recursion, the recursive step. 

We’re also trying to identify a core language that’s suitable to express these ideas. So, our initial mechanization of all of this argument was in Agda, and it was basically a shallow embedding. I mean, we did not rely on all of Agda in the semantics, but we could have. And we also have an implementation that mostly works, although not ready for other people to kick the tires yet of this in this sort of thing in GHC, which is driven entirely by unsafe cores. Like really, really terrible applications of unsafe cores. I spent a while poking at GHC to observe that, in fact, I could treat a tuple of length four or a tuple of length two by just coercing it, things like that. 

So, I don’t recommend that idea at all, but it did leave this question of like, well, is there a nice core language where you could say, “Well, add one feature to system FC, let’s say”? That seems like it might be the limit. Can we add one feature to system FC that would enable us to just do extensible programming—programming with extensible data types in Haskell? And so, we’re seeing if we can figure out what that feature is.

WS (0:42:25):  So, do you care to speculate here to wrap things up about, as we hand you the keys to GHC and ask you to add one feature? 

GM (0:42:36):  I mean, I would be terrified. I think there’s actually value in having the sort of like making it difficult to extend GHC or to extend the core language. It’s not difficult to extend GHC. I mean, I think the reliance that says we probably shouldn’t constantly change FC. I think that’s probably a very rational idea. We’re thinking about natural number indexed data types. The idea is that we can reduce all of the complexity of user-friendly programming with extensible data types to a dependent ML style. Basically, you have natural number indexed type families.

WS (0:43:12): Okay. Can you sketch the encoding that you have where you somehow count the number of constructors, or what’s the intuition there?

GM (0:43:21): So, at the surface level, it is a qualified type system. There are echoes of the Trex system in Hugs, but you can think of it as an unary generalization of Trex. And maybe the key insight in Trex is to say, “Well, if I give you an arbitrary, let’s call it a variant, there are some specific constructors that you need to know about in that variant, or else you wouldn’t need it.” I mean, you’re also using it completely polymorphically, and it doesn’t matter.

So, really, the only thing I need to tell you is where to find the constructors that you want. And so, in Trex, there was this elegant shell game in which it seemed like what you were – there were these constraints that seemed like they were talking about how to extend a row. But what they were really doing was telling you in the extended row, where did you put the case that you wanted?

WS (0:44:12): Right. There’s a single Lax constraint, which seemed like it was saying – it was used to postulate the absence of a label, but it was really used to tell you the label will be at this position so that you can quickly access it, right?

GM (0:44:30): Right. And so, in some sense, we’re doing an unary variation on that. So, when we tell you these two rows fit together to build this row, the evidence for that is basically to tell you where each field in those rows went. And so, that’s what reduces the whole thing to natural numbers without having to do something silly like Apriori committing to what the index for every label is or something like that, where really all we need to tell you is where do the labels that you think you want land in the data structure that I’ve given you. And so, yeah, those can all be natural numbers, and it’s all quite pleasant. And that is essentially what we did for our Agda semantics. We translated this all down to just products over finn n. So then maybe it feels like that’s the essential feature that we need to extract and describe for a core account of extensible data types.

WS (0:45:19): And then all you need is LiquidHaskell integration to throw all these problems at a SAT solver or something and then –

NV (0:45:27): To get rid of the unsafe crash.

GM (0:45:31): Yeah, exactly. I started out trying to use type classes to encode extensible variants. And more years later than I want to count, I’m still thinking about extensible variants and I’m still thinking about type classes. So, I guess consistency may be the mark of a small mind, but it’s worked out for me so far.

WS (0:45:53): Great. Thanks, Garrett.

NV (0:45:55): Thank you. 

GM (0:45:56): All right.

Narrator (0:45:57): The Haskell Interlude Podcast is a project of the Haskell Foundation and is made possible by the generous support of our sponsors, especially the Monad-level sponsors: GitHub, Input Output, Juspay, and Meta.

SPONSORS
Monads
IOHK Juspay Meta
Applicatives
CarbonCloud Digital Asset ExFreight Mercury Obsidian Systems Platonic Systems Standard Chartered Tweag Well-Typed
Functors
Artificial Channable FlipStone Freckle Google HERP MLabs TripShot
To learn more about the Haskell Foundation
Haskell Foundation, Inc.
2093 Philadelphia Pike #8119
Claymont, DE 19703
USA