Arnaud Spiwack

Transcript

This transcript may contain mistakes. Did you find any? Feel free to fix them!

Matthías Páll Gissurarson [0:00:19]: This is the Haskell Interlude. I’m Matthías Páll Gissurarson, and my co-host today is Joachim Breitner. Today’s guest is Arnaud Spiwack. We learned how linear types help everywhere in Haskell, including faster-than-disk network access, how linear types go beyond Rust’s ownership system, and why it’s not always best to type check everything in Core.

Welcome to the Haskell Interlude. We’re here with Arnaud. My co-host is Joachim.

Joachim Breitner [0:00:50]: Hi there.

MPG [0:00:50]: So Arnaud, tell us how did you get into Haskell?

Arnaud Spiwack [0:00:54]: Well, when I was in academia, before all this, I was mostly doing OCaml because I was working on a Coq proof assistant. But as someone who was very invested in programming language theory as an academic, I didn’t know a lot of Haskell because you get to read Haskell in papers. I like to say that I was fluent in Haskell before I wrote a single line of it. And so it turns out I leave academia after ten-ish years. And so opportunities come about and one of them involves a lot of Haskell, which was joining Tweag. And I did. So I had to learn a lot of things about the tooling ecosystem when I joined. Because I could write Haskell, but I don’t know how to actually compile any Haskell program. I had to learn. But that’s sort of the general story. Just it showed up. And then one thing led to another and we went into implementing linear types for GHC and then pulled me into the community some more.

MPG [0:02:00]: You implemented Linear Haskell, or you were like leading on that project. Can you tell us a bit about what are linear types and what is Linear Haskell?

AS [0:02:09]: What are linear types? Linear types Just introduce a new subset of functions or identify a subset of functions because It already existed, but there was no type for it. And these functions, they use their argument exactly once in a sense that can be made precise-ish. And I mean, it might first look a bit silly, just a weird restriction. But when you think about Rust and the popularity that it has, the restrictions are kind of similar, not quite. And it just lets us encode a ton of patterns inside pure functional languages that we couldn’t do before. The most typical example is the one from Rust, like encoding mutations without sharing. It’s honestly the most interesting one, but it’s the easiest to come up with. We’ve been working recently quite a bit on a destination passing, where instead of returning an argument, you take a hole as an argument and you fill the hole with the return value. A nice example that we came up with Jean-Philippe Bernardy, there’s an article about this, is you can use linear types to draw graph-like or circuit-like diagrams as functions in your language. And for these sort of applications, you probably mostly want to have a graphical language, but that is not always available inside GHC nowadays with this little library. You can just use linear functions as a way to represent your graph. And you don’t have to do things like, I add a node and then I add an edge, like you would do in Python or Java, which is fairly error-prone and at the very least not idiomatic in Haskell.

MPG [0:04:02]: Right. So linear types allow you to automatically just add functions and you get a graph out of the tool? Or how does it work?

AS [0:04:10]: Well, you write pieces of graphs that function in that story. So you have a bit of graph, it has a few edges that don’t lead anywhere that are the inputs, and a few edges that lead nowhere that are the outputs. And maybe you have three of them and then you say, okay, so this is a linear function from three ports to two ports, and maybe the ports have types. So maybe a port of type A, a port of type B, a port of type C, to a pair of a port of type D, and a port of type E. And that works for the technical thing that you’re building is a term in the language of symmetric monoidal categories. Yeah, I know it’s fancy.

JB [0:04:55]: It was quick five minutes into the podcast and we have monoidal categories?

AS [0:05:00]: Yeah, I mean they’re very related to linear types. The point is that this language that you can write with combinators and it’s awful is the same thing as a circuit that you can draw and it’s beautiful. And with linear functions, you can sort of bridge this gap and not have to use the awful language despite the fact that you can’t really draw in Haskell.

JB [0:05:20]: Does that include cyclic graphs or is that a different story?

AS [0:05:23]: If you want to add cycles, you need to add this ability sort of manually, say, okay, like a fixed point combinator or something. The technical term, if you want to look it up, is a trace like in traced categories. But yeah, it will give you the right signature that you want most of the time and the right properties that it should have.

MPG [0:05:44]: Right. So what kind of prompted Linear Haskell in particular? Like was there a specific project that you’re like, “Ah, here we would need linear types,” or was it just, “Ah, we can do it, and then we do it”?

AS [0:05:57]: It is partly a coincidence. We’re working on a project on which two of the members for Tweag were Jean-Philippe Bernardy and I. This project was about distributed storage at a massive scale, but we were both fresh out of academia and we had worked on linear typing not too long before for different reasons. But we sort of started seeing opportunities for linear typing everywhere. “Oh, we could do that much better if we had linear types.” “Oh, we could do this much better if we had linear types.” It’s kind of a running joke. And maybe the joke is not funny after you repeated it enough and it sort of became a serious thing, and we decided, “We have some room to do that. And so, let’s make it happen.” So some of them were related to protocols. If you consider the notion of monoid linearly and you have a lot of monoids that are weird like you could possibly do, and we’re thinking, “Well, okay, so we are making some kind of MapReduce sort of framework. Can we encode a more complex protocol inside MapReduce?” We’ve had some experience about that. It worked, but that was one of the idea.

Another one was due to RDMA. That’s a remote direct memory access. You have DMA in your computer, that’s what makes the different peripherals talk to the memory directly and not kindly ask the kernel to, “Please, kernel, switch so that I can write to the memory.” That would be very slow. So it’s the direct memory access. That’s the direct part of the memory access, I mean. And the remote one is the same, but across the network. It’s common in high-performance computing clusters. It’s the way they tend to transmit a lot of data that the interface looks like, “Okay, so here’s the buffer in my memory. You can write in it.” And maybe you split the buffer in bits and you pass, have the buffer to a computer, and have the other buffer to another computer, and they both fill their bits. And they just do that without any access to the CPU or the kernel. It is pretty fast with good infinite hardware these days. It’s faster than disk access, which is ridiculous. So yes, we wanted to render this interface in Haskell and it’s the kind of destination passing, which I didn’t necessarily realize at the time. But it sort of was obvious that, with linear types, you could make a lot more sense of what you were allowed to do and make this interface more pure.

These are just two examples and there were also issues about memory control because In distributed systems, the garbage collector is a rather big pain. Because the general idea of a garbage collector usually is that it does cost some time and sometimes it slows down your program, but it’s an exchange as well of having extremely fast allocation. And so in amortized time, in batch time, it’s practically invisible, it’s very fast. But if you distribute enough processes, there will almost always be one of the processes that is currently waiting on its own TC. And then everybody’s waiting in the amortization doesn’t work anymore. And that’s the problem that other distributed systems have encountered if you troll through the Spark documentation. Apache Spark, that’s a well-known distribution framework for data parallel computations, a successor to MapReduce if we want. You’ll see also considerations about managing memory out of GHC heap to avoid triggering the GHC yadda, yadda. And we also thought that linear types could help with this.

MPG [0:09:49]: I mean, because in Haskell, we try to avoid mutation, right? But then if you’re writing a big application, I mean, my experience is mostly with GHC itself. Like there is a lot of mutation going on because that’s kind of what you need for speed. So yeah, it’s ironic, right, that you kind of need a new type system to be like, “Okay, now we’re going to make sure we only use this once, and then the reference can be gone and then you can just garbage collect.”

AS [0:10:15]: And that’s the idea behind ownership or uniqueness. Basically, the observation from the 20th century is mutable, state mutable, shared state bad. And so you have basically two choices if you want to avoid that. Either you don’t mutate state or you don’t share mutable state. And functional programming is more like don’t mutate, but all the languages, like Clean like Rust, nowadays are more on the side of let’s just not share. And adding linear times to Haskell allow typing. There’s non-sharing. Though it’s not the same thing as an ownership-type system. You can encode ownership in it, but it’s going to be more clunky than Rust that has a type system that is dedicated to this.

JB [0:11:11]: Can you maybe explain the difference a little bit for those who haven’t played with both languages already?

AS [0:11:16]: Sure. There’s a lot of practical differences, but the theoretical difference is that, in an ownership-type system, I get the value with the guarantee that nobody else is using that value, is looking at that value even. I know that from the type of my argument as a function. So it’s a constraint on the context and I, the function, am all powerful and I’m free to share it afterwards. I can’t promise to anybody that it is not shared, but I can do that. In a linear type system, I, the function, promise that I will use my argument just once. So I am very restricted, and the context is not, and the context is actually freer. It can do more things with this than if I was allowed to share it, because the context knows I’m not going to share.

JB [0:12:12]: So how does that then help with this? Maybe there’s a naive idea of how Linear Haskell works, but my impression is whether if you have a function that takes an array where I can mutate some cells, and I have a linear function, then it can compile in a way that actually does the mutation in place. But it seems if I don’t get the promise that somebody else is still looking at the array, then this doesn’t quite hold. So somewhere, my understanding must be broken. Can you fix that?

AS [0:12:36]: Yes. So you were complaining about categories earlier, but now it really became technical. So, how you approach this is, as a linear function, what I guarantee as well is if the thing wasn’t shared when you gave it to me, it’s going to be unshared when it goes out, right? And therefore, one way to encode ownership is to say, okay, I’m going to create a value, an array, and I’m going to pass it to a linear function that cannot return the array. It needs to be used within the scope of this function and that will return a value. So it’s a bit of a continuation passing transformation, but only locally. So you take a scope function as an argument. And within the scope, the thing will only be used as a linear argument, so I control. The truth is, very often when you do that, you actually have two people watching the array, the scope function itself and the allocation function that probably keeps the array so that it can do some modification. If there are some exceptions, you might need to do some deallocation. So it keeps a handle on it, but it doesn’t really do anything. It promises it’s not going to do anything cheaty to it. So it acts as uniqueness typing of this continuation type and coding. Interestingly enough, as far as I’m aware, if you have an ownership-type system, you can’t encode linearity via continuation passing. So on a pure theoretical point of view, linear types seem to be the more general one. That’s not why we went that way, though.

JB [0:14:19]: But as somebody who’s then thinking about how the program got actually executed in the machine, it seems that this naive idea of, “Hey, with types, all my array mutation become in place,” is not a local property of the function, but rather needs more information about context in a more complex analysis of the compiler to actually pull this off.

AS [0:14:39]: Well, it’s a program-wide or a library-wide property, right? It’s an invariant of your library. Just like when you have the set library, it is some balanced tree data structure I don’t even know which. In the containers library, I mean. But it has an invariant and functions all assume that this is why it is true. Otherwise, they will produce a wrong tree. And you have the same sort of idea except you will break much deeper properties of Haskell if you get it wrong and if you implement a mutable array library like that. So we develop the linear-based library that you can check on Hackage or on GitHub, and it has a mutable array library and you will see that it has some pretty scary-looking unsafePerformIOs going on.

JB [0:15:30]: So if I can paraphrase, linear types don’t allow the compiler to introduce in-place updates safely, but allows you to write a library using unsafe primitives with a safe interface towards its users.

AS [0:15:42]: Yes. Yes is the short answer. And of course, you can possibly – it might be possible to add some compiler magic. It is a direction that I haven’t pursued and I find a bit daunting, to be honest. But it is not impossible in theory.

JB [0:16:02]: It’s also unclear how much we would gain because whether the trusted code base is just what happens to be in the subdirectory compiler of the GHC repo or also in the subdirectory base or linear base doesn’t really change too much to me at least. But our opinions vary.

AS [0:16:17]: Well, it depends on whether you want some automatic optimization versus some more careful controller about what’s going on. In the project, we’ve largely been more about control, and saying what we’re trying to support with linear types is more, what do I do when the compiler optimizer stops helping me? Let’s backtrack a second, right? A compiler does a ton of things for me. And that’s good because I don’t have to think about them, right? And GHC has software transactional memory, super underrated by the way. It has green thread, it has garbage collection, it has memory layouts.

JB [0:16:58]: Compact regions.

AS [0:16:59]: Yeah, it does a ton of things, I don’t have to think about them. And it has all these optimizations that are built in the libraries or the compiler itself. So I write the code as I want it and I know that the code that will be produced is decent most of the time. But it’s not always predictable. It’s a common experience of many people trying to write performance code that they just add one line to a function and it just behaves dramatically worse because something in the compiler decided, “Yeah, that’s too big for inlining anyway.” Or you move it to another module and then things change. You don’t know how, you don’t know why. It blows in your face.

And when you are trying to write highly performance code, or you have a bit of your goal as this needs to be highly performant, then the compiler optimizer gets in your way. You have to think about what it does all the time about how to work with it and how to circumvent it. And so how about we expose some things to the user that help them do without the optimizer, and things like that could be like fusion. Think array fusions in the vector library. It’s fantastic until it isn’t, right? But you can do the same work and you can expose the same work by giving a bit more than a structure to the users that look like arrays are really not allocating and you chain them with linear functions so that you are sure that you don’t duplicate computations by mistake. You pull together and you do the work that the fusion framework of the vector library does. You do it by hand, at least you know exactly what happens and you are not surprised. And that’s small, this sort of use case that we’ve been trying to think about, and therefore neglecting what the compiler could do to take advantage of linear types.

And to be honest, even right now, the compiler only sort of understands linear types. If you know about GHC’s internals—and I know both hosts do, but not everyone listening—there’s this typed language called Core where most of the optimizations happen. And the type checker for Core is called the Core Linter, and you can trigger it on your GHC builds by passing -dcore-lint to GHC. This option doesn’t check linearity. So it just drops any information about linearity when it works. And you have a different option, which should be -dlinear-core-lint, I think. I’m not entirely sure. I always forget which precise name we gave it. And this will also check linearity and trigger linearity checks. And it really works reliably only on the output of the desugarer. So before we do new optimization, we produce linearly typed call, and then after that, we sort of forget about linearity because it’s very subtle. And it also gets extremely technical here. So I’m probably going to stop speaking about it now, but I can get in-depth on why it’s technical and the interesting problems that adding linear types to a compiler pause compared to what the literature provides as guidelines.

JB [0:20:36]: That sounds dangerous to some extent. It means that GHC optimization may actually change the narrative properties and break the invariance that the, I don’t know, linear base library is relying on?

AS [0:20:49]: It may, but a conjecture that I can’t even say the conjecture formally, these are like questions I still have, is that any compiler pass that would destroy linearity guarantees is absolutely terrible for normal programs. And defining breaking linearity guarantees and absolutely terrible are not easy. So it’s a bit pretentious to call this a conjecture even. It’s a principle. And so the principle under which we operate is that it’s going to be fine. It’s going to be fine because it’s not the linear type problem. It’s a GHC problem. It’s just, if you are doing an optimization, then you are not destroying linearity guarantees. The difficulty is not that the program output is not linear; the difficulty is it’s linear in complex ways that are difficult to verify for a computer, and we don’t have all the answers yet about how we could verify them. And what we don’t want to do is to say, “Okay, nevermind. We’re not going to stop doing this optimization because we don’t know how to type the output. The optimization is good. We’re not going to hurt everybody.” So instead, we say, “Okay, well the – or type checker is bad, so we’re not going to rely on it too much.”

JB [0:22:11]: I mean, I wish I would not be able to go on a very high horse and say, “Oh, this is bad,” but I just made the same argument for something else on a paper I submitted. What else to do is make my paper say, “Hey, the linear Haskell guys are making the same argument that, yeah, transformations will not do this kind of stuff because that’s transformation anyway.” So yeah, I can’t complain.

AS [0:22:30]: It is a tight rope to walk, right? It’s not super comfortable, but it seems to be working for us right now. I mean, it’s an engineering trade-off and also a research trade-off because there’s some knowledge we just don’t have. And that’s an interesting exercise, right? If you were to start doing a compiler from scratch and you use a typed intermediate representation as where you do your optimizations, how much would you restrict your optimizations just to satisfy your type checker? You really tried that to be a tiny, tiny amount, but it still happens that linearity becomes quite hairy. And we don’t even have a good definition of linearity without a type system. And so it’s difficult to say, yeah, it’s linear because some argument that is undecidable is just not called by the type checker. We’re not even there yet. So there’s a lot of knowledge to be built that I think is triggered by this new problem.

JB [0:23:29]: If you’re saying you don’t have an untyped understanding of linearity or an operational one, does this mean that you can’t do a type soundness proof for your linearity check? Usually like progress and preservation proof, I would expect to prove that value. You run the program without looking at the types and certain things, when they happen, obviously that’s bad. And that’s the kind of things you want to rule out using our type systems, in this case, using a value twice.

AS [0:23:56]: Yeah. But the usual type soundness properties are I evaluated a value, the value has the right type, right? And that you can do with linear types, it’s just purely mechanical. But you have other sorts of definitions of having the right type. Like realizability tells you a weight to build a notion of something being of type into something that computes an integer that is vastly more complex than whatever type system it would capture because a type system is decidable most of the time, or at least you tend to want to. And at the very least, it is modular. That means, in particular, the type of something has built out the type of its component.

But really, like if you’re doing some lazy language and you write let X equals something awful in zero, then this is an integer computation, right? You’re not using the X, it doesn’t matter whether something awful is, it can be as untied as you wish. It is still an integer. And you can define what it means to compute an integer. It’s just the sort of easiest way to say it relates to your value, value reason, to an integer value and the value is integer. And then you can define what’s the function from end-to-end, which says, “Well, if you give it an integer, it computes an integer.” And that sort of gives you a definition of what it means semantically to be of these types. And this is a nonmodular and compositional non whatever you want definition and is beautiful after a fashion, but it is not something that is operationally usable in a compiler or a type checker or anything. And then you build your type system and you can say, “Well, it’s adequate to that definition. It underapproximates this.” Linear types also underapproximate something and we have an intuitive idea of what this something is, but we don’t have something that I would personally consider being a satisfactory intuitive definition that is formal enough to state other theorem. And that’s the missing bits that would really help speaking about what we can do in compilers and doing more advanced, more subtle type checking that we could enhance lint, the linter with, so that we can certainly be much more confident about linear types in the compiler.

JB [0:26:26]: So any grad students looking for a PhD thesis topic should now contact you?

AS [0:26:31]: Yeah, there’s actually someone doing some work as part of their masters on part of this. Not the semantic bit, but on understanding how much more we can do in Core. But yeah, this is very difficult work right now. But if you can tag me, I’ll do my best to advise.

MPG [0:26:48]: Right. So you touched on this a bit, but I’m kind of interested in, because it’s a big change to GHC, right? But it seems to have just gone very smoothly. Like you added it and if you don’t have the language extension, you never get a mention of linearity, but linearity is still there, right? Everywhere, no?

AS [0:27:11]: Yes and no. So it’s a small change in theory because we’ve engineered it that way. It’s a medium change in practice. And we have had bugs where linearity zipped out in normal programs. But yes, linearity is everywhere nowadays because all your data types are linear in the first approximation. So the reason why it’s a fairly chunky change as far as the code is concerned is that we modified the arrow type a little. And the arrow type is constructed and deconstructed everywhere in the compiler. So, it becomes a small change, but it’s very non-local. The patch was 7,000 lines, I think, with the following caveats. More than half of it were tests. So it’s not as big as it first sounds. And most of the changes are in Core. And we have since sort of walked back on some of these because, at the time, we were trying to type the output of every passes before realizing the futility of our endeavor. And futility, not like it’s a bad idea in theory, but because we are just not thoroughly unequipped to actually tackle this challenge, the word population in general, yet. We don’t have enough knowledge to do that. I think it’s a lot of academic work.

But yes, we wrote a paper that we published in the conference POPL in 2018, and this paper had very, very little new things to it. It was not a paper that says, “Oh, this is this new scientific idea that we’re pushing over.” We pulled from different streams of research and try to make a coherent hole that was primarily easy to integrate inside GHC, at least in theory, right? Simon Peyton Jones was part of this effort as at the very least, a sanity check. Like if I make a claim about that being easing GHC, I would make sure that he would say yes or no. Right? That helps a ton in making sure you do something sane. We believe this design is probably good for any language that wants to add linear types. Though if your language is not lazy, you might have more leeway because laziness interacts with linearity in ways that are not obviously very intuitive a priori. But it’s a difficult conversation. So it constrains your choices is the point. And so you might want to do something else because all the designs would work better, but I think this design still works. But it was also primarily sanity-checked against GHC.

And the first prototype, I built a prototype to publish this and not that with a paper, I think. And that was three weeks work. It was really short. It was also unsafe. But you had to work a little to find the end soundness in all that. But then working to patch all – some things were like challenges for engineering reasons, but all in all, it was not a gigantic sort of patch. And if we had taken a more loose relationship with Core from the get-go, we would’ve also saved us a ton of time. That was really the hardest bit, was to try and integrate with Core and then realizing that we could not fully anyway.

JB [0:31:04]: Do you think that’s a model we should follow for other type extensions where they sound useful in the source language, they’re hard to integrate into Core, to just say, “Okay, well, let just have them and don’t check them at Core anymore,” or do you think it was good and reasonable for Linear Haskell, but it’s not a good example? Like I’m thinking of extension types, in particular, but make it a bit more concrete.

AS [0:31:27]: I think it’s a case-by-case. I haven’t thought about whether existential types could exist without Core modifications. It might be difficult because the type eraser would – I’m not sure how you handle it. It might not be easy to do. But if you think about the refinement type, school is all about that, right? You have the sort of intrinsic typing that you have and this one is something that makes sense to preserve, and you have the refinement types on top of it, which is extrinsic, and you can erase sounding. And you could imagine that if you think of your system as a refinement type system, you have the basis that goes to Core and the rest that doesn’t, this is a perfectly sound design. And then you’re kind of liable to prove that the eraser is sound and could be really created. And if you’re doing something like Liquid Haskell and you want to make sure that your program doesn’t divide by zero, you are pretty sure that the compiler won’t introduce that, that’s fine. And that would be a non-ambiguous bug. It’s illegal for a computer to introduce a division by zero. And you could follow this approach because there’s not really a use for the compiler to say, “Okay, this integer is even, what do I do with it?” Nothing. Right? Unless you can use that for specialization. But it’s very dubious.

So if you don’t need it in the compiler, erasing it, like for me, I have a fairly relaxed vision of this. The type system in Core is both a curse and a blessing in practice. It makes some things really difficult, but it also allows for some optimizations. So from an engineering point of view, it’s a bit of a balance to strike, in my opinion. And I’m not a maximalist in either way.

JB [0:33:16]: Yeah. A good example maybe that’s accessible for this is, a very obvious thing you may expect the compiler to do is, if you have an either in Bool and you turn it into an either in String, so you only change it if it’s a right, but you don’t change it if it’s a left, it’s still changing the type of the whole thing. So in Core, you cannot reuse the same value. You’ll have to reallocate a new Left. And this is a very small example for some obvious optimization you would do if you didn’t have types in Core and GHC can’t do with optimization in Core. It has to do with a later phase where there’s less guardrails - like the STG phase. But this is a nice, small example for the kind of challenges of having a typed intermediate language.

AS [0:33:58]: Yeah, and also for the implementers. Working in Core and having to deal with the types and making sure they’re preserved properly, it is work and is work that if you had an untyped language, you might not have to do and it might be more productive to work on the optimizer.

JB [0:34:16]: Would you prefer to do this work with another language that doesn’t have a typed intermediate language? Does OCaml have a typed intermediate language or would you not have to worry about that there?

AS [0:34:26]: I think it doesn’t, but it will. I think that’s the status. I’m not sure. There’s a bunch of intermediate language. There is an intermediate language in OCaml. It’s called Lambda and it’s a Lisp sort of language and doesn’t have any types, but then it also doesn’t do much optimization at all. And now we have these languages Flambda and its successor, Flambda 2, very oddly named. Flambda 2 is not rolled in the compiler yet as far as I’m aware. But I think these languages have some types in them. And to the best of my knowledge, you don’t use them by default. But they do some quite a bit of optimizations, not quite the same as Haskell does because there’s much more things like exceptions and references to take into account that I hear. They’re not very good at references anyway.

I mean, some things are legal in either and that is not legal in the other, and you’ll have different idioms that you want to optimize for, but you still rather not use references in OCaml if you’re using Flambda to optimize if you’re counting on Flambda to optimize your code. That’s my understanding. But something it can do I think is, in the source language in OCaml, you have exceptions. You have just one kind of exception. You can erase exception, cache exceptions. But in the intermediate languages, you have two notions of exceptions, or the exception type itself is the same, but you have two ways to throw exceptions. It can be either a lexical exception or a normal exception. And the lexical exception, you know that within the functions frame, you have a catch, and so you don’t need to do stack rolling. And they’re super fast and sometimes can be compiled to tight loops. It’s tight loops and things like that. And the compiler tries to detect opportunities to transform exceptions that you wrote into such lexical exceptions that it also uses internally. We’re very off-topic, sorry.

JB [0:36:24]: Well, but it’s still interesting.

MPG [0:36:25]: Yeah. And I wonder like, so now you say you kind of skip checking Core so much, but you mentioned Liquid Haskell. I don’t think they look at Core that much either. So if you were doing it over, would you just skip the Core part and just have it on top, or is it necessary to look at the linear Core to make sure it’s safe? So, why are you still using Core?

AS [0:36:51]: Okay, story time. So my initial design, I proposed to not touch Core at all in linear types, at least in the first version, and to think about that later. And that’s what the prototype did. And that was one of the causes of unsoundness, but not for any reason you can come up with on the spot. The reason for unsoundness is that if you go – the Linear Haskell paper is on GitHub and I was tracking issues there for what we needed for the prototype. So you can go to the GitHub page, look at these issues. And there’s one that I named Debug TM. And that was my main problem in the prototype, where sometimes you could just simply convert a linear type to a non-linear type and completely forget about the guarantees that you were requiring. And the reason for that was that sometimes types are compared for equality with a function called Eq type in GHC. And then this function is used both in Core and in the type checker. And because it’s the Core function, I decided to just ignore linearity there. So whenever there was a many n1, Eq type would say, “Yes, these things are equal, everything fine.” But it’s also used in the type checker. It’s not like obvious how usually you use unification, but in some circumstances, instead of unification, you call Eq type directly. And so then it would say, “Yes, this type that is nonlinear is obviously linear so everything’s fine,” and would let you code and sound programs.

And so one of the primary reasons, like beyond hubris of thinking we can do this to go and start doing things in Core, was the fact that there was this one type type that was used both for the optimizer and the type checker and made it difficult to do something purely in the type checker. Nowadays with the Trees that Grows architecture, maybe that would be a bit easier, though we’re modifying the arrow type. So it is still something that is a bit unpleasant. Actually, I don’t know whether type has Trees that Grow extension points. Does it? I don’t think it does.

So anyway, a pure superficial circumstantial engineering issue, we have the choice. Either we split the type type in two or we start dealing with linear types in Core, at least to some degree. And we chose the latter. And it’s still not clear to these days that splitting the type type in two would not have been harder. So I know there was this discussion recently about constraint and type, which used to be distinct in the front end and equal in Core, which caused all sorts of problems, continuing problems. And so it was recently decided to split them in Core. And when we were discussing this proposal, I point down like a solution that is not discussed here is split type in two. I had a grudge. And so then you don’t have this problem because then it’s very clear that it’s equal in Core, it’s not as disequal in Haskell, and there’s no confusion possible there. No bugs for treating acting to equality types or whatever, things like that. And it was not considered feasible. Well, that’s funny. And we split type and constraint in Core.

JB [0:40:31]: Was it really not feasible or just not worth it?

AS [0:40:35]: Well, I mean it is definitely feasible if we have infinite time and money. It’s just a lot of work to split these things. I’m still hopeful that someday we don’t use the same type for both Core and the front end. But it’s not easy. It’s definitely not an easy task to tackle. And this actually split over – we discovered last week or two weeks ago that this split of constraint and type in Core prevent unboxing GADTs in many cases. It’s not a serious easy. The thing is, your GADTs probably don’t get unboxed. They don’t get unboxed for a variety of reasons, not just this one. So most of the time, it doesn’t make a difference. But the thing is, you can only unbox things like you put things in unboxed apples and unboxed tuples, only take arguments of kind type. So you can’t put constraints there because the kind is now different. And why did we discover that last week is because an idiosyncrasy in linear types was hiding the bug.

JB [0:41:37]: But how can an idiosyncrasy in linear types hide a bug that’s unrelated to linear types?

AS [0:41:41]: So it’s not unrelated to linear types because all the constraint fields are considered non-linear in GADT, whereas the non-constraint fields are linear by default. And we used to not – we still don’t. I need to merge that. And maybe by the time we publish this, we’ll be merged. So there’s a bit of time warp things going on here. But we use not to unbox unrestricted fields for the reasons I was discussing earlier, which is that we wanted to type the output of the unboxing and we don’t anymore verify linearity for it. Is this podcast usually that technical or is it just because I like to rant about technical stuff?

JB [0:42:29]: It varies, but I think it’s welcome.

AS [0:42:33]: Okay, very good.

MPG [0:42:35]: I’m enjoying it at least. So one final question about linear types is, we have linear base. So say I have my – I got 10 modules, 500 lines each. I just plug in linear base, I fix all the type errors. Would it go faster or probably not?

AS [0:42:59]: You mean just replacing your program, like the Prelude by my Prelude?

MPG [0:43:05]: Yeah. So, essentially, if I’m not modeling an RDMA, I’m just doing regular stuff, but I use Linear Haskell and then give it all linear types—if I have a lot of arrays, I would expect it to work, but does it help with anything else or is it just for the specifics?

AS [0:43:23]: Well, it depends. There’s no – you don’t just activate linear types and poof, it gets magical. You need to make use of libraries in a way that makes sense for your program. Remember that GHC has a very good optimizer. So if you’re not careful about how you use linear types, you’re not going to beat the optimizer. It is pretty good. But if you find a place in your programs where you feel that the optimizer is getting in the way, or maybe – I remember Edwin Brady published a blog post when Idris 2 was released and said, “Oh, it’s dramatically faster. And one of the reasons why it’s faster is that I use a hash table where I used to use a binary search free-based map.” If you don’t want to write your entire type checker MST and still use a hash table, then linear ways might be an option. You pass that the hash table could be used linearly. It would look like a map except you can’t share it, so you have to be a bit careful. You would return it very often. You return it from reads.

We do have plans to improve on this with a line of work on linear constraints that I am just slowly picking up again after hiatus, where a lot of the threading would be handled by constraints. And thus, you would have to worry less about it, less syntactical noise. But right now, you have a little. It reads like a map that even when you read, you get a new version of it. So it can be a bit clunky. Some people like it anyway, but at the very least, you don’t have to deal with ST. You’re out of a monad. So maybe it’s better for your program. You are not allowed to share it. So there are things that ST can do that you can’t do with linear types. Things like Dijkstra’s algorithm I think use a bunch of sharing somewhere. I’m just doing that off the top of my head. I haven’t checked. I might be completely wrong about that. But there’s a lot of data structures that might use sharing. You want to do a cyclic list sort of thing. Linear types will not help you there because there’s built-in sharing. You need to find a way around it. You might do some ST things, use some unsafePerformIO and expose a linear interface with that that might work, which we do a lot. But every time you do that, you’re just adding to the trusted base. You have to be a bit mindful of how loose you are with your unsafePerformIO. Plus, unsafePerformIO prevents some optimizations a bit careful. Sometimes you can use unsafeDupablePerformIO because linear types sort of makes that safe. But it’s really, really hard to convince yourself that you won’t have some dabble-free error in some nonsense situation that you haven’t like triggers one time over a thousand that you can’t check for, and someone will come to you and say, “Ah, I got a safehold in my program that use your library. And you will have no idea where that comes from, but you got this 10% extra performance.”

MPG [0:46:29]: That’s in ByteString, right? The unsafeUnutterablePerformIO, which just –

AS [0:46:33]: Yeah, it’s the same function as unsafeDupablePerformIO. I think it’s called accursedUnutterablePerformIO, which is the best name for every function ever written in Haskell. That’s for sure. Very Lovecraftian. But yeah, it’s very dangerous. That’s why they name it that way. Yeah, I never remember what the actual differences between the two are. But when you don’t know, don’t use that one.

MPG [0:47:01]: My favorite part is that the scarier the name, the simpler the implementation. So the accursedUnutterable, just ignores that it’s IO. It just throws away the real-world token.

AS [0:47:12]: Yeah, you just produce a real-world token out of nowhere. So it’s a run state hash function, which is a novel thing from base. You just wrap around that and you’re done. Whereas, unsafePerformIO has to do some extra careful checks.

MPG [0:47:28]: So linear types let you use unsafePerformIO more, but not unsafeDupable.

AS [0:47:34]: I think, usually, you can use unsafeDupablePerformIO because, well, you have to be careful on how you set things up, but you can set things up. There’s only one of each expression, so they can’t be shared. They can’t be duplicated because linear types, it doesn’t come up in the compiler. So they’re not really dupable IOs, but the point is that they won’t be duped or shared or anything because you’ve set it up that every expression is unique in your program and it’s very, very scary to program with it.

MPG [0:48:08]: But all this implementation was in the context of – so Tweag did most of the stuff, right? I heard about Tweag a lot, but what do they – can you tell us about Tweag?

AS [0:48:18]: Yeah. So we’re a consulting company in the innovation and cutting-edge technology space. And nowadays, Tweag is part of a bigger company called Modus Create as also into consulting, but has a wider array of technologies. So I’m going to focus on the traditional Tweag bits for now. So yes, we’ve been involved in a number of communities. Haskell is one, but also OCaml, Rust, Bazel, that’s a build system; Nix, a package manager for the big ones, and some more. What we try to do is to bring expertise to companies, large or small, the way they go in somewhere that nobody has really done before. So it has been a good fit for many startups that had this idea of a product and they needed a bit more oomph at the site, but also research departments in bigger companies, including biopharma, finance industry. We’ve been working with self-driving car companies, flying self-driving car companies.

MPG [0:49:24]: Are the companies flying or are the cars?

AS [0:49:28]: The cars. I don’t think any was commercialized yet. It’s a tricky business and if only for regulation purposes, right? You have to go – they tend to be American companies, so you have to go to American regulators and convince them, “Yeah, my car can fly, and it can fly over a city.” So, that’s a tough one to crack and that’s probably the biggest challenge, even more than technology there. And so we work both by providing expertise in these tools and also developing the communities and working in the open source space.

So we’ve been working, we’ve been talking a lot about GHC in which we do some work, but we also contribute to wider communities. We’ve built an array of libraries for Haskell. I think the most popular one is Ormolu. It’s a formatter that follows fairly different guidelines of the formatters we had before, closer to how Goformats go. And as Goformat is the thing that sparked interest into formatters, we’re kind of happy with the choices we made there. I think right now, it’s probably the most popular formatter for Haskell.

You have to understand that Haskell is extremely hard to format for a tool because it has fairly idiosyncratic rules about wide spaces. And when your tool is all about moving white spaces around, well, that puts a lot of drafts in front of you. And so, that’s a very interesting space. It’s been very, very difficult for formatters to get the right approach. And I think, I and we think that Ormolu strikes a good balance there. But we also developed build rules for building Haskell with Bazel. That’s a good example of things that is in between.

The Nix package manager is fairly well-known in the Haskell ecosystem. We’re not particularly involved between Haskell and Nix, but we have been promoting the community a lot, doing some basic maintenance to make this tool easier to use. The UX of Nix is still a bit difficult. So if you’re listening to that and you don’t know Nix, you should probably definitely check it out. But you’d know this is going to be a bit hard. The discoverability is still a problem, but I am loath if I don’t advertise Nix a little because it’s a very useful tool. It’s the tool that removed for me the need of scratching my head. And every time you start a new project, you have to, “Oh, I need to install this and that library in my system. I need to figure out how they’re named in my package manager. So maybe I use Ubuntu.” And you document it for Arch Linux or the opposite way. And you have to install all that and they stay on my system forever, but I needed them for five minutes. No more of this. I go to a library like –

And as a director for the open source work that we do, I get to sometimes build a lot of these projects. And so I just clone the project, type ‘Nix shell’ or ‘Nix develop’ depending on details, and then make OCaml build or whatever, and then voila. I don’t have to worry about this anymore. And when I leave the directory, all this is gone. It’s stored somewhere in the disk, but it’s isolated and it can be garbage collected. And that’s been very helpful. So that’s my little advertisement for the Nix community bit. But it has saved me a lot of headaches. Yes, that was the short of it, I guess. Do you have specific questions?

MPG [0:52:55]: Yeah. Because you had the Asterius compiler, right? And that got merged and released?

AS [0:53:02]: Well, yeah. The version that we merged is a bit different from the Asterius compiler. It’s largely a reimplementation based on what was learned during Asterius. And yes, as of version 9.6 of GHC, WebAssembly is a targeted backend. And right now, it’s not full-featured, but it can be used. We have deployed it. If you go to the Ormolu website, that’s certain back on earlier conversations, you will find a playground where you can write a bit of Haskell and you will format it. And this is actually powered by WebAssembly. You just format as you write and it’s surprisingly responsive. It’s much faster on JavaScript. That being said, as of 9.6, you also have JavaScript as a possible backend when this is more convenient for you. And that we were not involved in.

MPG [0:53:53]: Right. I was also very impressed by Asterius, it seemed like. Because it started in 2018, right? And then it was –

AS [0:53:59]: Yeah, it’s a long project, largely a solo project from Cheng Shao. That has been joined on and off by various people. The version that was merged is a collaboration between him and Norman Ramsey, who is fairly well-known in the Haskell community for having invented a few things like C – and little things. So we’re very happy that he was willing to join this effort.

MPG [0:54:27]: Right. So I think we’ve run a bit past time, but thank you so much for coming to the Haskell Interlude.

AS [0:54:34]: My pleasure.

MPG [0:54:34]: Yeah. I’m excited about Linear Haskell now, I think. In particular, I didn’t know that it was more general than Rust. I thought it was kind of the same, but we’re doing more. That’s cool. That’s very cool.

AS [0:54:48]: This whole destination story is actually something that I don’t know can be encoded in Rust, at least not with current knowledge. So that’s a good example of something that our lead destinations are something that Rust cares about because they allow reducing your memory footprint, which Rust is really all about. That’s sort of a gap in the design that they want to solve in an ad hoc fashion, but we can do in libraries. That’s my hope.

MPG [0:55:13]: So maybe they’ll get linear types at some point.

AS [0:55:17]: Maybe.

MPG [0:55:19]: Okay.

AS [0:55:20]: Thanks for me.

MPG [0:55:21]: Thank you.

Narrator [0:55:26]: The Haskell Interlude Podcast is a project of the Haskell Foundation, and it is made possible by the support of our sponsors, especially the Monad-level sponsors: Digital Asset, GitHub, Input Output, Juspay, and Meta.

31 – Arnaud Spiwack

Related links

Transcript