Recorded 2024-02-06. Published 2024-04-1.
Roman, known better online as effectfully, is interviewed by Wouter and Joachim. On his path to becoming a Plutus language developer at IOG, he learned English to read Software Foundations, has encountered many spaceleaks, and used Haskell to prevent robots from killing people.
Wouter Swierstra (0:00:15): Today, Joachim and I are talking to Roman, better known online as effectfully. He will tell us a bit about learning English in order to read Software Foundations, diagnosing space leaks, and how to use Haskell to prevent robots from killing people.
So, welcome, Roman. Good to have you on the show.
Roman (0:00:34): Hi, folks. Great to be here.
WS (0:00:37): So, the usual question we ask our guests is, how did you get into Haskell?
R (0:00:42): It’s a funny story, and the funny part is that I really liked programming when I was in the university, but then we got Java and we got C#, and I started consulting, and it just completely killed all my passion for programming. And at that point, I thought that it would probably make sense for me to be someone else, not a programmer. But I saw somebody doing Haskell and it looked like this beautiful, amazing language. But I still had these wounds from Java and C#. And so, I decided to study instead Combinator Logic and lambda Calculus—everything about Haskell that isn’t actually programming. And so, I read types and programming languages. Then when it wasn’t enough, I started reading Software Foundations, which was really hard because I didn’t really know any English. So I had to learn English and talk at the same time in dependent types. But then I got into Agda from Software Foundations. There wasn’t many books. I think there was zero books at that point.
WS (0:01:48): So, what led you to Agda from Coq?
R (0:01:52): I just wanted more. I enjoyed Software Foundations so much that I just wanted more of the same thing. And Agda was even better than Coq for me. It was even more beautiful and like, well, I don’t know, well-structured, well-implemented something. I had this amazing experience programming in Agda. This is how I became an Agda programmer for a few years. And on the side, I started introducing actual real programming back into my life after all the wounds from Java and C#. I had some consulting job where I was like working almost full-time for a hundred bucks a month, just so that I don’t feel like it’s a real job. And slowly like that, I converted from Agda to Haskell just to realize that programming in Agda was still a better experience than programming in Haskell, even though I love Haskell too.
WS (0:02:53): And how would you – I mean, you have quite some experience, I think, in both Haskell and Agda, and what are your thoughts on the two and how they relate, and do you see Haskell’s constantly building on a richer type system to sometimes be a little bit more like Agda. Is that something you support?
R (0:03:15): I think I have a more controversial view. So, certainly, Haskell moves towards dependent types and this type features of Agda, but Agda is based on some pretty strict type theory, while Haskell, it’s an experimental compiler that people just add a lot of different stuff to it—all the different features that do not necessarily play well with each other. But in a sense, it makes it even more powerful in terms of type-level programming than Agda because you have all those weird features, such as you can match at the type level on a function application, for example. So, you have like arbitrary function application, and you can match in the type family on that function application. And you can even recurse and handle argument by argument. And this isn’t something you can do in Agda. In Haskell, you can find type-level variables and instantiate them to something concrete, and you can do that within Haskell without any Template Haskell or anything like that. This isn’t also something that would be possible in Agda, again, aside from maybe a reflection.
WS (0:04:39): Yeah. That would’ve been my question, whether you could actually do something with Agda’s reflection. The alternative would be that you’d have to somehow come up with a deep embedding of the type application structure that you want to inspect and then still be able to reify that somehow.
R (0:04:58): Yeah. But it’s kind of backwards. Then you need to handle the structure that you defined. And it’s not an arbitrary type application; it is just some specific type application that you hard-coded in there.
WS (0:05:11): Exactly. Haskell’s type families are a little bit more flexible in that respect, right? You don’t have to design upfront what is going to be the little reified type language that I’m allowed to match on. Haskell’s type families can let you match on arbitrary stuff, I think.
R (0:05:26): Yeah. And when it comes to reflection, it’s one downside of dependent types that it’s just really hard to reflect dependent types in an intrinsically typed manner. So, we have in Haskell typed Template Haskell, but from what I know, there is no such thing for Agda because it is just really so hard to provide a typed representation of a dependently typed language.
WS (0:05:54): Yeah. So, the problem here usually is that, in Haskell at least, you have a notion of types and you have a notion of terms, and these two are distinct. And if you have to check when two types are equal, you can unify them, right? Where in Agda, you have types and terms, and this is all jumbled together. And on top of that, in order to check whether two types are the same, you actually have to do evaluation again. So, you have to write your types, terms, evaluator, and this becomes one big mutually defined mess, basically.
R (0:06:25): Yeah. And it’s not just mutually defined; it’s inductively, inductively defined.
WS (0:06:30): Yes. Yeah. Yeah.
R (0:06:32): This maybe works in this type theory with like quotient inductive-inductive types or something, but not in a normal type theory; in a regular one.
WS (0:06:41): But for a lot of applications that I’ve seen doing template programming in Agda or in dependently typed settings, or even in Haskell. Oftentimes I find that you only need to reflect the fragment of the language, and usually, you have a pretty good idea of what the fragment that you’re inspecting. So, if you look at how all these generic programming libraries from 20 years ago or something worked is they would all use Template Haskell and they would all reflect the data type definition and turn it into some fixed point of a functor or something. And then instead of having to handle all of Haskell, they knew exactly, I only need to reflect this fragment. And I think, I believe at least, or I’d hope that something similar might hold for Agda, where I don’t know of many examples of where you need like the full power of reflecting the whole dependently typed language into a typed setting. But if you can reflect the language into something untyped and then carve out the sensible bit that you’re interested in, that doesn’t need to be the whole language again. And that usually can be represented in a type fashion. So, it’s not entirely a lost cause, I would hope.
R (0:07:54): Sure. I agree that it’s not a lost cause. But I like types. And so, I want to deal with types whenever possible. And the raw representations, like the untyped Template Haskell, they are usually very hard to deal with, especially in Agda where the raw representation is just de Bruijn indices. And you see it there and you calculate the offsets manually, and it’s very painful.
WS (0:08:18): Yeah. And I agree that there’s a lot of engineering work or something that could maybe be done to make it easier to write these kind of programs. And the same is true for Template Haskell, right? That even with typed Template Haskell, it’s the kind of metaprogram that you hope you only ever have to write once, right?
R (0:08:34): And then GHC version updates of all the other version, and good luck.
WS (0:08:39): Yeah, exactly.
Joachim Breitner (0:08:41): I guess I’m a bit of an outsider when it comes to this kind of type-level programming that people do with Agda and Haskell. And I find it very interesting that you’re comparing Agda and Haskell as if they’re like interchangeable. But my impression, and maybe I’m wrong here, is that Agda programs only get compiled and Haskell programs get run. So, is this a – does it mean that all the things you’re discussing right now you’re doing in a sphere where you just care about writing the programs but never running them, or is it not actually true anymore that one doesn’t write real programs in Agda anymore?
R (0:09:14): So, people do actually write real programs in Agda. I had a couple, the real use cases, like I think Cedille was implemented in Agda, if I recall correctly, and there was some web server or something implemented in Agda. So, I like types because they allow me to compute arbitrary stuff at the type level, and then the stuff can generate code for me. So, for me, it’s the real job. It’s not just to get anything type checked and to check that some invariant indeed holds. I really like being able to write derive and generic, and then generic gets derived, and you get a lot of operations for free from that. So, I think types are very useful for that. And it is just not some kind of lost work. It’s actual work that then gets actually compiled usefully for me. And this works the same way in both Agda and Haskell.
WS (0:10:09): So, let’s maybe circle back to your career, because you mentioned falling out of love with programming at first and then via Software Foundations and types and programming languages, learning Coq, English, Agda. But we never really touched on Haskell, I mean, so far, really. So, how did you get into programming Haskell for a living, I guess?
R (0:10:33): I was writing Agda, not for a living, just in my room. And then I realized that I need to do something for a living indeed. And I saw on Reddit there was a job ad to work for a Russian company—I’m from Russia—a Russian company doing robotics. It was like a side business of the company. And I applied, and I just somehow got the job, even though I didn’t really have any experience to show for it. But somehow, I got the job. And in my first three or maybe four days, I realized that this is definitely not a job that I want to have because I was supposed to be responsible for a server that controls moving robots. And the robots were fairly big, like a hundred kilograms, and the server was really horribly written. And they already had cases where the robot would misbehave. And I was just afraid that it would kill someone, literally. And I said that nobody understands how the server is written because the person who wrote it quit and it wasn’t really well-documented. It misbehaves, and I just don’t want to be responsible if something bad happens. And so, in like three or four days, I said I’m either quitting or I rewrite this whole thing, which I guess is a bold move when you don’t have any experience and you’re just a new team member and you say that everything needs to be written.
But we negotiated for a bit, and they allowed me to rewrite first like 90% of the server and then the rest. And my manager asked me to rewrite. And the interesting thing, I guess, was that I had no idea how to do graphic user interface programming in Haskell. And somebody suggested that I just use interoperation with Qt, which is C++ library. And that worked, to my surprise. It was fairly pleasant to use. There is this great Haskell Library called Qtah, Q-T-A-H. It’s really well-written. It worked fantastic for our use case. I’ve added support for graphic scenes, something like that, so that you can draw inside of the window. Yeah, it really worked.
WS (0:13:10): Now did it have to be portable software, or did you have a fixed operating system that you could target?
R (0:13:15): We had just a fixed operating system. It was Ubuntu, and that’s it.
WS (0:13:18): Yeah. Okay. And then that helps, I think, a lot of GUI bindings. I know there was wxHaskell at some point, which tried to be the intersection of every existing operating system, which is a great idea to make portable software in practice, but very hard to maintain, I guess.
R (0:13:40): I couldn’t build it. I tried to build it, and I just couldn’t.
WS (0:13:44): Yeah. How did you find writing GUIs in Haskell? I mean, it’s not something very many people do, I guess.
R (0:13:50): Well, the biggest part of the GUI was written in C++. It was a very small file. So, in Haskell, I would just write binders for it. But it worked great. But my manager only ever complained about the GUI, firstly because I’m just horrible at writing GUIs, and it was a disgusting mess, but also because it was slow and it wasn’t clear if it was Haskell or C++ or something. And the manager really only started complaining after I quit, so I couldn’t really help them with that. Otherwise, it was really pleasant. I was surprised that you could just fire C++ from inside of Haskell and it’ll work. That was surprising.
JB (0:14:33): So, just to clarify, the GUI was written in C++, and you had Haskell bindings into that GUI to then talk to the robots.
R (0:14:41): Yes. So, the degree was in C++, but all the main logic of the server was written in Haskell. So, there was a huge Haskell application. Well, not like huge, but a big one. And only a small part was the GUI, and only a part of it was C++, even though it was the largest part.
JB (0:14:59): Okay. And I’m just curious, what kind of hundred-kilogram robots are running around controlled by Haskell and may be killing people, if you can talk about it?
R (0:15:11): Yeah. I think they no longer are controlled by Haskell because –
JB (0:15:16): Wait, they’re no longer controlled. That’s even more interesting.
R (0:15:19): I think they are controlled by C++, so you should be twice as scared.
JB (0:15:24): Oh. So the robots now have finalized us.
R (0:15:27): Yeah, I guess. And virtual instructors and whatnot.
WS (0:15:31): Yeah. But you quit this job, and then what did you go on to do after that?
R (0:15:37): After that, I worked at a startup. We were doing number crunching for clients. It’s hard to say what exactly we were doing. I didn’t really understand it myself, but I understood how to implement it. And we would rent the biggest Amazon machine at the time. It was something like four terabytes of RAM and 64 CPUs or something like this. And we would run our algorithms on that and produce something that didn’t really make any sense to me or some of my colleagues. But it worked. It was fine. The numbers were meaningless, but it was fun. And we ran into so much space leaks. I originally was afraid of programming Haskell because of space leaks. It’s the only thing that seemed like it can really spoil your experience with the language. And it really did multiple times for me. I fixed, I think, more than 10 space leaks in my career.
WS (0:16:38): And so, this is space leaks arising from doing numeric computations, which were not strict enough somehow, or –
R (0:16:46): Yes, for example, you had a vector, but it wasn’t unboxed. And so, there were pointers in there. And then you get thunks, and the thunks don’t get forced, and then you get a space leak, something like this. Also, we used a library for parsing HTML text, I think. And the library was very optimistic about the structure of the HTML text. It thought that it was well-structured, and all the texts were having both the closing and the opening tag. But it turns out that people on the internet do weird stuff. And we would scrape the internet, and one site would just never use a closing tag for anything. It would only open things. And that was blowing up our huge machine. It was really not great to debug that. Like, your machine just dies, and you have how many processes scraping the internet and go figure out what happens.
JB (0:17:52): So, how do you go about this – well, I guess you know when you discover a space leak, but how do you go about finding out what the causes and fixing it? I mean, if you’ve done it 10 times, then I guess that’s something you can tell our interested listeners in how to fix basics.
R (0:18:08): Yeah. I don’t want to confuse the listeners with my outdated knowledge. I think these days we have great tools from Well-Typed and other folks. And this was quite a while ago. And I wasn’t really using those tools. So, I would print the PNG file where it says how many allocations of what you have. And if you have, for example, way too many ints when you know then it’s some simple thunk problem where something allocates within a single thunk. But if you have like lots of lists and inside of lists, there is some pinned memory or something of this sort – well, not inside of lists, but alongside the lists, there is some pinned memory, it’s much more complicated and it’s harder to figure out what’s going on. And in some cases, Haskell doesn’t even show you some of this information.
I remember that I had one space leak with the Async Library, and it was leaking space, stack space of a thread, and the Haskell just wouldn’t show me that on the graph. You need some specific argument for it to even get shown. And so, you just look at every single line of code and try to figure out what’s going on and what can possibly go wrong. It’s a lot of trial and error. And I’ve never really developed any actual methodology for fixing space leaks. You just go and try to figure out what things as written are correct in terms of operational semantics. It’s a general problem with Haskell, I think, that people are really way too focused on the denotational semantics, and then you get to the operational one, and things just don’t behave as you would expect them to. Like for example, the monad law, when bind followed by a return, is basically identity, but this is not necessarily the case if you misdefine something or even if you define it properly, but something does not get in line, et cetera, and you just retain the left argument until you get to the right one. I don’t remember the exact details. There was this space leak in the transformers in state, I think, even where something is retained way full on just because this is how bind is defined.
WS (0:20:41): And is the hard thing actually diagnosing what’s causing the problem, or was the hard thing then fixing, coming up with a fix?
R (0:20:50): It’s always diagnosing the problem. Coming up with a fix was usually fairly straightforward. Sometimes, like pretty often, you just insert a single bank somewhere or you import data vector unboxed instead of data vector, or something like this. Or you import a strict state monad. And like, why is it lazy by default? Who wants lazy state by default?
WS (0:21:16): So, this was number crunching on the Amazon machines. And you then left this job again to do something else, yeah?
R (0:21:26): Yeah. And that was IOG. We have this Plutus compiler for running smart contracts on the Cardano blockchain. And it’s really a dream job by all standards because I’m working with compilers, I’m writing Haskell, and our compiler isn’t a messy one. We are pretty much forced into making it as small as possible and as well-behaved as possible because we have theorists who just prevent you from doing stupid stuff. That happens all the time. I propose something, and then one of our theorists says no. And they are right about that.
JB (0:22:09): Is that classical no or constructive no?
R (0:22:14): Either way. It actually – it sometimes comes with an example of why no, but sometimes it doesn’t. It’s just, “No, you cannot do this. Sorry.”
WS (0:22:25): So, can you tell us a little bit about the Plutus compiler and what it does and how it’s built?
R (0:22:30): Yeah. So, we have a tower of languages. We have the underlying language, it’s called untyped Plutus Core. It’s the one that runs on the actual Cardano blockchain. It’s a fairly simple lambda calculus language with data types and without any explicit recursion. We use recursive combinators like fix, except, not fix. So, yeah, it’s a very simple language, and we have an evaluator for it. We use the CT machine. We hope to get some bytecode evaluator sometime, but so far, our priority is mostly getting things right instead of getting them really fast because contracts written in our language, they control millions, maybe even dozens of millions of dollars. And you really don’t want to screw anything up. You want to be sure. So, we have negative formalization of the language of the untyped one, the typed one that I will tell you about a bit later. And we have plenty of tests, et cetera. And there is always this loom and danger of screwing up something just a little bit, but enough for somebody to lose a lot of money. So, we are really careful.
And I think the worst part of this setting isn’t even the responsibility, but it’s the fact that you can’t just go back and fix something once you realize that something got broken because it’s the blockchain. And things are supposed to evaluate the same way in future as they evaluate right now. We have some language version, et cetera. But in general, simply fixing a bug can be a really tricky thing because of this reason.
WS (0:24:14): So, you mentioned the untyped layer, and then presumably there’s something typed on top of that.
R (0:24:20): Yes. We have a few languages on top, and at the very top, there is Haskell itself. We use GHC Core. So, we have a plugin, a GHC plugin, that takes GHC Core and compiles it down to our intermediate representation language that has explicit recursion. No recursion combinators, nothing of this. So, just normal data types. And then we compile it to a system F-omega language that no longer has an explicit recursion, and then we get the untyped language out of that.
JB (0:24:55): When you say it no longer has explicit recursion, you mean like bindings that can refer to each other?
R (0:25:02): Yes.
JB (0:25:03): Okay. Because to me, that sounds a little bit like implicit recursion. It’s like you can just recurse and explicit as the one to the fix. But you’re saying you go from bindings that are in scope in their own body to combinators there.
R (0:25:14): Yeah. That’s correct.
JB (0:25:15): Okay. Just making sure that.
WS (0:25:17): And how do you find writing a compiler?
R (0:25:20): It is fantastic. It’s a dream job. I think it’s like writing games for some people, creating games using a program. For me, it’s compilers. So, I love compilers. One infinitely painful problem is name handling, and it always pops up. And I always ask compiler people how they do name handling their compilers, and I always hear different answers. It’s like de Bruijn indices, de Bruijn levels, and locally nameless and bound the Kmett’s library. And what we use, its global uniqueness, the same thing that GHC uses. We had horrible bugs. It’s like a Haskell equivalent of a seg fault, basically, when just some variable suddenly becomes a different one after 10,000 tests, and you get some horrible long output and go figure out what was going on. And in general, it’s even more painful than space leaks. And you just really have to go through all of your related code and look at each line, and try to understand what can possibly go around there.
I know that there is this foil paper that is supposed to fix this issue. I really want to try it out sometime. Because if it fixes this issue, then it elevates the majority of pain of developing the compiler, for me at least.
WS (0:26:55): Yeah. We had Edwin Brady on a few weeks ago, and he mentioned that for Idris 2, he made the choice to use at least a well-scoped syntax because he felt that there’s just so many bugs that we’re creating by doing some optimization thing or transformation, and then screwing up our binding structure or creating unbound variables. And even just having that check, not even going so far as using well-typed syntax, but just well-scopedness and being able to enforce that very strict discipline, he found it very useful.
R (0:27:34): Yeah. I found it very useful when I was writing Agda because it’s the default thing that you do there. And most of the time, you don’t even need to think when you develop something. You just mechanically satisfy the type signature that you’re given. It’s amazing, but I think it can be really slow. Probably, for our use case, it wouldn’t be a problem. I think we could do it. And our smart contracts tend to be fairly small, and we can’t even support huge ones. So, I guess for our case, it could work, but for a serious enterprise production compiler, I think this may not be such a good option, simply because it’s really hard to optimize something that has those, all those intrinsic invariants that can be really hard to maintain.
WS (0:28:19): Is that something that might be erased though? Is it – that’s what you’d always hope, right? That you have some types flying around which enforce some static structure. But after a while, when you run it through the compiler, you end up with just say de Bruijn indices.
R (0:28:39): Sure. I think it’s just that it’s harder to do that when you have complex invariants. Like for example, consider everyone’s favorite example of vectors. So, lengths or lists with their lengths reflected to the type level. It seems like such a fairly trivial example that would be easy to handle, but then you get to define something like reverse, then you write an entire paper about that. Great paper, by the way.
WS (0:29:10): Yeah, that’s true. But this is always the trade-off, right? That you choose some invariants, and then by fixing these invariants, you have some operations which are easy to support. Like I know mapping over a vector, right, once you write down the type, the function rights itself, quite literally. But then when there’s other operations that you want, which may still respect the invariant, but only for a very subtle reason, then all of a sudden, as a programmer, you have to do a little bit more work to reason about why this works out somewhere at all.
R (0:29:49): Yeah. And sometimes it’s really hard to produce something efficient in the end. And sometimes you need a non-existent feature, like when you need to define filter, say, for vectors, and then you need those fancy existentials that Richard Eisenberg had in his paper, which would, by the way, be really great to have. So, fancy types have this cost of potentially risking you up down the road when you need the efficiency. And we do need the efficiency. We’re just moving really slowly in this direction because we don’t want to introduce any bugs.
WS (0:30:27): So, you mentioned something about debugging the compiler and that you have a big test suite. Is that something where you also do property-based testing or fuzzing, or how does that work?
R (0:30:38): Yeah, we have property-based testing, and we have unit testing and golden testing, all of those. For property-based testing, we have term generators. We actually have like three different ways to generate terms for our languages. Some of those generate well-typed terms. Some of those generate terms that are well-typed, but very simple, just to evaluate something. So, they don’t have any fancy polymorphism, any fancy recursion, nothing of the sort. Just evaluate something. We also have a generator of ill-typed terms just so that we can check some transformations, for example, serialization, deserialization that it’s a round trip. But the most interesting generator was contributed by the Qvik company, which consulted us. And it generates really fancy terms with polymorphism and neutral recursion and everything, like data types, plenty of fancy stuff. And it’s really hard. They do unification there in their generator. Its complexity is comparable to the complexity of our language, basically. It’s that hard. But it works great. It actually does catch bugs. It caught maybe two or three bugs already. So, it’s extremely helpful. And we are talking here about bugs on master, not when you develop things. Then it’s dozens of bugs probably.
Also, it has an amazing shrinker. So, when you get a counterexample, it’s usually fairly small. We just recently got a counterexample that was maybe like half the screen, probably even less, and it really does help to identify a bug. So, without the generator, we would probably never realize we have a bug there. And without the shrinker, we would probably never realize what causes the bug, which actually is an ongoing investigation, but at least we have a fairly small counterexample that we can actually study.
WS (0:32:46): But it sounds quite hard because you have non-trivial lambda calculus, and then you want to generate random or somewhat random well-typed terms. I think that that’s like a known difficult problem, right? That if you just generate, kind of throw together a bunch of random constructors, the odds are that that’s not going to be well-typed or even well-scoped. How do you fix that?
R (0:33:12): We just hope it’s okay. So, that’s our strategy really. In the ideal world, we would have some kind of metric that tells us how interesting a term is, like how often variables get used if they get used multiple times, how the thing evaluates if just stuff drops and you ignore 90% of the term. But we are not in the ideal world. We have to feature, feature, feature, feature, and we don’t have that much time for testing. And so, we do try to claw as much time as we can, but we don’t have time to come up with some really strong testing strategy that itself is tested and maybe formalized or something else. So, I think there exist papers about the topic, and Qvik people really did a great job on that thing. So, they helped us a lot and they defined some pretty great generator. So, they catch things. That’s all we care about. But in the ideal world, we would do a much better job, I think.
WS (0:34:17): So, you’ve done quite a bit of programming in both Haskell and Agda and it is interesting about how you work in these languages, right? You have slightly different IDE or different features. How would you compare the two?
R (0:34:34): Yeah, that’s a great question. Agda forced emacs upon me. And it’s been painful, but after some years, you get used to it. The best feature of the Agda mode in emacs by far is typed holes. Haskell also has typed holes, but they are nothing like the ones in Agda. In Agda, each typed hole is like a separate GHCi session, except you also have access to all the local variables. You can type check things inside, you can evaluate them, you can normalize. So, like do whatever you want. And you can also fill the hole with an expression that has a bunch of other holes within it. And it’s such an amazing experience. You do not normally even retype check the file when you write, you do not retype check it on, say for example. You just fill the holes, and then you fill a hole with an expression that contains another, say, two holes. And then you go into those and fill them as well. And you also have Agsy, which is auto for Agda that generates expressions for you. It can also generate some boilerplate pretty quickly. It’s such an amazing developer experience that I find it really hard still in Haskell not to have that. When I moved from Agda to Haskell, this was my biggest pain by far, that I cannot develop things this way so that I never need to retype check anything. It just magically works.
JB (0:36:06): It sounds a bit like tactic programming and Coq or Lean in the sense that they have these holes that you’re filling and you can use some automation, you can give a partial term, and you get subgoals. But I guess the big difference is that in Agda, what you get out is still the final proof term, or I guess you would just call it program. Whereas in these tactic worlds, what you leave in there are the steps you took to get there. So, refine this hole with that, refine with that hole with that.
R (0:36:33): Yeah, exactly.
JB (0:36:34): And I guess sometimes one of these is nice, some of this other one is nice.
R (0:36:37): There is this quote by – I forgot who, that programming in Coq is like doing brain surgery over the telephone. And I think it’s extremely accurate. It’s exactly how it feels. You do not write a program. You write steps to produce a program, and then you are left with those steps. I think that in general, in Coq, they may be not as interested in the program. They’re mostly interested about proving something that it holds, and then it doesn’t matter. It’s even more helpful to have steps because they can be smaller than the final proof term. But in the programming world, you want to have an actual program that you create and that you put somewhere and people can review it, et cetera.
WS (0:37:22): I think the quote is by Peter Hancock. So, I should attribute him, I think.
R (0:37:28): Thank you, Peter.
WS (0:37:29): There’s a slight difference though in that there’s a certain maintainability question that I sometimes have with – I’ve done my Agda proof or program, and I’m so smart. I’ve filled in all the holes and I’ve done – and I change a definition somewhere or something changes in the development. And then it’s actually quite hard to fix because part of the history of the program has been lost, right? I’ve lost about how I got to this answer, namely doing induction on this thing and then filling in this dilemma partially and then figuring out, okay, then you’d want to record that. Whereas in tactic-based languages, sometimes you can change part of your proof or change part of your theorem, and hooray, your tactics still find a proof, right? It is a little less brittle in a way.
R (0:38:20): Yeah, fair enough. But I do programming, and even Agda still, the program is not proven. So, for me, it’s typed holes all the way. And it’s not the only difference, actually. So, for example, in Agda, jump to definition works by default for any definition out of the box. It just works. Like you can jump to the definition of a local variable, or you can jump into independency or into a normal top-level definition. And I know that HLS has jumped into top-level definitions, and they probably are adding jumping into dependencies, but I guess they don’t have jumping into local variables. I’m not entirely sure. But in Agda, it just works. I never needed to configure anything, never need to worry about anything. It just works, and that’s it. And also, Agda has case split, which is extremely helpful. HLS used to have it, but I think it got beat-rotted and it no longer exists.
JB (0:39:20): Can you explain what case split is for those poor Haskell people who’ve never seen it?
R (0:39:23): Oh, yeah, sure. So, when you have a data type and it’s some kind of a subtype, it has multiple constructors, you often want to define a function of the data type by matching on every single constructor and handling all of them in some way. And oftentimes it’s boilerplate to write all those constructors with the arguments. And in Agda, you can just hit – I think it was control C, control C, and Agda will generate all the matching for you automatically. And you can even associate a name with every single argument or constructor, and Agda will insert those names, not some arbitrary ones. So, it really does help to define functions over complex data types and even simple ones. It really does save time.
JB (0:40:14): Sounds a bit like GitHub Copilot.
WS (0:40:16): Yeah. So, there’s an interesting story here. I was recently pair programming with someone who was learning Agda, and they were an avid Copilot user, and where I use Agda in a certain way, right, where I start by writing out my type signature and then I write my function and put a question mark on the right and say, “Oh, let me load this up,” and then choose what to do induction over, and then gradually drill my way down, filling in these holes by one by one. Copilot is completely different, right? You start typing, and then it says, “Did you mean this?” And then he spent an equal amount of time as me, but like debugging the suggested solution thing. No, no, no, this won’t work, right? This is typing correct, or whatever.
So, it’s a very weird experience for me because I was used to using Agda in this very structured way where you think about like, “Oh, okay, how am I going to develop my program and do case now on this or induction on that?” And then here all of a sudden, it was like, here’s my best guess, and now you go fix it. And it’s – yeah. I don’t know. It’s interesting to see how this works.
R (0:41:30): Yeah. I wired Copilot into my emacs just to see how it works, what’s the experience. It really produces a lot of garbage, and sometimes it generates like 10 lines that don’t make any sense at all. And it distracts you from – they did from your flow. It puts you out kind of, so that you need to tell Copilot not to do that. This is not the correct thing. Try again. But what really put me off is that it would insert very subtle bugs into my code. One time I was optimizing a piece of code and I was looking to GHC Core, and I realized that the core is not correct. I was like, “What happened? It should be correct. Did Haskell compile it incorrectly? Or what happened?” And then it turns out that it wasn’t correct. Copilot generated a piece of code that wasn’t correct. Even though it was fairly obvious, I reviewed it and I missed it. So, I guess I’m better as a programmer than as a reviewer. For this reason, I turned Copilot off, and I only use it now to generate completely trivial boilerplate that I want, that don’t want to type myself.
WS (0:42:41): Yeah, that makes sense. I mean, these bugs can be quite subtle, right? And then maybe the definition looks okay, but I was surprised at how well it could do certain things. I think we had something fairly like a zipper on some binary tree-like data structure. And I thought this has been – of course, it’s probably been written like dozens of times, but still, it immediately gave something which was almost right. And I was quite surprised at how quickly it actually manages to find these things, even though my intuition is that the amount of training data can’t be that huge, right? This is a fairly specific example here.
R (0:43:24): Yeah. I had one case where it just realized what I was doing and inserted the completely right error message. And it would take a human some time to understand what was going on. It was fairly context-dependent, and somehow it figured out completely and assembled everything together and create a great error message. That was a great experience.
WS (0:43:46): Yeah. Great. Okay. So, you’ve done a lot of work on the Plutus compiler and it’s become a sizable project, right? I mean, it’s a non-trivial amount of code by now. Are there other performance bottlenecks that you run into there, or does it just work out of the box?
R (0:44:05): It never works out of the box. You always have to optimize something. One time we just removed array lookups from our evaluator, and that sped up the entire evaluator by 10%. And array lookup is really supposed to be something that is fast. But there is this layer of checking bounds, which I’m glad that we have that. It’s like faults are not nice. But still, somehow, it can be really slow, even the most basic operations. But the biggest pain that I’ve had with Haskell is that GHC sometimes just refuses to do what you tell it to do. It would very boringly insert join points when I want a let. And I really want a let, I write a let. Why do you insert a join point there? It’s supposed to be a thunk. It’s a shared value. So, this is one thing that GHC is really stubborn about. But it also really likes to change the semantics of your code, which is really surprising to me because we are functional programmers. We are supposed to respect the semantics, but GHC by default has these pedantic bottoms flag turned off, and it means that GHC can just turn one code into another code.
And the biggest surprise that I had was that a strict let is not the same thing as a strict case. So, you write strict case without any matching, you just insert a bank there, and it’s not the same thing as a strict let. GHC views these two differently and optimizes them differently, for example.
WS (0:45:49): So, you mentioned this transformation of semantics and pedantic buttoms. So, what’s going on there?
R (0:45:57): So, for example, you have an expression like let x = y in, and then a lambda. And GHC can move a lambda outside of the let. So, denotation, I guess it’s the same thing because Haskell’s lazy, the let binding is not going to be forced. But operationally, it’s a very different thing. I want this binding to exist outside of the lambda so that whenever I enter the lambda, the binding is reused, and it’s crucial for performance in our case because it contains a value for some pretty complex work that we spend a lot of time doing. So, we want to share this work. And it even took me a lot of time to figure out, firstly, what was going on, and secondly, how do I even fix it? Because GHC just does it, and I wasn’t aware of the flag. I had to actually read through the list of all the flags to realize that it was there. And then it turned out that the flag wasn’t as reliable as using just a strict case instead of a strict let.
JB (0:47:02): I think that might be slightly different issues. So, as far as I know, pedantic bottoms, like turning it off as it is the default, really allows the compiler to do optimizations that, in some cases, can turn a non-terminating program into a terminating one. So, it’s not preserving the semantics; it’s making a semantic better in a way, if you assume that you usually want things to be dominating. But I think the thing you described right now is more about – I think it’s called the full laziness flag about how it can – how liberal can float things in and out. But it shouldn’t affect the bottomness in this case. The problem you’re describing exists, but I think it’s a different flag that it’s behind.
R (0:47:37): Right. I’m pretty sure we also ran to the pedantic bottoms flag. I’m not entirely sure in what case. But yeah, you are right that this is full laziness, not pedantic bottoms. Thank you for the correction.
JB (0:47:50): Oh, no, maybe you’re right. The difference now that I think about it is the difference when you take the function and then you seek it. So, you force evaluation of this functional term. I mean, it’s a let and a lambda, so there’s a function error. And I guess sometimes floating something in that’s a strict binding would make something that if you force it will cause the binding to validate and then look at the lambda. And if the binding diverges, you get bottom. But if you move the binding inside the lambda, then seeking the whole thing just is nothing because it’s a lambda, it’s a value. And this way, we’ve just turned a bottom into a non-bottom. So, maybe you’re right after all.
R (0:48:31): Right. So, full laziness is going outside of lambda, and with pedantic bottoms, you can go inside. With pedantic bottoms turned off, you can go inside of lambda. Yeah, something like this. So, basically, it’s complicated. Listeners can realize it’s sometimes really painful to just force GHC to do what you want to do.
JB (0:48:51): So overall, you’ve been in the last hour mostly praising Agda and mentioned a few things where Haskell is letting you down. But at the same time, I get the impression that you’re actually quite happy to use Haskell. Is this just that Haskell is the least painful thing to use that you actually get paid for? Or do you, despite all the bad things you said about Haskell in the last hour, actually enjoy Haskell? And how would Haskell have to change in the future, or what exchanges do you maybe look forward to to make it even less painful?
R (0:49:29): I absolutely do love Haskell. I don’t do it just because I’m paid for it. I actually do love it. It’s a great language. I love it’s semantics, and I love that it’s lazy. I love that it’s pure. And the way it does affect, like all of these things, I love the language. Mostly negative, just so that Haskellers introduce typed holes. And then I will be completely pleased with the language. I would really like Hackage to be strict. So I love that the language is lazy, but I want strictness by default in many, many parts of Hackage. So, state should be strict by default, transformers should be strict by default generally, and containers should be strict by default, like Data.Map, Data.IntMap, et cetera. We should have some easier access to strict tuples and maybe, et cetera. I just don’t understand the obsession of making everything lazy because it bites me a lot when it comes to space leaks. And strictness just doesn’t seem to cause these problems when it’s in libraries. When it’s in the language, it’s pretty problematic in my experience because our language is strict, and we’ve run into quite some problems because of it. We cannot move things around easily, and it’s hard to control the semantics because of it. So, yeah, I love that the language is lazy, but I would love to have more strictness in libraries by default.
JB (0:50:57): This strict data language extension, are you essentially saying this should be the default? Data types are all strict unless you ask for it, but functions are lazy unless you change it, or –
R (0:51:08): That’s what I do in my projects. Yes. I just enable strict data, and I only make things lazy if I need that.
JB (0:51:15): Okay, Rom, that sounds like Haskell’s good, but it could be strictly better?
R (0:51:21): a good one.
JB (0:51:24): Yeah. I guess with that, thanks for all the insights, and I’m sure at least the Agda users in our audience will be quite happy about the episode. It was fun having you on the show.
R (0:51:35): Thank you very much, folks.
Narrator (0:51:37): The Haskell Interlude Podcast is a project of the Haskell Foundation, and it is made possible by the generous support of our sponsors, especially the Monad-level sponsors: GitHub, Input Output, Juspay, and Meta.