Andrey Mokhov

Transcript

This transcript may contain mistakes. Did you find any? Feel free to fix them!

Matthías Páll (0:00:19): Welcome to this episode of the Haskell Interlude. Today’s guest is Andrey Mokhov, who’s best known for his work on the Hadrian build system. Today, we’ll talk about algebraic graphs, selective functors, and the differences between OCaml and Haskell.

Welcome to this episode of the Haskell Interlude. My name is Matthi. My co-host today is Andres Löh.

Andres Löh (0:00:41): Hi.

MP (0:00:42): And we’re joined today by our guest, Andrey Mokhov…

Andrey Mokhov (0:00:44): Hello.

MP (0:00:45): A software engineer at Jane Street and visiting fellow at Newcastle University. Andrey is known for his work on algebraic graphs and selective functors and is the main author (I hope I got that right) of the Hadrian build system that GHC uses, as of the latest version, for all its artifacts and binaries. Welcome, Andrey.

AM (0:01:05): Thank you very much. It’s great to be here. I’d like to like start by saying that maybe I was the original author of Hadrian, but since then, so many people contributed that I no longer can honestly claim to be the main author. The main author is the GHC team now and I’m very happy about it.

MP (0:01:23): Let’s start off with the backstory, right? So how did you get into functional programming in Haskell?

AM (0:01:28): That was an unexpected transition for me. I was an academic and I’ve been at Newcastle University for a few years when I, by accident, started looking at functional programming. I was in research on a very niche but very interesting area. It’s asynchronous circuits, so circuits that don’t have a global clock signal and they coordinate various parts of the system, coordinate with handshakes asynchronously. My PhD was about figuring out some mathematical models for describing the behavior of asynchronous circuits, and I came up with the idea of algebraic graphs. And at that time, I was a very simple programmer. I was using very simple tools like C++ because of performance and I didn’t really learn anything else. And I tried to express algebraic graphs in C++ and I ended up with some five files to describe very simple expression language, and it was very unsatisfactory.

AL (0:02:23): Really? Sorry for interrupting. This is already very interesting because I always kind of assumed that algebraic graphs arose out of having Haskell and trying to find a suitable description for graphs, but it was really the other way around. You had the idea–

AM (0:02:37): Yes, it was the other way around.

AL (0:02:40): Oh, okay, you go on. Sorry.

AM (0:02:43): Yeah. So I can tell briefly that basically, I was trying to find a simple language for composing behaviors in sequence and in parallel, very simple basic blocks for describing concurrent behaviors. And I came up with this little language, this little algebra. And then I was trying to encode it and like I was trying to learn tools around that into language. And that ended up being quite painful in C++. As you can imagine, you would probably reach out for some visitor patterns and some hierarchy of classes where you try to express every operation in the algebra as a subclass. And that was very verbose and very painful to program. Fortunately, I had friends who knew Haskell and they showed me the delight. My friend just showed me how to write those five lines into like 20 lines of Haskell. And that was amazing. And yeah, I fell in love. I think maybe it is trying to say that I fell in love with algebraic data types. It’s probably incorrect to say that I fell in love with functional programming because the main feature I was using at that time is just algebraic data types and it’s such a simple idea. It surprises me that it’s so still underused in many programming languages.

MP (0:03:53): And where did it go from there? I mean, you didn’t stop at algebraic graphs, right?

AM (0:03:57): Yeah. So at first, I just started to translate some of my ideas to Haskell and I wrote a couple of blog posts. And surprisingly, I got quite a lot of reaction from the Haskell community. People seemed to like this idea. Well, I started to engage more with the community. I attended Haskell eXchange, I attended ZuriHac and I just enjoyed being in this community. And at some point, I guess I started to learn more and more and got pulled into this world.

AL (0:04:21): Did the Haskell implementation of algebraic graphs actually also managed to solve your original problem that motivated the whole development, or did it sort of turn into its own thing?

AM (0:04:34): Yes, it somehow consumed me in the sense that it moved me away from my original research. So in some sense, it was a success for me personally. It developed me as a programmer, it developed me as a researcher, but the original research situation was kind of forgotten by that time. And I still have in my long to-do list somewhere and I still have things I want to finish from that old research area where I still want to implement some of the ideas I had at that time. But so far, behind whatever I’m doing right now, it’s probably not going to find time to do it anytime.

MP (0:05:09): Right. So we don’t have a Haskell implementation of asynchronous circuits.

AM (0:05:14): Yes. Well, I actually had a PhD student who was looking into that and he came quite close. But I know PhD study is a short-term project. So he implemented a good library, but he moved on to other things. So there’s still a lot of things to do. And I’m no longer in academia so I can’t supervise the project on this. So maybe it’s just going to die away, those ideas.

MP (0:05:35): So after that, did you do the selective functors right after that or did you start with the build system, or what happened?

AM (0:05:41): No, the rules of selective functors are also in my academic work, maybe also in my PhD study. So it’s also a bit interesting. So maybe it’s also a bit unexpected that the rules are actually there in the asynchronous circuits. But first, from algebraic graphs, I transitioned to Hadrian to work on the build system for GHC. And that happened just by chance. I was working in academia, so I had access to some visiting researcher schemes. So I got some funding to spend half a year at Microsoft Research in Cambridge. Simon Peyton Jones was working there, and I was very much into Haskell in those days. I just sent an email to Simon explaining that I have– he has an opportunity to have me for half a year and I can do whatever he likes basically. And it was his idea to reimplement the GHC’s build system in Haskell. His and Neil Mitchell’s, because Neil Mitchell was working on the Shake library at the time, seemed like a great fit for this idea.

MP (0:06:33): Right. Yeah, because I remember when I was getting into working GHC, the Make system, it was a work of art. There were so many Make files and they all interacted in all these ways. And then you had to treat it like a black box. You kind of, if I change this, what is the effect? And then you come back to it. But now it’s all kind of principle. It’s very nice. It’s very fast also. It’s incremental when it does all these– yeah, it’s a whole new life for GHC developers, I think, right?

AM (0:07:04): I’m glad you think that. I think there are probably still a lot of sharp corners in the new build system, and the terms are that build systems are just hard obviously, like moving from a language which doesn’t have types and where everything is a string, which is spliced into other strings. Make file to something like Haskell is already a big step towards being able to understand what you’re working with. But even after we’ve done that, it’s still big. So we replace 10,000 lines of Make files with 10,000 lines of Haskell code. It didn’t become smaller. So Make files are already quite terse and it’s just, you know, GHC is a complex system and requires a complex build system. So it’s still going to be like– I’m afraid it’s still going to be a black box for some people who don’t have time to really dive into it and understand the details. But hopefully, it’s going to be easier to dive and understand the details now that it’s in Haskell.

MP (0:07:54): Okay, but you kept working on build systems, right, after that?

AM (0:07:57): Yeah, and that kind of surprised me. And I think if I look back, I think when Simon proposed me this project, I thought, “Okay, this sounds odd. I’ve never really worked on build systems. I’ll try to finish it in one month and then I’ll switch onto something more interesting.” So the project ended up being complete in some sense, like only eight years after it started. Eight years since it started, we finally removed Make files from the GHC tree and only the Haskell version remains. This is a lesson for me that build systems are very hard to migrate. And I’m doing a similar project right now, so it’s also migrating from one of build systems to the next, and it takes similar timescales. It’s just surprising how much complexity there is when you do that.

AL (0:08:41): Perhaps if even you yourself have been falling into this trap of underestimating the vast amount of work that is involved, perhaps you can also recap again for our audience a few of the highlights of unexpected complexity that came up over the years.

AM (0:08:58): Yeah, sure. And first of all, one source of complexity I think is that build systems are like a collection of knowledge artifacts written by dozens of people. And those people move on, they leave the community, they forget things. So when you start with a task of rewriting the build system in a new language, you first need to understand what’s there. And that is the first challenge, which is very hard. It basically requires you to go ahead and talk to lots of people and also rediscover things on your own because–

AL (0:09:27): Basically, just reverse engineering all the stuff that is there because, in particular, the Make files, like Matthías said, they are sort of working by magic, but there is no idea documented like what they’re actually doing or why. By the way, is that better now would you say in the Haskell world? I mean, is it sort of all self-documenting and explaining what it’s doing?

AM (0:09:53): In some places, it’s better but sometimes you just come across some conditional which passes some flag under certain obscure condition. And yeah, it may be documented but it may also be not documented. And then like, why do you pass this flag under this condition? It’s very unclear. You need to actually go ahead and ask people and figure this out. And also, it might be documented but maybe that documentation is just stale. So it’s good to have documentation, but unfortunately, documentation is hard to keep up to date and it’s a real problem, especially with build systems that live for decades and need to change and adapt. So this knowledge discovery is one problem. And the other problem I think is just that build systems are integrated with so many other systems, it takes time to translate all these integrations. There is the continuous integration system which interacts with build systems. There are various editors, there are various documentation generation tools, kind of like systems after systems after systems that you have to migrate.

MP (0:10:49): Right. And so how do you translate that knowledge? Like you just talk to people and you write the documentation and then you implement it or do you just experiment? Or how do you test the build system basically?

AM (0:11:00): Well, you talk to people, but also you just try to convince them to get on board on this project because many parts of the new build system in GHC were just written by other people. They were not written by me. It wasn’t like just me talking to people and trying to extract knowledge from their heads. It must also be like, “Ben, could you try to translate this part of Make files in the Haskell place?” And then Ben would go and spend some time and translate. And this was happening many, many times as well.

MP (0:11:23): Right. I feel like a lot of research projects start out this way. They’re always like, “Ah, I’ll do it in one month. It can’t be that hard, right?” And then I think those are the most interesting research problems, like the things that feel like they should be easy but then they turn out to be very hard, right? But Hadrian is just for GHC, right? Or can I start using Hadrian in my own project or no, it doesn’t work like that?

AM (0:11:48): So there was an idea which is, it is still there. So Hadrian has a small Hadrian library inside which can build Haskell packages. And in principle, we could try to put that library on Hackage and people might want to use it, but it’ll probably require quite a lot of work to make it really usable and well documented. So right now, it’s just buried somewhere inside Hadrian, but we really wanted to separate the generic part from the GHC-specific part. And I think we’ve just failed because of lack of time, but it’s still possible to do it.

AL (0:12:20): Is Hadrian completely separate from Shake now or is there still some common–

AM (0:12:25) : Oh, no. Hadrian uses Shake as a library.

AL (0:12:27): Okay. But there is value you would say for other projects, for non-GHC projects in using Hadrian rather than just using Shake?

AM (0:12:38): Yes. So there is quite a lot of Haskell-specific knowledge in Hadrian. In Hadrian’s library, for example, one could imagine using it from Cabal. So Cabal also needs to build Haskell packages and it could benefit from some of this functionality.

MP (0:12:51): I remember from the talk that Simon gave at ICFP, and he was mostly talking about Excel. That Excel is a build system and how that all tied together. But it feels like there’s a lot of graphs in there. You’re always building some graphs. So did you manage to integrate the algebraic graphs into this or did you have to write something else?

AM (0:13:11): No, I’ve never managed to really start using algebraic graphs anywhere apart from writing papers, to be honest. So actually, this is one of my also goals and that long to the list to actually start using algebraic graphs for real when writing build systems because they’re convenient to work with. But somehow, we never came around to do that.

MP (0:13:32): Right. But back again to the selective functors, I remember that that was also a very nice talk, and I remember I got very excited about it because you presented this new way of writing a kind of do notation, where you branch on monadic stuff, but before you have to run it, you can look deeper into the monads. So could you tell us a bit about what is it, selective functors?

AM (0:13:56): So the idea is that we have applicative functors, which essentially give you a little language for combining values, the bigger values, and these little pieces all have to be known in advance, so you’re kind of, okay, here are 50 little effects and here’s how to put them together in one big effect. But sometimes you want some of those effects to be conditional, and for that, people usually use monads. But monads, they make your structure very opaque. Because monads use functions that take actual values produced by those effects and generate new effects, which means the only way to really get the final structure is to start evaluating some of the effects that are there as a seed sort of.

But going back to my PhD years, I was working on tools that generate circuits, and circuits have this interesting property that everything that there is, all components that you put out in a circuit, they have to be known in advance, right? And only in runtime you’re going to switch on and off some little bits of the circuit to get different functionality out of it. And I wanted to have something similar abstraction in Haskell that basically allows me to specify a circuit upfront and then during runtime to pick different subparts of it for execution. And this abstraction was missing. So there were applicative functors that allow you to specify big circuit out of smaller components, but then all of them will have to run at runtime, so to speak. And there are monads which allow you to do very dynamic specifications. But then look, you can’t really put a monadic description in the circuit because you first need to start running that circuit. So there seemed to be some missing intermediate abstraction. And I was like, “This is what’s happening in the background.”

I was thinking about this idea, but then I started working on build systems and I observed something similar in this new domain. It was that some build systems allow you to– so we wrote this paper, Build Systems à la Carte, that categorizes build systems into two classes, build systems with static dependencies, where you have to declare all the dependencies upfront. Basically, you have to have this dependency graph upfront, build systems like Make. And there are build systems like Shake where this is not necessary and you can start with a seed which is known, but then from that seed, you grow this dependency graph during execution, build systems like Shake. But there are also build systems which allow you to over-approximate your dependencies. So basically, before you start the build, you can tell, okay, here are all the possible dependencies that I might need during the build and maybe you go ahead and download them and you install them on disk, and then you start the build. And only a subset of those is going to end up being used during the build. And here you have again the same flavor. You have an over approximation of everything you might need and you end up using only a subset of it.

My student, Georgy Lukyanov, and I were thinking about this and we were trying to come up with some kind of language to describe this some kind of new small set of primitives that would allow us to describe this sort of system. And yeah, we came up with this idea. So essentially, we wanted some kind of an If Statement. This wasn’t a new idea, so people were exchanging applicative functors with three-valued If-combinator. But what I didn’t like about that approach is that it was tied to Booleans. It was essentially, okay, f Bool and f a, f a gives you an f a. And I wanted something which is not tied to Booleans, which is similar to bind where you have this arbitrary type parameter that you can instantiate to other more interesting types than just Booleans. And so eventually, we came up with this strange operator that takes an Either wrapped in a functor and an optional effect which you need for one of the branches of the Either. And this was sufficient to specify all we wanted, was sufficient to describe all the systems we wanted. And we were pretty happy. It was like, I’m surprised that an idea, which is so simple after you’ve described it, it seems almost trivial. But somehow, I couldn’t find exactly this operator like this anywhere. And so I ended up writing a blog post.

And again, I like the Haskell community because, in this case, it’s where I was trying to publish some ideas, I always was getting some useful response from many people and this was another such case. So I wrote a blog post and I got a lot of responses. It was mostly encouraging, some suggesting some tweaks to the initial abstractions of the proposed, some asking about what the laws should be. And there was very productive discussions around those sorts of questions. And this motivated us to actually turn this into paper. This was almost exactly the same as algebraic graphs. I published an idea and there was a lot of discussion, which forced me to really rewrite that idea in the proper form and submit it to a conference. And the same happened to selective functors. If it wasn’t for the Haskell community, it would never have ended up in a paper. It would just remain the blog post. But the community pushed me to actually formalize these ideas and to present them in a nice format. This led to this paper.

AL (0:18:40): That sounds like a good way, right? I mean, you’re having an idea which is clearly of some merit and then you’re actually not just pretending to know everything already, but helping or using the community to help you to flesh it out and then publish it.

AM (0:18:55): Yeah. I really enjoyed this process. And I think the paper on selective functors, the list of acknowledgments is like 60 people and some of them are just handles because I don’t even know their names. They were contributing valuable ideas.

AL (0:19:07): So are you aware of some application areas whereas selective functors have been turning out to be very useful that aren’t build systems?

AM (0:19:16): Oh yes, I saw people use selective functors for parsers, for example. There was a nice paper by, I think, Jamie Willis and Nicolas Wu about staged selective parsers and I think it’s a very nice application of this idea. And also, Matthías was actually thinking about pushing this idea further and maybe implementing some syntax sugar for selective functions. This haven’t yet materialized, but I think this is another interesting direction of how do we–

AL (0:19:40): Would you go as far as intuitively saying that most uses of monads could actually be reduced to uses of selective functors?

AM (0:19:49): Many users, for sure. I managed to rewrite many of my monadic functions into selective functors and they just work fine. But yeah, obviously, monads give you dynamism and selective functors don’t. I tell you, with selective functors, you always have a fixed universe of things you can possibly do. And this is not true for monads.

AL (0:20:09): What is the story on composition, just out of curiosity? Because I mean, applicative functors compose nicely and monads compose not so nicely. What about selective functors?

AM (0:20:19): I think selective functors also don’t compose nicely, pretty much for the same reasons as monads.

MP (0:20:23): But if I remember, it kind of slots somewhere into the hierarchy, right? But it’s not the same as applicative. But you can have two superclasses of monad with selective functors, or what was the story there?

AM (0:20:35): So when I was writing the paper, I think I proposed here you can place selective functors in between, so kind of applicative for the superclass of selective functors and selective for the superclass monad. And maybe from some pragmatic reason, it makes sense, but in recent years, I’ve grown to dislike hierarchies and I think they just sort of complicate things. These days, I think that it just doesn’t give you enough benefits. Fixing this hierarchy and trying to fix selective exactly there, it just introduces too much friction with all the updates that have to do with all the libraries out there. And the benefit, I don’t know. Maybe reusing the same name for different functions. Yeah, this is useful, but I think I started to doubt that it pays for all the costs that you gained from hierarchies.

MP (0:21:19): So we’re not going to see the applicative selective monad proposal. Everyone has to implement everything.

AM (0:21:24): I’m more skeptical that it’s a good thing to do.

AL (0:21:28): Yeah, it’s kind of my feeling as well. But I mean, I’ve never had the time so far to think of a clear proposal or a better solution. But my feeling is that sort of this linear hierarchy of superclasses is just too limiting and this is just leading to endless discussions in the community that ultimately serve no one. And what you really want is to have rules to say everything that is an instance of this class is also naturally an instance of that class and some way of exploiting those rules, but not in such a way necessarily that there is one blessed superclass relationship where you actually have something available as a subdictionary in the implementation. But I feel like any change to that will be vast and also lead to endless discussions that may eventually benefit no one. So it’s an interesting future research problem perhaps.

AM (0:22:28): So I think extensions like Deriving Via are a more promising approach, which, as Andres says, allow you to basically generate some instances from other instances. They give you the same, essentially, code reuse but at less cost.

AL (0:22:40): It’s certainly the context in which I have been starting to think about this kind of stuff. Yeah, because when we created Deriving Via, we were also talking about superclasses and about the relationship with all sorts of other extensions. And it feels like there is something there that could perhaps be a better replacement for superclasses.

MP (0:22:59): I think also we implemented QualifiedDo, which was what Andrey was mentioning before. We can have a QualifiedDo version of selective functors. Because I don’t think people really want the hierarchy. They just want the do notation syntax. So you don’t actually have to burn all the bridges; you just define the do notation. Because that’s the blessed imperative syntax.

AM (0:23:21): So essentially, what we want is some form of overloaded case statements, which is of course very useful when you are implementing various sorts of DSLs. In many DSLs, you don’t really have a monad, but you have case statements, you have if-statements and you want to be able to benefit from do notation to write those programs.

MP (0:23:38): Right. Because we have things like applicative do, where it takes do notation and then it nicely rewrites it using a fancy algorithm into applicatives and then everything can be parallel and you don’t have to think about it. If we had selective functors, it would be very cool if we could rewrite do notation and then not have to run the monad because we can do the– it would be very cool. But I think, like Andres says, there would be so much discussion and so much re-implementation, and like, when is it worth it to make everyone rewrite everything just because in a few cases, we might gain a slight performance from do notation. It’s very interesting. It feels like you hit upon something. Because we knew about monads and we knew about applicative and now we have this, “Ah, this is like a really– it’s a very nice discovery.” It doesn’t feel like it’s invented. It feels like you discovered these. I really like those bits of program language features where they just work.

AM (0:24:31): Also, one other way to look at selective functors I think in retrospect is that there’s a very simple idea in the sense that applicative functors talk about DSLs where you have some kind of notion of a pair or a way to combine multiple objects together. Monads talk about languages where we have let binding. And I think selective functors talk about languages where we have case statements essentially. There are lots of other primitives out there in languages that you can pick and create an abstraction out of those primitives. You just pick the case statement, pick the if-statement. This gives you essentially selective functors. But there are also other primitives that you can study in the same manner.

MP (0:25:07): Right. So a little bit more about the current stuff you’re doing. You’re writing Dune for OCaml, right? That’s the one that builds the OCaml compiler or what is– tell us about your current work. What are you doing now?

AM (0:25:20): When I was finishing already, like finishing the work and most of the work in Hadrian, at that time, I was already thinking that I’m writing so much code these days that I feel like more like a software developer than in academic. And I started to explore possible jobs in the industry. At some point, I was even contacting Andres, I think, about a possible job at Well-Typed and we had a nice conversation, and I was talking to many people. And one particular good fit was Jane Street because I wrote the selective functors paper in co-authorship with Jeremie Dimino who was working at Jane Street.

So I ended up working with Jeremie on Dune. And Dune is a build system in OCaml. I think Dune is– interestingly, Dune is not yet able to fully build the OCaml compiler, I think. Also, in a similar project in the OCaml world, they also rewrite the build system for their compiler from Make files to Dune, and they are partially succeeded already. They can build some branches of the compiler I built using Dune. But I still haven’t finished the whole migration yet. So yeah, somehow by accident, I ended up being a Dune system developer and I very much enjoyed this area. One, maybe a lesson I learned is that you can learn to join an area even though initially you think it’s something that you don’t really like and you don’t want to spend more months than it. But as you spend more time on something, you learn about hidden beauty. You can find something like hidden beauty in pretty much any area you work on, as long as you put enough effort to get to know it better.

AL (0:26:40): How much of the Hadrian knowledge just applies and how much is different?

AM (0:26:47): A lot of that knowledge applies. So I learned a lot when working with Neil Mitchell, who is an expert on build systems and who is also working on yet another build system back through right now at Meta. That knowledge is directly useful right now. So I keep learning new things every day, of course, because there’s still a lot to learn. But that was a very useful foundation, that half a year that I spent at Microsoft with Simon and Neil.

AL (0:27:09): So would you say that the theory is mostly the same and doesn’t require much reworking and that like most of the differences in the new work is in these other aspects that we previously discussed, like reverse engineering, what actually the requirements are and figuring out what people are using it for or want to use it for? Or are there actually new fundamental things that require a completely different approach?

AM (0:27:35): I think there are some new fundamental things. For example, one thing that’s different about my current work compared to Hadrian is the scale of the build that I’m working on. So at Jane Street we have right now, I think 36 million lines of OCaml code, which is quite a bit more than the size of the GHC. It grows very fast, a lot of commits every day and we need to build everything. And so there is definitely a new flavor of working on the build system. Right now, it’s trying to make it much faster than– I mean, there is a lot more emphasis on performance right now and incrementality not in terms of the external actions, which is very common in build systems where you try to avoid triggering actions, like compilation for example of files that you don’t need to right now. But there’s also a lot of work on the internal incrementality, for me, right now. So you run the build system in this file-watching mode, and as the user is editing the files, you want to re-execute only a small number of functions inside the build system. You don’t want to essentially restart the system from scratch every time a single file has changed. So there’s also incrementality within the build system itself. So I’m working a lot on the incremental computation libraries these days, which is similar to build systems. Like as we say, Excel is a build system as well, but I know there is obviously also some differences. So the build system works on this mass immutable state, which is file system. This is probably one of the key complexities about the build system. Whereas incremental computation libraries already work with memory, so you have a lot more control there.

MP (0:28:58): So will we see any of this come back to Haskell at some point? Because I would really like to be able to pause compilation at some point and then do speculative compilation in GHC and then go back and continue. But the technology is not there. But we see this come back home to the bosom of Haskell.

AM (0:29:17): Yes. Well, definitely, as I’m learning new tricks, I hope to sometimes migrate those tricks to Hadrian. Why not? And to Shake as well. And here we are very frequently talking, like Neil and I are talking about the build systems we are working on. We are stealing some tricks from each other. And so there is some dissemination of ideas happening this way. One other thing I would like to be able to do is some days to use Dune to build Haskell projects as well. Dune can build OCaml, but it can also build C. Internally, we also have build rules for Python and whatnot. Why not also try to build Haskell? I would be pretty interested in doing that.

AL (0:29:49): So now that you have some insight into both the Haskell and the OCaml worlds, we of course have to use the opportunity. Tell us something about the differences you’re perceiving, both between the languages perhaps and the community and the whole way of thinking.

AM (0:30:05): I think these days, I’m pretty comfortable writing in both languages. One thing that would probably push me towards using, like preferring OCaml for any large project these days is just the speed of development because the compilation times in OCaml are very, very fast. But my initial impression on my first day working on Dune, do Dune, I thought that I made a mistake because nothing happened. But actually, it was just so fast. The speed of development is right now, for me, the main differentiator between OCaml and Haskell. Of course, it’s possible to improve on that and maybe the OCaml compiler has a simpler job to do. Maybe that’s why it’s fast. This is something that I would like to happen in Haskell. There are some smaller things, like initially, I was very much missing the type classes in OCaml. You can just write class, you have to tell which class even. So this is initially very annoying. Not now. I am just used to this inconvenience. It forces me to be more explicit sometimes than I would like to. But now at the time, I think maybe being explicit is not always a bad thing. Sometimes it’s okay. You say like this class comes from Int. It looks a bit odd when you write this line of code, but after sometime, you stop noticing this.

Maybe the biggest feature I’m missing from Haskell today is higher kinded types, which makes some abstractions like Traversable very awkward, pretty much impossible in OCaml. So this is the biggest feature that I’m missing from Haskell.

MP (0:31:23): But OCaml has implicit side effects, right? Has that bitten you or you come from the Haskell world so you’re already careful? How do you– how does that impact your work in–

AM (0:31:33): This is just not important to me. Obviously, on paper, this is a big difference. Haskell is very principled. You have to use unsafePerformIO if you want to escape from the IO monad. Whereas in OCaml, you don’t have to. But it’s also like I’m mostly writing pure code in OCaml. So I’m not maybe getting some of the guarantees from the compiler. But on the other hand, you can pretty conveniently introduce some pieces of local state. And occasionally, it’s useful and it’s much less verbose. So we are hoping to have some purity annotations in OCaml soon so that you can basically say, well, this function is pure and the compiler is going to check it. And that’s I think is just going to be probably the best of both worlds. But yeah, I’m not sure. I have to try it to be sure. Also, laziness, it’s also a big difference. OCaml is strict and Haskell is lazy. Obviously, in some cases, you prefer laziness by default. In some cases, you prefer strictness by default. And I think I’ve just grown to just accept strictness by default and it’s fine for me.

One thing that makes me sad when I’m writing Haskell is that laziness by default is convenient, but very often you actually are forced to annotate your programs with strictness notations, to get acceptable performance in certain cases. And this kind of makes me sad because the program used to be so nice and beautiful and now you had to insert some knowledge about the evaluation. It’s a nice program, so it makes me sad. And this doesn’t happen in OCaml in the sense like, okay, now I know for sure how things are going to be related, but I don’t have to annotate anything. Occasionally, I have to annotate things which are lazy, of course. And that has a similar effect. I used to have some nice, beautiful program, but now I have to insert some annotations that tell us something about the evaluation order. So I don’t know if it’s possible to have a language where you just don’t have to think about this. So maybe it’s just not possible.

MP (0:33:21): Right. But I remember at ICFP, Sivaramakrishnan made a big deal about the new concurrent and parallel OCaml. Are you using OCaml 5.0 right away, or how is the adoption pipeline? Because I know, I mean in GHC, a lot of people are still on 8.10, right? Because there’s always a bit of a lag, right? So how much is the lag in OCaml and is it more or less than in Haskell you think?

AM (0:33:43): Internally, we are quite behind. We are not using OCaml 5 yet. And there’s probably going to be at least a year, I think, before we can actually try using Multicore OCaml. So we have actually our own version of the compiler because we have quite a big compilers team right now and also recently, Richard Eisenberg also joined. So we really have quite a lot of people who are working on various interesting features. So Richard, for example, joined to help make unboxed types. But also, we have this unique feature, which is safe stack allocations. So it’s possible to annotate values in OCaml to tell that they’re going to be allocated on the stack. And this is going to be obviously cheaper than to allocate on the heap. And the compiler is going to catch you if you try to sneak this value somewhere in the– where it can escape and be used after the stack has been bought. So this is an interesting feature and there’s a lot of– so basically, our internal version supports this feature in the external version of OCaml, external version of the compiler doesn’t. And at some point, they will merge, but we’re in this state where we can’t use Multicore OCaml yet. At the same time, the external world can’t use really local allocations yet.

MP (0:34:52): I remember there was, because we were chatting at ICFP, some OCaml compiler writers and some GHC compiler writers, right? And it turns out we had a lot of same bugs kind of cropped up. But we did it here and then they had a half a year later or they had something and then we saw that later. So I feel like there’s a lot to be learned from discussions between these because these are both very big functional compilers, right? So it’s glad to have not just the GHC writing functional compilers, right? It’s to have variation and– yeah.

AL (0:35:24): Do you think there’s anything in terms of sort of community processes or something that the Haskell community could learn from OCaml?

AM (0:35:31): So it’s difficult for me to compare the communities because I was at different stages when I joined these communities. When I joined the Haskell community, I was basically nobody. And I was very impressed that the Haskell community was so welcoming and friendly for me. And that was something that I could not possibly reproduce in the OCaml community because here already I came, first of all, knowing a lot more. So I didn’t need as much help as I needed in the Haskell community. And also, I was already working on a project that had lots of contributors and knew many of them. So it was a completely different process for me. So it’s very hard to compare. Yeah, probably just not attempt to draw any comparisons because I just have no real basis for doing that. I enjoy being in both communities and I’m very grateful for the Haskell community for actually making me fall in love with functional programming because if not for the community, I wouldn’t have fallen in love with functional programming really.

AL (0:36:22): I would still like to go back a little bit to this switch you made from academia to industry. And you said– well, you had the feeling that you’ve already been writing so much code, and I can kind of understand that. I mean, I made the same switch at some point. But it also feels like you had a good research process going on, right? I mean, we talked about it, like you had the successful mode of coming up with a great idea, putting it out, refining it, and then publishing it. And that worked at least twice, if not more often. And it feels like there is a lot more stuff in that area. So was it easy for you to leave academia behind or is that something you’re missing or are you feeling that you can still do all the research that you want anyway, or do you want to go back at some point?

AM (0:37:07): Yeah. I mean, of course, I miss academia. I had a great time in academia. I enjoyed this spirit of open-minded research and not being constrained by the topics. And okay, I worked on the asynchronous circuit and then I just switched to build systems without having to ask anybody. This is the academic freedom and I used it. Of course, now I have much less time to just explore some ideas in the abstract. I have to do some real work. Some people are depending on my work. I occasionally write papers or maybe once a year. Maybe someday I’m going to come back to academia to finish all this. I have lots of ideas which I like to write about and would like to even just think about, which I don’t have time for right now.

But also, I enjoy some aspects of working in the industry as well. So presumably, when I go back to academia, I’m going to miss the industry as well, this immediate interaction with the users. So you implement the feature and tomorrow somebody comes and tell, “Oh, I’m so happy you’ve done this.” And this is just not happening in academia. You write a paper and you wait for a year for it to publish. And then you have a moment of celebration. But then I guess so far removed from the time. Actually, I came up with this idea, whereas right now it’s my job. I can come up with an idea, implement it, and the next day I see the benefits. It’s a very, very nice loop, a very short iteration. But also, I mean, there were some aspects of academia which I didn’t like and I’m happy to have escaped from them. I have never liked, for example, writing grant proposals. And this is basically, essentially, if you want to grow your group, if you want to have PhD students, you want to have researchers working in your group, you have to write grants. And I just hated this.

AL (0:38:44): Yeah, I completely understand you. It’s probably the single best thing. And somehow, I don’t know, all my life in the industry, I didn’t feel the same. I mean, for example, at Well-Typed as a software consulting company, we also have to look for clients, of course, but it doesn’t feel the same. Somehow, grant proposals were always like– yeah, there is this opaque process that nobody understands.

AM (0:39:11): Also, somehow by chance, I switched from academia to industry at a very good time because I switched in 2019 and then the pandemic started. And being academia, it stopped being fun at all because you couldn’t go to conferences, you couldn’t really interact with your students. You had to sit at home and do video lectures, which I’m very happy I didn’t have to go through this.

AL (0:39:31): But I mean, that affected industry as well, right? I mean obviously.

AM (0:39:36): Yeah.

AL (0:39:37): How is Jane Street set up? Is it mostly working from the office or–

AM (0:39:40): We mostly work from the office, but we still don’t have full remote positions. But who knows? Maybe we are going to rethink and maybe we’re going to evolve in the future.

AL (0:39:48): You’re in Singapore now for a few months, you said. So you can switch between different offices all over the world or–

AM (0:39:55): Yeah. We encourage kind of mobility between offices so the people meet each other and know other offices. So we recently opened an office in Singapore. It’s very small. It has a startup view, get almost everybody by name. And I’ll also going to explore our Hong Kong office in a few weeks. But it’s very difficult to be in this time zone. Everybody is so far away.

AL (0:40:15): It forces you to be an evening person, I guess.

MP (0:40:18): But I wonder, you were in academia and then you moved to industry. Your academic background, how does it affect your work in industry? Do you feel like you’re more principled or the ones who were always in industry? How do you feel like you compare with just having gone straight to industry or doing this stint in academia first and then going to industry?

AM (0:40:38): I don’t know how to compare, but my personal transition was very painless. My biggest shock was the sheer size of the systems that I had to work with, all this complexity. I was in academia. Every project I worked on before was a good fit in my head. This was no longer true when I moved to industry. Everything was much bigger. And that was the only difficulty. Other than that, the people I was working with, they were great. The atmosphere in the office was very similar to the atmosphere I had at university. Maybe it has to do with the fact that I’m working on the Tools and Compilers team, which is probably the closest team to research that there is at Jane Street. So there are lots of former academics here, lots of problems we are solving. They’re kind of research related and we’re occasionally writing papers about them.

So the transition was very easy for me. And it’s hard for me to say like, what would have happened if I did it straight out of the university, for example. I think one reason I waited for so long before switching to industry was I had some job offers before, but they were always very generic and I wasn’t really sure what I’m going to do if I accepted the job. I was a software engineer and at a company like Google, they don’t really know what they’re going to do. Whereas here I knew exactly I’m going to work on build systems. This is exactly what I was working on in the university. I’m just going to change languages and continue doing pretty much the same thing. And that’s what happened. And so the transition was very simple for me.

MP (0:41:57): Right. Now we talked about your backstory, we talked about what you’re doing now, but what is in the future?

AM (0:42:04): Yeah, in the future, I have no plans to jump into yet another research area. I’m still happy with working on build systems, but on the research side, I keep working on the topic of algebraic graphs, which brought me to functional programming. And recently, I wrote a paper that generalized that idea to be able to describe graphs with edge labels. And that also came as like– initially, it was a blog post called United Monoids. Also, I got lots of reactions and eventually, I forced myself to turn it into paper. So I published it in the Programming journal and presented at the conference this year. This is a work on which I would like to continue working on. So as I said, I still don’t use algebraic graphs in my everyday programming and I would like to change this. I would like to eventually close the circle and finally benefit from this idea in my everyday work.

MP (0:42:53): All right, yeah. We’ll keep an eye out. Because having an idea, writing a blog post, getting feedback, and then publishing it, it feels very nice, right? Because otherwise, you have this case where you’re like you have an idea, you wrote something about it, but it turns out it’s not that good, but you want to publish it anyway. So, refining the idea is key.

AM (0:43:11): Yeah, I can highly recommend this style of research. Go ahead and don’t be afraid to share your idea early. And maybe somebody’s going to tell you that this has been done before, then you save yourself a lot of time. But maybe people will tell you, “Oh, this is new and interesting. Keep working on this.” And then you can go ahead and put it into a paper.

AL (0:43:27): It also gives you sort of the moral high ground in case of objections coming up later, right? You can all say like, “Oh, this was out in the open for such a long time. You had your chance.” Okay, thank you very much for…

AM (0:43:42): Well, thank you. It was fun.

AL (0:43:43): Talking to us.

MP (0:43:44): Yeah, thanks.

Narrator (0:43:50): The Haskell Interlude podcast is a project of the Haskell Foundation and is made possible by the support of our sponsors, especially the Monad-level sponsors: Digital Asset, GitHub, Input Output, Juspay, and Meta.

21 – Andrey Mokhov

Related links

Transcript