Théophile Choutri

Transcript

This transcript may contain mistakes. Did you find any? Feel free to fix them!

Andres Löh: Welcome to the Haskell Interlude. My name is Andres Löh, and my co-host is Niki Vazou.

Niki Vazou: Hello!

AL: Our guest today is Théophile Choutri, perhaps better known in the Haskell community as Hécate. In the next hour, we’ll talk about the importance of documentation that is alive, the Haskell School project to create an online open source book on all aspects of Haskell, and the future of Hackage.

AL: Welcome Théophile, glad to have you with us.

Théophile Choutri: Hi, it’s a pleasure!

AL: To get started, why don’t you tell us how you got into Haskell.

TC: So, I went into Haskell through a series of hoops most unlikely. I started– I’m entirely self-taught, and I started doing some sysadmin stuff around high school. And then a friend of mine told me about Erlang and Elixir that are used in his company. And I got to play a bit with them. And once I finished high school I had written a couple of servers– server-side software in Elixir, and I was faced with a choice of languages to learn to write client-side software.

So I tried Golang, I tried Rust at the time, OCaml, and Haskell. And for various reasons I stuck with Haskell. I thought it was the one with which I could learn the most, considering that I wasn’t planning on ever doing computer science studies. So yes, I went with Haskell with the idea to learn. And never stopped learning.

NV: And so can you tell us a little bit more about this self-taught thing? Because I think Haskell is not a very easy language to get started in by yourself. How did you do it?

TC: I think the thing that unlocked me was the book Learning Haskell from First Principles, from Julie Moronuki and Chris Allen. It was remarkable, honestly, I could just start with Chapter 1 and writing some lambda calculus by hand with paper and a pencil. A really fantastic book, and I’ve been quite hooked on the next things that Julie produced with Chris Martin. So yeah, I would say it is this book that unlocked things for me in Haskell.

AL: It’s a huge book, right? I mean, have you read the entire book?

TC: Uhh, yes? I think more or less. I may have skipped some parts because I read it not in one way around – so I left it a bit, I went back to the parts that were of interest to me at the time – but yeah, in some way or another, I definitely read the whole book.

AL: So, how difficult was it getting from this self-taught Haskell from the book to actually being able to use Haskell for either your own projects or even for work?

TC: It was quite painful, actually! No secret here. And I can be very daft at times, so it was remarkably painful even sometimes. Some of the– OK, so I was highly biased by Erlang and Elixir on how to write applications, how to write systems. And having none of those primitives and concepts in Haskell really messed up my mind. I was desperately looking for an actor framework and concurrency in Haskell, only to find out that Cloud Haskell was deprecated, or at least aging very badly.

So, it took some time, honestly, to find the proper patterns, the good ways to write applications. I would say that I never truly felt confident writing real applications until I got my first job in Haskell. And that was an entirely new level.

AL: Did you specifically look for a Haskell job at that time, or did it just happen?

TC: Um, I was looking for an FP [functional programming] job at first. I was coming out of my first programming job ever – it was in Elixir – and I was more or less looking in those waters. And then someone who had seen me give a talk at the Paris Haskell meetup offered me a job. And I took it. And yeah, quite a radical change of pace and contents when writing Haskell applications

NV: So can you compare a little bit more Elixir with Haskell…

TC: Sure.

NV: …because I think they have many things in common. I know very few things about Elixir but I know they have a very passionate community too!

TC: Yes! Of course. Well, some of the people who made Haskell also worked on some aspects of Erlang, especially the needs for a type checker and a type system that would be suited for its applications. Erlang is a language based on a virtual machine that was designed for telecoms. It was made at Ericsson, the telecom company, in the 80s if I recall, and its first prototype was based on Prolog, so its syntax heavily inherited from Prolog. Which is a wonderful language.

And then the idea which has resonated with me very much, was to be able to write applications with very high reliability. Applications that would never shut down. And when they would inevitably shut down or fail for some reason, they would handle that with grace. So the idea of failure, of predictable failure as part of the normal cycle of life, is inherent to Erlang and Elixir. Elixir is a language based on Erlang. It compiles down to the BEAM – the Erlang virtual machine, which is called BEAM – and brings with it modern syntax, inspired from Ruby, and concepts from Haskell and Clojure.

So it’s very FP, I love it. It’s immutable, very strong concurrency primitives, and a bunch of stuff for distribution. And it’s also dynamically typed, so you can’t have a very powerful type checker, just for the reason that you may receive terms in types from the outside world. Like literally from a TCP socket. So you must be able to handle not knowing specifically what you may get from the outside.

AL: So that’s the same between Erlang and Elixir.

TC: Oh yeah, yeah, it’s the same principle.

AL: Yeah, yeah, right. I’m always a little bit– I’m always a little bit amused that people consider Ruby syntax to be “modern” compared to Prolog syntax, but yeah. I think (laughs) that’s just how it is.

TC: Well, in the chronological sense: Erlang 80s, Ruby late 90s, and actually it was made in ’95 if I’m not mistaken, so I’m barely older than Prolog– sorry, than Ruby. I’ve turned 26 this year. So, yeah, for me it’s one of the syntaxes that I’ve known all my life.

NV: But, in Elixir, there was an idea to add dependent types.

TC: Dependent types in Elixir? Well, that would be very hard, because they don’t even have static type checking. I don’t think– no, I don’t think the idea ever came around.

NV: OK. Maybe it was one conference that I heard it…

TC: Oh, well, it’s not impossible to do it, but I’ve never heard of such a thing more broadly.

AL: So when you started having a Haskell job, you’re doing a lot of open source stuff in Haskell as well, right, and you’re very active in the community. Did that just happen at the same time, or you said that you even gave a Paris Haskell meetup talk before you got that job…

TC: Heh heh. Yeah, so, I was in business school at the time. Two years after high school, I went to business school, after which I promptly did a burnout. And I focused a bit more on programming.

And yeah, I was very involved in meetups, it’s a good way to socialize and meet people with your interests. I went to the Rust meetup in Paris, in Mozilla’s office. It was very nice. And at some point, with enough lurking around in the meetups, people were offering me to give talks. So at some point I got the opportunity to talk about a subject that I would solve some years later, which was the laziness of sum and product in the Prelude, in base. Or at least, the fact that GHCi does not optimize laziness to strictness when you use those functions.

So my first talk was how linked lists and lazy functions can blow out your heap and your stack. Yeah, that was the first talk I gave. I really went into the Haskell community as a more involved contributor, I remember that was the 20th of March, 2020.

AL: Haha, you remember the day, wow!

TC: Yes, I do, but for a very simple reason: I tweeted about it. Because I was looking for people to help me write better documentation for base. And that’s the first time I used the hashtag #haskelldocs on Twitter. And that was the beginning of a wonderful adventure – which is still going on actually, but it had several arcs like in a series.

So yeah, it all started on Twitter and on the #ghc IRC channel on Freenode at the time. And yeah, that was the moment where I realized that I had a will to improve the documentation, and also the time to do it.

NV: Can you explain a little bit more, like what have you done in the documentation and where is it going?

TC: So, I have served as of course an individual contributor of documentation, but also as a coordinator, so I handled the onboarding of volunteers to write documentation, clarify the steps to setup a local GHC repository and build the documentation, and made a list of the principle modules on which we would focus.

I communicated heavily on that, and really took the position of a public figure for that stuff. Now it’s going a bit slower, because I’m doing other things, especially with the Haskell Foundation, but documentation is still one of my priorities when I work in the community. And yeah, it’s still a guiding line.

AL: Let’s go perhaps a little bit more into the details here because I think it’s a really interesting topic. So first of all, I would be curious to hear – you already said this to some extent, but – apparently it wasn’t your life’s theme up to that point that you were always after better documentation but it was more probably – I’m putting words in your mouth here perhaps – probably that you felt a particular lack in the Haskell documentation that somehow triggered you to say “I have to do something about it!”…

TC: Haha. I did it out of pure need. I could not learn Haskell properly without good documentation. I could not also recommend Haskell to beginner Haskellers due to the documentation, but also myself, I was hindered by this. So yeah, I went to the functions that I found the most unhelpful in terms of documentation, just asked around what the documentation should be – what it was really doing, what were the shortcomings – and I wrote that. And that was my primary mode of operation.

I really had to learn Haskell more, and each time I could not understand the documentation, I would have at it. Yeah. So that’s why I focused mainly on the problems I had, and that’s why I say to the people who want to get involved the documentation of whatever, Haskell or another language or community, scratch your own itch first. Because it gives you the will to do things right. You have a direct benefit from this being improved. And it’s usually a great motivation.

AL: Do you have a theory why Haskell is comparetively bad at documentation? Is this just the way the community is organized, or is it something about the language, or a mentality?

TC: Haskellers– I don’t think we are so special that we are uniquely bad at a thing, and especially at documentation. The experts suffer from the lack of clarity, because they will talk to other experts when they document a thing or write a paper, and the people who are mainly using Haskell in anger, once it unlocks for them– well, they have to go and work and use that thing, so they don’t have the time to properly upstream their findings.

As a little example, it took me one year of using monads at work to properly read the paper “Monads for functional programming”. And it’s a fantastic paper, but it’s a very bad introduction. The problem is that a lot of people see it as a foundational paper – which is it! But it’s not an introduction, it’s a subject by an expert, to other experts, to be read.

NV: But the question is “how do you become an expert”.

TC: Yeah, that’s also the problem: you need to be part of the inner circle of PhD students who come to the right professor and stuff like that, and go to the right university. So for an outsider like me, yeah I think my work and my goal have been dedicated towards making Haskell more approachable to people who have not been in a standard let’s say, classical academic track.

NV: I guess I am of those that believe that Haskell is not well documented because the types say everything.

TC: Mm-hm. Yes.

NV: I understand what you say, that to learn to read the types you need to…

TC: To read the types.

NV: (laughs)

TC: Yes, yeah!

NV: So can you give an example of something– I guess your motivation, like a concrete function let’s say, where the types do not suffice to give the documentation and you need to have the text there.

TC: Uh, do I have a concrete function in mind…

NV: Or like, your first function or something you spent a lot of time on maybe.

TC: Yes, servant. Writing servant APIs. When I look at the Haddocks [Haddock-generated documentation] generated for servant, I do not understand the type signatures. Well sometimes because they’re so large that they go off my screen, but also– I mean, they are valid types. You can’t take that from them. But are they readable types? Unfortunately not. And this is the moment where you need to say “OK, let us not read the types, let us see what the prose says about the types”. And the prose says… nothing. Nada.

So, at this point you start bugging people on IRC and saying “Hey, how do I use that?” and “What are type families?” and “Why does it have to be injective?” and yeah. I was really eager to learn servant, to learn how to use servant. I was really not prepared for the amount of theoretical and concrete stuff that I would have to get into my head before being able to use it. I still can’t read some of the type signatures, let’s be honest, especially the generic stuff. I use it at work every day. As a disclaimer, I’ve become a maintainer for servant.

So that’s– I’ll always come back to the things I’ve had most trouble to learn, just to make the path easier for the people who come after me. But yeah. I aim to replace the reading of type signatures – of hard type signatures – by the reading of approachable prose.

AL: I think servant is an interesting example certainly both because it is rather extreme…

TC: Mm-hm, mm-hm, of course.

AL: …using a lot of type level magic and has a lot of really complicated types, where if you look at just a type signature without diving deeper you can hardly understand what the motivation for something actually is. But also it is actually a package where I think the authors have at least tried from the very beginning to somehow write documentation at least. I mean, we have other Haskell libraries around where there is really nothing…

TC: Yup. Oh, yes.

AL: …but I think for servant relatively early on there were at least tutorials and there was an active effort. And without specifically judging the good and the bad, I’m just wondering, what’s actually the recommendation you would give to library authors? What can be expected from them (laughs), reasonably, and how can we become better in general? What’s the minimum standard that we should adhere to, or what are perhaps good tricks where you would say “if you do this, then you’re already halfway there”?

TC: I have to give a shoutout to my friend Koz Ross who has hammered into me the best practices for this, because he has very piercing eyes. doctest: very, very important. You put the example right in the documentation. Also, servant has this setup where we have “cookbooks”, which are written in literate Haskell. Those cookbooks are part of the CI setup, so for each pull request, the examples – the documentation – is validated against this pull request. And it is deployed then online.

So you’ve got living documentation, and I think that’s one of the priorities for our community, because unfortunately the other extreme is this paper published in 1988 and the code doesn’t compile any more because we have changed standards twice now – and stuff like that. I think the best thing that we can do for documentation and for the future is having living documentation, having examples that are embedded right into the documentation, because then they can be shown to the user – the casual user, who will not read the code, but read the Haddocks.

AL: Right. So closer to the code.

TC: Yeah.

AL: So rather than write a separate tutorial that will get outdated…

TC: So. (laughs) They all have their usage. You’ve got to balance the necessary amount of handholding for the new user who arrives – from A to Z – but also specific guides on particular subjects; like in servant, how to implement JWT authentication. You also need to provide a reference for the experts, and people who will have to audit the library, or use it in anger. Also, a commentary is well-received when you have the liberty to explain in-depth the whys and the hows, and the origin and the destination.

Every type of documentation has its place, it’s just that you can’t put together the commentary and the tutorial. So that’s– you can’t put the PhD paper right with the one-on-one class. That’s not possible. I mean, it’s very easy to do it, but not when you’re the recipient of such a thing.

At this point in time, I think servant has a very good model of documentation: it has the basics, and then specialized cookbooks. I’m complaining about servant because the types are scary – really scary, especially when you also come from a community that puts the emphasis on the types over the documentation– “read the types”. It’s true, in some contexts. In the case of servant, the types are not the tutorial, they are the commentary. They are the reference. When you’re an expert on type families and on other type level magic.

NV: You said that the tests– the examples that you’ve put are automatically checked…

TC: Yeah, they are a Cabal project. They depend on the library, and they are in the same repository, so when you put a change well you trigger recompilation of those projects. And if you break something, we will see it.

NV: Is this sustainable?

TC: In which terms? In terms of effort?

NV: Yes.

TC: Well, I would argue that software is not sustainable as long as you need at least one human to take on their free time to do things. No, in some way it is because you may see the upfront cost of each time checking if the code compiles. I think it balances quite– the cost of having to fix a bunch of failures that have accumulated over time. So yeah, I think in average, it’s a good thing.

AL: My guess would be that it’s more sustainable the more you check, right. You have a little bit of an upfront cost, I guess to set up the whole infrastructure. And perhaps there is still stuff that we could figure out to make that easier for other people to copy. But doctests and so on are already there, I guess. But in general, it’s much easier or much more sustainable if you make some refactorings to your library, and you see via CI or something that your documentation is outdated, than having to pay attention to it manually or having to wait for bug reports and saying “oh, I didn’t notice it in the last three years but this example does no longer seem to work”, and then you have to reconstruct three years later (laughs) when it actually went wrong.

I think in general, I quite believe in the approach. At the danger of bringing up too much other stuff into the conversation, there is this tool that I still maintain, which I’m not very proud in terms of code quality: lhs2TeX. Which is being used for paper writing in Haskell from time to time. One of the general ideas there was also always that you can type check your documentation– that you can have a lot of code that is in your papers, the actual Haskell programs, even if they perhaps are only fragments of Haskell programs, or not even completely Haskell, or some variant thereof.

There’s a lot still to be explored in that area, in terms of better tooling. Ideally I want any reference to a function name that appears anywhere in the documentation, I want that to be at least scope checked and ideally type checked and everything. That would be very nice. Which I think probably leads us towards Haddock, right?…

TC: Yes, yes!

AL: …you have been working on Haddock as well, you have basically revived the project from a dormant state.

TC: I wouldn’t say that – I have been honored to assume maintainer duties for Haddock, but it was still maintained. Not actively being worked on. It’s a huge beast.

AL: Yeah, perhaps I was unfair, I shouldn’t have said revived. Reinvigorated, perhaps (laughs).

TC: Well, to some extent. But indeed, when I was writing documentation for base, I felt the need to unify – own? The full chain of production, from when you write documentation, when you generate it, when you publish it. And of course this means onboarding people writing the Haddocks, and then generating the documentation.

And yeah, indeed, it was not easy, especially because we have this workflow that we share with GHC. The project still needs people to work on it, it’s never done. And of course, myself and the other maintainers are very welcoming of anyone who would like to give a hand in any way, shape or form.

NV: How can somebody contribute? How do they reach you, or what are the steps?

TC: I would not advise to read the issues list on GitHub, but rather take contact directly with the team, with people on IRC or on Twitter. I’m reachable everywhere. Except maybe in the metaverse. But yeah, don’t step into the fray like that for Haddock. Just reach out to someone, say “Hey, I want to help” and we’ll find something approachable for you.

And also we need a bunch of different skills – web design and integration skills are also very welcome. It’s a skill we truly need and sorely lack in this community. So, yeah, this is something that we’d find very useful.

AL: Are there any big long-term goals for Haddock, in your opinion?

TC: I would love to have the parser moved, or at least cohabitating, with a Markdown syntax for the Haddocks. We have the tooling – up to a certain point. By that I mean that the pandoc creator has written wonderful libraries for Markdown, but then we’d need the integration into Haddock, and this is not an easy feat.

So… is there something else. Well, we have a plethora of tickets open of course. There’s also the coordination with GHC, but I’m trying to push that out to the GHC team itself, because they essentially maintain their own branch of the code. We have big projects that could use some help, and we have the everyday stuff, people who find weird edge cases that are not properly documented and such.

NV: All this works by volunteers?

TC: Uhh, yes. Until now I think we’ve always had volunteers, I’m a volunteer myself. I mean, if someone wants to sponsor some work on Haddock it would be most welcome. No problem with that. Maybe this would be work suited for corporate sponsoring, because it requires dedicating a significant chunk of time on understanding the system and working on it.

AL: Are all your various efforts – I mean, you are doing some work on Haddock, you are doing some work on servant, I think you are working with Cabal as well, and probably there are all sorts of other projects – are these all hobby projects for you? Or are you doing them as part of your current job as well?

TC: They all are, until a couple months ago, where I formally asked my team lead to dedicate my Friday afternoons to work on servant. Which had a nice coincidence with the fact that I am implementing servant endpoints at work to migrate from Happstack. And he found that very topical and useful for the whole team, so yeah, now I’m literally a paid contributor to servant. Which is very nice–

AL: That’s very nice– we should probably backtrack and say which company is this.

TC: Sure. Nowadays I work at Scrive AB, which is a Swedish company that provides a solution for electronic signature of documents, and electronic identity through country-wide identity providers, especially in the Nordic countries.

And yes, we are very huge Haskell users. We have very talented people in the team, too numerous to cite them all, and we are starting to invest more into giving back to the community. And this means allocating engineering time for both individual projects like servant, but also in the near future in-kind donations to the Haskell Foundation, of which I am a board member.

AL: Is Scrive using Haskell more or less exclusively, or is it a mixture of languages?

TC: It is a mixture of languages. We have some stuff in Java, and of course the whole frontend part is either in Elm or JavaScript.

AL: OK. And is it a distributed company? Are you working from…

TC: Entirely. It’s a Swedish company, most of my colleagues are in central Europe. I’ve received numerous invitations from the Haskell user group of Brno in the Czech Republic, and also Poland. And yeah, we must be six people in France right now. And almost nobody in the same city.

So yeah, it’s fairly distributed. We have hard requirements on European timezones in the EU + Norway and Switzerland, for matches of privacy laws and other stuff. But yes, we are an entirely distributed company.

AL: How many Haskellers would you say are there in this company? You said six, just in France?

TC: Yeah, yeah– wow, that’s a very good question. Ten, twenty… more than twenty, definitely. But it’s a bit shameful, I don’t know the exact count of my colleagues.

AL: That happens if the company has reached a certain size.

TC: It’s a good chunk of Haskellers. Scrive made some Reddit post that says “Hi, we are looking into hiring twenty Haskellers in Europe”. So we are more twenty, definitely more than twenty, But yeah, we are the company that seeks to hire twenty Haskellers in Europe. And, yeah.

AL: Do you mean now?…

TC: We’re still hiring. Uh, we’re hiring a bit less now.

AL: Mm-hm.

TC: We’re reaching our actual requirements in terms of hiring, which is quite nice. But yeah, more than twenty definitely.

AL: That’s very nice that they also let you spend part of your time on open source Haskell work.

TC: Well I think my teammate and my manager had immense clarity, because they saw that the success of our engineering efforts was dependent on our involvement in investing in the tools that we use. Truly it was a good business decision.

AL: Yeah, absolutely. How does all this relate to your activities in the Haskell Foundation? Would you say that’s mostly an independent thing that came up, that when the Haskell Foundation started you just felt compelled to be active there as well, or…?

TC: When the Haskell Foundation– that’s a very good question– when the Haskell Foundation started asking for applications for the board, I asked around if it would be a good idea for me to apply, since I have such a terrible imposter syndrome. I felt like with the resources of the Foundation I could invest more and do better things that relate with my interests in documentation, in teaching in such.

So I went from coordinating people on base documentation, to coordinating people in several task forces – that include documentation, but also I have started this project called Haskell School, which is very closely modelled after Elixir’s own Elixir School website. Haskell School is a website that aims to give you a practical overview of the language and how to use it. We aim to provide this resource both to people who are not in the academic track to learning Haskell, but also as a support material for teachers. We aim really to provide a variety of topics, that give you a practical look-and-feel of the language: how it is used, how it is typed, what are the constructs, etc etc. That’s basically the façade website that we want to promote when people decide to learn it, decide to learn the language.

AL: So how would you say it compares to for example, the Wikibook?

TC: Well, that’s the thing: the difference is also in terms of resources invested. I plan on using– well, the Foundation lends its resources, so it means we can call in volunteers. We can even pay people to write on subjects. For example, we are in discussions with actors of the community who have proven to teach remarkably well Haskell, and we are in contact with them to commission them, to write content for the Haskell School.

It’s a spiritual descendent of FPComplete’s School of Haskell, which is now in read-only mode. So we intend to provide a long life to this project, and a supply of people to keep it alive and keep it up to date.

AL: It could be that I’m wrong but, FPComplete’s School of Haskell, despite the name, was essentially just providing a website for people to have these blog posts where you could interactively run Haskell code as well. Was it ever a coordinated effort at creating content? I thought it was mostly an approach of “here is the technology, we hope that other people go and do things with it”.

TC: I have no idea because by the time I got interested in the School of Haskell, it was already in read-only mode. So it wasn’t accepting any submissions and it was already archived.

AL: I remember that I really liked it at the time, that you had this– at that time, still a really rare possibility of running the code directly within the browser and trying out all the examples, so that was extremely cool. And there were also lots of people who were writing very, very nice blog posts. But from what I gather, what you’re now doing is a much more concerted and planned effort as to creating coherent sets of topics. So what– you’re aiming for basically the whole spectrum, from very beginners to experts? Do you want to ideally cover the whole range?

TC: I am not targeting beginner programmers. It’s an entirely other subject and the efforts that we put in Haskell School are not viable for people who are just learning programming. We are targeting people who have experience in programming, but who are also beginners at Haskell. Which is why some of the concepts that we explain, like currying, we also provide snippets in JavaScript. That’s the target audience.

AL: OK, so it does include beginning Haskellers, it just doesn’t include beginning programmers.

TC: Absolutely. Mm-hm.

NV: And is this already up?

TC: It is not, it is not. As it is a volunteer work, we are mostly dependent on everyone’s availability, especially to review pull requests and stuff like that. It is however hosted on the Haskell Foundation GitHub, so you can absolutely take a look at the lesson plan for example. And every lesson that is uploaded goes through a review process. I have appointed editors for each language in which the School aims to be translated to. We aim to have translations for French, Russian and Spanish as priority, and less focused translations by other volunteers.

NV: And English, right?

TC: Yes of course, the original version is in English. There will be a translation for Russian. I have to thank the Russian Haskell community– they have produced extremely valuable work in providing canonical (more or less) translations of computer science terms in Russian, with a very nice glossary of Russian terms for Haskell. Thanks to them.

AL: Aren’t you worried– we’ve been talking before about documentation, and teaching materials are also documentation in a certain way, they’re always in flux and they become outdated and you have to basically check them continuously – so if you have this ambitious goal, not just covering a large amount of ground but even offering that in multiple languages, isn’t this going to be an incredible amount of work keeping it all up in the air and up to date and maintained over time?

TC: Yes. Yes.

AL: I don’t want to discourage you, I’m hoping that you say “yeah, we have all this figured out”! (laughs)

TC: No no no no, but– difficulty must be acknowledged. It’s not going to be magical. That being said, with the recent events in the core libraries proposals, I think we are seeing the arrival of a less rapid breaking cycle in GHC and base development. So we can also count on the fact that things will very likely break less often.

Now, yes of course it will require people who both speak the language and know Haskell. But you know what? We don’t have to get those people from other circles of Haskell where they are very busy. I think documentation is a very practical and useful way to introduce people to a language, to the ecosystem, because they are in the unique position of being dependent on the material that they have to take care of, or at least that they want to improve.

So yes, I’ve onboarded many newcomers to the community through documentation, the first one being myself. So I don’t fear that we will lack people. We just need to show that there is a way forward for beginners to be invested in the community, in documentation and especially in Haskell School. Because Haskell School plans on being one of the first stops of a beginner. If they feel grateful for Haskell School, they also will feel– not compelled, but they might have an interest in keeping it alive and well and up to date.

AL: So you’re basically trying to build it into the design of the project to attract more contributors.

TC: Yes. And actually we are in contact with several universities across Europe especially, with teachers. The idea for the Haskell Foundation is to build teacher hubs. This is also one of aims of another task force that I created much more recently, which is the compiler tooling task force. Which stems from another thing that is very personal: how the hell do I write a compiler in Haskell, considering that I’ve never taken a single compiler class, and I never had a Haskell advisor for my master’s thesis or PhD thesis.

AL: Why– sorry, perhaps I missed something there– it’s great to write compilers, obviously (laughs), but why do you have to write one, or where is this coming from…?

TC: Well, I don’t have to write one, but you will have to admit that in Haskell, the subject of compilers is very present. Haskell itself is touted as a very useful and very good language to write compilers. From the moment you arrive, you see and you’re being told that Haskell is a very good language for compilers.

And the more you do Haskell, you learn Haskell, you think Haskell, the more you see problems as compiler problems. This may be a bit cliché but, once you get a hammer every problem starts looking like a nail. There is also this part of the ecosystem where we have very good beginning of tutorials, of documentation, but they have their shortcomings.

So it’s partly a tooling project and partly a documentation project. I’ve had great feedback on that and a lot of people felt compelled to be involved. I’ve had people from big corporations like Google coming. Individuals interested in teaching compilers, so I’ve got teachers with me. I’ve got researchers like Csaba Hruska of the GRIN Project. I think I also have a knack for organizing communities and organizing volunteer groups.

AL: Certainly seems that way.

NV: With the compilers teaching, is it independent of Haskell School?

TC: It is independent, yes. It is independent, it is its own thing. Even though we can host some pedagogical content on Haskell School, it merely is just a platform after all. It’s not just about documentation, but it’s also not just about writing compilers or writing tooling. It really is about making Haskell one of the most viable options to write compilers nowadays.

AL: Is it a project that also has a name and can be found somewhere? Haskell School I know where to find it, but this compilers project…

TC: Yes, of course– the compiler tooling task force is present on the HF Slack workspace. We don’t have any repository at the moment, it is still very much bootstrapping, and we are contacting actors of the community to get some interest and also some cooperation. Especially, we have people very interested in finalizing and updating tutorials, and also we are still very much thinking about how we want to invest our efforts to gain some traction and momentum.

AL: One other topic I wanted to bring up before we completely reach the end and forget about it, is that I’ve recently seen you posting things about a new project that you’ve been working on, which I think is aiming to be a reimagination, or a replacement, or an evolution, or I don’t know how you would call it– of Hackage. Is that right?

TC: Yes, absolutely. I’ve posted about it, it’s called Flora. It has come from– well, the many discussions I’ve had with the Hackage team have been nothing but helpful. It allowed me to better understand the shortcomings of Hackage’s current architecture, despite the work that’s being done on it. I also have as a model the very excellent package managers of the Elixir community, and the website lib.rs from the Rust community.

My aim is to create a package manager that can be self-hosted without too much maintenance burden, but also provide an alternative index for the packages. A nicer– it’s silly but, nicer webpages. Just nicer overall design, and putting the most relevant information at the forefront of the design. It is very much a design matter. But also bypassing the shortcomings of Hackage’s architecture. For example, providing a simple and integrated listing of reverse dependencies for a package. Very useful, but Hackage cannot do that at present time.

And also having a proper user system: the Haddock solution– the ad-hoc, sorry– solution that Hackage uses today doesn’t allow you for example to log off from your user account if you’re on a computer. So there are many things. And unfortunately Hackage is a very old codebase. I think it was set in service– it was put in service around 2007 if I’m not mistaken. Or maybe 2011.

AL: I don’t quite know either anymore, sorry. (laughs)

TC: Well, it’s been here forever, let’s be honest.

AL: I mean, there was Hackage and then there was Hackage 2 and I think what we are having now is Hackage 2 and that was already– there was a long gap between Hackage 1 and Hackage 2.

TC: Yup. Indeed.

AL: But I don’t exactly remember the year.

NV: You’d like to change only the frontend, correct?

TC: I think for the main website I would like to expose to the rest of the world, it would be for a time an alternative index, read-only index of Hackage. And when we reach feature parity– the idea is also to give back in terms of innovation.

For example, I took the bet of having namespaces for packages. Now, how the hell can I have namespaces because Hackage does not? Well, packages hosted on Hackage implicitly have the namespace “hackage”. That’s something that we’d like to upstream into Cabal. And other points of innovation where I’m free of the legacy codebase, and I can iterate more rapidly on those things that have been asked by the community of users but could not be delivered.

NV: Why do we need namespaces for the packages?

TC: Ah, well it’s a very good question. First of all, let’s take a look at what we have in Hackage at the moment. We have a system of tags and categories. They more or less serve the same purpose, because they don’t have much difference between them. Categories are mostly free-form text, and so are tags.

But also, you can’t quite control who belongs to a category of package. If I upload a library operating on text, I would give it the “Text” category, which is very– it makes sense. But also, can I provide a group of packages that belong to the same family, and more or less curate the content of this set of packages? For example, when we upload servant libraries, they all start with the servant name, so it’s always servant-server, servant-client, servant-oauth, servant-swagger. We don’t have any means of providing a category that we own, for which we have upload rights, and for which we can guarantee that all the packages in there are made by us, and that you can give us the level of appropriate trust for those packages.

If I wanted to create a rogue package, I would not have any hard time faking belonging to a particular package community. So yes, there is a matter of signalling ownership, also signalling maybe quality, if things are official. For example, I could see containers, text and vector under a “haskell” namespace, because they belong very closely with base and the other libraries– the boot libraries, the core libraries.

I’ve heard of package manager– of repositories, sorry, who had to implement namespaces a bit late. And they could only do an optional version of those namespaces. npmjs comes to mind for example. And some other package repositories have those as a paying feature, which I do not want to have. I think this should be for everyone.

AL: Another advantage is that names do not run out as quickly if you have different namespaces.

TC: Oh absolutely.

AL: …and there’s less battles about good names already having been taken. Although then if you go for a two-layer structure like most people seem to be doing these days, then perhaps the top-level names are the ones that are being fought for (laughs) so who knows.

TC: Yep. Yes, but we can also coordinate with the organizations that would own those namespaces. For example I would not give the “servant” namespace to anyone. I would coordinate with the owners of the organization on GitHub for example. Which Is what Maven I think does. In the Java world, you can have some kind of reverse domain name namespaces. For example, com.github.whatever . And so you have to own the DNS for that, or you have to own the namespace in the the code forge in which you host your packages.

AL: So what’s your goal with this? You’re seeing this to hopefully eventually just replace Hackage as the main package repository, once it has enough features? That’s your current hope?

TC: I would love to! It’s not an easy task: Hackage is a very full-featured package manager, and we have to ensure maximum compatibility with Cabal and existing users. But also at the same time, right now I want to take some bold stances in terms of what I would accept as packages on the index. You have to stay compatible up to a certain point. But it’s also an excellent opportunity to innovate, and go not necessarily faster, but make things better from the beginning because you’ve learned from your elders. And that is something that I can only do.

AL: I’m not sure whether I’ve fully understood this. You said initially you’re aiming this to be a read-only index, not only of Hackage but of other things as well, or only of Hackage?

TC: Oh, only Hackage.

AL: Only Hackage. But then how is that compatible with saying “I’m not sure what I would accept on my side”?

TC: Well for example, starting with Cabal 3, we require license identifiers to be SPDX (Software Package Data Exchange) compatible. It’s something that is handled on the Cabal side by an Either type, which is either an old-style license or a new-style license. Which is the burden of time, it’s normal. But maybe I can enforce that this license must be SPDX compatible, so that we can have tooling to determine if your chain of dependencies is in compliance with the requirements of maybe your legal team. Because SPDX provides a unified terminology of licenses.

AL: Does that mean that you would not index all of Hackage?

TC: Maybe not– I would maybe not, sorry, accept uploads directly to the server with those invalid licenses. I think I would try to stay compatible with Hackage, but also try to orient the new uploaders once we open for submissions towards SPDX licenses for example.

AL: Right, OK. Mm-hm.

TC: It’s a very– it’s a balance to find, between the amount of compatibility you want to have, but also where you want to take the ecosystem. Because it’s about having a vision in the end, you can’t just be passive, be in the reaction of things that happen. If you lead a package repository, it means you have a certain impact on the ecosystem, naturally. But then you need to acknowledge that power that you have, and it’s a bit cliché but use it for good. Or at least if you use it for evil, people will be able to denounce you properly, instead of saying “well, not much activity, this seems to be very passive”. It’s a bit weird.

AL: It’s certainly good that there is movement again in this general area. I think both Cabal and Hackage have always been suffering from an overall lack of maintainers, and Hackage drastically moreso than Cabal I would say. It’s just good to see that something is happening. It’s probably a little bit early to see how it’s exactly going to evolve, but at least it’s creating movement and is giving you an opportunity to try out new things.

Even though I can imagine that it’s a little bit of conflicting goals, of originally just trying to be a read-only index but at the same time wanting to try out lots of new things that only really make sense once you have uploaders– or I guess even the user system that you mentioned does not play that big a role if it’s essentially just a read-only site, right.

TC: Yeah. If we can focus on being a read-only site for a while, it means we don’t have to rush anything sensitive regarding user systems or security. So we can take our time to push the features when they’re ready. Same for the candidates feature: if we don’t upload, we can allow ourselves to work fully on a release candidate feature, and not botch it.

So yes– it’s an interesting journey. I think the most important is to have a vision, a product vision, where you want to take your users. You can backpedal if needed, but I think it’s much easier to go to another direction if you’re wrong, if you’re misguided, rather than get out of your slumber– of your sleep, if the maintainers of the ecosystem have been very passive and somnolent for the past years.

But maybe it’s only my ADHD speaking right now…

AL: (laughs)

TC: …I’m everywhere, I’m always doing something.

NV: I think we all want a better version of Hackage.

TC: Yes we do.

AL: Yeah, I think we’d all be happy. I would be immensely happy to just have reverse dependencies within Hackage. I think I would be even happier if we had support for collections, similar to Stackage, directly within Hackage as well.

TC: Oh that would be wonderful.

AL: So that would be, that would be good. It’s amazing, you have so many projects going on, I think we could probably talk for another hour if we wanted to without any problems but, I think we should also probably come to an end.

TC: Yes.

AL: OK, then thank you so much Théophile for telling us all about your efforts and projects, and I wish you the best for, um, yeah (laughs)

TC: For the rest, yes!

AL: For all these different things, I hope they all are very successful, and perhaps we can talk again in a year or something and see where it all got to.

TC: Absolutely. I was honored to be invited, thank you very much.

NV: Thank you!

8 – Théophile Choutri

Related links

Transcript