Satnam Singh

Transcript

This transcript may contain mistakes. Did you find any? Feel free to fix them!

Matthias Pall Gissurarson (0:00:15): Hi, I’m Matti.

Samantha Frohlich (0:00:17): And I’m Sam.

MPG (0:00:19): Today we’re joined by Satnam Singh, taking us through his career, switching from academia to big tech. We’ll talk about convincing people to use Haskell, laying out circuits, why community matters, and not being afraid of losing your job.

All right. Welcome, Satnam.

Satnam Singh (0:00:42): Good afternoon, I guess, in Chalmers and Bristol, and good morning from here on a sunny day in Seattle.

MPG (0:00:49): So we usually start out by asking, how did you get into Haskell?

SS (0:00:56): Maybe I should start from the beginning of my journey into functional programming languages, which was in high school. I had this computer called the BBC Microcomputer, and I came with several languages. I enjoyed programming in all of them. One of which was Lisp, and Lisp really captured my imagination. And I tried to write Lisp interpreters and Lisp and all these usual kind of things. And as an undergraduate project at the University of Glasgow, I don’t know why, but I thought it would be just a good idea to use a functional language to design some hardware. My project was to do these chips called PAL and PLA chips. They’re a bit like FPGAs. They’re logical programmable chips.

MPG (0:01:32): Right.

SS (0:01:32): And I saw some synergy between functions and trying to describe what the behavior with chip and how chip works. So I made a wee language called Pineapple. I don’t know why I called it Pineapple. I wrote a wee compiler for it. And I really wanted to write it in Lisp. And I was asking my supervisor, Lewis McKenzie, what functional language is available at the University of Glasgow. And he passed me on to David Watt, who’s one of our programming languages lecturers, and he said, “We’ve got this new professor, John Hughes, who’s arrived. And I think somewhere he’s installed some new language, functional language that you can use.” So I find John Hughes’s office, knocking the door, saying, “I’m trying to do this thing in Lisp, et cetera, et cetera.” And he takes me inside and said, “Don’t worry about this Lisp stuff. Here is the path to the Miranda compiler. Use that instead.”

So undergraduate, I got into Miranda. I really find it – it’s like mind-blowing. It seemed inconceivable that you could write computer programs in this very crazy language that doesn’t have semicolons or assignments or things like that. It just seemed also beautiful in some kind of way. So I ended up – I stayed on at Glasgow to do a PhD with Mary Sheeran, and the language that I – in the same area of trying to look at the intersection of functional programming languages and designing hardware. And I did – all the coding for my PhD was done in Miranda. Yeah, Haskell didn’t exist at the time. I started my PhD in 1987.

Towards the end of my PhD, where I was wrapping up, people started to formulate Haskell. There was a meeting in Glasgow that lots of people came to as part of the, I think, Haskell report process. So towards the end of my PhD, Haskell compilers start to emerge a little bit from Glasgow. Also, I used quite a bit of HBC from Chalmers as well, which for a long time, had a pretty good implementation. It was a little bit ahead of where the Glasglow Compiler was. And from then, just pretty much for the rest of my career, I’ve managed to find the excuse to use Haskell one way or another. So I guess it was just being in the right place at the right time and a helpful nudge by John Hughes in the right direction.

SF (0:03:32): So you said the synergy between FP and hardware really appealed to you, but to the uninitiated, they may seem really different. Like FP is really abstract, hardware is really down there. What is the correspondence?

SS (0:03:47): I thought it was just like a simple thing. Like if you look at a gate, like a combinational gate, well, how do you describe its behavior? Well, its behavior is a pure function. The output of that logic gate is a function of its inputs, and you can just write it as an expression. And then if you have a state, well, what’s a delay element? Well, that’s still quite a pure thing. You can think of an input as being a stream, and then the delay element just delays that stream by one value. That’s quite a simple, beautiful thing. And then you can have a feedback loop, and you can bend one of the wires output and put it back into input. If you’ve got a stream to lazy evaluation, it just all works out. So it’s like magic. It’s not just only functional programs. I think model circuits very well. It’s lazy functional programs, because when you write a lazy functional program in that style, your description, you can just directly evaluate it. You’re not creating some DSL or create some abstract syntax tree in the background that you then walk and interpret. You literally give streams of input to those descriptions, and you get streams of outputs, which, for a synchronous circuit, for a singly clocked synchronous circuit, I think very faithfully models what that circuit does. I think it’s quite magical.

MPG (0:04:55): Right. And this was the Lava project, right?

SS (0:04:58): That’s right. I mean, it went on to be Lava, right? I mean, I think I was doing things of that type long before Lava. Lava was created after I left academia and I was working in industry. I was working at this company called Xilinx, an FPGA company. After many years of researching, using FPGA chips as a lecturer in Glasgow, I quit my job in academia to join Xilinx. Of course, I was still looking for excuses to use Haskell. I’ve joined this very traditional hardware company where people design the most hardware in VHDL or Verilog.

One of the things I really got a lot of inspiration from was Henderson’s Escher fish programs that he had written—these geometric combinators to let you draw these fish. You’ve got a small number of combinators, a small number of base tiles. And those few lines of code, of clever recursive code, you’ve got these wonderful Escher diagrams. And I thought, ‘This is really cool.’ There’s not a single XY coordinate in this. And you’ve got these nice pictures. And then my PhD supervisor, Mary Sheeran, she had made this language that knew a bit about layout, where you could describe that circuit A is beside circuit B, or you could have a row of A, or a column of B, or a triangle of C, or whatever. And then you could rotate and flip. Then you had acetates from back in the day where we had overhead projector slides and do all these neat transformations. I found those useful in my job at Xilinx, where we had these circuits, where we had all these triangles of delays to do pipeline delays, and we are building these filter circuits that are like multiply, add component that you then stamp out multiple times. So in my mind, I had a picture of how these things should be laid out on the chip.

But the normal way you describe these circuits, you don’t get that layout. The input is this graph. The graph has got no spatial information. And you get this rack’s nest of wires. That doesn’t look like the beautiful picture in your mind or in your whiteboard. And most of the time, people put up with that. But sometimes, customers couldn’t get their design to fit in a chip of the size you want to use. And in theory, there are enough lookup tables and registers and wires for that design to go there. But this simulated annealing automatic placement routing algorithm couldn’t pack it in, whereas I would look at it again and go, “Well, you’re doing this software-defined radio thing with these filters and these things. Should we lay it horizontally like this? And they’re like that.” So I just made a wee DSL in Haskell where I could do this kind of Escher or Mary Sheeran Ruby style descriptions of, for every circuit, I encoded its width and its height. And then I made the combinators for row and column and so on. And I made this APLs point for you to plug these things together. And then you could get quite sophisticated circuit layouts without uttering a single XY coordinate. You got exactly the layout you wanted, right? When you ran this tool, it produced VHDL, Verilog instance at every gate and had computed XY coordinate, right? For each one of these gates. And then when you put it through tools and you looked at it, you had this like solid rectangle that was tightly packed that used every component of the chip. And it went really fast, and the customer was happy, and we could ship it. So I call that Lava.

I’ve always got my best ideas when I’m on holiday or cycling or running. And we were in Hawaii walking over these kind of – what were lava fields. And the idea of something that was very molten and fluid at one point and then solidified at another point, it really matched my view of FPGAs. It’s a little bit of how we were programming the FPGAs we saw. So that’s where I came up moving over. And then we added other features to this, especially Mary and Koen and so on, to do with things that follow verification and model checking, et cetera. And that framework could be used for many, many things—not just implementation, but all kinds of that. So it could draw themselves. For example, you could work at critical paths. You could work at testability information. And that was all done in Haskell. By then, Haskell had been around a while. And that was satisfying because you were able to solve a problem that other people couldn’t solve using normal languages.

And earlier in my career, I’ve been a proselytized advocate of functional programming. I’ll tell people you’d write your programs in Haskell. And I really try and convince other people. What I did learn was that people will do something different if they’re just desperate, if they’ve tried every other way of doing it normally, using their normal tools and flows, and they’re lying in the floor bleeding, and they’ve exhausted all other avenues of doing things in C++ or whatever it is. And if you happen to have something that solves their problem, then we’ll reluctantly come to you and accept it. And this is one of these examples where it wasn’t tractable using the conventional techniques to do these really compact, high-speed, nicely laid circuits. People didn’t have another way of doing it. So they put all of a strange person in this corner of Xilinx that knew how to do this. But it wasn’t like a general success, right? The company didn’t generally move to using this style of circuit design for most of the circuits that they did. Just occasionally when some customer needed like super high performance.

MPG (0:09:38): So you mentioned that you left academia, go to industry. So how did that go about? Because I feel like a lot of us are very much in one world, but you have the insight into both.

SS (0:09:49): A lot of people talk about careers that they’ll say they went from A to B to C to D as if there’s some kind of grand plan of how – that’s how it was always going to be. They’re just executing the plan, and they’ll get to their ultimate destination with a great deal of thought and consideration about how they got from A to B or B to C. And I have to say it’s not been like that for me at all. Everything’s been incredibly random and unexpected and unplanned. So I can’t give this five year Soviet master plan, whatever view, right?

So I was a lecturer at the University of Glasgow. I had a research group, PhD students, postdocs grant, and my partner, Susan, was also doing a PhD at the University of Glasgow. And for the summer, she had got an internship at Sun Microsystems. She worked on this very cool persistent Java project, pre-Java. So she was going to go for three months to California. It’s a bit miserable. I don’t really want to be three months without Susan. So I thought, I’ll speak to my pals at Xilinx, who happen to have an office at Edinburgh, to see if there’s any way they could wangle some kind of visiting position for me at Xilinx in California for a few months so I could be with Susan, which they did. So I got to be an intern as well for a few months at Xilinx. And I had to do something while I was there. So I made this Photoshop accelerator. So they had this new PCI-based card with BGA chips in it. And I worked out how to implement things like Gaussian blur and things like that to run – back then, it actually wasn’t accelerated. So I made the hardware for this accelerator in Haskell, of course. And then C++, I wrote all the Photoshop plugin stuff, transferred the buffer from the PC’s memory to the card, did all the Gaussian blur, and all the other filters got result back. And I had this end-to-end demo at the end of the summer. I think I demoed it to the VP and things like that on my way out and said, “Thanks. It was nice working here.” Back off the University of Glasgow. On my way out, they just handed me a job offer, which I wasn’t expecting because I had conceptualized my career as I really wanted to be a lecturer. I knew that even when I was an undergraduate. And I became a lecturer at the University of Glasgow, which I thought like fabulous. Well, this is what I’m going to do forever until I retire, right? Why would I do anything else? So that destabilized me a little bit.

By that time, I think I’d been a lecturer for like seven years or something like that. And I was fortunate to have a permanent position, but it was bloody hard work. I worked like insane number of hours. It was just very, very demanding. I mean, I often wouldn’t see much of Susan or my mother or my friends. And it was not very well paid. I was like scraping by it. And I had remembered that I had to, at one point, borrow money from the TSB Bank to give it back to the TSB Bank so I can make my mortgage payment that month. And that was a little bit soul destroying. So that planted a seed of an idea in my mind that maybe I should go and do something where I can earn just a bit more money and not scrape by it from month to month. Eventually, I said, “Okay, I’m going to try something.” Also, I’ve always wanted to do practically apply theory to practice, right? To solve interesting problems. And as an academic, I mean, I could write papers and I can make these prototypes. I can give all these demos, but it’s hard to do things which really solve some of that problem that somebody really cares about and some product which then gets shipped, right? So that’s a difficult thing to achieve. At least it was for me as a lecture at University of Glasgow.

So having had this little wee project where I did something which applied actually a lot of theory, but at the end of the day made something tangible that you fire up Photoshop and it’s got its extra menus to let you do your blur and these FPGA chips program in Haskell, I thought, ‘That’s quite exciting. Maybe I should do more of that.’ So then I thought, ‘Yeah, I’m going to quit my position and go to Xilinx.’ And I think it was a great move. I still carried on publishing papers. I still did researchy kind of things. I went to FPGA conferences, functional programming conferences. I didn’t have the teaching. I didn’t have a collegiate atmosphere. So there’s a lot of things I did miss very much from being a lecturer. But I had a more sane life. I had more time to go for a hike or bike rides, or I really got into cooking as well because I had a bit more time, right? I could have a bit of a life. I thought I never had a life as a lecturer. I just worked all the time.

SF (0:13:54): And Xilinx isn’t the only industry company you’ve worked for. Could you tell us a bit more maybe about your time at Google?

SS (0:14:03): Yes. So I was working at Microsoft before I worked at Google, at Microsoft Research Cambridge. And so this is the same story, but maybe another octave up or something like that. So at Microsoft Research Cambridge – and before that, I had worked at Microsoft Product Groups in Seattle, but then I moved to Cambridge to join the research organization. Again, I was trying to apply programming language theory and concepts to try to solve practical problems. At that point, I was trying to get people interested in hardware-based machine learning accelerators, which I designed either DSLs in F# or in Haskell or Bluespec or whatever. I was trying to make hardware design appear more like software design and better tooling and things like that.

I was publishing papers, publishing lots of papers, but I wasn’t getting any impact in Redmond, right? At headquarters of Microsoft to get my stuff into production and use for real chips for accelerating machine learning. I got a bit dispirited about that because I didn’t really care about publishing papers, to be honest. And I think really, when I look back at what you had to do, do what somebody like Doug Burger did, which was quit his position at university, he wants to move him and his family to Redmond and then go into the office and bang the table and stare people in the eye physically in the same room and push it to make it happen. It was hard to do this research lab in the fence in Cambridge.

I got approached by Google to go to a product group to work on a machine learning accelerator for search. This is what I really want to do. If I really want to have this impact, I have to go into the belly of the beast to do it. So that caused us to move from Cambridge to California. And I got interviewed for this position, right? To this Hardware Acceleration Research Project. But the hiring process at Google takes a long time. So by the time I arrived at Google, that project had been cancelled. I had just moved myself and my family to California and quit my job. Otherwise, it’s a very nice job at Microsoft Research Cambridge. And I turn up, and they say, “Well, that project doesn’t exist. You’re going to do this distributed systems configuration management project instead.” I said, “What?” I thought, ‘I don’t know anything about distributed systems. You should have hired my wife. She’s a distributed systems expert researcher. You’ve got the wrong person here.’ But no, that’s what I had to do.

So I then started a career as a professional software engineer at age 45, which was like a big shock because I’d worked in product groups before a little bit, but mostly in advanced development or in research labs. I built prototypes. Now I was like a production software engineer at Google, and I just didn’t know how to code. I didn’t know how to code at a professional level. I mean, I coded as a teenager, right? And written literally, probably a million lines of code. But to code like a professional software engineer with all the processes, the flows, the code review, coding standards, guidelines—I didn’t know how to do that. And it was quite a grounding, humiliating process. My tech lead was half my age or something like that. He was very patient with me, and him and the rest of the team really taught me how to be a professional software engineer. And that was a very slow, difficult process. I’m glad I went for it. I learned an awful lot of what it means to be a professional software engineer, and it was tough.

And so I worked in configuration management, learned about distributed systems, learned how to configure these PlanetScale deployments of these services at Google, and it accumulated me. I mean, I was, again, right place, right time. I joined this project, Kubernetes. I mean, I wasn’t part of the founding team, but I just joined shortly after that. And I worked quite intensely. And I mean, I think at one point, I was two or three top committer in GitHub for it, and I’d written a lot of the code. And now bizarre things happen. Like my daughter, she’s doing an internship at Tesla, and she comes home and says, “Dad, I ran this command called kubectl.” I said, “Well, yeah, I know quite a bit of that command.” So it’s very strange to have your child run software that you’ve written. It’s a bit like mind-bending. I never thought that would ever happen. And Kubernetes went on to be a huge, huge, huge success. Pretty much everybody on LinkedIn who approaches me for a job, they want me to be head of Citibank, DevOps, something like that. Totally inappropriate job for me to get. But still, bizarrely, it’s the most successful thing I’ve worked in my career. And it’s the most unrepresentative thing I’ve also worked in my career. So that’s my first epoch. I worked at Google as well. So after the success of Kubernetes, I thought we should quit while we’re ahead. So I quit Google and went somewhere else.

MPG (0:18:13): Right. So where did you go after your first epoch?

SS (0:18:15): So when I quit Google, I went to Facebook and I worked on completely different back-to languages, and I worked on optimizing DEX bytecode. So Android apps run something that looks like Java bytecode. It’s a variant of it. It’s called DEX bytecode. And I saw a horrible history here to do with Oracle. So I joined a project that was trying to compress the size of these DEX files to make them as small as possible. So I wrote C++ to optimize Java, if that makes any sense. And that’s the final step in producing these things called the APKs. That’s the packaged thing that gets deployed that you install on your Android phone. And that’s quite exciting. And it’s very, super impactful. Like when I was on call, I built the APKs for Facebook and Messenger and things like that. And that will go out to an alpha channel to make sure that it didn’t insta crash. And I remember on Fridays when I would take the Caltrain to San Francisco, I’m looking over people’s shoulders who’ve got Android phones. And as they’re scrolling, kind of hoping scrolling smooth and not getting white screens of death, which occasionally happened because then I’m on call and I’m going back into the office to fix that.

And that was good. It was very metrics-driven. You could see every build, how much space you managed to compress. But it’s also scary because the way you compress, the way you optimize your files, is by deleting methods and fields, which you thought were probably not going to get used. Because in general, you can’t do this in Java, right? Because in Java, there can always be more code. At some point through the internet, more code can land. So how do you know what’s dead code, right? But in any case, you have this closed-world assumption. So you can play all kinds of quite crazy trick. So sometimes I would delete methods, fields that I shouldn’t have. That’s heuristic, right? Because you can’t be sure. And then someone will get insta crash when they do the appropriate navigation and – yeah, that was exciting. I enjoyed that very much. It was a very different gig.

So still using my skills from the world of programming languages and having done a PhD in the world of PL, et cetera, become a better software engineer from the skills that I learned at Google, but then all the details to do with inner classes and stuff like that.

MPG (0:20:16): Were you able to use some Haskell for that?

SS (0:20:19): Not for that. Other people did a fair amount of Haskell. I’m sure you know people like Simon Marlow and so on. But I wanted to work in headquarters. I didn’t want to move my family again because I think I’d get divorced. So there wasn’t a Haskell project opportunity in Menlo Park, which would have been nice. But also, I like the idea of doing something really – and I didn’t view that as being the project I would do all the time. I wanted to start off doing something impactful and useful. A little bit of credibility. And the other thing I learnt the hard way a little bit when I joined Google, I thought I had some credibility or a modicum of respect based on all my academic work and all the papers I had published and all the commits. Actually, no, that just counted for zero. You start with nothing, and you have to work your reputation up from day zero with nothing and all that other stuff. It may have got you to an interview stage, but it didn’t matter beyond that point.

I learned that if you want to do interesting things, maybe it’s good first to do some smaller projects, get some credibility, respect, demonstrate capability, whatever, gain some credits, and then spend the credits later doing crazier things that are more higher risk and acquire a company to take a bit more faith in you. And I viewed this DEX project, it’s called ReDex. It actually is open source project. The codes are all on GitHub. You can actually see everything I did. I thought, ‘Yeah, I’ll do this ReDex project for a bit and get some tokens of credit because I arrived at zero. And then I’ll go and do something, maybe create my own.’ Because that wasn’t my project. I joined somebody else’s project. So I’ll get a bit of credibility on that. Then I’ll create my project to do something. Again, I’m still wanting to do something in the direction of accelerating machine learning with special-purpose hardware, which I think at that time, Facebook really needed and wasn’t really doing. It was betting on commodity hardware, and it’s committed to betting on just using Nvidia GPUs, which I think was a bad call by the company at that point. But I think one issue, I think Facebook, which is otherwise a great company, but I think it is very tactical and it thinks in a very short term, and it’s very low to make more strategic decisions and do longer-term project. The TTL for projects are just far too low at Facebook Meta, and I think it suffers as a consequence.

MPG (0:22:34): But then you – I mean, these companies, they did up ending doing some AI acceleration or ML acceleration, right, with the tensor chips and all of that.

SS (0:22:43): That’s much – that was afterwards. That was later. Yeah. So, I mean, I can’t remember what year it is, but this was before Facebook. Eventually, I had to give into that pressure, which I think was inevitable, right? But it’s now late in the game, right? It’s a little bit –it’s behind Google and Amazon and Microsoft, which I think will be much further ahead of using specialized hardware for machine learning, networking, other kinds of tasks, et cetera, because you’re doing these operations at such huge scale. Even the tiniest percentage improvements in performance, energy, whatever, have gigantic savings in dollars, megawatts.

SF (0:23:20): So where did you go next, armed with your credibility?

SS (0:23:24): So I never got a chance to spend my tokens because then the next unexpected NMI happened, and I got contacted by some people at X, which is this startup incubator at Google called Google X. And a bunch of people were making a machine learning chip. I thought, ‘Great. This is what we try to do all these years.’ And they were designing it in Bluespec, which is this Haskell-based hardware description language. And I thought, ‘This sounds very Satnam S. I’ll go and do that.’ I didn’t stay at Facebook for very many years. So it was a bit sad to leave it. Maybe I didn’t leave it because I wasn’t happy. I thought it was a great place. But working on a machine learning chip in Bluespec, that seemed too crazy and too great a deal to turn down. So I went back to the Google empire, and I worked on this fantastic project with people like Larry Augustin. Larry sat to my right. I hired Andy Gill, and he moved to California. It was really just a fantastic team, and we designed this machine learning chip in Bluespec. We fabricated it. It came back, and it worked the first time. We made a compiler for it in Haskell. And so it was just a fantastic den of Haskell and Bluespec hackeries. That was like a fabulous thing. It totally failed as a product, but it was a great project to work on. Well, why it failed as a project, that’s either a different podcast or a whisky conversation.

MPG (0:24:43): Right. Yeah.

SF (0:24:44): Worked first time though, you must have done a – yeah.

SS (0:24:48): I did. I think that’s quite cool. I mean, I’ve worked on another chip now at the company I currently work, and it’s not done in Bluespec yet. It’s done in SystemVerilog. And my work on it was done in SystemVerilog standard hardware description language. So I’ve got my fingers crossed that it might work first time as well, but we’ll wait and see. I think we’ll find out in a few months.

MPG (0:25:05): But then now you’re at the Groq, right? And they do a lot of Haskell there, no?

SS (0:25:09): Yeah. So they do a lot of Haskell. That’s where I did the chip. The chip work is sadly not on Haskell. But there’s a lot of – I should have counted how many Haskell people we have, but we have quite a lot. The company has got three main technical pillars to it. One is a bunch of hardware engineers that make these special chips to accelerate machine learning inference and linear algebra accelerators and focus in matrix multiply, which is a key operation you care about for inference. Compiler people and their job is to take these abstract linear algebra graphs from TensorFlow, PyTorch, et cetera, and do the appropriate transformations to them to map them onto the spatial compute engine. So somewhat similar to what FPGA tools do. And then we’ve got recently a cloud division, which – because we use our chips to actually host large language models and people can, through the internet, through a web page or API, create them.

So the compiler is in two parts. The front part of the compiler, it used to be in Haskell, but industry is really converged on this MLIR infrastructure for compiling machine learning models. It’s like the LLVM of machine learning hardware. And it was just too hard in Haskell to keep re-implementing all this functionality, which was a shared thing that many, many other people could have used. So we thought, ‘Okay, we’ll just use the MLIR front end,’ which is a bunch of C++. So the front end of the compiler is now C++. But as soon as you lower to the intermediate representations that are like our chip, the closer you get to that, the closer you get to the bit of the code that’s still in Haskell, especially the assembler, for example. So the lower levels of the compiler are in Haskell, and the higher levels of the compiler are in C++. And Haskell gets used for bits for the infrastructure as well. And I do Haskell hackery myself as well. I’ve got a wee domain-specific language, I call Haste, that I use to program our chips. So it will generate the IR for our chip. And then I run my FFTs and sorters and my usual Haskell combinator-based description circuits. They don’t like circuit descriptions, but they are mapped to programs that are out of chips.

MPG (0:27:16): I guess it was one of those companies that sprung up and then suddenly was hiring all the Haskell people, and everyone was like, “Oh, what are they doing?”

SS (0:27:23): Yeah. So I mean, it was founded by Jonathan Ross, who was a big fan of Haskell and of Bluespec and of Agda. So he’s very much from my tribe, just not an accident because I came across Jonathan in 2012, the year I joined Google. And he was like, I think, some kind of a test engineer or something like that in New York doing something completely inappropriate, given his skills and background and talents. And I knew at that time that Google was trying to create a new machine-learning chip, so I introduced him to Jeff Dean. And through Jeff Dean, he got onto the ground floor, right, of the first TPU, this special chip that Google made for doing machine learning, which I think he did his work in Bluespec for that chip. So yeah, I think he wrote some notes about how does one sensibly design any hardware without using something terrible like Verilog. And so I told him, “Well, we should always just use Haskell. We should always use Bluespec.” And he believed me, and he did it, and he followed through it. So he was a convert to our tribe and then went unfounded, became a CEO of a company of the same values of a tribe. So that’s one of reasons why a lot of the software is done in Haskell. And I think you would like it if the chip was done in Bluespec or something like that, but it’s hard to hire enough people with those skills. And then if you hire people who have got the normal skills, then you can’t convince them to do something different.

MPG (0:28:42): Right. Because you’ve done a lot of this community building. Remember, you keep – on Twitter, you keep talking about this FP Castle that you’re preparing, right?

SS (0:28:53): Yeah, I mean, this is like a – who knows what I would do? Is this an aspirational kind of thing? There’s something appealing about having something multidisciplinary and unstructured where people just come and drop by. It’s just like a jazz band gigging kind of thing. You do your stuff, they do their stuff. And sometimes things intersect and combine, and you create something new. And sometimes maybe not, or maybe just you learn something new. And maybe it’s functional programming. Maybe it’s programming in general. Maybe it’s functional programming. It’s something else, something not even technical. And some people will come for one or two months. Maybe some people come for a year. And just to have a place where people could just be creative and riff off of each other. I think that’d be cool, right? It’d be a little bit inspired by things like Dagstuhl, but Dagstuhl is like super focused, and it’s for one week. I’d quite like that. I just don’t have enough money to make that happen. So I have to work out how to create some kind of highly valuable startup, have it get bought by some big megacorp. And then I’ll do my Scottish cast. But you also have a love for Scottish countryside, especially the West Coast of Scotland. It’s got scenery, whisky, hiking. I find it inspirational. And often I’ve had some of my best ideas when I’m out and about in nature, having a hike, going for a swim, whatever. So I think it’s nice to get people to go to stunningly interesting, beautiful places for creativity and mental health.

SF (0:30:15): I’m in. Sign me up.

SS (0:30:20): I think your name’s on the list.

SF (0:30:21): Oh, yes.

SS (0:30:22): I’ll have to check because anyone – you send me a PR, because it’s like a Github page just saying that. I’ll just approve the PR. So I mean, will I really do it? I don’t know. But I think even if I don’t do it, I think it’s great to have it as an aspiration. I mean, one of my problems is that one of the people I would like to do it with is Conor McBride. He’s completely anti-car. And the problem of picking somewhere remote is you’re forced to get there by car, whereas he would prefer it to be somewhere you can get to by train. So I’m thinking of somewhere like on sky and some real beautiful castle and some ruins and all that, whereas Conor’s idea is like take the train to Ardrossan and go to this island called Arran.

SF (0:31:01): That’s that. That was what I was going to suggest. What a great island.

SS (0:31:06): And then there’s this brutalist old school or college, whatever, and that could be FP Castle, but I think that’s not quite what I had in mind. So Conor –

SF (0:31:15): They do have two whisky distilleries on Arran, so.

SS (0:31:18): They have two, right? Yeah. So I’ve been following the fate of the Arran one for a while, and I think it produces a really great whisky. I don’t know about the second one, actually. I have to go get educated about that.

SF (0:31:27): Well, maybe that’ll change your mind. Speaking of FP Castle, you’re probably going to do quite a lot of your recruitment from your social media presence. You’re quite active on Twitter.

SS (0:31:38): Yeah. I mean, I originally joined Twitter right at the beginning, a long time ago. So my original Twitter handle was my name, SatnamSingh, and I gave it up because there’s many, many famous Satnam Singhs, and one of them is this basketball player in Canada. So if you Google Satnam Singh, that’s the person you get. I got fed up with TV studios contacting me wanting to interview me because they don’t want to interview me. They want to interview the actual Satnam Singh in Canada. So I gave that Twitter handle up. I rejoined it during the pandemic, I think, or just before I rejoined. And I really got into Twitter during the pandemic. I wasn’t really a user of Twitter before that because otherwise, it’s socially alienating, right? So you have to find some substitute for the community that you had in person. So I put a bit of effort into it during the pandemic. I liked it. I really enjoyed it. And there’s a lot of controversy, right, around Twitter.

I’ve loved my Twitter prism because I don’t use it for politics, right? So I’m not having shouting matches with people over political issues. So I do it for technical things. I do it for food. I do it for whisky. I do it for travel. And these communities on Twitter are just wonderful. Just absolutely wonderful places to be. I really enjoyed them. So I think they’re quite a valuable thing. Most of these communities haven’t eroded too much of late. I would say that the technical community perhaps has because a lot of people have gone to Mastodon, which I haven’t really engaged in. So I’ve lost some people. There are people I would see often that I quite liked seeing on Twitter, either for their technical posts or whatever, jokes about cows or whatever. And they’re gone. And that’s a shame.

I just can’t get – Mastodon just doesn’t do it for me, right? It just doesn’t have the electricity and flow and the banter, the pub banter nature that Twitter does for me, for the topics that I engage in. I think you have to put things into something like that to get something out of it. So I also post and I comment, and then other people go. And then I get to know other people. And I’m very much into cooking. I’m quite amazed that some quite famous chefs follow me on Twitter. I’ve tried something I’ve never expected. I’m very much into music. Some of the musicians I’ve really loved that I grew up with, like Ricky Ross from Deacon Blue, follows me and he’s from his band Deacon Blue. I really love – my domain, raintown.org, is named after Deacon Blue’s first album, Raintown, which alludes to Glasgow, alludes to all the rainy places I’ve lived, including Seattle, right, where I’m now. When I was a kid, I would never have imagined Ricky Ross would follow me on Twitter. I was having a chat with Nigella Lawson the other day. Mind-bending stuff, really.

SF (0:34:07): How do you think your social media presence has affected your role in the Haskell community?

SS (0:34:14): I’m not sure about that. I mean, I don’t view myself as a very sensual person in the Haskell community because I’m not a researcher anymore, right? When I quit Microsoft Research in 2011, I stopped publishing. I haven’t published since then. So what I try to do is maybe I’ll retweet if someone’s got a great paper or looking for someone to hire or has given a good talk. So I try and use that to amplify other people’s cool stuff. But I’m not sure it has got much to do with me. I’m just a conduit of getting information from point A to point B. So I guess that’s a good thing I’m trying to do, right, with the following that I do to get to promote other people and other people’s work to a wider audience.

Otherwise, I think Haskell people don’t really come across because I’m not an active contributor to the Haskell compiler or libraries or whatever because that’s not what my day job has involved. And then occasionally, I go to a Haskell – I’ll go ICFP or whatever, but I’m not giving a talk or anything typically because I don’t write papers. So I’m more just milling around chatting to people. Occasionally, people will come up to me and say hi, which is great. I really like that, and I like asking people what they work on, what excites them, and their view of the future. So maybe I get a few more engagements like that because people recognize me from my Twitter picture and they’ll come and say hi. I mean, I use it to promote things I care about.

I really like the programming languages community research, which overall I think is a nice community. And I’ve worked in a few research communities. They’re not all nice, but the programming languages one in general is. And I think that just doesn’t happen by accident. It happens by people actively working to make it nice and keep it nice and to develop the community, to grow it, and to make it more inclusive.

One thing I really care a lot about are things like bad behavior, bullying, harassment. So I bag on that, about that fairly frequently. I mean, on my own website, I’ve got article about bullying and harassment at the workplace and how to deal with it. So I tweet that and hope that I guess the people in the PL community, some of them find it useful. In fact, quite often when I’m at a conference, there’ll be several days, I’ll go and have one-to-one meetings with people who read that article and identify that something terrible has happened to them and they want to discuss it and relate to it or get more advice. So I find myself doing quite a bit of that. And that’s a good thing too, right? Because I wouldn’t have found that article otherwise.

SF (0:36:38): Does that lead into your role in SIGPLAN? Because I know you do some mentoring there.

SS (0:36:43): Yeah. So for a few years, I’ve been a SIGPLAN-M mentor, which I enjoy doing very much. There’s lots of different types of SIGPLAN-M mentors. I typically get students that are wondering about a career, doing research, and thinking about maybe going to industry or industrial research lab or maybe into product group and wondering about that transition and what that’s like. So that’s what I get to pigeonhole the stereotype for, which I’m very happy to do. Yeah, I think I went onto my third student. I enjoy that. And before that, I was elected to SIGPLAN. So I spent my – I did my, whatever it was, four years. And I ran the awards process, which was very, very rewarding for me. A lot of hard work, but also it just opened my eyes to – I mean, I knew that people around me are brilliant, but when you’re a rewards chair at SIGPLAN, you, at a very high-level detail, look at someone’s career and their achievements and you write a report and other people write a report. And then you have any of these people, and they’re all absolutely brilliant, and you have to pick one to get an award. And it all seems like crazy and arbitrary because they should all get awards because they’re all brilliant. And I find that quite inspiring to see, to understand more clearly what someone’s career is like. Then it was my job to write this paragraph of two synopsis summary of someone’s career and why they got that award, which you can now read, right, on the SIGPLAN’s website.

I think we all benefit from the community in the way it is, and I think it’s good to give back to the community, which you can do in a variety of ways and help shape it. And running for and getting elected to SIGPLAN is one of the ways in which you could do it. It’s not the only way, but I think it’s one good way. If you happen to have the time and the energy and the opportunity and your employer will let you, that’s something I would encourage people to think about.

MPG (0:38:21): Right. So it’s not just for academics.

SS (0:38:26): If you look at SIGPLAN—I mean, I can’t remember who’s in the committee at the moment—almost everyone is an academic. I would love to see more industry people on SIGPLAN. So I think I was an outlier. I’m not sure how many other – when I was in SIGPLAN, how many other people in committee were from industry. I mean, I might be the only one, or maybe there was one other person. I can’t remember. I think there is such a great scope to apply PL, PL ideas and concepts, and wider computer science. So I would like to see somehow that represented in things like the makeup of SIGPLAN or our program committees for conferences.

I don’t want people to feel that if they do PL research, they have to be a PL professor and work on just programming languages and type theory or whatever. I think it comes to this general education about how to solve problems, how to analyze, how to give presentations, persuade people, really powerful analytic mental skills, which have got wide applicability, probably even beyond computer science. And I think we don’t make enough noise about that, right? Sometimes I come across some PhD students who are towards the end of their tenure, and they’re a little bit dispirited, going, “Well, why am I doing this PL thing? It seems so esoterical, and I don’t really want to be a PL professor or blah, blah, blah.” And it’s important to emphasize. Actually, you might be working on some theme prover for too many hours every day for n days in a row, but at the end of this, you’re going to get this general education, training, apprenticeship, mentorship, that’s going to make you a better scientist. And then you can go in and solve lots of problems.

MPG (0:39:53): Yeah, I think you’re one of the inspirations for the rest of us who are like, “Oh, we can still be 45 and go on to have a successful software engineering career if we give up on this one.” Right?

SS (0:40:05): I guess another thing that’s a bit unusual is I have changed jobs reasonably often compared to – some people do change jobs like once or twice or a few times. I changed jobs a lot. At first, I wanted to have a job and have it forever. That’s why my first – I thought about my first job because I got this permanent appointment, because I had this fear of unemployment. I grew up in a family where I was relatively young, my father lost his job, and we grew up in unemployment benefit on welfare. To me, UB40 is not a song. It’s a form that was pivotal in giving a bit of money to a family so that we can subsist. So I had this ingrained fear of unemployment and not being able to provide for my family, which I had to really overcome, right? Quit my first job. And now I’ve adopted a different mindset, which is I don’t care if I get laid off. I don’t care if this job disappears or I don’t like it because there’ll be another one. I can move on to something else. And I shouldn’t let fear of unemployment, me carry on doing something which is not good for me or not good for the company or whatever. Always be ready to shift to do the next thing and have no fear. And it has many caveats, right? You can’t always do that. Sometimes economy is terrible. Sometimes you can’t get another job. But things do move in cycles, and sometimes it’s easier to do that than others. So I would say you have no fear of losing a job because you’ll get another.

SF (0:41:25): Before you go, Satnam, I must know, what is your favorite whisky? That’s what I got to know.

SS (0:41:32): I be drinking all these fancy whiskies of late. My friend, Derek Dreyer, has got amazing whisky collection, and we drank lots and lots of great whiskies there. But I would say it’s still one of the first ones I really fell in love with and liked, which is just Lagavulin 16, which is this fabulous, smoky, peach whisky from Islay. It’s very easy to get. You can get it at almost any bar in the world. And I think it’s just like a fantastic wee drop. What’s not to love about it?

SF (0:41:58): Are you Ron Swanson?

MPG (0:42:01): Where do you think Ron Swanson got the idea from?

SS (0:42:06): Recently, I got a present from Alexandra Silva and Derek Dreyer. I got it in January at POPL in London. A bottle of Glenrothes 1997, a Thompson Brothers special edition, and it is especially nice. It was a very, very good drop.

MPG (0:42:23): All right. Yeah, I think – I never know how to end these. I feel like we could go on a long time. But Sam, do you have any ideas?

SF (0:42:33): Just for me to say, thanks for coming.

MPG (0:42:35): Okay.

SF (0:42:37): That’s what we know it is.

MPG (0:42:37): All right.

SS (0:42:37): Oh, thank you very much for being on. It’s an honor.

MPG (0:42:41): Thank you so much.

Narrator (0:42:45): The Haskell Interlude Podcast is a project of the Haskell Foundation, and it is made possible by the generous support of our sponsors, especially the Monad-level sponsors: GitHub, Input Output, Juspay, and Meta.

56 – Satnam Singh

Related links

Transcript