Python Was Built for Humans. AI Just Changed Everything.

Podcast
Event Data Talks on the Rocks
Location Remote
Date February 10, 2026

This transcript and summary were AI-generated and may contain errors.

Summary

In this conversation with Mike Driscoll on the Data Talks on the Rocks podcast, I cover my journey from studying pure math at MIT to building pandas at AQR Capital Management during the 2008 financial crisis, and the projects that followed: Ibis, Apache Arrow, Ursa Labs, and Voltron Data.

I discuss the current state of next-generation columnar file formats. Several projects are exploring what comes after Parquet, including the Lance format for multimodal AI data, Vortex from SpiralDB incorporating state-of-the-art encoding techniques from CWI and TU Munich, the Nimble format from Meta, and F3, a research collaboration between Tsinghua, CMU, and University of Wisconsin that I have been advising. These formats target different use cases – some focused on analytical workloads, others on multimodal data with images and video for AI pipelines.

I explain a shift I have been experiencing from human ergonomics to agent ergonomics in programming languages. Python became dominant because it is pleasant for humans to write and read. But with coding agents writing most of the code, the bottleneck has shifted: test suite execution speed and compile times now matter more than how enjoyable a language is to write in. I have been building new projects in Go because the agentic loop – prompt, generate, test, iterate – runs faster in compiled languages. This does not mean Python is going away; we will write more of everything, but the proportional distribution across languages will change.

I walk through my current workflow using Claude Code across multiple sessions, and demonstrate RoboRev, a tool I built to have a second agent (Codex) continuously review every commit that Claude Code produces. I have run over 3,000 automated reviews in a few weeks and consider adversarial agent review essential – the agent that wrote the code is less likely to spot its own bugs. I also demonstrate Spicy Takes, a side project that scrapes and summarizes tech blogs, grades quotes by “spiciness,” and currently covers 22 authors and 31,000 quotes.

I close with advice for the next generation of developers: learning to write code matters less now, but understanding software architecture, design patterns, refactoring, and code smells matters more than ever. If you cannot articulate what is wrong with code or explain what you want to an agent, the agents will not produce good results. Computer science education may start to look more like studying literature – reading and understanding what makes software good rather than writing it from scratch.

Key Quotes

“Python is so successful because it’s good for humans. It’s good for humans to write. It’s enjoyable… But in a world where agents are writing all of the code, all these benefits that Python has – its readability, its human ergonomics – the agents don’t care about that, but the thing that they do care about is the performance.” – Wes McKinney

“I started feeling like I can just do things. And it made me feel the same way that I felt when I started programming in Python almost 20 years ago.” – Wes McKinney

“If all of your code isn’t being automatically reviewed by adversarial agents, you’ve essentially got tons of bugs lurking that you can’t possibly find through your own human QA.” – Wes McKinney

“Learning to write code is not that important now, but you do need to invest in learning about the theory of software architecture and what effective and sustainable large-scale software projects look like.” – Wes McKinney

Transcript

Mike Driscoll: Welcome to Data Talks on the Rocks. I’m your host, Mike Driscoll, and this week I am thrilled to have as my guest, longtime friend and data impresario, Wes McKinney. Wes McKinney, welcome to the show.

Wes McKinney: Thanks. Yeah. Happy to be here. It’s been a long time coming.

Mike Driscoll: I know since I started and kicked off this pod, you’ve definitely been on the short list of folks I was excited to get involved here. And I think, you know, before we dive in to some of the great things you’re working on today, I thought for those who maybe aren’t as familiar with your personal journey, I thought we’d just talk a little bit about how we got here. I’ll preface by saying, I think the timing of your appearance on the show is awesome because I know when we chatted last week, you’ve got some incredible projects that are underway and looking to show off to the audience that will be paying attention to this pod. So before we get there, I think it’s worth just kind of recounting your own journey in data as a builder. I think everyone knows of you as the creator of the world’s most popular data framework out there, Data Pandas. I thought maybe in your own words, tell us a little bit about your journey through data from your schooling to the creation of Pandas to Datapad to Voltron, with a stopover on the way at Cloudera, to now at Posit working with your longtime friend, Hadley Wickham, formerly of course, RStudio, and JJ Allaire over there. Just love to give your own larger narrative here for folks who, again, may only know you as the Pandas guy and are now kind of here to know you more.

Wes McKinney: It’s a lot to recap. I’ll try to do it as succinctly as possible. So I graduated from college in 2007, went to MIT, I studied pure math. I dabbled in a little bit of programming, like I learned a little bit of Java, but mostly I did a little bit of theoretical computer science and some optimization and things that were like the intersection of electrical engineering and computer science. But mostly I was a pure math nerd and it was 2006. It was still before the financial crisis when I was interviewing for a job in my senior year of college. And I thought I would go and work for a quantitative hedge fund and get some experience before I go and get a PhD in something. And I was recruited by a company called AQR Capital Management that hired a lot of math and computer science undergrads to work there. And I ended up there right at the start of the financial crisis in 2007 and found basically that my job was not mathematical modeling and doing pen and paper work and trying to do modeling and partial differential equations and the things that I thought I would be doing. It turned out that it was mostly data wrangling and data cleaning and data analysis work and languages like R and MATLAB. And there was Java and C++ code that was used to build production systems. And it was a pretty stressful time in 2008. We were under a lot of stress to do analysis work and turn around analysis more quickly so that we could react to what was going on during the financial crisis. And so I was just like frustrated at how difficult it was to work with data. We were of course doing a ton of SQL, SQL Server Management Studio was also part of the equation. And so essentially I started, I taught myself Python.

Mike Driscoll: By the way, was that your first exposure to Python was after MIT? So you had not done any Python while you were at MIT?

Wes McKinney: No, I didn’t. I didn’t write a line of Python in college. A friend of mine at MIT had mentioned to me that Python was a really great language to program in because we were taking an algorithms class and we had to implement dynamic programming, which is a type of computer science algorithm as part of a course. And we had to do an implementation and I did it in Java because that was the only language that I knew. And I hated programming in Java. And she was like, wait till you see my Python solution. It’s only going to be like 30 lines of code. And I was like, there’s no way. And I remember she sent me the Python scripts and I opened it up. I’m like, how could this possibly in 30 or 40 lines of code do the algorithm that was like my three or 400 lines of Java? So I had some glimpse into what was possible with Python, but I didn’t start seriously using it until I was at AQR. And I think this was like end of 2007, early 2008. And it was like almost this epiphany of feeling like I can just do things like, wow, I can sit down and write code and I can get things done really quickly. And I don’t have to fight with the IDE and all of this boilerplate and stuff that I really just hated doing in Java. And it just gave me this feeling of empowerment and being able to sit down and write code and solve problems and get things done really quickly.

Mike Driscoll: This is by the way, a preview of what we’ll talk about later in the show, right? I mean, maybe just to get there and then I do want to hear, do you feel like we’re at another moment like today, in terms of that shift from Java to Python and Python to prompting?

Wes McKinney: We are. And honestly, the way that I feel today, like in this moment is exactly the way that I felt when I discovered Python and started building stuff in Python.

Yeah, anyway, to abbreviate the story, I started building what turned into the Pandas project. It was initially an internal project. I got permission to open source it later. I left AQR, went to grad school, continued working on Pandas, eventually dropped out of grad school to work on Pandas full time. I started writing Python for Data Analysis at the end of 2011, worked full time on the library for about a year, tried to start a fintech company with a couple of my former AQR colleagues that didn’t work out. And one of those guys, Chung, we moved to San Francisco and started Datapad at the beginning of 2013. So we both met Mike around that time when we moved to SF and we raised some money from VCs. And we were doing a visual analytics company. I’m obviously burying the lead a little bit about how Pandas got so popular, but my book came out at the end of 2012 and the project really, along with the rest of the Python ecosystem started to take off. And in 2013, I actually handed over the reins of the Pandas project to its community, to its core open source developers. And so in spite of my not having been closely involved in development over the last 12 or 13 years, the project has clearly become enormously popular. But I was really actively working on it only its first five years of existence or so. And after that, I got too busy doing startups and other entrepreneurial activities.

After that, we worked on the company for about a year and a half, found it was difficult to raise money at that point in time, if you’re doing a visual analytics company, which is what Datapad was, saw an opportunity to join Cloudera. So we did that at the end of 2014. And at Cloudera, my job was to figure out how to make Python work with the big data stack at the time. And the problem that I saw there was that building a bridge between the Python world, Pandas, NumPy, the Python data stack, and the big data world of all these Java-based and C++-based processing systems was really difficult. And I was looking at the complexity of like, how do we build interfaces to all of these different systems? How do we move data around? How do we make Python functions callable from these systems? Ibis was the result of that. So I started the Ibis project, which was like a hybrid of Pandas and SQL. You can think of it as a DataFrame API that generates SQL queries to target different data processing backends.

And then with a group of open source developers, we started the Apache Arrow project, which is a data interchange, data representation format for dealing with DataFrames and tabular data that’s cross-language, cross-processing engine, can be used as a way to proverbially tie the room together in the words of the Dude.

I spent a lot of the last 10 years working on Arrow in varying capacities, went to Two Sigma from Cloudera because they were generously offering to sponsor Arrow development. So I spent a couple of years at Two Sigma working on Arrow, built a small team there. Eventually we spun out the team from Two Sigma to create a not-for-profit organization, Ursa Labs in 2018, to continue developing Arrow with multiple corporate sponsors. So that’s when RStudio, Posit got into the mix and began funding Arrow in a serious way. We also had money from NVIDIA, Intel, and some other financial firms.

And eventually we saw that there was a need to start a company for what we were doing, not just to be a not-for-profit. And so that’s how Ursa Computing and eventually Voltron Data came into being as we tried to get all the right people in the room to build a unified computing company built around Apache Arrow to essentially democratize accelerated Arrow-native computing. Unfortunately, Voltron Data didn’t make it – a lot of startups don’t – but we invested tens of millions of dollars in the open source ecosystem, especially in the greater Arrow ecosystem. We helped start the Substrait project, which is a whole other topic for open source. We were sponsors of the DuckDB Foundation and DuckDB Labs, and so we had a very fruitful partnership with the DuckDB team, worked really hard on really strong Arrow integration, Arrow support in DuckDB, which has been leveraged to great effect across the open source data ecosystem. And yeah, we got just a lot of really smart people in the room and were able to channel that towards doing a lot of great work, especially in the open source world.

So yeah, it’s been – that’s kind of an abbreviated story. There’s a lot of details involved in there, but really at the end of the day, I’m passionate about human productivity working with data. And that includes not only the human interface problem of the code that you write and making it easy to express your ideas and work with the data, but also making it efficient. So efficient to process, efficient to build systems both large and small. And so that often means improving the performance of interoperability or just getting data out of a database. And so that’s led to some things like ADBC, which is Arrow Native Database Connectivity. And so now we’re seeing just this week, Databricks released an ADBC driver. So you can have Arrow Native connectivity to Databricks now, which is amazing. And so really, I’m still very enthusiastic about this more unified, open standards based world of composable systems where things are modular and interchangeable and efficient. And the hope is that we’re all a great deal more productive and focus on solving problems as opposed to everybody having this tower of Babel problem where everyone has to reinvent the wheel. We shouldn’t have everybody re-implementing CSV readers. And like you mentioned Hadley Wickham – both Hadley and I have spent a huge amount of time implementing CSV readers, which is kind of absurd to think about. You have two prolific open source data science developers building CSV readers. There’s a misallocation of resources, right?

Mike Driscoll: You wrote a piece a couple of years ago on your blog. I think it was titled the road to composable data systems. In some ways, somewhat prescient because I think there’s a number of companies that have emerged since then that are certainly part of that vision of multiple different companies like Bowplan on the one hand, you’ve got Columnar that’s now supporting the Arrow project. Maybe to expand on that a little bit – how do you see the big players? It feels like with things like Iceberg, with Arrow, every time we have these sort of open standards, it seems like there’s this place where they initially resist the open standard. And then because it obviously represents a threat to their proprietary or maybe even more controlled standard that they’ve got, there’s almost this acceptance of it and embrace of it. How do you see this tension between what Databricks and Snowflake, Azure, Google, and Amazon want to have – this sort of walled garden – versus the incredible emergence of innovations around Arrow, ADBC, Iceberg, open table formats? How do you see that tension resolve in the coming years? And maybe I just reflect a little bit on the thesis that you put forward in that 2023 blog post.

Wes McKinney: Yeah. Well, when you think about the way that people build data processing systems, they’re very complex. They take often tens or hundreds of years of person labor. A lot of that translates to money, usually millions of dollars, tens of millions of dollars. And there’s a lot of business pressure to deliver value in a certain timeframe. And when you think about what it means for a commercial database system or some other type of commercial enterprise data platform, data processing product – often the timeline and the amount of work that’s involved with collaborating in a productive way with a broader open source community has a lot of trade-offs. It takes a lot longer to get things done. And in many cases, the result that you get is worse, at least within your own sphere, than the thing that you could build that’s specialized and fully in your control. You build it all in your own timeline with your own people and you don’t have to deal with all this annoying collaboration and negotiation of features and scope and requirements and all those things.

And so when we started Arrow, a lot of people’s reaction was, that’s a really nice idea. Why has no one done it before? Maybe because it’s really hard and people have tried and it was too hard. Or maybe they would say, that’s a nice idea, we’ll implement it whenever everyone else implements it. Like we don’t want to be the first movers because you could do a lot of work and end up with something that’s just – you build a bridge to nowhere essentially.

And so in the early days of Arrow, which was 10 years ago now, it’s having its 10th birthday in February, there was this bootstrapping or chicken and egg problem where everyone’s looking around the table. It’s like, okay, who’s going to implement this? And so I gave a talk in 2017 at JupyterCon and the thesis of the talk was basically being the chicken, so to speak – being willing to take the first step and to do the work and essentially incrementally build out, grow the layers of the onion until you create enough interface points and software that people can rely on where it becomes easier and easier and more and more of a good idea to get involved in this ecosystem of interoperable technology.

Mike Driscoll: The other hesitation folks have is it feels like an Esperanto language where everyone talks about wanting to have this beautiful kind of agnostic interoperable thing, but then in truth, it feels like a lot of times those projects – yours are sometimes maybe the exception – they don’t get there. What actually often becomes the standard is a single company’s implementation that becomes standard. It doesn’t often emerge that there’s a consortium or an individual that’s outside of these organizations that can push a viable standard into the ecosystem.

Wes McKinney: Yeah. And there’s also capitalism and market competition. And so for many years, I remember at the big data conferences, we would always joke with each other. Like we want to make a graph of who’s claiming to be faster than who. And you would end up with this circular graph of everyone says that they’re faster than X is faster than Y. And so you would end up with this big flowchart of who says they’re faster. You try to figure out who based on that is claiming to be the fastest and are they actually the fastest?

Kind of going back to this blog post, the idea I was trying to get across was how can we become modular, Arrow-native, interoperable components, essentially commoditize the storage and execution layers of the stack so that we could be freed up as developers to just focus on user interfaces and human productivity. And now, of course, humans may not be so important anymore. And so we might be thinking more about agent productivity and what that means. But this was before agents, right? Agents were only a twinkle in our eye at this point.

But to be able to shift away from who can build the fastest execution engine and instead invest in productivity and usability – that’s really for me what this has all been about. And I felt that Arrow was one of these disruptive technologies that could serve as a way to bring about that type of transformation where it isn’t like everyone’s selling their vertically integrated system that’s a little bit different and a little bit interoperable. It’s like a walled garden unto itself. And so now we’re in a world where if you’re not using Arrow, you’re behind the curve, but it took us a decade to get there, right? So it wasn’t overnight. It looks obvious in retrospect, but it took 10 years. Isn’t there that meme from the Shawshank Redemption, like it only took six years? It only took 10 years, but here we are.

So yeah, it’s definitely been an interesting time, but in a sense, I think up until maybe a year or two ago, I’d also done a startup. And the startup was really exhausting and I put a lot of effort into making that everything that it could be. And so I deserved a rest. I worked for the better part of a decade making Arrow happen, bringing people together, being a leader in that open source ecosystem. And so I definitely needed a rest.

But also, as I started getting back involved in the field of programming and thinking about what to work on next, I saw a lot of people working on – I’ve done a lot of work on the Parquet file format. And so I’ve been involved in a number of research projects, both with companies and industry as well as academic research groups to do research and better understand what makes a good columnar file format, especially one that is built for Arrow compatibility, intended for systems that are using Arrow exclusively. How do we design a really great successor to Parquet, for example, that’s a lot faster to encode and decode, but it can assume that everyone’s going to be using Arrow on the other side – the encoding pass, it’s going to be Arrow in, when you decode it’s Arrow out. And so that actually simplifies the design space quite a bit.

So we’ve been doing some work on that, but I didn’t get really down the rabbit hole in terms of actually building those projects. There’s startups and research groups that are building them, but I’ve been more of an advisor. But I didn’t really get inspired about going back down the rabbit hole and working on file formatting, encoding, and columnar execution and things like that, if that makes sense.

Mike Driscoll: Well, it seems like the value and maybe again, increasing the value moves to – it’s not so much actually implementation at the low level, that’s the bottleneck, it’s the architecture. And in some ways having felt the pain as both an entrepreneur, as a developer, as a user of these tools, and certainly when you went to Posit, you saw these two major ecosystems, the Arrow ecosystem and the Python ecosystem, that would obviously benefit from interoperability there.

I’d love to hear, speaking of the successor to Parquet – file formats matter. We joked about you and Hadley writing CSV parsers. Still as shocking as it is to those of us who are in the world of data, CSV is still a thing and you’re talking about moving beyond Parquet. Some places are just beginning to embrace Parquet as a more strongly typed columnar format. But let’s talk about that format that I’ve heard people mention. I’ve mentioned on my pod with Joe Rice, who was here a couple episodes back, the F3 format. This stands for the Future File Format, a collaboration between a number of both research folks and commercial folks. Maybe in your own words, tell us what’s exciting about that format. Would you consider that a successor? Maybe also I’ll just throw out there, Spiral, there’s a company called Spiral that’s out there. There’s some other formats that I think are contending. There’s Lance, LanceDB, the Lance format. There’s a number of other formats that are out there. So where does that speak into that space?

Wes McKinney: Yeah. I can provide an orientation of where all of these file formats came from. So we started a research collaboration, maybe a little two, two and a half years ago. And so the Lance file format – it’s a columnar format for multimodal AI data. It’s now a community project. There’s a company called LanceDB, which is founded by Changshu and his co-founder who’s a former Clouderon. And so it’s very much a small world. My former co-founder is the creator of, his company is the creator of LanceDB.

There’s Vortex, which is created by SpiralDB, another sort of next gen storage company. And there’s the Nimble file format from Meta, which is part of something that was built at Meta for their internal data warehouse. And then there’s some research projects. So there’s F3, which is a collaboration between Tsinghua, CMU, and University of Wisconsin. And so I’ve been part of that research group and advising that work, which is being done by grad students.

And I would describe F3 as a research project that is helping drive exploration of beyond simply data encoding, but more of some of the more subtle and complex details of how to design a future-proof file format. Because one of the troubles that we ran into with Parquet is that it was difficult to add things to it and to have some confidence that you could use those new features and other people would be able to read them. And so F3 has explored some ideas like including WASM code that enables applications to decode data that they’ve never seen before, things like that. So it’s a research project that has explored some very interesting ideas. There’s a paper you can read about it. At the moment, it’s not trying to be an industrial file format for production.

I think Vortex, which has gone and incorporated a lot of the latest and greatest ideas about data encoding from CWI, which is where DuckDB was created, and TU Munich, which is where research projects like Umbra and Hyper were created by Thomas Neumann – Vortex is trying to pull together the state of the art in data encoding in a Parquet-like file format. Essentially something that is super fast to encode and decode, but is based around lightweight encoding, parallel encoding, decoding that works really well on GPUs, that sort of thing.

But these file formats, these new file formats, they have different applications that are designed for different use cases. And I think the reality is one of the big things that’s changed in the last 15 years since Parquet was designed is multimodal data. We weren’t really thinking about dealing with large amounts of unstructured data in a big table because we’ve got lots of images, lots of video, lots of text because we’ve got our traditional analytical data, which is the metadata really. And then we’ve got the unstructured data, the images and the video. And this is all going as part of an AI LLM processing pipeline. And so essentially everyone right now is trying to figure out how to bring together the learnings from the analytic database community with the AI LLM tools to be able to create LLM-oriented data processing pipelines. If you need to do AI processing of lots of videos, that’s a much more complicated problem than count star.

I imagine there will be a natural segmentation of the file format space based on the type of application where I think on one hand, yes, we need a better Parquet that’s a lot better to use on modern hardware that works better on GPUs and takes advantage of modern encoding and decoding techniques. But there’s a lot of innovation needed on multimodal data storage. And I think projects like Lance have been really focused on that. And as a result are seeing a lot of adoption across AI labs and companies that are doing production applications of LLM, image, and video processing. If you’ve heard of companies like Runway, they’re a big Lance user, for example.

So I think it’s very interesting and I’ve been happy to be involved with it and to provide at least some historical perspective and the Arrow perspective on these formats. Part of why I’m not actively working on any of them is because I feel very confident in the teams that are building them. So like Peter Bontz at CWI is perhaps the world expert on data encoding and decoding. And so I don’t feel motivated to work on columnar encoding because Peter is smarter and better at this than I am. I’m pretty sure that whatever Peter does is the right thing to do. He’s been doing this for 20, 25 years or more. His grad student Azim at CWI – they created the Fastlanes project, which has provided a lot of intellectual fodder for these new file formats.

Mike Driscoll: It’s in good hands. Speaking of your blog, which of course everyone can go check out at wesmckinney.com, you can get all of Wes’s great posts listed there. Many of the topics we’re talking about today will be featured there.

One of the things that you and I chatted about last week was related to a blog post that maybe you had just written or you were about to write, which is this shift from what you titled the shift from human ergonomics to agent ergonomics. And not only are you thinking about this shift, you’re kind of living it yourself and you shared in that post that this might come as blasphemy to some of your followers – that one of the world’s most prolific Python developers has lately been writing a lot of software in Go of all languages.

I would love maybe just a few thoughts on this shift and then I’d love to actually – one of the great things about doing a virtual interview is I think folks who’ve been reading your social media posts, I think maybe a lot of folks have some skepticism about the tsunami of agent-led code productivity that’s been unleashed certainly with a step function in December. So maybe you could set the stage a little bit about this shift that you’re living through yourself and then we can actually dive in and you could show off your own setup and what you’ve been doing to build some of these new projects.

Wes McKinney: Yeah, of course. We can do a little – we can vibe together. I’m all here for it. This is going to be like my first, maybe my first live vibe coding session.

So fast forward to the present, I’ve been back at Posit since the end of 2023. I’ve mostly been working on Positron, which is a new data science IDE. So it has AI features, AI systems, but you can think of it as a next generation RStudio that works for R and Python. I designed the data explorer. So if you click on a Parquet file or click on a CSV file, when it magically opens in Positron, I worked on that feature and am very proud of it.

But I was an AI skeptic, I think, pretty much until the beginning of 2025. I hardly touched LLMs, even LLM autocomplete. I kind of had this feeling like I don’t need this, I’m good at writing code. This is not useful for me. I saw people using Cursor and it just didn’t appeal to me. I used Windsurf a little bit, one of the Cursor competitors, and I was also like, this is nice, but this is not a 10x for me. Maybe this saves me a little time, a little Googling now and then.

But I heard about Claude Code in April 2025, and that was the first thing that really put me on this path of like, this is the thing. But early Claude Code was still rough around the edges and made a lot of mistakes. And so it was hard to have confidence that it was going to be able to generate good software on its own. And I spent a lot of last year using Claude Code to work on Positron and found that it enabled me to work on parts of the code base that were previously inaccessible to me in the past, especially things relating to front end development. Positron uses React. It’s within the VS Code code base, which is a very complicated code base.

Mike Driscoll: Is it a fork of VS Code?

Wes McKinney: It’s a fork of VS Code. Yeah, it’s a pretty big fork of VS Code that’s been, had probably 30 or 40 person years of development put into it at this point. It’s a pretty extensive project.

So I was seeing gains from using that. And then maybe later in the year, especially I think the big turning point was the release of Codex 5.2 from OpenAI and Opus 4.5 from Anthropic. And what I saw was that the performance of these coding agents – both the foundation models got a lot better, but also the coding agents had been developed a lot and were just much more dialed in and acting in a way that was helpful. And so starting in like October, I began going down the rabbit hole building, essentially going through my mental backlog of all the projects that I had thought about building for the last 20 years. But I had just never been able to justify spending the time. Either I didn’t have the energy, I didn’t have the time, I just couldn’t justify spending my time on something that was just for me and that I couldn’t sell or wasn’t part of my professional work.

Mike Driscoll: It’s the ROI, right? The return on the investment. The investment was so high as a developer that the return just couldn’t justify it.

Wes McKinney: Yeah. And then something like, I think, somewhere between October and December, it just clicked where Peter Steinberger, who’s the creator of ClaudeBot now and OpenClaw, he gave this talk, which I recommend to everyone. It’s called “You Can Just Do Things.” And I started feeling like I can just do things. And it made me feel the same way that I felt when I started programming in Python almost 20 years ago. Going from Java to Python was such a breath of fresh air. Like, all this stuff is out of my way and I can just think about the problem. I can write the code. I can get things done in one tenth of the time. This is like that. And so that’s the overwhelming feeling that I had.

But then I spent a lot of October to December really grinding on – I built a pretty extensive personal finance tool, terminal UI tool called MoneyFlow. And I did a couple other little things. I renovated my home lab and so now I’ve got a whole personal dashboard for my home lab and little web applications that I built for my personal assistant essentially.

But then I found that building large scale code bases with these coding agents, it just starts to break down when you reach a certain point. And so my next thought was, we really need to figure out what does software engineering look like in this new world? And how do we build software effectively in a scalable way and over a long period of time?

I think with coding agents, people have demonstrated you can bootstrap a brand new project and get something working in a couple of hours or a couple of days. But the old model of like, maybe a project that took you four years before you could build in six months or a year now, but still eventually you get to a certain scale where there’s – to build and maintain a large software project is a different problem from building something greenfield. And so we need new tools and approaches to doing that work.

And one of the big things early on was I found that Python started to feel like a ball and chain basically. And the way I would explain that is that we know that Python is slow, but we don’t care because our time is valuable and the time that we spend writing the code, reading the code, reviewing the code, that time is more valuable. And so over the last 15 years, we’ve made this trade-off of performance for our time and our enjoyment of life. We choose Python because it is wonderful to program in. It has a great ecosystem of packages. Now it has the whole everything in AI, all the LLMs are trained and post-trained in Python. It’s wonderful.

But Python is so successful because it’s good for humans. It’s good for humans to write. It’s enjoyable. There were some rough edges like packaging and distribution, but Charlie Marsh and UV have largely fixed that. So everything’s great. But in a world where agents are writing all of the code, all these benefits that Python has – its readability, its human ergonomics – the agents don’t care about that, but the thing that they do care about is the performance. And how long does it take to generate the code, to edit the code, which is meaningful. Python’s conciseness and readability does impact that. But more meaningfully, running the unit tests becomes something that is a material part of how long it takes you to build something with coding agents now. And what I found was that with the early projects with coding agents that I built in Python, just running the test suite was a bottleneck. The software ran fast enough, but the trouble was that my agents are just sitting there waiting for the test suite to complete. And that was the thing that was preventing me from moving faster.

Mike Driscoll: The bottlenecks have shifted. It’s interesting, it gets back to when you and your classmate at MIT, who said, I’ve written this dynamic programming assignment in 30 lines of Python. How many lines of Java do you think your equivalent was?

Wes McKinney: Oh, I mean, two or 300 at least.

Mike Driscoll: Right. So back then the bottleneck was the developer writing that code, but are we swinging back to – of course, I’m not sure you’re going to start programming in Java again, but arguably maybe it now is more efficient for you to write 300 lines of Go. Does it have the concision of Python or?

Wes McKinney: Well, there’s two modern languages that were developed essentially as a reaction to the Java and C++ ecosystems. So Google originally had three programming languages that they used in production. They had C++, like Google search engine and most Google code was written in C++, the heavy duty systems code. But then you had things that were more like server applications and distributed systems. And a lot of those were written in Java. And then you had scripting and automation and glue. And that was all written in Python for the human benefits of that. And because it was just gluing together systems and doing automations and stuff, and it was just easier to write that code and to run it in Python and to not have to worry about jar files and class paths and compiling C++ code and things like that.

But then Google started to struggle with the maintenance burden of all their Java and C++ code and began yearning for a systems language that was a lot faster to build because Java and C++ are both relatively slow to build, C++ especially. So very fast to build and very fast to run, but also with garbage collection like Java has, but with a really nice concurrency model where you can build multi-threaded programs in a safe way without blowing your toes off and standard libraries and things like that. And so they created the Go programming language essentially to be the panacea to their woes from using Java and C++ for a decade plus at that point. I don’t know exactly what year they started building Go. Some of the greatest programming language designers in the history of the world.

Mike Driscoll: Rob Pike and yeah, Ken maybe, yeah.

Wes McKinney: And Rust has a similar origin story where it was created by Graydon Hoare at Mozilla and was essentially designed as a research project as a panacea to the pain and suffering that is C++ and C++ build systems. And so both Rust and Go, modern programming languages, have just figured out the build system. You can manage your dependencies, you can create, you can build your executables really reliably, predictably on all the operating systems. You can create static binaries. You build your application and it just creates a binary that you can copy. You don’t have to copy the JVM, a bunch of systems libraries. You don’t have to worry about different Linux distributions. You can just copy the binary and run it. That’s amazingly liberating.

Mike Driscoll: Yeah. That’s – Rill is a Go binary that takes advantage of that exact setup.

Wes McKinney: Right. Yeah. So you’re using Go for those reasons at Rill.

So yeah, anyway, but those were not decisions that were made to optimize human productivity. It’s really like compiling the software, distributing the software is something that has a material cost and maintainability over time. And the fact that those languages take more human effort to write was an acceptable trade-off. They’re both very fast too.

But basically now, I think Go and Rust, maybe there’s other languages like TypeScript, we’ll see a proportional migration of work away from Python because of the agentic productivity benefits of performance – how fast it takes to run, how fast is the agentic loop? How long does it take to build and run the test suite and validate whether a particular turn of work in your coding agent is valid? Because that’s the unit of work in coding agent land – the turn. You prompt, the agent does work, and then it returns control to you. That’s one turn.

Mike Driscoll: That’s true.

Wes McKinney: So it’s like, how can we speed up the turn? And you can also call that the agentic loop.

Anyway, I’ve just stumbled on this because I felt like I was going slowly. And so I’ve tried rewriting things in Go just to see what happens. And for me, the results have been really good. And so if I don’t need AI tools or data science tools or Polars or things like that, I’m going to be building everything in Go that makes sense. If something doesn’t make sense, it could be written in Python, it could be written in Rust.

But the funny thing is, we’re not going to be writing less Python. Some people read my blog post and were like, Wes, are you saying Python’s going to die? Of course not. We’re going to be writing more of everything. Probably a hundred times, at least 10 times, maybe a hundred times more of absolutely everything in aggregate. So are we going to be writing less Python? No, we’re going to be writing a ton more Python. And if Python’s market share right now is 50%, 55%, in five years, maybe it’s 30%, but of a much larger pie. And so I think that there will be this multimodal, bimodal or trimodal distribution of applications where there’ll be things that make sense to write in Go because of all the benefits of Go and things that make sense to write in Rust and things that make sense to write in TypeScript or Python. And it’ll be really interesting to see as agents take over all of the coding from all of us, where the proportions of the pie end up looking like in the fullness of time.

Mike Driscoll: Right. It’s interesting. The emergence of these high level languages were a manifestation of the era. And now we have a different era and different systems, different bottlenecks. But I totally agree that it’s not that Python’s going away, it’s just maybe the distribution of software will shift.

Let’s get into it then. So this is the lead up. We’ve talked about the world is changing in front of us. You’ve talked about since the last couple of months really going down the rabbit hole of agentic coding. Maybe for our listeners and our watchers here in particular – if you’re willing to share, maybe to set it, it would be great to just hear how you work as a developer. You talk about that agentic loop. And yes, all of us are, I think, paying a lot of attention to the author of ClaudeBot, Peter Steinberger.

Let’s go ahead and get in and see what Wes McKinney’s own setup looks like. So maybe to set the table, I know you and I talked about Spicy Takes as an example of one of these side projects that you recently spun up. Maybe tell us a bit about how you built it. Maybe even the motivation as a product leader. I love that you’ve got some of our favorite, several of the former Data Talks on the Rocks guests are here as well.

Wes McKinney: Yeah. Well, I’ll kind of explain to you where Spicy Takes came from. One of the projects that I did with agents this fall in November and December was that I completely overhauled my website. And if you go to wesmckinney.com/presentations, I’ve given something like over a hundred conference talks, or if I now include interviews and podcast appearances and talks that I’ve done – over the last 16 years, I’ve done 114 of them. And I thought to myself, how can I liberate this content and make it consumable? Because who’s going to watch a conference video from 2017? But I had ideas that were important to me that I wanted to share that were in my JupyterCon keynote from 2017. And so what I said is, can I collect all my videos and I want to transcribe them with AI and then summarize them and then pull out the key quotes.

And so basically I did that. I’ve got a transcript of these talks generated with OpenAI Whisper. And then, this is my 2017 JupyterCon talk, key quotes: “no one ever got fired for buying MATLAB,” “programming languages are user interfaces for describing computation,” “we would like to own less of the infrastructure that enables projects like Pandas to exist.” And I really liked this.

And so I thought to myself, what can I do with this processing pipeline? And I found that there were a lot of blogs and friends of mine and people that I admire and blogs that I’ve enjoyed over the years – coding blogs, tech leadership blogs, people in the data community. I love their writing, but so often I would share a blog post or something with people and they’d say, that’s great. I don’t have the time to read a 2000 word blog post. Could you just tell me what it’s about?

And so I was thinking to myself, I just built a tool using AI that can help with that, that summarizes and pulls out key quotes from content. And so essentially, I worked with Claude Code and an increasingly complex pipeline of agentic engineering tools to process now 22 blogs and several thousand posts or videos that have been transcribed, summarized, and key quotes pulled out. And I was especially interested in controversial quotes. So I’ve got a database of 31,000 quotes across 22 authors. And you can see that George Hotz, Geo Hotz, is the spiciest of the spicy of all of them. But I also have our friend Hannes Muehleisen. I just added him, co-creator of DuckDB, and he’s got lovely hot takes from, let’s say, 2024, like “we’re essentially building things that are inferior to the state of the art from 20 years ago and somehow get excited about it.” And I just love this.

Initially the first one that I built was for Ben Stancil, whose Substack I absolutely adore. He just released a new post that I need to add up here later today. I might as well go and do that. I’m just going to ask Claude to go ahead and do that. “Hey Ben published a new Substack. Can you add it to his Spicy Takes site?” So we’ll get that working on that.

Mike Driscoll: Wow. And just for – even with that, okay, so this is, you’re using Claude Max and you just essentially give it a prompt and it already has context?

Wes McKinney: Yeah. It’s because all the context lives in my – so it scraped his Substack post from today. So now it’s running the LLM analysis, which summarizes and pulls out the quotes. And then it will grade the spiciness of the quotes. And so now it’s grading the spiciness.

Mike Driscoll: And then it says it’s a shell script. So what’s inside that shell script?

Wes McKinney: Well, I mean, I’ll show you. So we’ll go over here to Spicy Takes and scripts, LLM analyze. So yeah, I mean, it’s LLM generated. It’s got a prompt asking, analyze this blog post in depth, extract summary, key points, money quotes, themes, tone, key insight, one sentence capturing the core takeaway. That gets outputted as JSON. And then the grade spiciness has another prompt. So we’ll look at grade spiciness here. And so this basically says a spicy quote is one that takes a contrarian or unpopular position, uses sharp wit, irony, or sarcasm, critiques industry conventions, challenges sacred cows, makes a bold, potentially controversial claim. It has a memorable quotable drop-the-mic quality.

Mike Driscoll: And these prompts, did you write those prompts or did you have assistance even in the design of these prompts?

Wes McKinney: I mean, I explained in my prompts to Claude what I believe spiciness was. And so I could look back at what was my prompt that generated the script, but ultimately this all came out of my brain.

Mike Driscoll: If we were to segue – are you willing to show your, in addition to Spicy Takes, which obviously you built with this your new agentic approach here, maybe tell us a little bit about some of the other software applications that you’ve built. And are there any that you could show off just that process? Spicy Takes is the example, but just like when you talk about that agentic loop, as you go from ideation to planning to actually coding, evaluation, and agentic loops, what does that look like today versus what it would have looked like before?

Wes McKinney: Right. Well, I mean, typically the way I’m working is I’m working with multiple Claude Code sessions on different projects simultaneously. And I flip back and forth between different projects. This could be parts of my day job at Posit or side projects.

I found that one of the challenges that I’ve run into is that the code that the LLMs, the agents generate is not the best. It often has a lot of bugs. And I found that I was having to do a lot of human-in-the-loop QA, human QA and reporting the bugs to the agent. And that felt a little bit dehumanizing. And so I started asking a different agent, Codex, to review all the code that Claude was writing. And I found that it was really effective at finding real bugs in the code from my Claude Code sessions. And I eventually said, I can automate this whole code review problem.

And basically, what I want is a to-do list for my project spicytakes.org where here’s a code review that was run against a commit. This was done by Codex and it provides a bug here, which I can then copy and paste into my Claude Code session. And it will then just fix the bug and commit it. And so I hit address and now my queue of incoming reviews on this project are all handled. So it fixed the bug. And then I ask it to commit.

Mike Driscoll: You’re using RoboRev here.

Wes McKinney: RoboRev, yeah.

Mike Driscoll: And just to – this morning I saw one of your posts where you were talking about –

Wes McKinney: Yeah, it’s coming through. Those are all Git commits in the repository because I’ve just been developing in the main branch. And so you see, for example, here, this was – I was working on dealing with Steve Yegge’s blog and there were a bunch of bugs in the scraper that was developed for Medium. And so basically the agent finished this Git commit, which we can look at here. “Add our Medium RSS scraper for Steve Yegge.” And the second that Claude finished doing that, it committed, that popped up here and then Codex found all these bugs. And so then I copy the review and I paste it back into my Claude Code session, which then fixes all of these bugs.

And what I found is that it’s better to have a different agent, ideally just not the same Claude Code session or ideally another agent like Gemini or Codex review the work that Claude Code is doing because it has a different perspective and will find problems that maybe Claude doesn’t see because at the end of the day, Claude made the code in the first place. And so it seems that it is less likely to see the bugs that it itself has created.

And so basically now I’ve run over 3,000 reviews in the last three weeks or four weeks since creating this RoboRev system. And now it’s essentially the only way that I work. It’s like the sword in Zelda – “it’s dangerous out there, take this.” And so basically I do not code with agents without a continuous reviewer running in the background. And I more and more, the more I use this, the more I believe that if all of your code isn’t being automatically reviewed by adversarial agents, you’re essentially – you’ve got tons of bugs lurking that you can’t possibly find through your own human QA.

And so I’ve been developing this project pretty actively the last few weeks and have built some automations to enable automated fixing, automated refactoring. And so now I can do things like, let’s say I want to code review all my scrapers. And so I’m going to do RoboRev analyze refactor per file. So now that kicked off 20 refactoring analysis jobs with Codex, which I will go up here and show you. And so now each of these jobs – I’ll show you what the prompt looks like. Basically suggest refactoring opportunities in this file and it inlines the file content in the prompt.

And here’s this refactoring task. And if we look at one of them – files, code smells high, monolithic function has deep conditional chain and mixed responsibilities. Hard to reason about and extend. And so I can take this, I can say fix this. And it will – let’s say we didn’t want to do that, right? So let’s say we just want to tackle that refactoring task. And so here I can say RoboRev fix that ID. And so now it’s – here’s the refactoring result. And now it’s going to sit here and refactor that file while I sip tea and talk to you.

Mike Driscoll: Wow. And by the way, who’s behind RoboRev?

Wes McKinney: Me, I built it.

Mike Driscoll: You’re RoboRev. Okay. So this is your own system for doing this agentic coding?

Wes McKinney: Yep. That’s right.

Mike Driscoll: So for a moment, I was trying to identify who’s – I’m like, where did this come from?

Wes McKinney: No, no. I made this and it’s written in Go.

Mike Driscoll: Have others picked this up by the way?

Wes McKinney: It’s starting to catch on, but I think it requires a new way of working, which is that essentially many people are still building software with agents the way that they used to write code as a human. They would sit and work all afternoon and whenever it seems like they’re done, they do a git commit. And so they might spend two or three hours with Claude Code before they commit anything. And so the idea is that you could be working for hours and have no feedback on the code that’s being generated. And the only feedback that’s in the loop is your human QA of running the code and seeing if it seems to work. And the problem is that there’s tons of bugs that you’re not observing in your human-led QA that you could be finding if everything that the agent is doing is reviewed immediately. And so the idea is that this kind of changes the workflow where every time the agent does anything – so I’ll show you kind of the output of the review. It did it. All refactoring changes have been applied and committed, replaced this messy code with a dict mapping tag names to handler method names. That’s all very reasonable refactoring that I would do if I had the time in the old world. But now I just can have the agent do it.

But basically the idea is that you want to shorten the feedback loop and get as much code review, code quality feedback into the agentic loop as you can. And this runs – this isn’t part of GitHub pull requests. This runs on my machine. And you can see that all of these refactoring analyses on all of these scrapers is done. And I can even just say “fix everything” – RoboRev fix – and it will sit here. It’s like, oh, there’s 21 jobs that haven’t been addressed and it will sit here and grind through them all and commit the results.

So I’m going to stop it because this code doesn’t have unit tests. I will say that it does make test-driven development even more important. But yeah, this is the way that I work and I feel like the way that I was working last fall with Claude Code sessions, not really reading the code very closely, maybe giving a cursory glance to the code and only finding bugs through human QA – I think that is a barbaric way of working and using agents to elevate code quality and to find bugs is the way to get stuff done.

Mike Driscoll: So obviously you’re an early adopter of a lot of the coolest things out there. We talked a little bit about ClaudeBot, now OpenClaw that Peter has created. If you could give your top five list of what are some of the coolest tools you’ve come across in your travels in the last few months, what would be on that top five to 10 list?

Wes McKinney: I’m obviously biased by my own tools, but I am a user of Steve Yegge’s Beads, which is an embedded lightweight task tracker that helps improve the memory and context use of the agents. And so if you’re finding that you’re dealing with lots of slinging around lots of markdown plan documents and the agent’s getting lost and trying to remember what tasks have they done in your plan or what are they not – Beads is a solution for that. It has been a little bit of a hot mess because Steve Yegge is vibe coding everything and also building Gastown and some of the Gastown stuff has made its way into Beads, which has created some issues. There’s some Beads clones. But it’s a good idea. And I’m confident that within six months there will be a steady, production Beads-like system that’s good for development.

I think having a really good terminal emulator like Kitty – I like Kitty. I think Ghosty is good when it doesn’t crash for me. But I think it’s very important to be able to have lots of sessions around. And so as you see, we were looking at a terminal with six panes going and to be able to maximize a pane, minimize a pane and jump between different sessions very fluidly, it can help you keep an eye on a lot of different processes and things that are going on, especially if you’re running parallel Claude Code sessions.

Mike Driscoll: What IDEs do you work in these days? Which IDE, if any?

Wes McKinney: I don’t really use IDEs anymore. I still use Emacs for editing by hand and looking at stuff. But if I’m doing TypeScript, I use VS Code because VS Code has great TypeScript support. When I work on Positron, I use VS Code to develop, run the project, set breakpoints and debug things. But aside from that, I’ve never really been a big fan of IDEs. I only started using LSP for tab complete in Emacs a couple of years ago. So I’m pretty old school, even though I’m relatively young. I’m 40 years old, but pretty old school in terms of my tools.

Mike Driscoll: The last question is advice to the new generation, the folks who aren’t 40. If there’s a Wes McKinney who’s 22 and graduating from MIT or some other university this coming year, what’s your advice to that next generation of developers who are kind of really encountering this brave new world of computer science and software engineering?

Wes McKinney: I think that we’re going to be spending a lot more time reading code and interacting with code at a very high level structural, design patterns type level. And so I think learning to write code is not that important now, but you do need to invest in learning about the theory of software architecture and what effective and sustainable large-scale software projects look like. What are design patterns? What is refactoring? What are code smells?

I think books like Uncle Bob’s – Bob Martin’s Clean Code. He’s on the Spicy Takes website. It’s a classic book on code refactoring and code design. And I think it’s still really valuable in terms of how to think about building quality software. Because if you can’t explain to the agents what you want them to do, they’re not going to do it. And so learning how to articulate what is wrong with the code and also to understand what the agents are telling you, because they may respond to you a suggested architectural pattern, design pattern, some way to improve the code base, refactor the code base. And if you can’t make sense of what they’re saying, how do you judge what’s the right approach?

So I think investing in – I think we’re going to go back and read a bunch of Martin Fowler and relearn design patterns and what does good software look like. And I think that computer science education is going to be more about – it’s going to become more like English literature. We’re going to be studying software programs and understanding what makes them good and why they are good. So that we can explain better to our agents what we want.

Mike Driscoll: So in the end, we all become sort of engineering managers. And some of those architectural principles, regardless of the languages that were used, seem to be as important as ever.

Well, Wes, it has been an absolute pleasure and a joy and eye-opening to see what you’ve been cooking up here, how you work in this new era of AI and agentic-led software development. Thanks for giving us a tour of your work and sharing with us your own journey to get here today. We look forward to hosting you again in the future and appreciate all the time today.

Wes McKinney: Of course. Thanks for having me on. Thank you.