James Blair Part 1: Portfolios, practice, and staying curious
This transcript and summary were AI-generated and may contain errors.
Summary
In this episode of The Test Set, Michael Chow, Hadley Wickham, and I talk with James Blair, a Senior Product Manager at Posit working on cloud integrations. James shares his unconventional path into data science, starting from childhood dreams of robotics, through journalism and animation programs, before discovering statistics and R programming during his undergraduate studies. His experience building a Shiny dashboard during a bioinformatics internship was a pivotal moment that demonstrated the practical power of R for building useful applications.
We discuss the value of data science master’s programs in today’s landscape, with varying perspectives on whether the investment makes sense given changes in the field and the rise of AI. I note that while formal education can help legitimize career transitions, the disruption from AI may reduce entry-level positions across the board. Hadley suggests that being discerning about program quality is essential, and that learning physical skills like welding might provide useful diversification.
James emphasizes the importance of hands-on practice and building portfolios over formal credentials, arguing that working on personally meaningful data problems helps develop genuine curiosity. He discusses how maintaining curiosity is essential in an AI-dominated world where it’s easy to be “carried away by the vibes” of AI-generated answers. The conversation touches on confirmation bias in data analysis and the critical role of data scientists in questioning results and defending conclusions with reproducible methods.
Key Quotes
“I think curiosity is such a fundamental requirement and you have to defend curiosity. In today’s AI world, I think you have to kind of hold on to curiosity. It’s something you have to intentionally just say, I’m going to remain curious because otherwise it’s really easy to open up ChatGPT or whatever tool you want and just be like carried away by the vibes.” — James Blair
“Part of our job as data literate individuals, whether you want to call us data scientists or statisticians, whoever, right? The end of the day, the job is what is true? What is this data actually saying and why are we convinced that that’s the case?” — James Blair
“I think there’s continual value in your own hands-on experience. You learn a lot through your own practice and implementation of things. Even if you’re not showing these things off, just putting yourself to work on interesting projects, exploring data, learning how to use tools that exist today is extremely useful.” — James Blair
“I’m a chronic accessorizer, which does not combine well with cycling at all, because cycling has an exorbitant amount of things you can buy.” — James Blair
“I find a ton of value in it, I think partially from the curriculum and the instruction, but also just the experience of being with peers at that particular point in time, trying to figure out what data science is to some extent.” — James Blair
“That was the one point that really is like an inflection point for me, where all of a sudden I fell into something that I just absolutely loved. I’d never programmed before. And all of a sudden there was this language that, to me, made a ton of sense.” — James Blair, on discovering R
“I think data science is one of those areas where it requires the most human judgment. It’s the most nuanced and is perhaps the least well-suited to automation and humans being replaced by AI agents.” — Wes McKinney
“The average master’s across disciplines is generally a pretty shitty quality. That’s just such a good way for universities to make more money. Good master’s programs do exist, but you really have to hunt for them.” — Hadley Wickham
“If you spent $50,000 learning how to weld—it’s probably much cheaper than that to learn how to weld or do something physical. Those skills, I’m like mostly saying that programmers and data scientists are going to be around, but it doesn’t hurt to diversify your options.” — Hadley Wickham
Transcript
[Podcast intro]
Welcome to The Test Set. Here we talk with some of the brightest thinkers and tinkerers in statistical analysis, scientific computing, and machine learning, digging into what makes them tick, plus the insights, experiments, and OMG moments that shape the field. This episode is part one of a conversation with James Blair, cycling boss, data junkie, and Senior Product Manager at Posit.
Michael Chow: All right, James, so glad to have you on The Test Set. Thanks for joining us. I’m here with Hadley Wickham and Wes McKinney, and we’re so excited to talk to you, I think, all about your journey into data science. For a little bit of background, you’re a Senior Product Manager at Posit on cloud integrations. Is that right?
James Blair: Yeah, that’s correct.
Michael Chow: Nice. And what gasses you up? It seems like a lot of biking, as maybe we can tell from your background.
James Blair: My bike enthusiasm? Yeah, I don’t know if it’s a healthy obsession or an unhealthy one, but I got into cycling probably 10 years ago. I was commuting to work by bike, and then I went remote and had just bought a new bike and was like, now what do I do with this thing that I just bought? And so I started riding for fun and found that the more that I rode, the more that I enjoyed it. And it became this, you know, Hadley talks a lot about the pit of success. This became a pit of like something—a lot of time and a lot of money and a lot of carbohydrates, but I love doing it. It’s something that kind of grounds me, I guess, right? There’s a good community with cycling. I’m part of a team that races, so during the summer I have races that I go to, and it’s fun to have something that keeps me active that I enjoy doing besides just, I don’t know, trying to be motivated to be active.
Michael Chow: Yeah, that’s awesome. I hope you drop into a lot of literal pits of success too, on your bike.
James Blair: It’s funny, I got—so I have a road bike and then I have a mountain bike that I have had for a couple of years, and I feel like a lot of people that I at least interact with kind of grew up mountain biking, which is its own whole discipline, right? Like I consider myself a decent road cyclist to some degree. Mountain biking, I’m totally a fish out of water, even still. And so I have fallen in many pits on the mountain bike and have a few of the scars to prove it. But I like, it’s fun to kind of mix and match. And I live in a neat part of the country, I’m in Utah, so I have like 50 miles of mountain bike trails within a quarter mile of my house that I can get on to. So it’s easy and convenient for me to get out and ride outside and test myself and try new things and hopefully not crash too terribly.
Michael Chow: Looks like you might also be really into ribbons.
Hadley Wickham: Yeah, what’s up with your ribbon, Javi?
James Blair: The ribbon, yeah, I was like, so I was trying to figure out how to frame this. I was like, I can put the bike on the wall, that’s interesting. That’s all printer filament. So I have also become oddly obsessed with 3D printing to some degree. And so that’s all printer filament that used to be in these cupboards, and then they overran it. And so I incidentally 3D printed these little holders that hold them on top of the cupboards in these nice little organized ways.
Michael Chow: Are you telling us that if you were to throw open those doors that it would be more filament?
James Blair: Two of those doors are total cycling stuff, so all my cycling gear is in there. And then two of them is printer stuff. So just other things related to printing and taking care of and building stuff from 3D. And a lot of stuff that I’ve purchased in this kind of hobby, and have yet to figure out how to use, which—I’m a chronic accessorizer, which does not combine well with cycling at all, because cycling has an exorbitant amount of things you can buy. So it’s not a good thing. But I tend to get really excited about something and then I’m like, I need all this stuff. And then I have all this stuff. And I’m like, I should someday learn how to use all this stuff. So I’m slowly working my way through it. I’m getting there.
Hadley Wickham: Have you 3D printed anything for your bike?
James Blair: I have. Yeah. So nothing of consequence, right? Like 3D printing, I think is fascinating, but I’m not going to trust any critical structural component to something I’ve printed. But there’s a little inset decal that goes on the axle. And whenever I take my wheel off to put my bike in the car, sometimes I over tighten the axle, and it pushes it out. And so I lost it a while ago. So I printed a new one. And because I thought it’d be fun. And because it’s cycling and everything’s extra, I printed it in carbon fiber filament. It doesn’t make a difference—it’s a tiny little piece that weighs I don’t even know, a fraction of a fraction of anything. But I also call my name on my bikes, and I like superheroes. So my road bike is Black Panther. And so it’s the Black Panther logo in this little decal that I have on the front axle of my bike, which I thought was kind of a fun touch. But that’s the extent—that’s the one thing I’ve printed for the bike so far.
Michael Chow: All right. I’m glad. I love when we can get into the intersection of your hobby Venn diagram, you know, explore that intersectional space. I think this is a good segue. I mean, this does remind me—I think I saw you mentioned early on, you were really thinking about going into robotics. And then you ended up aiming at journalism and then data science. I’m so curious, maybe you could recap for folks that path into data science. One thing that struck me was the move from robotics to journalism is so intriguing.
James Blair: None of it makes sense. Yeah.
Michael Chow: I’m really curious to hear your perspective on that—what was your journey like into data science?
James Blair: I’ll try to be brief to some extent, but from a young age, I really thought robotics was really interesting. I had this long standing dream of being this robotics engineer. I think as young as six or something there’s old videotapes of me being like, I want to go to MIT and study robotics. And that was kind of the dream that I had for a long time. And then there wasn’t any sort of thing that pushed me away from that necessarily. But in high school, I ended up somehow on the school paper. I don’t remember how this happened. But I had a fantastic journalism teacher and I was—I did high school, the first part of high school was in Alabama. This was in Alabama. I had this fantastic teacher who just really embodied journalism, the pursuit of truth and what it meant to report and to report on news and how to identify things that were newsworthy and ask good questions and all these things. And I just really loved it. And so that kind of pushed me into this direction of maybe this is something I want to pursue as a career. And I found that I really enjoyed meeting with people, interviewing people, thinking about how to tell a story that was compelling. And I still to this day remember headlines of some of the articles that I wrote in high school, right? It was this deeply meaningful process of finding something interesting and then just exploring it to the very bottom and then telling that story that I found really fascinating.
So did that for a while. That led me to think maybe journalism was a career choice for me. It’s something that I could do. I did an internship with a local newspaper after I moved to Utah. So halfway through high school moved to Utah, did an internship with a local paper. That was really great, like strong advocate for internships solely because I think it helps you kind of figure out, do you really want to do this? And in my case, it helped me figure out, I don’t think I want to do this as much as it is as much fun as it is to really investigate things. There’s also a lot, particularly on the journalism side where it was like, you kind of have to work your way. You’re not always—I think it’s very few people that find themselves in a position where they’re really delivering significant investigative reporting on a regular basis. Whereas I was writing a bunch of reviews of local events that have happened, which was fine. And it helped me understand the industry and kind of helped me understand things a little bit better.
So pivoted away from that. I was really into—and this feeds into the 3D printing thing a little bit too—I was really into 3D graphics and design and that kind of world for a while too. And so when I went and started college, I did my undergraduate at BYU. They have a really fantastic animation program. And so I had decided going into it that I wanted to pursue animation—that felt like an interesting place. Pixar seemed like this really exciting place to potentially work. And this whole industry just seemed really intriguing to me because of BYU’s program and the reputation they have. It’s a very, very competitive program. So the first year that I was there was a bunch of prereqs before you formally apply to the program. And all of that was nothing to do with computers. It was all hand-drawn figure drawings, rudimentary two-dimensional animation, which was really fascinating. Not anything I’d really focused on before. And also taught me that I’m a pretty terrible artist. I just was not good at it. And I think like anything, you can develop those skills. But I was with people that were so gifted and I would look at their sketches and I would look at what they were doing. And I was like, I am so clearly not on the same—I’m not even in the same city that they’re in, let alone the same ballpark. I’m just so far away from what they’re doing.
And so that kind of made me have this little crisis of, okay, well now what, what am I going to do instead? And I had, I decided that I really loved teaching and the process of teaching and helping people through this process of discovery that is learning. And so I was trying to figure out if there was something there. I ended up deciding to pursue psychology. Sorry, this is turning into a whole thing. I’m going to wrap this up. I promise.
I pursued psychology and the goal was to do instructional design. I was like, it could be really cool to build curriculums for all kinds of things. And so that was where I was headed. One of the things that was encouraged amongst psychology students was to have a statistics undergrad to bolster your opportunities for grad school. So that’s what I did. I was like, cool. I pursued statistics, which I had never really put any thought or emphasis on before. And suddenly there’s this whole new field that I just loved. And I realized that the only thing about psychology that I found remotely really engaging for me was analysis. Now that I’ve done the study or now that I’ve reached some results, I want to get in and look at what I’ve learned. What are the relationships? What did we learn? What did we not learn? Those kinds of things.
And so I ended up flipping the two. I changed my major to be statistics, my minor to be psychology. And then that introduced me really early on in the statistics program to R as a programming language. And for me, I look back on my whole career, that was the one point that really is like an inflection point for me, where all of a sudden I fell into something that I just absolutely loved. I’d never programmed before. I think I’d taken maybe one C++ class on a whim and I hated it, absolutely hated it. And all of a sudden there was this language that, to me, made a ton of sense. And particularly from a statistical standpoint, it gave me the ability to explore statistics in a very hands-on way that was fundamentally different from the mathematical proofs and concepts that we would discuss in lectures, which are undeniably important. But for me, I just really struggled to engage with that kind of learning. And then all of a sudden with programming, I was like, oh my goodness, I can see what I’m learning about in real time. And I have this total environment where I can experiment with things and I can build my own functions and my own simulations and all this stuff.
So that was this huge turning point for me. And then I graduated and as I was approaching graduation, it was right at the height—this is 2016. So it was kind of right at the height of data science as a discipline, where people were talking a lot about it, programs were being developed around it. There was a lot of enthusiasm for data science, which in my mind was just like statistics by a different name. And so I applied to a data science graduate program and was accepted and did that. And that’s kind of put me on this path of data science, programming, R specifically as a starting point into all of that. And yeah, a very winding journey, but one that I wouldn’t trade because I’ve learned a lot through the whole process.
Michael Chow: Yeah, it’s so cool. I mean, it makes me think about—I did something similar with psychology and stats and that pipeline of like, I do remember our stats classes at the time were all in SPSS. So a very point and click. And what you said about R really resonates with me where you’re really hands on and you’re able to explore and kind of get in there. But also, I feel like there’s something magical too about going from that to now being able to put up a website is such an interesting path.
James Blair: Yeah, I think one of the other key pieces here was I did, while I was still an undergrad, I did an internship with a really small bioinformatics startup. And they, I found them through a career fair and I was like, I’m studying statistics. And the guy that I was talking to was like, we might be able to use that. Cool. Why don’t you come see what you can do with us? So I showed up and I really didn’t have—they didn’t have anything they wanted me to do. They just thought I could be useful, which for some people I think would be kind of the worst case scenario. Like, please tell me what to do. For me, I’ve always been fairly autonomous. And so I showed up and was like, okay, they’re like, do stats for us, you know, we need some stats. I was like, cool.
And what ended up happening was I took this opportunity to learn Shiny. So this was like, I spent every day of this internship for a couple of weeks, just diving into the Shiny documentation. I’d heard about it before, but I’d never done anything. But I was like, you know what? I think I might be able to use this Shiny package and build something that could be of use to this company. So I dove into the Shiny documentation. And within the first month or six weeks that I was there, I had put together this Shiny dashboard that pulled all of their Google Analytics from their platform and put it all in this central place where they can monitor traffic and engagement with the tool that they were building. And everybody was blown away. And I felt so empowered, right? I was like, oh my, look at what R can do. Look what I can do. This is so great. I can make things that are useful using this kind of quirky little programming statistical language. And again, just one of those things where I emphasized further in my mind that I was doing the right thing because it was engaging and it just clicked with the way that my brain works.
Michael Chow: Yeah. It’s so cool. One thing I’m curious about is I know when I talk to people about data science and sort of data science education, I feel like people often ask the value of a master’s or some of these advanced education. Do you have advice for people who are thinking about a master’s or what to do after, say, undergrad?
James Blair: So I really enjoyed the program that I went through. I was part of the first cohort through the program. So they were kind of figuring things out as they went. In fact, it was a master’s in analytics when I started, and then I graduated with a master’s in data science. At some point they were like, you know what, if we call this data science, we could make it cooler. So that’s what they did, which was great.
I think that’s such a hard question, especially in today’s world, right? The way that the world is evolving, we live in this world of AI and what that means both today and in the future. For me, I still think there’s a lot of value there. Part of that value came from the fact that I used this program as a chance to really try to stretch myself to the rest of the data science ecosystem that I really hadn’t had time to dive into previously. So a good concrete example of this is I had done everything in R, right? I knew that Python existed as a language. That was kind of it. And so when I started this program, they told us upfront, they said, hey, all of our content is structured so that you can kind of choose how you want to do things. You can use R, you can use Python. And I made a conscious decision to do all my schoolwork in Python, just as an effort to—I knew the concepts and I knew that if I really got stuck, I could prove something out in R and then try to work myself backwards from that to figure out the equivalent on the Python side.
And so for me, that exercise, doing that with peers, being in a classroom setting was really helpful and useful. It was also—my experience was a bit unique because the program I did was out of San Francisco. But instead of moving to the Bay Area, I just lived in Utah and plane commuted to grad school for two years, which honestly, financially made a ton of sense and was kind of a nice way to do things. I had a young family at the time. And so instead of trying to figure out how we could make it work in the Bay, we just lived where we had been living. And I took every other Saturday and flew to San Francisco for the day. So it was a busy couple of years and I racked up a ton of Alaska miles, but it was fun. I enjoyed it.
And for—again, to answer the question—for me, I find a ton of value in it, I think partially from the curriculum and the instruction, but also just the experience of being with peers at that particular point in time, trying to figure out what data science is to some extent. And I think maybe we have a better answer to that question today, although it still is somewhat ambiguous, I think, particularly as we consider the landscape of AI and everything else. But I benefited and continue to benefit from the experience. So for me, it was well worth the time and investment.
Michael Chow: Yeah. I wonder, do you all have any thoughts about advanced education or do people ask you, should I do a master’s?
Hadley Wickham: Yeah. And I tell them like, I don’t know, but I will say, I think the average master’s across disciplines is generally a pretty shitty quality. That’s just such a good way for universities to make more money. So I think good master’s programs do exist, but you really have to hunt for them and you know, you should be able to, especially for a data science master’s, you should be able to ask them for their data on what happens to their graduates. I think you need to think of it as an investment. You’re going to pay X dollars to get this thing. What are you going to get out of it? How does it affect your job chances?
Wes McKinney: I’ve heard similar things. I know a lot of universities, their data science and statistics master’s programs have become big sources of income and tuition for the programs, because typically the master’s students pay for their masters and the PhD students are paid for out of grants and research funds. And so essentially the PhD students cost money, the master’s students make money.
But my advice generally is it depends. I think if the program has a good track record of helping students get jobs and end up transitioning successfully in the professional world, that’s an important data point. It can be helpful for people who are making some type of a career or education segue. Maybe they started out in a field where they didn’t have a lot of statistical training or data science training, and they’re looking to segue into being more of a data scientist and maybe on their resume, on paper, with their education and their work experience, they don’t have the right kind of resume to make themselves appealing to employers looking for data science candidates. Then having that type of program on your resume can be a big benefit.
But of course in 2025, the big question is whether that’s all going to be going out the window with the emergence of AI. And certainly I think data science is one of those areas where it is one of the areas of technical work that requires the most human judgment. It’s the most nuanced and is perhaps the least well-suited to automation and humans being replaced by AI agents. That doesn’t mean that a lot of startups and companies aren’t going to try, but I think compared with a lot of the areas where people are writing code and doing data analysis, I think data science is one of the areas where humans are the most essential to being in the loop.
And so perhaps there will be, I speculate there will be a lot fewer entry level data science positions than there were in the past. Just like there will be fewer entry level software engineering roles and things across the board. But there’s still a need. So I’m sure that AI will have a disruptive effect on the education industry. And so it will be interesting to see what things look like when the dust settles. And I don’t know whether it’s three years from now or five years from now or 10 years from now. Andrej Karpathy thinks we’re 10 years away from artificial general intelligence. So I don’t know what happens when we reach that point, but it’s definitely going to be interesting.
Michael Chow: Yeah. Maybe to recap some of what people said, it sounds like James, you’re saying it was really useful to have two years or some stretch of time to really focus on learning data science and have classmates and a cohort to work with. And Hadley, it sounds like you’re saying there’s a need to be discerning about master’s programs since there’s a strategic component to them in universities and things.
I have to say, combined with what Wes was saying, choosing to invest $50,000 or whatever a master’s costs these days into data science right now feels like it does feel like a high risk kind of bet. And I’m kind of like, I don’t know, 70% joking when I say this, but I feel like if you spent $50,000—learning how to weld is probably much cheaper than that to learn how to weld or do something physical. I do think those skills—I’m mostly saying that programmers and data scientists are going to be around, but it doesn’t hurt to diversify your options and make sure you can do something in the real world too. Like James will be fine because he can 3D print things.
James Blair: I was just going to say, I think part of the benefit for me was the timing of it all, right? The world today is so different than it was 10 years ago. And part of it for me was just that it kind of, especially young in my career, helped legitimize the aspirations that I was aiming for. And I look at kind of where my career has evolved since that point, since finishing up that master’s program. I finished, I graduated from that program, spent a little time in California trying around and then started at RStudio right away. And so RStudio was the first job that I had after I concluded that program. And I think part of that was what kind of set me up to come into the role that I came into at RStudio and then to evolve from there. But again, the landscape now versus then is foreign today. It really is. It’s a totally different world.
Hadley Wickham: So I think our advice, if you’re entering data science today, the best piece of advice we can give you is be born 10 years earlier.
Michael Chow: Time travel. Just do that.
James Blair: I think there’s, I think what we’ve seen, or at least what I’ve seen is there’s formal education, which I do think plays a role. I think there’s reasons to be cautious and very intentional about the pursuit of that. But I also think that there’s a lot of, there’s continual value in your own hands-on experience. People talk a lot about the value of a portfolio, which you could debate, how good is it to have an impressive GitHub to show somebody? And I don’t think that’s really the point though. I think the point is, at least from my perspective, you learn a lot through your own practice and implementation of things. And so even if nothing else, even if you’re not showing these things off, I think just putting your hands, putting yourself to work on interesting projects, exploring data, learning how to use tools that exist today, in today’s world, learning how to use AI to Wes’s point to supplement, but not replace the work of a data scientist is extremely useful and can easily be accomplished outside the constraints of a formal educational program.
And I think there’s a ton of value in that exercise of let’s just practice things, right? Find an interesting problem, find some data about it, look at that data, figure out what the tools in the market today let you do with that data. And then critically as the data scientist or aspirational data scientist, ask the hard questions about that data and then figuring out how to find answers that are reproducible and can be defensible. That to me, that’s the critical piece here is we live in a world of so much noise that part of our job as data literate individuals, whether you want to call us data scientists or statisticians, whoever, right? Whatever label you put on it, the end of the day, the job is what is true? What is this data actually saying and why are we convinced that that’s the case? And how can we use X, Y, or Z tool, R, Python, this package, that, whatever, how can we use that to defend our positioning, to defend whatever decision we’ve made based on what we’ve learned?
Hadley Wickham: And to me, the thing that’s even more true today is we just have this human tendency to find data that confirms our prior beliefs. And regardless of whether that’s looking at data or working with AI, that’s something you’ve just got to be so aware of and constantly fighting against. To me, that’s at the core of good data science—really pushing against that tendency to just stop at the answer that you want to see and really think, does this actually make sense? Is there some other reason I could be seeing this pattern?
James Blair: I think that’s such a great contrast point to the way that AI operates today, right? AI just wants to be agreeable and wants to pursue the right answer. And so this idea of critical thinking and being able and willing to take a step back and kind of flip things on their head and say, but what about, or what if, or wait a minute, what if we looked at it this way instead, or what if the question was this—that to me is what makes us so critical in this world.
And maybe “us” is not the right—that’s what makes curious individuals so critical in this world. When I talk to people who are interested in the field or are trying to figure out what data science is about, and you hear this, I’m not the only person that holds this belief, but I think curiosity is such a fundamental requirement and you have to defend curiosity. In today’s AI world, I think you have to kind of hold on to curiosity. It’s something you have to intentionally just say, I’m going to remain curious because otherwise it’s really easy to open up ChatGPT or whatever tool you want and just be like carried away by the vibes. But you can kind of lose yourself in that and that’s dangerous. And so I think there’s this, you have to have this willingness to say, I’m going to remain engaged. And part of that engagement is I’m going to remain fiercely curious about whatever I’m doing.
And a big part of that for me is working on problems that are interesting to you, right? Working with data that you find personally engaging for X, Y, or Z reason. I spend a lot of time looking at cycling data. Why? I like cycling. And so for me, that data is meaningful. Is it earth shattering? No. Am I solving critical problems? No. But for me, I know what questions to ask. I understand the meaning behind the numbers because I’m on a bike every day and so I know what X, Y, or Z looks like and feels like. And so there’s this connection to what I’m doing that fundamentally helps me become more curious and ask deeper questions.
Michael Chow: Yeah, that really resonates. I do feel like too, in a lot of situations where I’ve seen data scientists really work super well is where there are numbers coming in, but they also have tons of experience with the domain at that point, or a lot of the flow of data through the company and how it taps into reality. So they can kind of connect it back, like to you being on a bike and then looking at bike data. I feel like a lot of data scientists have this sort of—they’ve been in a room with the stakeholders and people who need to make decisions. And so they can interpret the data well, or know which data to focus on more or less.
[Podcast outro]
Stay tuned for part two of our conversation with James Blair on the next episode of The Test Set. The Test Set is a production of Posit PBC, an open source and enterprise tooling data science software company. This episode was produced in collaboration with Creative Studio Adji. For more episodes, visit thetestset.co or find us on your favorite podcast platform.