Software
Apache Arrow
Feb 2016 –
A cross-language development platform for in-memory analytics with a standardized columnar memory format. I helped design the core format and work primarily on the C++ and Python implementations.
Positron
Jan 2024 –
A next-generation data science IDE built by Posit, combining the power of VS Code with first-class support for Python and R data science workflows.
Apache Parquet
Jan 2014 –
A columnar storage file format optimized for big data processing. I am a PMC member and principal author of the C++ implementation and Python bindings (PyArrow’s Parquet support).
pandas
Apr 2008
The foundational data analysis and manipulation library for Python. pandas introduced the DataFrame abstraction to Python and became one of the most widely-used tools in data science.
Ibis
Jan 2015
A Python library providing a DataFrame API that compiles to SQL and executes on many backends (DuckDB, PostgreSQL, BigQuery, Spark, and more).
msgvault
Jan 2026 –
Archive a lifetime of email and chat. Offline search, analytics, and AI query over your full message history. Powered by DuckDB
spicytakes.org
Jan 2026 –
LLM-powered blog summaries and memorable quotes
VibePulse
Dec 2025 –
macOS menu bar applet to display your Claude Code and Codex token use via ccusage
moneyflow
Oct 2025 –
Power user terminal interface for personal finances and budgeting, supporting Monarch Money, YNAB, Amazon purchases, and more