MyCase is a legal practice management platform under the 8am brand. I helped start the AI team which spun out of a hackathon team I led. We designed and launched the Case Assistant, a first of its kind general purpose agent for case knowledge work. I specifically owned agent architecture, evals, prompting, observability, and the overall agentic experience. Users can ask it things like:
AI coaching and analytics for Strava athletes. Users ask natural language questions about their training data and get insights via on-demand analysis. Under the hood, an agent writes and executes Python code in a sandboxed environment. 300 users. Went viral when Strava asked me to shut it down for violating their API terms. I got them to reverse the decision. The inpsiration for this project was to build an AI native version of my Strava Analysis app from 2022. Live here (demo mode available even if you don't use Strava).
A coding agent built from scratch in ~1000 lines of Python. Can understand, write, and edit code. Inspired by Anthropic's Claude Code. Very bare-bones—won't compare to the real thing—but helped me build intuition on how these agents actually work. Repo here.
AI agent that searches the web to find jobs tailored to your specific criteria. Built with Idode Kerobo. Inspired by Google/OpenAI's Deep Research paradigm—agentic search applied to a specific use case. 500 users on launch day. Demo here.
Fine-tuned five variations of GPT-4o to generate writing in my dad's blog style. Compared against prompt-based approach with Claude. Writeup on learnings around fine-tuning, synthetic data generation, and style here.
Cofounded a mobile app that turns articles into AI-generated audio summaries—launched before Google's NotebookLM audio feature. Learned swift and web scraping on the fly. Built with Idode Kerobo. 20,000+ palates generated. 4.7 rating on App Store. Available on iOS.
A tool to vibe test prompts with an ELO-style leaderboard. Despite so much talk about how to evaluate LLM-based tasks, I haven't found any good tools to systematically experiment with prompts where 'good' is subjective. Inspired by the Chatbot Arena Leaderboard—set up some prompts and models for a task, see two outputs, pick a favorite, and win rates get logged to a Google Sheet. Demo here.
Using the G-Eval procedure to compare the performance of Claude Opus and GPT-4 on a summarization task.
An LLM fine-tuned on a curated set of wisdom, advice, and insights. Great for life advice. I wrote some notes about the fine-tuning process here.
A very basic Python implementation of an LLM powered agent following the ReAct (reason + action) framework. I shared some notes on the implementation here.
An exploration of how we can use language models to evalute the qualty of their own summarization performance.
An agent which can analyze and help you understand your spending. Built two versions: one using LangChain Agents and one using OpenAI's Assistants API.
Live link above or view a demo here. The backend is deployed on a free Heroku tier so if it doesn't work right away, just try again in 30 seconds :).
Handles queries like, "How much did I spend in January?", or "What did I spend the most on this year?"
Independent research exploring how to evaluate and improve the quality Retrieval Augmented Generation. Related writing:
10 Ways to Improve the Performance of Retrieval Augmented Generation Systems
The Issue with Data Supported Chatbots
Exploring how to make a chatbot that can answer any question about the legal tech industy. Built using GPT-3.5 connected to a large collection of information on the industry.
Handles queries like "Tell me about some legal tech companies using AI" or "What does the company MyCase do?"
I use the app Strava to track my runs and bike rides. A while back I published a peice on how to access and analyze activity data for other Strava users. A lot of people reached out and asked if I could do this for them, so I built a web app that allows users to authenticate into the Strava and recieve a report containing stats and trends for their top activites. You can check out the live app by clicking above, or see a demo here. In 2025 I built an AI native version of this app called Caiden.
No longer live
There was a dramatic rise in violent crime in Washington DC, where I live, over the past few years. I built this dashboard which keeps track of the number of shootings and robberies reported by the DC Police Deparment via Twitter.
Through this Solidity / React app, you can send me an article you think I should read and it will be stored as a transcation on the Ethereum block chain. When you send me an article, you may also be randomly chosen to recieve a small prize in ETH.
No longer live
A simple game that gives couples three levels of questions to ask each other: chill, deep, and deeper. Couples, and even friends, can use it to get to know each other or just fill some time.
I noticed that a trend of a lot influences and niche leaders hosting job boards. Most were designing and building their own from scratch. I build an MVP for a web app that would abstract this process and serve as a Shopify for job boards. Pallet has since come along and built a really cool business around a more sophisticated execution of this idea.
Tutible was a platform designed to match college students with peer tutors who has taken their class. I designed and built this with the no-code platform Bubble before learning to program web apps.
A directory of small businesses in my home town. I built this with Bubble at the beginning of the pandemic with the intention of helping stores share info about their operations, hours, or deliveries.
A bluetooth enabled smart lock that I built with an Arduino.