Skip to main content

Command Palette

Search for a command to run...

Engineering ResumeRoast: Designing a Scalable Application

Updated
8 min read
Engineering ResumeRoast: Designing a Scalable Application
M
Software Engineer focused on building scalable web applications using Python, React and AWS.

Why Did I Build This?

I wanted to build something that went beyond a simple CRUD app, something that would force me to make real design decisions. ResumeRoast seemed like a fun idea with a deceptively simple use case:

  • Upload your resume

  • AI gives you brutally honest, humorous feedback on it

Looks simple right? I thought the same. Little did I know, you can over-engineer anything. But in my case I didn't. I just made simple decisions that turned it into a scalable web application that reflects how real systems work.

The moment I started thinking about the system behind it, the decisions compounded fast. How do I handle file uploads? What happens when the AI takes 10 seconds to respond? What if 50 users upload at the same time? How do I protect user data from being exposed through the LLM? How do I prevent abuse of the roast endpoint?

These aren't hypothetical scale problems, they're problems you need to design for from day one, even in a side project, because changing your architecture mid-build is painful.

This blog series documents the engineering decisions I made while building ResumeRoast Not just what I chose, but why and what I'd do differently.

I always want to thank Hussein Nasser and Arpit Bhayani for teaching people to think like engineers instead of just solving problems blindly.

What the System Needs to Do

Before picking any technology, I wrote down what the system actually needed to do. This sounds obvious, but it's the step most people skip

  • A user registers, logs in, and uploads a resume (PDF)

  • The system extracts text from the PDF and sends it to an AI model

  • The AI response is processed and returned to the user as a formatted roast

  • Users on different tiers have different roast limits and feature access

  • The system should not block on slow AI responses

  • Files should persist across deployments and container restarts

The last three requirements immediately ruled out a naive implementation. You can't just call an AI service synchronously and hope the HTTP request doesn't time out. You can't store files in a local folder if you're using Docker. And you can't enforce rate limits without some shared state between your services.

These requirements drove every decision below.

Architecture

Here's what the final system looks like before we dig into the individual decisions

The web server never blocks on AI processing. The worker never handles HTTP requests. The database never stores files. Each piece does exactly one thing. Let's look at how I arrived here.
Let's assume I want to move the entire application to cloud, this maybe one of the architecture for ResumeRoast.

The Tradeoffs

Sessions vs JWT

For authentication, I went with sessions over JWT. JWT is the popular recommendation. Stateless, portable, no server-side storage needed. But in practice, implementing JWT correctly means dealing with refresh token rotation, short expiry windows, and a blocklist for logout invalidation. That complexity wasn't justified for this project at this moment

Instead, I created a UserSessions table directly in PostgreSQL. Every login creates a session row, every request validates against it, and logout is a simple delete. The tradeoff is a database query on every authenticated request. But at this scale that's a non-issue, and I get something JWT can't easily give me, the ability to query active sessions, force-expire them, and maintain a full login audit trail. If traffic ever demanded it, migrating session lookups to Redis is a well-defined step.

Adding infrastructure before you need it isn't engineering, it's anxiety.

PostgreSQL vs SQLite

SQLite was genuinely tempting. Zero configuration, runs in-process, and local development is frictionless. For a solo project with modest traffic, it's hard to argue against it.

The argument against it came when I thought about concurrent writes. A Celery worker finishes processing a roast and writes the result to the database. At the same time, the web server is writing a new task record because another user just uploaded their resume. SQLite locks the entire database on every write. Under any real concurrency, this becomes a bottleneck you can't tune your way out of.

PostgreSQL handles concurrent connections gracefully with row-level locking, plays well with every ORM and migration tool.

Local Storage vs S3

Storing uploaded files locally felt fine until I thought about Docker. When you containerize your application, the file system inside that container is ephemeral. It disappears when the container restarts. A resume uploaded to /uploads/resume.pdf on one container is invisible to every other container, and gone on the next deploy.

You can persist it using Docker volumes, but that just moves the problem. Volumes are still tied to a single host. The moment you scale to multiple instances, you're stuck.

S3 solves this by being completely external to your application. Every service, the web server, the Celery worker, any future service reads and writes from the same place. The URL for a file doesn't change between deploys, and object storage is horizontally scalable by design in a way local disk never is.

This decision also opened up Presigned URLs, a pattern where S3 generates a temporary signed URL that lets a client upload directly to S3 without the file ever touching your web server. I'll discuss about this in another blog

Redis

At this point I had a relational database for structured data and object storage for files. Redis entered as a third, distinct category — not a replacement for either, but a complement to both.

Redis is an in-memory data store. It's fast in a way PostgreSQL can't match for certain operations, and it supports data structures like lists, sorted sets, and pub/sub that make certain problems trivially easy. In ResumeRoast, Redis does two jobs:

  • Task queue backend — Celery uses Redis as its broker to store and dispatch tasks

  • Rate limiting counters — incrementing a counter per user per minute is exactly what Redis is built for

Redis isn't a database replacement. It's the layer between your application and your database that handles anything needing to be fast, temporary, or shared between services in real time.

Async vs Sync Processing

This was the most consequential decision, and the one that justified every other component in the architecture.

The naive implementation of "roast my resume" is user uploads file -> server calls AI API -> server returns response. Simple, But an AI API call can take anywhere from 5 to 30 seconds. Holding an HTTP connection open that long while the user stares at a spinner is a poor experience. More critically, if your server is handling 50 requests simultaneously, you now have 50 threads blocked waiting on I/O, your server will run out of workers.

The async pattern solves this cleanly, the web server's only job is to accept the upload, enqueue a task, and immediately return a task_id to the client. The Celery worker picks up the task independently, does the heavy lifting, and writes the result to the database. The frontend polls for completion using the task_id.

The async pattern also gave me failure resilience for free, if a worker crashes mid-task, the task stays in the queue and gets retried. That's something a synchronous approach can never offer. More on this in another blog.

Summary

Decision Chosen Key Reason
Auth Sessions + PostgreSQL Simpler, instant revocation, full audit trail
Database PostgreSQL Concurrent writes, row-level locking
File Storage AWS S3 External, durable, container-agnostic
Cache / Queue Redis Fast ephemeral store for tasks and rate limiting
Processing Async (Celery) Never block HTTP threads on slow AI calls

None of these decisions are exotic. They're standard tools used deliberately, with a clear understanding of why the alternative would fail — not discovered by accident when things broke in production.

In the next blog, I'll go deeper on the security layer, why rate limiting strategy depends on, what you're protecting against, and how I handled prompt injection attacks on the roast endpoint.

Thank you for reading the article. If you found it informative or interesting, please give it a thumbs up. I would highly appreciate it if you could share it with your friends as well.