Serverless vs. Traditional VPS: Finding the Sweet Spot for AI Automation
The serverless versus VPS debate has been going on for a decade. Both sides argue passionately that the other side is doing it wrong. For an AI-native solo builder, neither answer is correct. The right answer is "use both, for different things."
I run a hybrid architecture across every PrintMoneyLab project. Frontend and lightweight routing on serverless. Long-running AI work and persistent background processes on a traditional VPS. The combination handles workloads that would break either approach used alone.
This post is about how to actually decide which workload goes where, with concrete examples from production systems.
What This Post Covers
The fundamental difference between serverless and VPS hosting, the hidden constraints that make each one fail at certain tasks, the hybrid pattern I use across all my projects, and a decision framework for picking the right host for a new workload. Plus: the cold start problem and why it matters more for AI applications than most articles admit.
What "Serverless" Actually Means
Serverless is a confusing name because there are still servers. The difference is that you don't manage them. Cloud providers spin up containers on demand, run your code for the duration of the request, then shut them down. You pay only for execution time, billed in fractions of a second.
The pitch is compelling for solo builders: no server administration, automatic scaling, near-zero idle cost. Cloudflare Pages, Cloudflare Workers, Vercel, AWS Lambda, and Netlify Functions all fit this model. For specific workloads, this is genuinely the best architecture available.
The constraints get less attention than the benefits:
Execution time limits. Most serverless platforms cap function runtime between 10 seconds and 15 minutes, depending on the provider and tier. Anything that takes longer gets killed mid-execution. For AI workflows that involve multi-step reasoning, long generation tasks, or chained API calls, this limit hits faster than you'd expect.
Cold starts. If a function hasn't been called recently, the first request has to spin up a new container. That spin-up adds latency — sometimes 100ms, sometimes several seconds. For user-facing experiences, this lag is noticeable. For background jobs, it doesn't matter.
State and memory. Each function invocation is independent. You can't keep a database connection open across requests. You can't cache data in memory between calls. Anything stateful has to live in an external system, which adds complexity and cost.
Vendor lock-in. Each platform has its own conventions, deployment patterns, and limitations. Migrating between them is harder than migrating between VPS providers because the platform shapes your code, not just your environment.
What VPS Hosting Actually Gives You
A traditional VPS is a virtual machine you control. You SSH in, install software, run processes, manage the operating system. The Oracle Cloud Always Free instance covered in Episode 3 is the canonical example for solo builders.
The benefits are the inverse of serverless constraints:
No execution limits. A process can run for milliseconds or for months. There's no timeout. Long-running AI agents, background data collectors, and persistent connections all work without architectural workarounds.
No cold starts. The server is always running. Every request hits a process that's already alive and ready. Latency is purely network plus application time.
Full state control. Memory caches, open database connections, in-process state — all of it works the way it does in any normal application. You're not architecting around constraints; you're just writing software.
Cost predictability. The VPS costs the same whether it serves zero requests or a thousand. For consistent workloads, this is cheaper than serverless. For sporadic ones, it's more expensive (you're paying for idle capacity).
The trade-off is operational overhead. You manage updates, security patches, process restarts, networking. Most of this can be automated, but the management surface exists in a way that serverless doesn't.
The Hybrid Pattern That Actually Works
The right architecture for most AI-native projects isn't either-or. It's a hybrid: serverless for what serverless does well, VPS for what serverless can't do.
Here's how the pattern looks across my projects:
The split happens at the boundary between user-facing and system-facing work. Anything a user directly loads in their browser goes on Cloudflare Pages, because edge distribution makes the experience faster globally. Anything that needs to run continuously, hold state, or take longer than a few seconds goes on the VPS.
Why Cold Starts Matter More for AI
Cold start latency is often dismissed as a minor concern. For most web apps, it is — users wait an extra second on the first request, then everything's fast. For AI applications, the dynamic is different.
AI calls are already slow. A Claude completion takes 1-3 seconds for short outputs, longer for complex ones. Adding a serverless cold start of 1-2 seconds on top of that pushes total response time past the threshold where users perceive the app as broken instead of just slow.
The user doesn't see "your function is cold-starting." They see "this app is unresponsive." Whether the lag is in your code or the platform doesn't matter to them.
For latency-sensitive AI features, keeping the application warm matters. On a VPS, the application is always running — no cold starts ever. On serverless, you have to either accept occasional first-request lag, pay for "always-on" provisioned capacity, or design around the cold start with optimistic loading patterns.
This is one of the main reasons my AI workloads run on the VPS. The Weather Bot polls data every 60 seconds; the kr-sentiment endpoint serves cached responses in milliseconds when warm. Both would work on serverless, but the architectural complexity to handle cold starts gracefully wasn't worth it.
What Goes on Each Side, In Practice
For someone setting up their first hybrid stack, here's what fits naturally on each side based on what I run today.
Serverless edge fits well for: static sites, single-page applications, lightweight API endpoints that finish quickly, webhook handlers, image transformations, geographic-sensitive content delivery, and any workload where automatic global scaling matters more than execution time.
Traditional VPS fits well for: persistent backend services, AI agents that orchestrate multi-step workflows, scheduled jobs that run for hours, services that need in-memory caching, processes that hold long-lived connections (database, WebSocket, message queues), and any workload where consistent latency matters more than scaling on demand.
The middle ground — medium-duration workloads with moderate state requirements — can go either way depending on cost projections and how much operational complexity you can absorb.
The Cost Math at Different Scales
This is the part most articles get wrong. Serverless is "cheaper" only for certain traffic patterns. At consistent high traffic, a VPS is dramatically cheaper. At low or sporadic traffic, serverless wins.
For a solo project doing maybe 1,000-10,000 API calls per day, both options can fit in free tiers and cost is essentially zero either way. The decision is about architecture fit, not cost.
For a project doing millions of requests per month, the math diverges. Serverless costs scale linearly with traffic. VPS costs are flat until you hit capacity. There's a crossover point where one is cheaper than the other, and it depends on your specific usage pattern.
For a project with massive idle periods between bursts, serverless wins easily. The Cloudflare Pages site for Flight Compensation Checker might serve 50 requests on a quiet day and 5,000 during a viral moment. On a VPS, I'd be paying for idle capacity 95% of the time. On serverless, I pay only for the bursts.
For a project running constant workloads — like Weather Bot's data collector, which polls APIs every 60 seconds, 24/7 — the VPS approach is dramatically cheaper. Doing the same workload on serverless would generate continuous billable invocations.
Where to Start
If you're building your first project and not sure where to host it, here's the honest path: start serverless if your project is purely user-facing (a website, a simple form, a static tool). Start VPS if your project needs anything that runs continuously or for more than 30 seconds at a time.
Don't try to design the perfect hybrid architecture upfront. Start with one host that fits your primary workload. As the project grows and you discover needs the original host doesn't handle well, add the other side — serverless for the burst traffic, VPS for the long-running jobs.
The hybrid model emerges naturally if you let architecture follow real requirements instead of forcing a pattern from day one.
What's Next
Hosting decisions are upstream of deployment workflows. The next post in this series goes deep on the deployment pipeline itself — specifically, how GitHub plus Cloudflare Pages turns "push to main" into "live in production globally" with no manual steps. The setup is shorter than you'd expect, and the productivity difference is bigger than you'd expect.
← Previous: Optimizing API Costs Next: Automating the Deployment Pipeline →
More posts in this series will cover the actual stack — deployment automation, secrets management, monitoring, and the workflows that hold everything together. If you're working on shipping something with AI tools and have questions, drop them in the comments — the more we share, the faster we all move.
Disclaimer: This blog documents practical development workflows based on personal experience. Nothing here is financial, legal, or professional advice.
Comments
Post a Comment