Boring on purpose: the stack that survives a year in production
Every framework you adopt is migration risk you accept on day one. The cheapest production system is the one made of components that have been in production for years. Boring is a feature.
The most expensive part of an AI system is rarely the AI. It is the framework you adopted in week one that turned into a migration in month nine.
We have walked into enough engagements with this exact shape to recognise the pattern on day one. A team builds an agent on the orchestration library that was hot when they started. The library makes the demo easier. The team ships. Then the library has a major version bump that changes its core abstractions. Then it gets acquired and the abstractions change again. Then a competing library gets popular and the team's hiring pool shifts. Then a contractor who is good but only knows the new library wants to be hired. Now the team is six months into a rewrite that produces the same product they already had.
This is the framework churn tax, and many AI teams in 2026 are paying it before they have a name for it. The way out is a decision the team has to make early. Pick boring on purpose.
Adopt now Wait or avoid
──────────────────────────────────────────────────────────
Proven runtime Postgres New platform layer
Clear exit Redis, SQLite Proprietary agent OS
One job pgvector, Inngest All-in-one AI suite
Owned by you Thin wrappers Deep framework lock-in
Boring is not a preference for old tools. It is a preference for components with a clear job, a credible exit, and a low handoff cost.
What boring actually means
Boring does not mean old. It does not mean stagnant. It means proven, stable, and unlikely to require a migration in the next two years.
A few markers we use for boring:
The component has been in production at well-known engineering shops for at least three years. Not as a side project. As a load-bearing part of the system. Postgres is boring. SQLite is boring. Redis is boring. Most things that have a real birthday party event when they hit a major release are not yet boring.
The component does one thing. Components that try to be a platform tend to fight you when your needs do not match the platform's shape. The thing that lets you assemble many small components yourself does not have an opinion on your shape, which means it does not get in the way later.
The component has a credible exit. If the team that maintains it stops, the project does not die. There is community ownership, multiple competing implementations, or a clear migration path. Postgres meets this. Many newer specialised data stores do not clear this bar yet.
The community has lived through at least one major API change. Libraries in their first or second year tend to break things across versions. By year three, the project either has matured into a stable API or has been displaced by something else. Trust the libraries that have already done their breaking changes.
If a component meets all four of those, it is probably boring enough. Most components teams adopt early on AI projects meet none of them, because the boring options are not what the conference talks are about.
Postgres is the boring answer most of the time
The architecture decision we most often end up revisiting on engagements we are brought into is the data layer. Teams reach for specialised infrastructure too early. Postgres, with extensions, covers more of what an AI system needs than people think.
Vector search? pgvector. It is not the fastest pure vector database on the market. It is fast enough for many production workloads under ten million vectors, depending on dimensionality, filters, concurrency, and latency targets. It runs in your existing Postgres cluster, with your existing backups, your existing replication, your existing IAM. The operational simplicity of "it is just another column on the table" is worth more than the recall benchmark difference at the scale most systems actually operate at.
Full-text search? Postgres full-text. Solid for most use cases. If you need real BM25 with proper tuning, fine, layer Elasticsearch or Typesense in. Most teams reach for the specialised tool before they have proven the boring one is insufficient.
JSON storage? Postgres jsonb. Indexed, queryable, ACID. We have shipped agent memory layers entirely on jsonb fields with appropriate indexes, without introducing a separate document store.
Queueing? Postgres-based queues are not the fastest but they are transactional with the rest of your data, which removes whole classes of cross-system consistency bugs that come with separate queue infrastructure. For most agent workloads, the simplicity is worth more than the throughput.
Caching? Use Redis if you need it. Postgres is not the right tool. But check whether you need Redis at all. Often a small in-memory cache plus Postgres is enough.
The rule we use: start everything in Postgres. Move a workload out of Postgres when you have measured that Postgres is the bottleneck. Most workloads never reach that point.
The orchestration question
This is the most contested decision on most projects. Do you use an agent framework or roll your own orchestration?
There is no universally right answer, but there are usefully right answers depending on what you are building.
Use a framework if: you are building a system with many possible execution paths that you cannot enumerate in advance. Complex multi-agent setups, dynamic tool selection, anything where the agent's behaviour is genuinely open-ended. The framework's abstractions earn their cost here because writing the orchestration yourself would be a quarter of work.
Roll your own if: your system has a finite number of paths, the paths are knowable in advance, and the value is in the quality of each path rather than the variety of paths. Most customer-facing agents are in this category. The "agent" is really a state machine with five to ten possible flows, and modeling it as a state machine with explicit transitions makes it debuggable, testable, and inspectable. Modeling it as an open-ended agent framework with autonomous reasoning makes it none of those things.
For most production agents we build, the answer is some hand-rolled orchestration around a typed function-calling loop. A few hundred lines of code. We can read it. We can step through it. We can predict its behaviour. When something goes wrong in production, we can look at the trace and explain exactly what happened. Frameworks can make the trace inscrutable in exchange for letting you write less code, which is the wrong trade for systems that need to be debugged at 2am.
If you do use a framework, treat the framework as a temporary substrate. Write your code so that swapping the framework would be a refactor, not a rewrite. The agent business logic should live in your code, not in framework-specific abstractions. We default to thin wrappers, not deep integration.
Vendor lock-in is the bill nobody pays attention to
Every AI tool you adopt has a lock-in cost. Some are explicit and easy to see (proprietary data formats, proprietary APIs). Some are implicit and easy to ignore (the cost of training your team on the tool's mental model, the cost of building your evals against the tool's specific behaviour, the cost of writing your prompts against the tool's quirks).
The boring rule helps here too. Components with a credible exit have low lock-in cost. Components that are the centerpiece of a vendor's pitch have high lock-in cost because the vendor is incentivised to make leaving expensive.
The specific pattern to watch for: vendors that want to be the "operating system" for your AI workloads. They will manage your prompts, your tools, your observability, your evals, your deployment, your routing, your fallbacks. Each integration sounds reasonable. Each one is a strand of rope. By the time you want to leave, leaving is six months of work, and the vendor knows it.
Use vendors that do one thing well. Helicone or Langfuse for observability. Vanta or Drata for compliance plumbing. Model providers for models. Avoid anything pitching itself as a single platform for everything. The platform you can leave in a week is a platform that has to keep earning its place. The platform you cannot leave is one you have already lost the negotiation with.
The stack we tend to default to
For an AI agent in production, the defaults we reach for, in May 2026:
Postgres for everything that is not specifically not a Postgres problem. With pgvector for retrieval.
A typed function-calling loop in TypeScript or Python, hand-rolled, around the major model providers (Anthropic, OpenAI, Google, Meta). Provider SDKs wrapped in a thin abstraction that lets us swap providers without changing call sites.
Helicone or Langfuse for model observability. OpenTelemetry plus your existing APM for everything else.
Inngest or Trigger.dev for background jobs and durable workflows. Or just Postgres queues if the workload is simple.
Vercel or Cloudflare or your existing cloud for hosting. Whichever already has the rest of the company's infrastructure on it. Do not introduce a new hosting layer for the AI workload.
Vanta or Drata for compliance plumbing, partner pricing passed through.
That is the boring stack. It is not new. It does not have a hot Twitter following. It does not have a foundational framework that promises to be the agentic operating system. It also does not require a migration in a year. It is the stack that the team can hand off to the client's engineers without a six-week onboarding. It is the stack we would still be defending if we revisited the engagement two years later.
Where this leaves you
Stack decisions feel low-stakes during the build phase. They become high-stakes the moment you try to evolve the system, hand it off to a new team, or migrate off a vendor.
If you are starting an AI build now, the questions to ask before you adopt anything:
Has this been in production at a well-known engineering shop for at least three years?
Does this do one thing, or does it want to be a platform?
If the company maintaining this goes away tomorrow, what is the migration path?
How easy is it to leave this in two years?
If the answers are uncomfortable, the boring alternative is probably the right call. The boring alternative is also usually cheaper, easier to hire for, and easier to defend in a security review.
We almost never regret choosing boring. We have, more times than we can count, watched teams regret choosing exciting.
The system that is in production a year from now is the one made of components that were already boring when the system was built. That is not a coincidence. That is the operating principle.
