Friday, September 26, 2025

On the Nature of Microservices

a series of interlocking sawtoothed gears, intended to represent a set of microservices

Microservices entered the lexicon of systems engineering a number of years ago, breaking up a system which might once have been delivered as a single large binary — though usually with separate relational database at least — to instead consist of a series of small microservices each performing a specific function and communicating amongst themselves to provide an overall service.

In my experience at least, moving to a microservices architecture does solve real problems but the nature of the problems it solves are as much organizational as technical.

In a very large engineering team all trying to work together to deliver a solution, one frequently loses velocity simply because of the number of teams jostling against each other:

  1. The release process grows over time, and tends to add process of the form "make sure that problem never happens again."
  2. Rollbacks roll back everything. When every team has P0 deliverables, no team has P0 deliverables.
  3. The internal architecture might have been carefully planned... or might not. Even without a "if I can call it, I can use it" mentality, there still can be significant underspecification.

 

Moving to Microservices

A team which moves to microservices, devoting considerable effort to do so, has every incentive to solve these issues:

  1. The release process for each microservice can be (mostly) decoupled with disciplined versioning of API between producers and consumers.
  2. Rollbacks impact only a portion of the overall service.
  3. The interfaces between microservices will have an API boundary, though Hyrum's Law still applies that un-promised behavior can become a load-bearing dependency.

The tradeoff is that one now has a distributed system, and distributed systems are hard.

  • Debugging is much more likely to cross multiple processes.
  • The 95% and 99% tail latency will suffer, as long code paths likely incur extra cross-process messaging.
  • A robust design will degrade if subsystems are unavailable, transforming the potential for outage into more frequent partial systems failure.

Kubernetes logo, a ship's wheel

If the problem being solved is "maximize effectiveness of a large engineering team" then these are perhaps good tradeoffs. One has the resources to develop tooling for debugging and latency and reliability. If one has further chosen to deploy those services via an orchestration system like Kubernetes, one has the personnel to cover the operational burden of that too.

If one doesn't have the personnel to cover the cost, microservices become a more questionable choice. Fundamentally: delivering via monolithic binary is not inherently bad. It isn't automatically a poor choice. It isn't poor engineering.

How to deliver software is influenced by the size and composition of the team delivering it.