Learn/Glossary/Fault Tolerance
Architecture

Fault Tolerance

Fault tolerance is a system's ability to keep operating — or degrade gracefully — when individual components fail.

Diagram

  Payment Service DOWN
       │
       ▼
  Circuit Breaker OPEN ──▶ Return fallback: "Payment unavailable, try later"
  (App stays online, users aren't blocked)

In Depth

Fault tolerance means a system continues to function (fully or in degraded mode) when parts of it fail — a database timeout, a crashed microservice, or a network partition.

Related Terms