SettleMint
ArchitectureSelf-HostingHigh Availability

Hot-Warm (Active-Standby)

Active-standby deployment with warm validators and continuous database replication. Provides geographic redundancy with RTO of 30–180 minutes.

Purpose: Describe the hot-warm active-standby deployment pattern.


Active-standby deployment with warm validators and continuous database replication across two clusters.

Architecture

Rendering diagram...

When to use hot-warm

  • Acceptable RTO of 30–180 minutes
  • Regulatory requirements for geographic redundancy
  • Consortium networks where validator keys can be pre-staged
  • Cost optimization compared to full hot-hot

Recovery metrics

MetricTargetNotes
RTO30–180 minutesDepends on automation level
RPO5–60 minutesBased on replication lag
RTT1–6 hoursIncluding validation and testing

Important: Failover is manual and requires trained operator availability. RTO depends on staff availability and time zones. Regular drills are required to keep procedures current.

Setup and maintenance

TaskTime estimateClient role
Two cluster provisioning1 dayClient platform engineer
Network connectivity setup4–8 hoursClient platform engineer
CloudNativePG setup (two clusters)1 dayClient platform engineer
PostgreSQL primary and replica config1–2 daysClient platform engineer
Replication verification4–8 hoursClient platform engineer
Velero installation (two clusters)4–8 hoursClient platform engineer
Warm validator configuration1 dayClient platform engineer
Key management setup4–8 hoursClient security engineer
Failover scripts and automation1–2 daysClient platform engineer
Failover drill and validation1 dayClient platform team
Total initial setup2–3 weeks1–2 client engineers
ActivityFrequencyTime per cycle
Replication lag monitoringDaily15 minutes
Standby health verificationDaily15 minutes
Backup verificationWeekly1 hour
Helm chart updates (2 clusters)Monthly2–4 hours
Failover drill (full)Quarterly1 day
Security patching (2 clusters)Monthly4–8 hours
Monthly effort25–40 hours

Team requirements

  • Minimum: 0.75–1 FTE dedicated platform engineer
  • Recommended: 1–1.5 FTE with on-call rotation
  • Critical: Documented failover procedure executable by on-call staff

On this page