SettleMint
ArchitectureSelf-HostingHigh Availability

Hot-Hot (Active-Active)

Multi-cluster active-active deployment for consortium and public blockchain networks. Provides the lowest RTO (1–10 minutes) at the highest operational cost.

Purpose: Describe both hot-hot deployment variants: consortium networks and public networks.


Two variants of active-active multi-cluster deployment, depending on whether you run consortium or public blockchain networks.

Consortium networks

Multi-cluster active-active deployment for consortium blockchain networks where you manage validators.

Architecture

Rendering diagram...

Recovery metrics (consortium)

MetricTargetNotes
RTO1–10 minutesAutomatic failover via GSLB
RPOSeconds–minutesAsync replication lag
RTT10–60 minutesIncluding traffic rerouting

Setup and maintenance (consortium)

TaskTime estimateClient role
Four cluster provisioning1–2 daysClient platform engineer
Network connectivity (peering or VPN)1–2 daysClient network engineer
CloudNativePG setup (four clusters)1–2 daysClient platform engineer
PostgreSQL distributed topology2–3 daysClient DBA or platform engineer
Failover automation and testing2–3 daysClient platform engineer
End-to-end DR drill1–2 daysClient platform team
Total initial setup3–5 weeks2–3 client engineers
ActivityFrequencyTime per cycle
Cross-cluster replication monitoringDaily30 minutes
Backup verification (all clusters)Weekly2 hours
Helm chart updates (4 clusters)Monthly4–8 hours
DR drill / failover testQuarterly1–2 days
Security patching (4 clusters)Monthly1–2 days
Monthly effort40–60 hours

Team requirements: 1.5–2 FTE dedicated platform engineers + 0.5 FTE DBA support, 24/7 on-call rotation.


Public networks

Multi-cluster active-active deployment for public blockchain networks where on-chain data can be re-derived.

Key differences from consortium

  • No validators to manage — the chain is external
  • Data is re-derivable — indexed data can be rebuilt by re-indexing
  • Lower operational overhead than consortium
  • Focus on minimizing user-facing downtime

Architecture

Rendering diagram...

Recovery metrics (public networks)

ScenarioRTORPONotes
Single pod failure<1 minute0Kubernetes reschedules automatically
Database failover1–5 minutesSecondsCloudNativePG automatic failover
Cluster failover (GSLB)1–10 minutes1–5 minutesTraffic shifts to healthy cluster
Full re-index required5–60 minutesN/ADepends on chain size

Setup and maintenance (public networks)

TaskTime estimateClient role
Two cluster provisioning1 dayClient platform engineer
CloudNativePG setup (two clusters)1 dayClient platform engineer
DIDX setup1–2 daysClient platform engineer
Global traffic management4–8 hoursClient platform engineer
Total initial setup1.5–2 weeks1–2 client engineers
ActivityFrequencyTime per cycle
Replication lag monitoringDaily15 minutes
Indexer sync verificationDaily15 minutes
DR drill / failover testQuarterly4–8 hours
Security patching (2 clusters)Monthly4–8 hours
Monthly effort20–30 hours

Team requirements: 0.5–1 FTE dedicated platform engineer, lower effort than consortium due to simpler topology.

On this page