Transactions that recover themselves
Every transaction carries explicit state across preparation, approval, broadcast, and confirmation, so a stuck step recovers instead of going dark, and a retry can never duplicate an in-flight send.
Every transaction now has a name for where it is, so a stall recovers from that exact point instead of vanishing silently or sending twice.
A transaction as a tracked, recoverable process
From receipt through confirmation, every step in a transaction's life is a named, persisted state. When something stalls, the engine resumes from exactly that state instead of retrying from the start or dropping the work entirely.
- Nonce
- Held for the lifetime
- Reverts
- Typed
- Long writes
- Return the result
- Resume
- From the last step
A blockchain send has always looked simple from the outside: sign it, broadcast it, wait for the receipt. What that hides is a sequence of steps, each of which can fail independently. The gas estimate can time out. The custody provider can hold the transaction pending approval. The broadcast can succeed while the receipt never arrives. If any step goes wrong and you have no record of which step it was, you face the same choice every time: retry and risk sending twice, or do nothing and leave a transfer stuck in an unknown state.
On a regulated platform that asymmetry has real consequences. A duplicate token issuance is a compliance event. An approval that silently expires means an investor transfer that never settled. A failed-but-unconfirmed send leaves a gap in the audit trail. These are not rare edge cases. Under real concurrency, with custody approval windows in the tens of minutes and variable chain conditions, they become near-certain.
DALP 3.0 addresses this by giving every transaction an explicit state machine. The Transaction Lifecycle Engine tracks each send through eleven named states, persists every transition, and holds the resources needed to resume safely when something interrupts.
Eleven states, no silent drops
The lifecycle runs from RECEIVED through QUEUED, PREPARING, SIGNING, BROADCASTING, and CONFIRMING to COMPLETED. Custody workflows that require external sign-off pass through PENDING_APPROVAL between preparation and signing. Every path ends in a terminal state: COMPLETED for successful sends, FAILED when retries are exhausted, CANCELLED when an operator or the system aborts, and DEAD_LETTER when the retry budget is gone and an operator must intervene.
Three paths exist, depending on how signing is handled.
- 1Standard
QUEUED → PREPARING → SIGNING → BROADCASTING → CONFIRMING → COMPLETED. The platform requests a signature from the signer provider before broadcast.
- 2Native broadcast
QUEUED → PREPARING → BROADCASTING → CONFIRMING → COMPLETED. The signing provider handles signing internally; the SIGNING step is skipped.
- 3With approval
QUEUED → PREPARING → BROADCASTING → PENDING_APPROVAL → CONFIRMING → COMPLETED. A custody policy gates the transaction after broadcast; it waits in PENDING_APPROVAL until approved or expired.
At every transition, the engine persists the new state before proceeding. Nothing can move from BROADCASTING to CONFIRMING without the state record reflecting that. If the process crashes between those two steps, it restarts at BROADCASTING, not from the beginning. The state is the checkpoint. Recovery starts where work stopped, with no gap between the two.
DEAD_LETTER is the safety valve. When a transaction has exhausted its automatic retry budget and cannot progress without human judgment, the engine parks it in DEAD_LETTER and surfaces it through the Platform API and CLI. An operator investigates, resolves the underlying issue, and rescues the transaction back to QUEUED to try again. The escalation path is controlled, not silent. Autonomous recovery and operator-assisted recovery are two distinct states, never conflated.
The nonce problem, solved once
Every Ethereum account sends transactions in strict order. Each send carries a nonce, a counter that increments by one. Two sends that claim the same nonce: the network accepts the first and drops the second. A nonce that is too low: rejected outright. A nonce from a replaced transaction: the result depends on timing in ways that are difficult to predict.
This is where most fire-and-forget implementations break under real conditions. Before DALP 3.0, if a broadcast timed out and the caller retried, the retry could request a new nonce, either colliding with the in-flight send or creating a gap in the sequence that blocked every subsequent transaction from that account. Under light traffic on a test network, you might see nothing wrong. Under real concurrency, with approval windows and variable confirmation times, nonce collisions were near-certain.
The Transaction Lifecycle Engine holds the nonce for the lifetime of the send. The nonce is allocated during PREPARING and held through BROADCASTING and CONFIRMING. Release happens only when the send reaches a terminal state: confirmed on-chain, explicitly failed, or cancelled. A retry against a live send reuses the same nonce rather than requesting a new one. Two in-flight sends from the same signing account are queued so each nonce is allocated in order.
The sub-status layer records exactly what went wrong when a nonce error does occur. NONCE_CONFLICT and NONCE_TOO_LOW are distinct sub-statuses on a FAILED state, not generic errors. An operator can tell from the record whether the failure was a sequencing error the system could have prevented or an external condition the system caught and reported correctly.
Reverts return typed reasons, not hex
When a transfer reverts on-chain, the failure has a reason. That reason is encoded in the transaction receipt as a selector and parameters. Before DALP 3.0, reading that reason meant taking the four-byte selector, looking it up against the contract ABI, extracting the parameters, and translating the result into something an operator or integration could act on. That decoding step fell to the caller.
The Transaction Lifecycle Engine does this automatically. Every on-chain fault type the platform tracks is declared in the contract ABI: frozen addresses, expired identity claims, allowlist misses, supply cap violations, policy blocks. When a send reverts, the engine matches the returned selector against that registry, extracts the named parameters, and returns structured metadata in the response. The integration receives a fault name and the specific value that caused the rejection. The operator sees the exact rule that fired, not an opaque byte string.
For a regulated institution this matters in two directions. An operator can act on a specific reason immediately without opening a support ticket to decode the failure. Every typed rejection also becomes a structured audit record: when the transfer was attempted, exactly why it failed, which rule triggered it. Audit trails built from typed rejections carry inherent traceability. The reason lives in the data, readable directly without interpreting surrounding log lines.
Long writes return a settled result, not a timeout
Some writes take longer than an HTTP connection will stay open. Deploying a token contract, settling a multi-step transfer, or running a compliance onboarding workflow can each wait for on-chain confirmation across multiple blocks. Before DALP 3.0, that wait often exceeded the 100-second proxy ceiling, returning a gateway timeout to the caller who then had no way to know whether the operation succeeded.
In DALP 3.0, v2 write endpoints return immediately with a correlation handle. The connection does not stay open waiting for the chain. The caller listens on the status endpoint, which emits the settled outcome the moment confirmation lands: final state, on-chain address for deployments, transaction hash, and the block the send was included in. If the listener disconnects and reconnects, it gets the current state for that handle immediately. Nothing needs replaying or reconciling on the caller's side.
The practical consequence is that confirmation logic lives in one place. The engine holds the result until the caller reads it. An integration that previously needed a polling loop, a timeout handler, and a reconciliation pass to decide whether a timed-out token deployment had actually succeeded now reads a settled result or a typed failure. The guesswork is gone.
Workflows resume where they stopped
Multi-step operations that include on-chain sends are idempotent across restarts. A pod eviction, a rolling upgrade, a deliberate pause, or a crash mid-flight all produce the same outcome: the operation resumes from the last completed step. Steps that finished before the interruption are not re-executed. Steps that did not complete are retried from scratch, with the same inputs, until they succeed or exhaust their retry budget.
The mechanism is a persistent journal. Each step is a journal entry. When a run resumes after an interruption, the engine replays the journal to reconstruct state up to the last committed step, then continues forward. An onboarding that deploys a token, registers an investor identity, and executes an initial transfer can be interrupted at any point. The worst case is one incomplete step retried; no completed step is ever duplicated.
This matters most during operational events. A rolling upgrade during a multi-step onboarding does not leave an investor in a partially-registered state that requires manual remediation. The platform enforces a disruption budget during upgrades so in-progress operations drain before the running instance is replaced. In-flight work is not dropped to make room for a new version.
Stalls surface before anyone has to go looking
The engine watches every active invocation against two thresholds. The first is how long a run has been in a pending state. The second is how long since any state mutation was last observed. Both must cross their threshold before the system flags a stall: a long-running operation that is still making progress does not trigger, even if it has been running for hours.
When both thresholds fire, the alert surfaces in the console rather than sitting silently in a log file. The difference is that the system tells you about the stall before you have to discover it. The team knows a run has stalled, can see why, and can act without trawling through infrastructure logs.
Runs that need a genuine decision land at the same surface. A send that reverted because a compliance rule changed mid-flight, or an approval that was explicitly rejected by a second signer, lands in the operator queue alongside stalled runs. An operator resolves the underlying issue and resumes through the Platform API or CLI. The distinction between automatic recovery and operator-assisted recovery is explicit: the engine handles what it can handle autonomously, surfaces what it cannot, and never conflates the two by silently marking a parked run complete.
When a run needs a hand
Automatic restart covers the common case. When a run has exhausted its retry budget and parked itself, it needs an operator decision, not a database query. The Platform API and CLI expose every paused invocation: list what is stuck, preview a resume as a dry run, resume one invocation by ID or bulk-resume a set. No infrastructure access required.
The same surface covers runs that ended in DEAD_LETTER. An operator who resolves the underlying issue rescues the transaction back to QUEUED through the same interface. The audit record of the rescue is part of the lifecycle history for that transaction: when it was escalated, when it was resolved, who acted on it.
Background schedulers for reconciliation, rate refresh, and on-chain confirmation monitoring recover automatically after a crash. They restart from their last committed state and continue without operator intervention. Upgrades carry a disruption budget so in-progress work drains before the new version takes over.
Account abstraction
Smart wallets and sponsored gas are now part of the core operating model. Operators take actions without holding native tokens, under role boundaries you define.
Ledger Index
A live, multi-chain index that decodes every on-chain event the moment a block is final. Reorg-safe and queryable as plain business data, with balances at any past block.