SettleMint
Developer guidesAPI integration

Durable Execution Engine operator API

Reference for DALP operator API routes that check Durable Execution Engine health, re-register workflow services, remove stale deployments, and prepare a stuck workflow for retry.

The Durable Execution Engine operator API is an operator-only route set for workflow health checks and narrow recovery actions. Start with the doctor route, choose one write route that matches the failing component, then run doctor again to confirm the result.

These routes require an authenticated caller with the system operate permission. They are intended for platform operators and trusted automation, not tenant-facing product flows.

Recovery order

StepRoutePurpose
1POST /api/v2/admin/operator/restate/doctorInspect ingress, admin API, deployments, services, and invocation state without changing workflow state.
2One write routeRe-register the service, remove stale deployments, or prepare one workflow key for retry.
3POST /api/v2/admin/operator/restate/doctorVerify that the affected component moved back to ok or that the remaining failure is understood.

The exact route path contains the current public API service segment. Treat that segment as an endpoint path, not as product terminology.

Component statuses

Doctor responses use the same status vocabulary for each component.

StatusMeaning
okThe component responded and its payload matched the expected shape.
degradedThe component responded, but it returned an error status or an unexpected payload.
unreachableDALP could not reach the component within the route timeout or could not complete the probe.

A degraded sub-check does not fail the whole doctor route. Read every component before choosing a recovery action.

Doctor

doctor is the read-only entry point. It probes the Durable Execution Engine ingress URL, admin API, deployment list, service list, and invocation table.

POST /api/v2/admin/operator/restate/doctor

Request body

Send an empty JSON object.

{}

Success response

{
  "ingress": { "status": "ok", "latencyMs": 12 },
  "admin": { "status": "ok", "latencyMs": 9, "version": "1.4.0" },
  "deployments": {
    "status": "ok",
    "items": [
      {
        "id": "dp_01j8m7k2q3r4s5t6u7v8w9x0y1",
        "serviceUrl": "https://workflow-service.example.com",
        "createdAt": "2026-05-09T10:00:00.000Z"
      }
    ],
    "error": null
  },
  "services": {
    "status": "ok",
    "items": [{ "name": "IdentityRecoveryWorkflow", "revision": 3 }],
    "error": null
  },
  "invocations": {
    "status": "ok",
    "byStatus": { "invoked": 2, "suspended": 1 },
    "recentFailures": [],
    "error": null
  }
}

recentFailures returns up to 20 recent failed invocations. Each item includes id, serviceName, serviceKey, failedAt, and errorMessage.

Force redeploy

force-redeploy registers the durable workflow service URL with the Durable Execution Engine admin API. The route does not remove old deployments. Run stale deployment cleanup when doctor still shows old deployment records.

POST /api/v2/admin/operator/restate/force-redeploy

Request body

FieldTypeRequiredDescription
serviceUrlURL stringYesService URL to register. Use the URL returned by doctor or the configured durable workflow service endpoint.
forcebooleanNoDefaults to true. Passes a forced registration request to the admin API.
{
  "serviceUrl": "https://workflow-service.example.com",
  "force": true
}

Success response

{
  "acknowledged": true,
  "deploymentId": "dp_01j8m7k2q3r4s5t6u7v8w9x0y1"
}

Cleanup stale deployments

cleanup-stale-deployments keeps the deployment matching serviceUrl. DALP drains every other registered deployment. Use this route after doctor or force redeploy confirms the active service URL.

POST /api/v2/admin/operator/restate/cleanup-stale-deployments

Request body

FieldTypeRequiredDescription
serviceUrlURL stringYesService URL of the deployment to keep. Every other registered deployment is treated as stale.
forceDrainbooleanNoDefaults to false. Use true only when stale deployments point to dead services and cannot drain normally.
{
  "serviceUrl": "https://workflow-service.example.com",
  "forceDrain": false
}

Success response

{ "acknowledged": true }

If DALP cannot list deployments or reach the admin API, the route returns an admin-unreachable error instead of acknowledging cleanup.

Recover stuck workflow

recover-stuck-workflow prepares one workflow key for retry. DALP kills and purges prior invocations for the supplied (serviceName, serviceKey) pair, then clears keyed workflow state so the next submission starts from a blank state.

POST /api/v2/admin/operator/restate/recover-stuck-workflow

Request body

FieldTypeRequiredDescription
serviceNamestringYesWorkflow service name. It must contain only letters, numbers, underscores, and hyphens.
serviceKeystringYesWorkflow service key. It must contain only letters, numbers, underscores, and hyphens.
{
  "serviceName": "IdentityRecoveryWorkflow",
  "serviceKey": "invitation_01j8m7k2q3r4s5t6u7v8w9x0y1"
}

Success response

{ "acknowledged": true }

DALP refuses to clear a workflow when an active invocation is still running or when the previous invocation already succeeded. That failure path returns a structured retry-blocked error with reason and invocationIds so the operator can inspect the active or completed work before trying again.

Error conditions

ConditionMeaningOperator response
Missing system operate permissionThe caller is not authorised for operator routes.Use an operator account or API key with the required permission.
Admin API unreachableDALP could not resolve or reach the Durable Execution Engine admin API.Check admin connectivity, then rerun doctor.
Deployment not foundThe supplied serviceUrl does not match a registered deployment when the route needs that mapping.Run doctor and retry with the exact registered service URL.
Workflow retry blockedRecovery found an active invocation, an already succeeded invocation, or a query or purge condition that prevents safe retry.Inspect the returned reason and invocationIds before retrying or escalating.

On this page