Why orchestration is the backbone
Most agentic lending systems do not fail because the AI was bad. They fail because:
- Experian or another bureau times out mid-pull.
- The orchestrator loses state during a service blip.
- The application double-pulls credit on a page refresh.
- A Reg B / Reg Z timer quietly slips past during an incident.
The 7 principles — at a glance
- 1**Durable workflow engine** — workflow-as-code with deterministic replay.
- 2**Application as a saga** — long-lived state machine with a full event log.
- 3**Parallel fan-out, gated fan-in** — bureau / income / fraud in parallel, deterministic joins.
- 4**Idempotency is non-negotiable** — unique key on every external call.
- 5**Signals for events** — resume on doc upload, e-sign, HITL response.
- 6**Deployable versioning** — in-flight apps survive a redeploy.
- 7**Regulatory timers as first-class citizens** — Reg B / Reg Z / ECOA fire even through outages.
1. Durable workflow engine
**Failure mode it prevents:** worker crashes mid-application, state evaporates, applicant has to start over.
- Pick an engine that treats your code as the workflow definition and persists every step.
- Temporal is the canonical example — every
awaitis checkpointed, so it replays deterministically after a crash. - Airflow, Camunda, n8n, and CrewAI all have a place — but for credit decisioning, **deterministic replay is what keeps your audit story honest**.
// Temporal-style workflow — reads sequential, replays deterministically
export async function loanApplicationWorkflow(input: LoanInput) {
const bureau = await pullBureau(input.applicantId);
const income = await verifyIncome(input.applicantId);
const fraud = await runFraudChecks(input.applicantId);
const decision = await decide({ bureau, income, fraud });
if (decision.confidence < 0.85) {
await escalateToHuman(decision); // pauses on a signal
}
await runComplianceChecks(decision);
await issueDisclosures(decision);
await waitForSignal('e-sign-completed');
await fundLoan(decision);
}2. Application as a saga
**Failure mode it prevents:** "what happened to application X?" requires trawling 10 services for partial truth.
- Each loan is a long-lived state machine — minutes for instant approval, weeks for doc collection.
- Every transition is an event written to a durable log.
- **Compensations matter** — if a downstream step fails, the saga knows how to roll visible state back without leaving the customer in limbo.
3. Parallel fan-out, gated fan-in
**Failure mode it prevents:** decisioning runs on stale or partial data because one source was slow.
- Pull credit, verify income, and run fraud checks in parallel.
- Do NOT advance until all required signals are back (or a deterministic timeout fires).
- A gated fan-in makes **"what we knew at decision time"** trivial to reconstruct.
4. Idempotency is non-negotiable
**Failure mode it prevents:** a page refresh causes a second hard-pull, dings the customer, and creates a fair lending audit trail you do not want.
- Every outbound call — bureau pull, KYC, payment, e-sign — carries an idempotency key derived from
application + step + attempt. - Network hiccup, restarted worker, retried activity → still safe.
// Idempotency key — derived, not random.
// Same (applicationId + step + attempt) → same key → bureau dedupes.
function idempotencyKey(applicationId: string, step: string, attempt: number) {
return `${applicationId}:${step}:${attempt}`;
}
await experian.softPull({
applicantId: input.applicantId,
idempotencyKey: idempotencyKey(applicationId, 'experian-soft-pull', attempt),
});5. Signals for events
**Failure mode it prevents:** burning money on polling and getting "which step is this on?" wrong.
- Workflows pause on real-world events — doc uploaded, e-sign captured, HITL response, manual override.
- Resume happens on a **signal**, not a poll.
- Signal-driven resumption keeps the workflow honest about which step it is on and **who took the last action**.
6. Deployable versioning
**Failure mode it prevents:** a redeploy silently changes the meaning of a step an application has already passed through.
- You WILL redeploy mid-application. Plan for it.
- Version your workflow code: old applications replay against the version that started them; new applications pick up the new version.
- Engines like Temporal have first-class versioning APIs — use them; do not hand-roll.
7. Regulatory timers as first-class citizens
**Failure mode it prevents:** missing a Reg B adverse action window because your cron host was down for 20 minutes.
- Reg B adverse action, Reg Z disclosure timing, ECOA notification windows — these are clocks that **must fire even mid-incident**.
- Model them as **durable timers inside the workflow engine**, not as cron jobs in a side service.
- If the timer fires while workers are restarted, the engine triggers the right activity when it comes back up.
Putting it together
flowchart TD
Start["Application created"] --> Saga["Application Saga (state + log)"]
Saga --> FanOut["Parallel: bureau / income / fraud / KYC"]
FanOut --> Gate["Gated fan-in (all required signals or timeout)"]
Gate --> Decision["Decisioning agent"]
Decision -->|approve| Compliance["Compliance + disclosures"]
Decision -->|review| HITL["HITL signal awaited"]
HITL --> Compliance
Compliance --> Sign["E-sign signal awaited"]
Sign --> Fund["Funding + close"]
Saga --- Timers["Reg B / Reg Z / ECOA durable timers"]Continue the series
- ← Part 1 — Orchestration beats the model
- ← Part 2 — A production architecture for agentic lending
- Part 4 (coming next) — implementing this in **Temporal vs. n8n vs. CrewAI** for the Decisioning, Compliance, and HITL agents.
- For more notes from the series, visit apicode.io.
Part 3 of 3 in the LinkedIn series on Building Lending Platform Orchestration. Reformatted with TL;DR, principle-by-principle "failure mode + fix" structure, code snippets, and a recap.