Durable workflows, resilient agents.

In-process execution engine for Rust. Key-based memoisation. No external dependencies. Your binary is the engine.

$ cargo add memable
See how it works ↓
pipeline.rs
async fn sync_pipeline(ctx: Context)
    -> Result<(), EngineError>
{
    // Each step has a key. Already done? Cached.
    let data: Vec<Record> = ctx
        .step("extract:v1", async || {
            fetch_from_api().await
        }).await?;

    let cleaned = ctx
        .step("transform:v1", async || {
            normalise(data)
        }).await?;

    ctx.step("load:v1", async || {
        write_to_warehouse(cleaned).await
    }).await?;

    Ok(())
}
Abstract illustration of workflow nodes connected by colourful pathways, with key icons and checkmarks representing cached steps

A library, not a platform

memable runs inside your process. There's no separate server, no sidecar, no coordinator to deploy. Add it to Cargo.toml, write your workflow as an async function, and your steps survive crashes, restarts, and redeployments.

If you need distributed orchestration across a fleet of services, tools like Temporal and Restate are built for that. memable is for when your workflows run in a single process and you want them to be durable without adding infrastructure.

Data pipelines that checkpoint automatically. AI agents that resume after a restart. Integration jobs that don't lose progress. That's the use case.

What's in the crate

  • Embedded storage (redb, pure Rust, no FFI)
  • Key-based memoisation for crash recovery
  • Suspend/resume without holding memory
  • Configurable retries with backoff
  • In-memory backend for tests

Keys, not positions

Most durable execution engines replay your workflow from the top, matching each step by its position in the code. Reorder something? Insert a step? That's a breaking change for every in-flight workflow.

memable matches by key. You name each step yourself. On recovery, the engine looks up what's already done by key and returns the cached result. It doesn't care what order your code runs in.

Deploy new code while workflows are in flight. Bump a key version to re-execute a buggy step. Remove a step and its cached result sits there harmlessly. No replay. No migration. No ceremonies.

After deploying v2

extract:v1 → cached
transform:v2 → re-executes (key bumped)
validate:v1 → runs fresh (new step)
load:v1 → cached

What you get

Everything runs in your process. No external services, no network calls to a coordinator. Just your code and a local embedded store.

Crash recovery

Steps are journaled before execution. Crash mid-workflow? On restart, completed steps return cached results. Execution picks up where it stopped.

Explicit retries

You decide what's retryable. Return StepError::retryable() for transient failures. StepError::permanent() when it's done for. The engine doesn't guess.

Suspend & resume

ctx.suspend("key") drops the workflow to disk. Nothing held in memory. Signal it later with a payload and execution continues from cached state.

Child workflows

Fan out to child workflows, each with their own key space. Configurable concurrency limits. Results collected with ctx.join_all().

Durable timers

ctx.timer("key", duration) suspends until a deadline. The workflow drops from memory. A background task handles expiry. No cron. No external scheduler.

Built on tracing

Every step, retry, suspend, and resume is a tracing span with structured fields. Bring your own subscriber. No proprietary observability layer.

See it work

A workflow with retry logic. The second step fails on the first attempt, then succeeds on resume. The first step stays memoised throughout.

basic.rs
async fn greeting_workflow(ctx: Context)
    -> Result<(), EngineError>
{
    // Step 1: always succeeds, gets cached
    let name: String = ctx
        .step("fetch-name:v1", async || {
            Ok("Rust".to_string())
        }).await?;

    // Step 2: transient failure on first try
    let greeting: String = ctx
        .step("format-greeting:v1", async || {
            if sometimes_fails() {
                return Err(
                    StepError::retryable("timeout")
                );
            }
            Ok(format!("Hello, {name}!"))
        }).await?;

    println!("{greeting}");
    Ok(())
}

First run

Step 1 completes and is journaled. Step 2 hits a transient failure and returns StepError::retryable().

On resume

The engine re-enters the workflow. Step 1 returns the cached result instantly, no re-execution. Step 2 runs again and succeeds.

The point

You don't write recovery logic. You don't manage checkpoints. You write normal async Rust and let the engine sort out the rest.

Get started

1

Add the crate

$ cargo add memable
2

Write a workflow

An async function that takes Context and calls ctx.step() for each durable operation.

3

Run it

let mut engine = Engine::builder()
    .in_memory()
    .build();
engine.register("my-job", my_workflow);
engine.start().await?;

engine.invoke("my-job").await?;