Early-stage software — the API is unstable and will change. Follow along on GitHub.

Your process crashes. Your workflow doesn't.

memable is an in-process durable execution engine for Rust. Steps are identified by keys you choose. Completed steps return cached results on restart. No external services to deploy.

$ cargo add memable
How it works ↓
pipeline.rs
async fn sync_pipeline(ctx: Context)
    -> Result<(), EngineError>
{
    let data: Vec<Record> = ctx
        .step("extract:v1")
        .run(async || {
            Ok(fetch_from_api().await)
        }).await?;

    let cleaned = ctx
        .step("transform:v1")
        .run(async || {
            Ok(normalise(&data))
        }).await?;

    ctx.step("load:v1")
        .run(async || {
            write_to_warehouse(&cleaned).await;
            Ok(())
        }).await?;

    Ok(())
}
Abstract illustration of workflow nodes connected by colourful pathways, with key icons and checkmarks representing cached steps

A library, not a platform

memable runs inside your process. There's no separate server, no sidecar, no coordinator to deploy. Add it to Cargo.toml, write your workflow as an async function, and your steps survive crashes, restarts, and redeployments.

If you need distributed orchestration across a fleet of services, tools like Temporal and Restate are built for that. memable is for when your workflows run in a single process and you want them to be durable without adding infrastructure.

Data pipelines that checkpoint automatically. AI agents that resume after a restart. Integration jobs that don't lose progress.

What's in the crate

  • Embedded storage (redb, pure Rust, no FFI)
  • Key-based memoisation for crash recovery
  • Suspend/resume without holding memory
  • Configurable retries with backoff
  • In-memory backend for tests

Keys, not positions

Most durable execution engines replay your workflow from the top, matching each step by its position in the code. Reorder something or insert a step and that's a breaking change for every in-flight workflow.

memable matches by key. You name each step yourself. On recovery, the engine looks up what's already done by key and returns the cached result. It doesn't care what order your code runs in.

Deploy new code while workflows are in flight. Bump a key version to re-execute a buggy step. Remove a step and its cached result just sits there harmlessly. No replay, no migration.

After deploying v2

extract:v1 → cached
transform:v2 → re-executes (key bumped)
validate:v1 → runs fresh (new step)
load:v1 → cached

What you get

Everything runs in your process. No network calls to a coordinator, no external state store. Your code and a local embedded database.

Crash recovery

Steps are journaled before execution. Crash mid-workflow and on restart, completed steps return cached results. Execution picks up where it stopped.

Explicit retries

You decide what's retryable. StepError::retryable() for transient failures, StepError::permanent() when it's done for. The engine doesn't guess.

Suspend and resume

ctx.suspend("key") drops the workflow to disk. Nothing held in memory. Signal it later with a payload and execution continues from cached state.

Child workflows

Fan out to child workflows, each with their own key space. Configurable concurrency limits. Results collected with ctx.join_all().

Durable timers

ctx.timer("key", duration) suspends until a deadline. The workflow drops from memory. A background task handles expiry. No cron, no external scheduler.

Built on tracing

Every step, retry, suspend, and resume is a tracing span with structured fields. Bring your own subscriber.

Built for agents

LLM calls cost money and take seconds. If your agent crashes after a planning step, you don't want to re-run the same prompt and pay for the same tokens again. memable caches the result by key. On recovery, it's instant and free.

Tool calls have side effects. Sending an email, charging a card, writing to a database. These can't just be replayed. A keyed step that already succeeded returns its cached result. The side effect doesn't fire twice.

Agent loops run for minutes or hours. A research agent that searches, reads, summarises, and iterates can run dozens of steps. Each one gets a dynamic key. A crash at step 30 doesn't mean re-running steps 1 through 29.

agent.rs
async fn research_agent(ctx: Context)
    -> Result<(), EngineError>
{
    // LLM plans the research -- cached on recovery
    let plan: Plan = ctx
        .step("plan:v1")
        .run(async || {
            Ok(llm("What topics should we cover?").await)
        }).await?;

    // Each topic gets its own keyed steps
    for topic in &plan.topics {
        let results = ctx
            .step(&format!("search:{topic}:v1"))
            .run(async || {
                Ok(web_search(topic).await)
            }).await?;

        ctx.step(&format!("summarise:{topic}:v1"))
            .run(async || {
                Ok(llm(&format!("Summarise: {results}")).await)
            }).await?;
    }

    // Final report -- all prior steps cached
    ctx.step("report:v1")
        .run(async || {
            Ok(llm("Write the final report").await)
        }).await?;

    Ok(())
}

Get started

1

Add the crate

$ cargo add memable
2

Write a workflow

An async function that takes Context and calls ctx.step() for each durable operation.

3

Run it

let engine = Engine::builder()
    .storage(RedbStorage::open("./data")?)
    .build();

engine.register("my-job", my_workflow);
engine.start().await?;

engine.invoke("my-job")
    .await?.wait().await;