Skip to content

athena-sdk

A Python SDK for building, validating, and running data workflows. Open-source, self-contained, and runnable on your laptop in under a minute.

from athena_sdk import Workflow

wf = Workflow("hello-sdk")
wf.python_transform(
    "greet",
    code="def transform(row):\n    return {'message': 'hello, athena'}\n",
)

result = wf.run()
print(result.ok, result.node_results["greet"].output)

That's a complete, working program — no API key, no remote backend, no SaaS account. The bundled engine runs in-process; the only external dependencies are Postgres + S3-compatible storage (one docker compose up -d away — see Getting started).


What you get

  • Typed builders for every nodewf.postgres(), wf.s3(), wf.api(), wf.python_transform(), wf.if_(), wf.split(), plus a long tail of source connectors (Twitter, Reddit, PubMed, ClinicalTrials, EDGAR, …). Misuse fails at build time with a typed WorkflowBuildError, not at runtime.
  • Local executionwf.run() executes the workflow in-process via the bundled engine. Same code runs unchanged against a deployed nexus-backend via wf.deploy_and_run() if you want hosted execution later.
  • Validation, visualization, serializationwf.validate() returns structural issues before you run. wf.visualize() renders Mermaid / ASCII (and HTML in Jupyter). wf.to_json() / wf.to_yaml() round-trip cleanly so workflows can live in Git alongside the rest of your code.
  • Expressionsexpr.node("Loader").get("rows") references upstream outputs without string-templating gymnastics.
  • CLIathena validate, athena visualize, athena run for workflows that live as JSON/YAML files.

Install

pip install athena-sdk

Optional extras:

pip install 'athena-sdk[yaml]'          # YAML round-trip
pip install 'athena-sdk[cli]'           # `athena` CLI entry point
pip install 'athena-sdk[orchestrator]'  # AthenaWorkflowTask for DAG hosts
pip install 'athena-sdk[otel]'          # OpenTelemetry spans (opt-in)

Where to next

  • Getting started — install, spin up the engine, run your first workflow end-to-end.
  • Concepts — Workflow, Node, Connection, Result. Read these once and the rest of the API makes sense.
  • Guides — task-oriented walkthroughs for building, running, expressing, and operating workflows.
  • API reference — every public symbol, rendered from the docstrings in src/athena_sdk/.