athena-sdk¶
A Python SDK for building, validating, and running data workflows. Open-source, self-contained, and runnable on your laptop in under a minute.
from athena_sdk import Workflow
wf = Workflow("hello-sdk")
wf.python_transform(
"greet",
code="def transform(row):\n return {'message': 'hello, athena'}\n",
)
result = wf.run()
print(result.ok, result.node_results["greet"].output)
That's a complete, working program — no API key, no remote backend, no
SaaS account. The bundled engine runs in-process; the only external
dependencies are Postgres + S3-compatible storage (one
docker compose up -d away — see Getting started).
What you get¶
- Typed builders for every node —
wf.postgres(),wf.s3(),wf.api(),wf.python_transform(),wf.if_(),wf.split(), plus a long tail of source connectors (Twitter, Reddit, PubMed, ClinicalTrials, EDGAR, …). Misuse fails at build time with a typedWorkflowBuildError, not at runtime. - Local execution —
wf.run()executes the workflow in-process via the bundled engine. Same code runs unchanged against a deployed nexus-backend viawf.deploy_and_run()if you want hosted execution later. - Validation, visualization, serialization —
wf.validate()returns structural issues before you run.wf.visualize()renders Mermaid / ASCII (and HTML in Jupyter).wf.to_json()/wf.to_yaml()round-trip cleanly so workflows can live in Git alongside the rest of your code. - Expressions —
expr.node("Loader").get("rows")references upstream outputs without string-templating gymnastics. - CLI —
athena validate,athena visualize,athena runfor workflows that live as JSON/YAML files.
Install¶
Optional extras:
pip install 'athena-sdk[yaml]' # YAML round-trip
pip install 'athena-sdk[cli]' # `athena` CLI entry point
pip install 'athena-sdk[orchestrator]' # AthenaWorkflowTask for DAG hosts
pip install 'athena-sdk[otel]' # OpenTelemetry spans (opt-in)
Where to next¶
- Getting started — install, spin up the engine, run your first workflow end-to-end.
- Concepts — Workflow, Node, Connection, Result. Read these once and the rest of the API makes sense.
- Guides — task-oriented walkthroughs for building, running, expressing, and operating workflows.
- API reference — every public symbol,
rendered from the docstrings in
src/athena_sdk/.