Skip to content

Vibe-coding workflows with Microsoft Copilot

This guide is for users who want to build athena-sdk workflows by describing what they want — to GitHub Copilot Chat, Copilot inline suggestions, or the Copilot-powered features in VS Code — and have the AI generate the SDK code for them.

The SDK is built for this. The public API is small (~25 typed builders), the workflow shape is declarative, and the examples/ folder ships a curated bench of full DAG patterns that Copilot can read as context and adapt to your use case.

The trick is giving Copilot the right context, not better prompts.


The 30-second loop

1. Pick the closest example.
2. Open it next to your new file in VS Code.
3. Open AGENTS.md too — Copilot will pick up the patterns + rules.
4. Describe your workflow in a top-of-file docstring.
5. Let Copilot draft. Validate (`wf.validate()`). Iterate.

That's the loop. Everything below is just sharpening each step.


Step 1 — Pick the closest example

The bench of examples doubles as a prompt library. Each one is a self-contained, runnable workflow that demonstrates one shape:

If your workflow is … Open this example
ETL: pull from Postgres → reshape → write to S3 01_etl_postgres_to_s3.py
Source → AI tagging → write back 02_enrichment_with_ai.py
Streaming social feed → filter → aggregate 03_social_monitoring.py
REST API GET → map → Postgres upsert 04_api_to_postgres_upsert.py
Deploy to backend + poll for completion 05_deploy_and_poll.py
Render a workflow as Mermaid / ASCII 06_visualize.py
Tenant/asset-parameterised DAG ({{name}} tokens) 07_workflow_variables.py
Fan-out + fan-in (split → parallel branches → merge) 08_split_merge_parallel.py
Multi-stage pipeline with ((var)) placeholders 09_export_variables_pipeline.py

If your use case crosses several rows, pick the one closest to the topology (linear / branched / fan-out + fan-in), not the data domain. Topology is the hard part for Copilot to invent; domain strings (table names, URLs) are easy to swap.


Step 2 — Pin the right files into Copilot's context

GitHub Copilot Chat (and inline Copilot) reads from your open editor tabs and the active workspace. Open these alongside your new file:

your_new_workflow.py        ← active file you're typing in
examples/0N_<closest>.py    ← closest pattern from the table above
AGENTS.md                   ← rules + module map (top of repo)
docs/site/guides/build.md   ← canonical builder API + idioms

That's enough. AGENTS.md tells Copilot the module structure; the example shows the working DAG; build.md fills any gaps in the API surface. Don't pin every example — Copilot's context window is finite, and irrelevant examples dilute the signal.

For Copilot Chat (Ctrl/Cmd-I), you can also explicitly attach files via #file::

#file:examples/02_enrichment_with_ai.py
#file:AGENTS.md

Build me a workflow that pulls rows from a Snowflake table, runs a
GLiNER NER model on the `body` column via ai_tagging, and writes
the entities back into a sibling table.

Step 3 — Write the workflow as a top-of-file docstring

Before you call any builder, tell Copilot what you're building. A DAG sketch + a one-line description per node is enough:

"""
order-anomaly-flag — flag orders with a suspicious amount-to-customer
ratio and post the flagged set to a Slack webhook.

DAG:
    Postgres SELECT (orders + customer averages)
      → python_transform (compute z-score per row)
      → filter (z > 3.0)
      → api POST to Slack
"""

from athena_sdk import Workflow, Trigger, Config

Copilot reads the docstring and the imports, then auto-completes the rest in the SDK's idiom. It now knows:

  • Which nodes you want (one line per node = one builder call).
  • The order they wire up in.
  • That the trigger / connection pattern matches the example you pinned.

Step 4 — Use the four-builder grammar in your prompts

The SDK groups everything into four categories (see build.md → The four kinds of nodes). Naming the category in your prompt locks Copilot to the right helper:

Say Copilot reaches for
"Add a Postgres SELECT for X" wf.postgres(operation="select", query=...)
"Add an action to write to S3" wf.s3(operation="write", ...)
"Add a transform that maps rows" wf.map(...) (typed, not python_transform)
"Add a control to branch by X" wf.if_(...) or wf.switch(...)
"Fan out to two parallel branches" wf.split(...) + the two downstream nodes
"Merge two branches back" wf.merge(how="join", on="...")

If Copilot reaches for python_transform for something one of the typed transforms covers, push back: "use wf.filter instead". The typed transforms validate at build time and render in the diagram — Copilot sometimes defaults to python_transform because that's what an LLM "would" write in plain Python.


Step 5 — Common workflow shapes, copilot-style

Linear ETL — "from X read, transform, write to Y"

Prompt:

Workflow that reads orders from Postgres, drops anything older than 30 days, and writes the rest as Parquet to S3.

Copilot draft:

wf = Workflow("orders-archive")
wf.trigger(Trigger.cron("0 6 * * *", timezone="UTC"))

pg = Config.postgres(host="...", database="...", user="...", password="...")

read = wf.postgres(
    "load",
    operation="select",
    query="SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days'",
    connection=pg,
)
write = wf.s3(
    "archive",
    bucket="orders-archive",
    operation="write",
    file_path="snapshot.parquet",
    file_format="parquet",
)
read >> write

Branched flow — "if X, then Y, else Z"

Prompt:

If the order amount > $1000 post to Slack, else write to a low-value S3 bucket.

Copilot draft:

gate = wf.if_("is_high_value", condition="$input.data.get('amount') > 1000")
gate.true  >> wf.api("notify", url="https://hooks.slack.com/...", method="POST")
gate.false >> wf.s3("park", bucket="low-value", operation="write", file_path="parked.csv")

Fan-out + fan-in — "do X and Y in parallel, then combine"

The shape that's hardest to invent without a template. Sketch the two branches in your docstring, then let Copilot draft:

split = wf.split("by_relevance")
merge = wf.merge("recombine", how="join", on="record_id")

split >> filter_primary   >> bert_primary   >> insert_primary   >> merge
split >> filter_secondary >> bert_secondary >> insert_secondary >> merge

merge >> downstream

UI export translation — "translate this exported flow"

Two-step prompt, very effective:

  1. Open the exported .json file in one tab and the example whose topology is closest (linear → 01, AI tagging → 02, fan-out → 08_split_merge_parallel.py).
  2. Ask Copilot Chat:

    Using the patterns from #file:08_split_merge_parallel.py, translate this workflow JSON into SDK code: #file:my_export.json

Copilot copies the example's structure (workflow shell + per-node builder + >> wiring) and fills in the configs from the JSON.


Reference card — what to tell Copilot when it gets stuck

Common Copilot misses, and the corrections that unblock it:

Symptom Tell Copilot
Generated import openai / raw httpx.post() "Use the SDK's typed builders. No raw HTTP."
Forgot the trigger "Add wf.trigger(Trigger.manual()) after the workflow init."
Used a node name twice "Node names must be unique. Differentiate by operation."
Built a python_transform for a typed-transform job "Replace with wf.filter(...) / wf.map(...) — typed beats Python."
Forgot to wire merge's two upstreams "merge is fan-in: both branches must >> into it."
Used {{var}} without wf.set_variable("var", ...) "Add wf.set_variable("var", "<value>") near the top."
((var)) placeholder in an api() body fails to render "The producing python_transform needs export_variables={"var": "var"}."
Asked for a config kwarg that doesn't exist Show the helper's docstring — Copilot will conform.

Validate, visualize, run — fast feedback for vibe-coding

Don't trust Copilot's output until wf.validate() agrees and wf.visualize() shows the shape you sketched:

issues = wf.validate()
assert not issues, issues
print(wf.visualize())   # mermaid / ASCII diagram

Both are pure-Python — no engine, no DB, no API key needed. Run them on every Copilot draft. A 30-second loop of describe → draft → validate → visualize → adjust converges much faster than typing the workflow by hand and beats trying to debug a broken DAG at runtime.

When the diagram looks right, switch to:

result = wf.run()           # local engine
# or
client.workflows.deploy(wf) # hosted

What not to vibe-code

Some parts of the SDK have non-obvious idioms that Copilot can guess wrong. Be deliberate here:

  • Connection configs — passwords belong in env vars / secrets, not in the workflow file. Copilot will happily inline credentials.
  • Cron expressions — Copilot sometimes invents non-standard formats. Verify on crontab.guru before deploying.
  • Custom Python code inside python_transform — review every line. Copilot's Python is good, but the engine sandbox restricts imports; function mode loops per-row but script mode runs once. See build.md → python_transform.
  • Postgres queries with user-controlled inputs — review for SQL injection. The SDK doesn't escape automatically.

Where to next

  • Build a workflow — the canonical reference for every builder method, with snippets.
  • Expression helpers — typed expr.node() / expr.variable() references for cross-node wiring.
  • Examples folder — every shape, runnable.
  • AGENTS.md (top of repo) — module map, patterns, what to avoid. Also worth pinning when working with Copilot.