# KiCI documentation bundle

This file is the concatenated markdown of every KiCI documentation page intended for LLM coding agents. See the index at /llms.txt for the same content as a curated link list.

# Getting started

## User guide

Source: https://docs.kici.dev/user/

Documentation for workflow authors -- people writing CI/CD pipelines in TypeScript using the KiCI SDK and compiler. If you are defining workflows, running local tests, or learning the SDK API, start here.

## Pages

### [Getting started with KiCI](getting-started.md)

Install the SDK and compiler, write your first workflow, compile it to a lock file, and test it locally with simulated events. Covers prerequisites (Node.js 24+, pnpm), the `kici init` command for scaffolding, and the relationship between workflows, the lock file, and the three-tier runtime.

### [SDK reference](sdk-reference.md)

Complete API reference for `@kici-dev/sdk`. Covers all factory functions (`workflow()`, `job()`, `step()`), trigger builders (`pr()`, `push()`), rule functions (`rule()`, `skip()`), matrix configuration (static arrays, static objects, dynamic functions, include/exclude), and the `StepContext` interface that steps receive at runtime.

### [Lock file and workflow drift](lock-file-and-drift.md)

Why the lock file must stay in sync with workflow source, how to commit both together, using pre-commit and CI to catch drift early, and the agent-side hash verification when compiling from source.

### [CLI reference](cli-reference.md)

All CLI commands provided by `@kici-dev/compiler`: `kici compile` (with watch mode and check mode), `kici run` (local and remote execution), `kici test` (event simulation with dry-run, filtering, debug output, and custom payloads), `kici login`/`logout`/`org` (authentication and org management), `kici status`/`cancel` (run management), `kici secrets` (secret listing), `kici types` (type generation), `kici fixture` (generate test payloads), `kici init` (interactive project scaffolding), `kici hook` (pre-commit hook installation), `kici endpoints` (webhook entrypoints), and `kici workflows` (workflow listing). Includes environment variables and exit codes.

### [Workflow patterns](workflow-patterns.md)

Common patterns for building real-world CI/CD workflows. Includes examples for basic CI pipelines with job dependencies, monorepo path-based triggering, conditional jobs with rules, matrix builds across Node versions, dynamic jobs generated at runtime, Docker-based step execution, and parallel test splitting.

### [Dashboard](dashboard.md)

Guide to the KiCI web dashboard. Covers navigation (sidebar, org switcher, mobile bottom tabs), run list (table columns, filters, pagination, empty states), run detail (resizable two-panel layout, job tree, step selection, metadata tabs), log viewer (ANSI color rendering, search, permalink, copy), settings page (tabbed layout), theme toggle, keyboard shortcuts, and error pages.

### [Testing guide](testing-guide.md)

How to run and write tests for KiCI workflows, including remote test execution with `kici run remote`, fixture-based testing, and overlay mode for uncommitted changes.

### [Environments](environments.md)

Configure deployment environments (staging, production, review/\*) with variables, scoped secrets, and protection rules. Covers the SDK API (`environment`, `env`, `concurrencyGroup` on jobs), the 8-layer variable merge precedence, protection rules (branch restrictions, required reviewers, wait timers, concurrency), dashboard management, type generation, and migration from the legacy contexts system.

### [Environment variables](env-vars.md)

Reference for all `KICI_*` environment variables supported by the CLI. Covers authentication overrides (OIDC issuer, client ID, project ID), browser behavior (custom browser command, fixed callback port), development mode, and usage examples for CI/CD, self-hosted, and headless environments.

### [CLI authentication](cli-auth.md)

Authenticate the KiCI CLI with browser-based OAuth (default), device authorization flow (for headless environments), or API key paste (for CI/CD pipelines). Covers org management and PATs.

### [Event system](events.md)

Event model concepts: event types, the registration model, event matching, and circuit breaker protection. Understanding this distinction is key to working with non-git triggers like schedules, custom events, and generic webhooks.

### [Lifecycle hooks](hooks.md)

SDK hook API for cancel, cleanup, success, failure, and step-level callbacks. Hooks run at specific points in the execution lifecycle to react to outcomes and perform cleanup.

### [Concurrency groups](concurrency.md)

Control parallel execution with auto-cancel and queue modes. Prevent multiple workflow runs from executing in parallel when they target the same resource.

### [Dynamic values](dynamic-values.md)

Compute `environment`, `env`, and `concurrencyGroup` at runtime based on the incoming event payload. Instead of hardcoding static strings, pass a function that receives the webhook event and returns the resolved value.

### [Secrets](secrets.md)

Access encrypted secrets in workflow steps via the explicit secrets API. Secrets are never auto-injected into `process.env` -- you must explicitly request each secret by name.

### [GitHub App provider](providers/github.md)

The flagship source. Covers creating the GitHub App on GitHub's side (permissions, webhook URL, private key), registering it with the orchestrator via `kici-admin source add github`, routing keys (`github:<appId>`), global-workflow policy, enriched Check runs on pull requests, private-key and webhook-secret rotation, and troubleshooting.

### [Universal-git provider](providers/universal-git.md)

Connect a non-GitHub-App forge (Forgejo, Gitea, Gogs, GitLab, plain GitHub) to KiCI via its webhook. Covers preset selection, PAT and SSH credential wiring, credential rotation, global workflow policy against `generic:<orgId>:<sourceId>` routing keys, and troubleshooting.

### [Global workflows](global-workflows.md)

Cross-repo workflows that let a single workflow repo define jobs which run on events from many source repos in the same org. Covers the mental model (workflow repo vs. source repo, authoring axis vs. source axis), SDK syntax for declaring globals via `repos:` patterns, the dashboard opt-in flow and per-setting semantics (master toggle, author allow-list, source deny-list, elevated-access list), the security model, and troubleshooting skipped dispatches.

---

## Getting started with KiCI

Source: https://docs.kici.dev/user/getting-started/

KiCI lets you define CI/CD workflows in TypeScript instead of YAML. You get full language power -- type safety, autocompletion, loops, conditionals, and async/await -- for your build pipelines.

## Prerequisites

- **Node.js 24+** (LTS recommended)
- **pnpm** (or npm/yarn -- examples use pnpm)
- Familiarity with TypeScript

## Quick start with kici init

The recommended way to start a new project is `kici init`. It scaffolds the directory structure, lets you pick a starter template, and installs dependencies for you:

```bash
npx kici init
```

This will:

1. Create `.kici/` directory with `workflows/`, `tests/`, `types/`, `package.json`, and `tsconfig.json`
2. Create a `.kiciignore` file with sensible defaults
3. Let you choose from starter workflow templates (hello-world, pr-checks)
4. Install dependencies using the package manager detected for your repo (npm, pnpm, or yarn)
5. Update `.gitignore` to exclude `.kici/node_modules/`
6. Optionally install a pre-commit hook to auto-compile workflows

The package manager is detected from your repo's `packageManager` field, lockfile, or the manager that invoked `kici`, defaulting to npm. Pass `--package-manager <npm|pnpm|yarn>` to override it.

### Options

| Flag                                  | Description                                                  |
| ------------------------------------- | ------------------------------------------------------------ |
| `--force`                             | Overwrite existing `.kici/` directory                        |
| `--skip-install`                      | Create files without installing dependencies                 |
| `--package-manager <npm\|pnpm\|yarn>` | Force a package manager for the install step (default: auto) |
| `--mjs`                               | JavaScript-only mode (no TypeScript, no deps)                |

### MJS mode

If you prefer plain JavaScript without TypeScript compilation:

```bash
npx kici init --mjs
```

This creates `.mjs` workflow files that run directly without a build step.

After running `kici init`, jump straight to [Compile the workflow](#compile-the-workflow) below to compile and preview the scaffolded workflow.

## Manual setup

If you'd rather wire things up by hand instead of using `kici init`, install the SDK (runtime definitions) and the compiler (CLI tooling) yourself, then create your first workflow.

### Install the SDK and compiler

```bash
pnpm add @kici-dev/sdk
pnpm add -D @kici-dev/compiler
```

The examples use pnpm, but npm and yarn work too. With npm:

```bash
npm install @kici-dev/sdk
npm install -D @kici-dev/compiler
```

With yarn:

```bash
yarn add @kici-dev/sdk
yarn add -D @kici-dev/compiler
```

### Create the workflow directory

KiCI looks for workflows in `.kici/workflows/`:

```bash
mkdir -p .kici/workflows
```

### Write a workflow

Create `.kici/workflows/ci.ts`:

```typescript
import { workflow, job, step, pr } from '@kici-dev/sdk';

const lint = job('lint', {
  runsOn: 'linux',
  steps: [
    step('install', async ({ $ }) => {
      await $`pnpm install --frozen-lockfile`;
    }),
    step('lint', async ({ $ }) => {
      await $`pnpm lint`;
    }),
  ],
});

const test = job('test', {
  runsOn: 'linux',
  needs: [lint],
  steps: [
    step('install', async ({ $ }) => {
      await $`pnpm install --frozen-lockfile`;
    }),
    step('run-tests', async ({ $ }) => {
      await $`pnpm test`;
    }),
  ],
});

export default workflow('ci', {
  on: pr({ target: 'main' }),
  jobs: [lint, test],
});
```

This workflow:

- Triggers on pull requests targeting `main`
- Runs a `lint` job first
- Runs a `test` job after lint succeeds (`needs: [lint]`)

**Single-step shortcut.** If a job only has one step, pass `run` directly to `job()` instead of building a `steps: [step(...)]` array:

```typescript
const deploy = job('deploy', {
  runsOn: 'default',
  run: async ({ $, log }) => {
    await $`./scripts/deploy.sh`;
    log.info('Deployed');
  },
});
```

`run` and `steps` are mutually exclusive. The shorthand is ideal for deploy/notify/smoke-test jobs. See [Single-step job shorthand](sdk/core.md#single-step-job-shorthand) in the SDK reference for details (output access on the resulting `job.result` is flat -- no step-name nesting).

## Compile the workflow

The compiler validates your workflow and generates a lock file:

```bash
npx kici compile
```

Expected output:

```
✓ Compiled workflows → .kici/kici.lock.json (1 workflow)
```

The lock file (`kici.lock.json`) is a JSON representation of your workflow that the KiCI agent uses for execution. Commit this file alongside your workflow source. See [Lock file and workflow drift](lock-file-and-drift.md) for why and how to keep them in sync.

## Preview trigger matching

Use `kici test` to preview which workflows match a trigger event (dry-run, no execution):

```bash
npx kici test pr:open
```

Expected output (simplified):

```
🔍 DRY RUN - No commands will be executed

Workflow: ci
  Triggers:
    - pr
  ✓ Matched trigger 1
  Jobs (2):
    lint
      runs-on: linux
    test
      runs-on: linux

Decision Summary:

  ci: ✓ matched

✓ Dry run complete
```

## Run locally

Execute matched workflows locally with `kici run local`:

```bash
npx kici run local pr:open
```

This compiles, matches triggers, and runs all matched jobs with DAG-based parallel scheduling.

If you do not want to remember the event arg, pass `--pick` (or `-p`) and pick from a list of workflows instead:

```bash
npx kici run local --pick
```

The picker lists each workflow with a summary of its declared triggers, derives the event arg for the one you choose, and runs it through the same pipeline.

## Workflow dependencies

KiCI workflows can use any npm package. Dependencies are declared in `.kici/package.json`, which `kici init` generates automatically.

### Adding dependencies

To add a package to your workflows:

```bash
cd .kici
npm install lodash
```

This updates `.kici/package.json` and generates (or updates) `package-lock.json`.

### Dependency resolution contract

Every `.kici/` dependency must be resolvable from the **single cloned repository**. When a job runs, the agent clones only this repository and installs `.kici/` dependencies with your repo's package manager (npm or pnpm; yarn is not supported — the agent rejects it with an actionable error). A dependency that points outside the cloned repo cannot be resolved.

In practice:

- **From a registry** — the common case. Pin a published version (a private registry works — see [Private registries](./private-registries.md)). Available for any package manager.
- **From an in-repo workspace sibling** — if your `.kici/` is a member of a **pnpm workspace**, it can depend on a sibling package in the same repo via `workspace:*`. The whole repo is cloned, so the sibling is present and resolves; the agent also builds your `.kici/` dependency closure after install, so a sibling's build output exists before the workflow that imports it loads. A `file:`/`link:`/`portal:` path is allowed only when it stays inside the repository.

What fails fast (with an actionable error naming the dependency, not a raw package-manager error): a `workspace:` dependency in an **npm** project (npm has no workspace protocol — pin a published version or switch to pnpm), and any `file:`/`link:`/`portal:` path that points outside the cloned repo.

Then use the package in your workflow:

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';
import _ from 'lodash';

export default workflow('deploy', {
  on: push({ branches: 'main' }),
  jobs: [
    job('process', {
      runsOn: 'default',
      steps: [
        step('transform', async ({ log }) => {
          const data = _.merge({ a: 1 }, { b: 2 });
          log.info(`Merged: ${JSON.stringify(data)}`);
        }),
      ],
    }),
  ],
});
```

### How dependencies are cached

When the KiCI agent runs your workflow, dependencies are handled automatically:

1. **First run (cache miss):** A build agent installs dependencies from `.kici/package.json`, packs the resolved dependency tree into a tarball, and uploads it to cache storage. For a pnpm workspace this closure includes the shared store and any in-repo workspace siblings `.kici` resolves.
2. **Subsequent runs (cache hit):** The execution agent downloads the cached tarball and extracts it -- no install needed.
3. **Lockfile changes:** When your lockfile changes (`.kici/package-lock.json` for npm, or the repo-root `pnpm-lock.yaml` for a pnpm workspace), the cache is invalidated and a fresh build runs.

This means the first run after a dependency change is slower (build + execution), but all subsequent runs are fast.

### The .kici/package.json file

Every KiCI project needs a `.kici/package.json`. This file:

- Declares workflow dependencies (including `@kici-dev/sdk`)
- Signals the agent to run the dependency cache step
- Is generated automatically by `kici init`

If you are setting up a project manually (without `kici init`), create a minimal `.kici/package.json`:

```json
{
  "name": "@kici-dev/workflows",
  "private": true,
  "type": "module",
  "devDependencies": {
    "@kici-dev/sdk": "^0.0.1"
  }
}
```

Then run `npm install` in `.kici/` to generate the lockfile. Commit both `package.json` and `package-lock.json` to your repository.

## Development mode

When developing the KiCI SDK itself (or testing against a local fork), enable development mode.

### sdkPath in .kici/package.json

Point to a local SDK checkout for IDE autocompletion:

```json
{
  "name": "my-project-kici",
  "devDependencies": {
    "@kici-dev/sdk": ">=0.0.1-0"
  },
  "kici": {
    "sdkPath": "../../packages/sdk"
  }
}
```

The `sdkPath` field tells the compiler where to resolve TypeScript path mappings for `@kici-dev/sdk`.

### KICI_DEV environment variable

Set `KICI_DEV=true` to use a prerelease-compatible version range (`>=0.0.1-0`) in generated files, which resolves prerelease builds from a local Verdaccio registry:

```bash
KICI_DEV=true npx kici init
```

Or add the flag to your root `package.json`:

```json
{
  "kici": {
    "development": true
  }
}
```

## Authoring KiCI workflows with LLM coding agents

KiCI ships first-class context for LLM coding agents (Claude Code, Cursor, Aider, etc.). When you scaffold a project with `kici init`, the CLI writes `.kici/AGENTS.md`, a one-page briefing that tells the agent:

- where the SDK type declarations live (`node_modules/@kici-dev/sdk/dist/index.d.ts`)
- the five canonical authoring patterns with runnable examples
- the anti-patterns that catch agents off-guard (no YAML, no `/dist/...` imports, no top-level `await`)
- the local commands the agent should drive (`kici compile --check`, `kici test`, `kici run local`, `kici docs llm`)

If you don't want the file, pass `--no-agents-md` to `kici init`, or delete the file afterwards — KiCI never reads it at runtime.

For coding agents that want the entire documentation set up front, KiCI follows the [llms.txt convention](https://llmstxt.org/):

- `https://kici.dev/llms.txt` — curated link index grouped by SDK / patterns / CLI / architecture.
- `https://kici.dev/llms-full.txt` — concatenated markdown of every page indexed above.
- `kici docs llm` — print the same `llms-full.txt` bundle to stdout, offline, straight from the installed `@kici-dev/compiler` package. Add `--index` to print the curated `llms.txt` index instead. The agent can pipe the output into its own context buffer with no network call.
- `kici docs` — open the docs site in your browser.

The offline bundle is regenerated from `docs/` every time the package is built, so it always matches the version of KiCI you've installed.

## Watch mode

During development, run the compiler in watch mode to recompile automatically when workflows change:

```bash
npx kici compile --watch
```

The compiler watches `.kici/workflows/*.ts` and recompiles on every save.

## Next steps

- **[5-minute quickstart](quickstart.md)** -- ready to run your workflow on real infrastructure? Stand up an orchestrator + agent (Docker / Podman or bare metal)
- **[SDK reference](sdk-reference.md)** -- complete API for workflows, jobs, steps, triggers, rules, and matrix
- **[CLI reference](cli-reference.md)** -- all CLI commands with options and examples
- **[Workflow patterns](workflow-patterns.md)** -- common patterns for real-world CI/CD workflows

## How KiCI works

KiCI uses a three-layer architecture:

```
SDK (define) -> Compiler (validate) -> Lock file -> Agent (execute)
```

1. **SDK**: You write workflows in TypeScript using factory functions (`workflow()`, `job()`, `step()`). The SDK provides type-safe definitions with full IDE support.

2. **Compiler**: The `kici compile` command loads your TypeScript workflows, validates the dependency graph (no cycles, no missing references), and generates `kici.lock.json`.

3. **Lock file**: A portable JSON file containing all workflow metadata. The lock file enables the orchestrator to evaluate triggers without cloning your repository.

4. **Agent**: The agent receives dispatch instructions, clones your repository, and executes the steps defined in your workflows. Agents are self-hosted and label-routed.

The lock file approach means the orchestrator stays git-agnostic -- it only needs the lock file to decide which jobs to run. The agent handles the actual code checkout and step execution.

## See also

- [SDK reference](sdk-reference.md) -- complete API for workflows, jobs, steps, triggers, rules, and matrix
- [CLI reference](cli-reference.md) -- all CLI commands with options and examples
- [Workflow patterns](workflow-patterns.md) -- common patterns for real-world CI/CD workflows
- [Architecture overview](../architecture/overview.md) -- how the three-tier runtime executes your workflows

---

## 5-minute quickstart

Source: https://docs.kici.dev/user/quickstart/

KiCI offers two equally-supported quickstart paths. Pick the one that fits your machine — both end with the same working pipeline (orchestrator + agent + your first workflow run visible in the dashboard).

## Option A — Docker / Podman (recommended)

Two containers brought up with `docker compose up -d` (orchestrator + PostgreSQL), plus one short-lived agent container spawned per job by the container scaler. Minimal host setup, perfect for a laptop, home server, or a tiny VM. No need to install PostgreSQL or any other system service.

[Start with the Docker / Podman quickstart →](./quickstart/compose.md)

## Option B — Bare-metal install

Native systemd services managed by `kici-admin orchestrator install` / `kici-admin agent install` — the orchestrator and agents run as native processes. The backing PostgreSQL runs as a single container by default (one `docker compose up -d`), or you can install it natively if you'd rather not run a container runtime at all. Best for a long-lived Linux host.

[Start with the bare-metal quickstart →](./quickstart/bare-metal.md)

## Which should I pick?

|                   | Docker / Podman                          | Bare metal                                                                            |
| ----------------- | ---------------------------------------- | ------------------------------------------------------------------------------------- |
| Host requirements | `docker` or `podman` with compose v2.20+ | systemd, Node.js 24+, PostgreSQL 18 (container — needs `docker`/`podman` — or native) |
| Time to first run | ~5 minutes                               | ~10 minutes                                                                           |
| Upgrades          | `docker compose pull` + restart          | `kici-admin orchestrator restart` after `npm install -g kici-admin@latest`            |
| Best for          | Quick evaluation, ephemeral hosts        | Long-lived production hosts                                                           |

If you're not sure, pick Docker / Podman.

## Looking for the laptop-only path?

Both quickstarts deploy a real orchestrator + agent. If you only want to write a workflow and dry-run it on your laptop with no infrastructure, [Getting started](./getting-started.md) covers `kici test` and `kici run local` instead.

---

# Workflow patterns

## Basic workflow patterns

Source: https://docs.kici.dev/user/patterns/basic/

A standard lint-then-test pipeline using job dependencies (`needs`):

```typescript
import { workflow, job, step, pr } from '@kici-dev/sdk';

const lint = job('lint', {
  runsOn: 'linux',
  steps: [
    step('install', async ({ $ }) => {
      await $`pnpm install --frozen-lockfile`;
    }),
    step('check', async ({ $ }) => {
      await $`pnpm lint`;
      await $`pnpm format:check`;
    }),
  ],
});

const test = job('test', {
  runsOn: 'linux',
  needs: [lint],
  steps: [
    step('install', async ({ $ }) => {
      await $`pnpm install --frozen-lockfile`;
    }),
    step('test', async ({ $ }) => {
      await $`pnpm test`;
    }),
  ],
});

const typecheck = job('typecheck', {
  runsOn: 'linux',
  needs: [lint],
  steps: [
    step('install', async ({ $ }) => {
      await $`pnpm install --frozen-lockfile`;
    }),
    step('typecheck', async ({ $ }) => {
      await $`pnpm typecheck`;
    }),
  ],
});

export default workflow('ci', {
  on: pr({ target: 'main' }),
  jobs: [lint, test, typecheck],
});
```

The `test` and `typecheck` jobs both depend on `lint`, so they run in parallel after lint succeeds. KiCI validates the dependency graph at compile time -- cycles and missing references are caught before you commit. At runtime, jobs are gated on upstream completion: a job only dispatches after every entry in its `needs` array reaches a terminal state. If an upstream fails, downstream jobs skip by default (override per-edge with `ifFailed: 'run'`). See [Job dependencies (`needs`)](../sdk/core.md#job-dependencies-needs) in the SDK reference for the full matrix of `needs` forms (string, `Job` ref, `{ name, ifFailed }`, `dynamicGroup()`) and [needs-scheduler](../../architecture/execution/needs-scheduler.md) for the dispatch semantics.

**Single-step jobs don't need a `steps` array.** When a job only does one thing, pass `run` to `job()` instead of wrapping it in `steps: [step(...)]`:

```typescript
import { job, workflow, push } from '@kici-dev/sdk';

const smoke = job('smoke', {
  runsOn: 'default',
  run: async ({ $, log }) => {
    await $`curl -fsS https://example.com/health`;
    log.info('Health check passed');
  },
});

export default workflow('smoke', {
  on: push({ branches: 'main' }),
  jobs: [smoke],
});
```

`run` is mutually exclusive with `steps` (throws at compile time if both are set). Outputs are flat on `job.result` (no step-name nesting). See [Single-step job shorthand](../sdk/core.md#single-step-job-shorthand) in the SDK reference.

## PR-only workflow with branch filters

Use `pr()` to filter by events, target branches, source branches, and file paths:

```typescript
import { workflow, job, step, pr } from '@kici-dev/sdk';

// Only trigger on opened/synchronize events targeting main,
// and only when source code files change
const trigger = pr({
  events: ['opened', 'synchronize'],
  target: ['main', 'develop'],
  paths: ['src/**', 'packages/**', '!**/*.md', '!docs/**'],
});

const build = job('build', {
  runsOn: 'linux',
  steps: [
    step('build', async ({ $ }) => {
      await $`pnpm build`;
    }),
  ],
});

export default workflow('pr-checks', {
  on: trigger,
  jobs: [build],
});
```

### PR trigger options

| Option        | Type                                       | Description                                                                                 |
| ------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------- |
| `target`      | `string \| RegExp \| (string \| RegExp)[]` | Match target branches (glob or regex)                                                       |
| `source`      | `string \| RegExp \| (string \| RegExp)[]` | Match source branches (glob or regex)                                                       |
| `events`      | `PrEvent[]`                                | Filter PR event types                                                                       |
| `paths`       | `string[]`                                 | Only trigger when matching files change. Use `!` prefix for exclusions (e.g., `'!docs/**'`) |
| `description` | `string`                                   | Add a human-readable description                                                            |

Default PR events (when `events` is not specified): `opened`, `synchronize`, `reopened`, `closed`.

## Push trigger with branch filters

Use `push()` for push-based workflows:

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

// Deploy on pushes to main
const deploy = job('deploy', {
  runsOn: 'linux',
  steps: [
    step('deploy', async ({ $ }) => {
      await $`pnpm build`;
      await $`pnpm deploy`;
    }),
  ],
});

export default workflow('deploy', {
  on: push({ branches: 'main' }),
  jobs: [deploy],
});
```

### Push trigger options

| Option        | Type                                       | Description                                                                                 |
| ------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------- |
| `branches`    | `string \| RegExp \| (string \| RegExp)[]` | Match branch names (glob or regex)                                                          |
| `tags`        | `string \| RegExp \| (string \| RegExp)[]` | Match tag names (glob or regex)                                                             |
| `paths`       | `string[]`                                 | Only trigger when matching files change. Use `!` prefix for exclusions (e.g., `'!docs/**'`) |
| `description` | `string`                                   | Add a human-readable description                                                            |

### Regex branch patterns

Both `pr()` and `push()` accept regex patterns alongside glob strings:

```typescript
// Glob pattern
push({ branches: 'release/*' });

// Regex pattern
push({ branches: /^release\/v\d+\.\d+$/ });
```

## Multiple triggers

A workflow can respond to multiple trigger types:

```typescript
import { workflow, job, step, pr, push } from '@kici-dev/sdk';

const test = job('test', {
  runsOn: 'linux',
  steps: [
    step('test', async ({ $ }) => {
      await $`pnpm test`;
    }),
  ],
});

export default workflow('ci', {
  on: [pr({ target: 'main' }), push({ branches: 'main' })],
  jobs: [test],
});
```

## Manual / local-only workflow (no git events)

Sometimes you want a workflow that does **not** fire on pushes, pull requests, tags, or any other git activity — only when you explicitly ask for it. Use `dispatch()` as the trigger: it corresponds to GitHub's `repository_dispatch` event, which is never emitted by commits, PRs, tags, releases, or any other automatic git action. The workflow stays idle until someone explicitly invokes it.

There are two ways to "explicitly invoke" a `dispatch()` workflow:

1. **Locally from your laptop**, with `kici run local dispatch` — no orchestrator, no agent, no webhook, nothing deployed. This is the only path while you haven't wired the repo to a deployed KiCI orchestrator.
2. **Remotely**, if the repo is connected to a KiCI orchestrator via a GitHub App, by calling GitHub's repository-dispatch API: `curl -X POST -H "Authorization: token <PAT>" -H "Accept: application/vnd.github+json" https://api.github.com/repos/<owner>/<repo>/dispatches -d '{"event_type":"hello"}'`. GitHub fans the webhook out to the App, the orchestrator normalizes it into a KiCI `dispatch` event (see `packages/orchestrator/src/providers/github/normalizer.ts`), and the matched workflow runs.

Note that GitHub's `workflow_dispatch` event (the "Run workflow" button / `/actions/workflows/.../dispatches` API) is GitHub-Actions-internal and is **not** delivered to KiCI. The SDK has no `workflowDispatch()` trigger. Only `repository_dispatch` reaches KiCI.

```typescript
import { workflow, job, step, dispatch } from '@kici-dev/sdk';

export default workflow('hello-world', {
  on: dispatch(),
  jobs: [
    job('greet', {
      runsOn: 'linux',
      steps: [
        step('say-hello', async ({ $ }) => {
          await $`echo "Hello, World!"`;
        }),
      ],
    }),
  ],
});
```

Run it locally, without any orchestrator or agent infrastructure:

```bash
npx kici compile           # regenerate .kici/kici.lock.json
npx kici run local dispatch
```

`kici run local` compiles the workflow, matches triggers against a simulated `dispatch` event, and executes the matched jobs directly on your machine with DAG-based scheduling. No webhook, no GitHub, no deployed orchestrator involved. See [`kici run local`](../cli-reference.md#kici-run-local) for options like `--job`, `--env`, `--json`, and `--junit`.

### Scoping to a single workflow

Because `kici run local dispatch` matches **every** workflow that listens for a `dispatch` event, running it in a repo with several dispatch-triggered workflows will fire all of them. Narrow execution to one with `--workflow <name>`:

```bash
npx kici run local dispatch --workflow hello-world
```

`--workflow` is a post-match filter: the workflow still has to have a trigger that matches the event argument. If `hello-world` does not list a `dispatch()` trigger, the command reports `No workflow named "hello-world" matched the event` and exits successfully without running anything.

If you do not want to memorise event args, use the interactive picker instead:

```bash
npx kici run local --pick
```

`--pick` (aliased as `-p`) lists every workflow alongside a compact summary of its triggers, lets you select one, and derives a matching event arg from the chosen trigger — so the execution still flows through the normal trigger-matching pipeline and "cannot produce an inconsistent run". Multi-trigger workflows show a second prompt for which trigger to simulate. `--pick` is mutually exclusive with `--workflow`; in a non-TTY shell it prints the workflow list and exits without running anything.

### Unfiltered vs typed `dispatch()`

Leave `dispatch()` unfiltered while you drive it from `kici run local`. The CLI currently simulates a dispatch event with no event type (i.e. `action` is undefined), so a trigger defined as `dispatch({ types: ['deploy', 'rollback'] })` will not match `kici run local dispatch` — the typed form is intended for real `repository_dispatch` deliveries from the orchestrator.

## Conditional execution with rules

---

## Conditionals & matrix patterns

Source: https://docs.kici.dev/user/patterns/conditionals-matrix/

Rules control whether a workflow or job runs. Use `rule()` for conditions that must pass, and `skip()` for conditions that should skip execution.

### Workflow-level rules

```typescript
import { workflow, job, step, pr, rule } from '@kici-dev/sdk';

const test = job('test', {
  runsOn: 'linux',
  steps: [
    step('test', async ({ $ }) => {
      await $`pnpm test`;
    }),
  ],
});

export default workflow('ci', {
  on: pr(),
  rules: [
    rule('has source changes', async (ctx) => {
      return ctx.changedFiles.some((f) => f.startsWith('src/'));
    }),
  ],
  jobs: [test],
});
```

### Job-level rules

```typescript
import { workflow, job, step, pr, rule, skip } from '@kici-dev/sdk';

const unitTests = job('unit-tests', {
  runsOn: 'linux',
  steps: [
    step('test', async ({ $ }) => {
      await $`pnpm test:unit`;
    }),
  ],
});

const e2eTests = job('e2e-tests', {
  runsOn: 'linux',
  rules: [
    // Skip E2E when only docs change
    skip('docs only', async (ctx) => {
      return ctx.changedFiles.every((f) => f.endsWith('.md'));
    }),
  ],
  steps: [
    step('test', async ({ $ }) => {
      await $`pnpm test:e2e`;
    }),
  ],
});

export default workflow('ci', {
  on: pr(),
  jobs: [unitTests, e2eTests],
});
```

### Rule context

Rule check functions receive a `RuleContext` with:

| Property       | Type                                | Description                         |
| -------------- | ----------------------------------- | ----------------------------------- |
| `event`        | `EventPayload`                      | The triggering event data           |
| `changedFiles` | `string[]`                          | Files changed in this event         |
| `env`          | `Record<string, string\|undefined>` | Environment variables               |
| `$`            | zx shell                            | Shell executor for running commands |

### Marker rules

A rule without a check function always passes. Useful for labeling in the decision trace:

```typescript
rule('ci: required check');
```

## Matrix builds

Matrix configurations run a job across multiple parameter combinations.

### Simple array matrix

Run a job for each value in an array:

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

const test = job('test', {
  runsOn: 'linux',
  matrix: ['18', '20', '22'],
  steps: [
    step('test', async ({ $, matrix }) => {
      await $`nvm use ${matrix!.value}`;
      await $`pnpm test`;
    }),
  ],
});

export default workflow('test-matrix', {
  on: push(),
  jobs: [test],
});
```

With a single-dimension matrix, the current value is available as `matrix.value` in the step context.

### Multi-dimensional matrix

Use an object to define multiple dimensions. KiCI expands all combinations (capped at 256):

```typescript
const test = job('test', {
  runsOn: ['linux', 'kici:agent:container'],
  matrix: {
    os: ['linux', 'arm64'],
    node: ['18', '20', '22'],
  },
  steps: [
    step('test', async ({ $, matrix }) => {
      // matrix.os = 'linux' | 'arm64'
      // matrix.node = '18' | '20' | '22'
      await $`echo "Testing on ${matrix!.os} with Node ${matrix!.node}"`;
      await $`pnpm test`;
    }),
  ],
});
```

This creates 6 job instances (2 OS x 3 Node versions).

> **Labels are customer-defined.** `runsOn` values such as `linux` or `arm64` are scaler
> labels **you** define in your orchestrator's `labelSets` — they are matched by subset
> semantics, not by a hosted-runner name. You can also target reserved auto-injected labels
> in the `kici:` namespace (e.g. `kici:agent:firecracker`, `kici:agent:container`) to pin a
> job to a specific backend type.

### Include and exclude

Fine-tune matrix combinations:

```typescript
const test = job('test', {
  runsOn: 'linux',
  matrix: {
    os: ['linux', 'arm64', 'windows'],
    node: ['18', '20', '22'],
  },
  // Remove specific combination
  exclude: [{ os: 'windows', node: '18' }],
  // Add specific combination not in the matrix
  include: [{ os: 'linux', node: '23' }],
  steps: [
    step('test', async ({ $ }) => {
      await $`pnpm test`;
    }),
  ],
});
```

Exclude is applied first (removes matching combinations), then include adds additional entries.

### Dynamic matrix

Compute matrix values at runtime using an async function:

```typescript
const test = job('test', {
  runsOn: 'linux',
  matrix: async ({ $ }) => {
    // Discover packages in a monorepo
    const result = await $`ls packages/`;
    return result.stdout.trim().split('\n');
  },
  steps: [
    step('test', async ({ $, matrix }) => {
      await $`cd packages/${matrix!.value} && pnpm test`;
    }),
  ],
});
```

Dynamic matrix functions receive the same context as dynamic job functions (`$`, `ctx`, `log`, `env`).

### Matrix type guards

Use type guards to inspect matrix configuration at compile time:

```typescript
import { isStaticArray, isStaticObject, isDynamicFunction } from '@kici-dev/sdk';

if (isStaticArray(myMatrix)) {
  // string[]
}
if (isStaticObject(myMatrix)) {
  // Record<string, string[]>
}
if (isDynamicFunction(myMatrix)) {
  // async function
}
```

## Dynamic job generation

Generate jobs at runtime using async factory functions. Useful for monorepos or when the set of jobs depends on the repository state:

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';
import type { DynamicJobFn } from '@kici-dev/sdk';

const discoverAndTest: DynamicJobFn = async ({ $ }) => {
  // Discover packages at runtime
  const result = await $`ls packages/`;
  const packages = result.stdout.trim().split('\n');

  return packages.map((pkg) =>
    job(`test-${pkg}`, {
      runsOn: 'linux',
      steps: [
        step('test', async ({ $ }) => {
          await $`cd packages/${pkg} && pnpm test`;
        }),
      ],
    }),
  );
};

export default workflow('monorepo-ci', {
  on: push(),
  jobs: [discoverAndTest],
});
```

### Mixing static and dynamic jobs

The `jobs` array accepts both static `Job` objects and `DynamicJobFn` functions:

```typescript
const lint = job('lint', {
  runsOn: 'linux',
  steps: [
    step('lint', async ({ $ }) => {
      await $`pnpm lint`;
    }),
  ],
});

export default workflow('monorepo-ci', {
  on: push(),
  jobs: [lint, discoverAndTest],
});
```

Static jobs and dynamic generators live side by side. The `isDynamicJobFn()` type guard distinguishes them at runtime.

## Combining patterns

A full example combining triggers, rules, matrix, and job dependencies:

```typescript
import { workflow, job, step, pr, push, rule, skip } from '@kici-dev/sdk';

// Only run on PRs targeting main with source changes
const prTrigger = pr({ target: 'main', paths: ['src/**', 'packages/**', '!**/*.md'] });

// Also run on pushes to main
const pushTrigger = push({ branches: 'main' });

const lint = job('lint', {
  runsOn: 'linux',
  steps: [
    step('install', async ({ $ }) => {
      await $`pnpm install --frozen-lockfile`;
    }),
    step('lint', async ({ $ }) => {
      await $`pnpm lint`;
    }),
  ],
});

const test = job('test', {
  runsOn: 'linux',
  needs: [lint],
  matrix: { node: ['18', '20', '22'] },
  steps: [
    step('test', async ({ $, matrix }) => {
      await $`pnpm test`;
    }),
  ],
});

const deploy = job('deploy', {
  runsOn: 'linux',
  needs: [test],
  rules: [
    // Only deploy from push events (not PRs)
    rule('push event only', async (ctx) => {
      return ctx.event.type === 'push';
    }),
  ],
  steps: [
    step('deploy', async ({ $ }) => {
      await $`pnpm build && pnpm deploy`;
    }),
  ],
});

export default workflow('full-pipeline', {
  on: [prTrigger, pushTrigger],
  rules: [
    skip('docs only', async (ctx) => {
      return ctx.changedFiles.every((f) => f.endsWith('.md'));
    }),
  ],
  jobs: [lint, test, deploy],
});
```

This workflow:

1. Triggers on PRs targeting main (with path filters) and pushes to main
2. Skips entirely if only docs files changed (workflow-level `skip` rule)
3. Runs lint first, then tests across 3 Node versions in parallel
4. Deploys only on push events (not on PRs), after all tests pass

## Workflow chaining

---

## Integration patterns

Source: https://docs.kici.dev/user/patterns/integrations/

Use internal event triggers to chain workflows together. Workflow A completes, emits an event (or the system auto-emits a completion event), and Workflow B triggers in response.

### Using system completion events

The orchestrator automatically emits `workflow_complete` and `job_complete` events. Use `workflowComplete()` and `jobComplete()` triggers to listen for them:

```typescript
import { workflow, job, step, push, workflowComplete } from '@kici-dev/sdk';

// Workflow A: deploy on push to main
export const deploy = workflow('deploy', {
  on: push({ branches: 'main' }),
  jobs: [
    job('deploy', {
      runsOn: 'linux',
      steps: [
        step('deploy', async ({ $ }) => {
          await $`./scripts/deploy.sh`;
        }),
      ],
    }),
  ],
});

// Workflow B: runs after deploy succeeds
export const postDeploy = workflow('post-deploy', {
  on: workflowComplete({ name: 'deploy', status: ['success'] }),
  jobs: [
    job('notify', {
      runsOn: 'linux',
      steps: [
        step('slack', async ({ $ }) => {
          await $`./scripts/notify-slack.sh "Deploy succeeded"`;
        }),
      ],
    }),
  ],
});
```

### Using custom events

For richer payload data, emit custom events from steps using `ctx.emit()`:

```typescript
import { workflow, job, step, push, kiciEvent } from '@kici-dev/sdk';

// Workflow A: deploy and emit custom event with payload
export const deploy = workflow('deploy', {
  on: push({ branches: 'main' }),
  jobs: [
    job('deploy', {
      runsOn: 'linux',
      steps: [
        step('deploy', async ({ $ }) => {
          await $`./scripts/deploy.sh`;
        }),
        step('notify', async (ctx) => {
          await ctx.emit('deploy-complete', {
            env: 'prod',
            version: '1.2.3',
          });
        }),
      ],
    }),
  ],
});

// Workflow B: triggered by custom event with payload matching
export const postDeploy = workflow('post-deploy', {
  on: kiciEvent({ name: 'deploy-complete', match: { '$.env': 'prod' } }),
  jobs: [
    job('smoke-test', {
      runsOn: 'linux',
      steps: [
        step('test', async ({ $ }) => {
          await $`./scripts/smoke-test.sh`;
        }),
      ],
    }),
  ],
});
```

Custom events are delivered immediately (mid-workflow, not queued until workflow completion).

## Generic webhook integration

Trigger workflows from non-GitHub sources like ArgoCD, Jenkins, Grafana, or any HTTP service. Generic webhook sources are configured via the orchestrator admin API, and workflows listen using `genericWebhook()`.

```typescript
import { workflow, job, step, genericWebhook } from '@kici-dev/sdk';

// Triggered by ArgoCD deploy events
export default workflow('on-argocd-deploy', {
  on: genericWebhook({ source: 'argocd', events: ['deploy.success'] }),
  jobs: [
    job('post-deploy', {
      runsOn: 'linux',
      steps: [
        step('verify', async ({ $, rawPayload }) => {
          // rawPayload contains the full webhook body from ArgoCD
          await $`./scripts/verify-deploy.sh`;
        }),
      ],
    }),
  ],
});
```

See the [Operator guide: event routing](../../operator/event-routing.md) for how to set up generic webhook sources, verification methods, and trust relationships.

## Stripe webhook handler

Process payment events from Stripe using `genericWebhook()` with HMAC-SHA256 signature verification. This pattern applies to any external service that sends signed HTTP webhooks.

```typescript
import { workflow, job, step, genericWebhook } from '@kici-dev/sdk';

export default workflow('stripe-invoice-handler', {
  on: genericWebhook({
    source: 'stripe',
    events: ['invoice.paid'],
    auth: {
      method: 'hmac-sha256',
      secret: 'stripe-signing-key',
      signatureHeader: 'stripe-signature',
    },
    description: 'Process Stripe invoice.paid events',
  }),
  jobs: [
    job('process-invoice', {
      runsOn: 'linux',
      steps: [
        step('extract-customer', async ({ $, log }) => {
          log.info('Processing paid invoice from Stripe');
          await $`./scripts/process-invoice.sh`;
        }),
        step('update-billing', async ({ $ }) => {
          await $`./scripts/update-billing-records.sh`;
        }),
        step('notify-team', async ({ $ }) => {
          await $`./scripts/notify-billing-team.sh`;
        }),
      ],
    }),
  ],
});
```

**Prerequisites:**

- An operator must create a generic webhook source named `stripe` via the admin API. See [Operator guide: creating a source](../../operator/event-routing.md#creating-a-source).
- The `stripe-signing-key` secret must contain your Stripe webhook signing secret.
- This workflow uses the [registration model](../events.md#the-registration-model) -- it will not trigger until you push to your default branch.

## Self-hosted git forge (Gogs, Forgejo, Gitea)

KiCI has no native provider for Gogs, Forgejo, or Gitea, but these forges send HMAC-SHA256-signed webhooks with a predictable header layout. Model them as a generic webhook source: point the forge's webhook at the orchestrator (or the Platform relay), configure HMAC verification with the shared secret, and map the forge's event header so `genericWebhook()` can match on it.

**Operator setup:**

```bash
# Forgejo / Gitea send event name in X-Gitea-Event and signature in X-Gitea-Signature.
# Gogs uses X-Gogs-Event and X-Gogs-Signature (same HMAC-SHA256 hex-digest format).
kici-admin source add generic \
  --org my-org \
  --name forgejo-main \
  --verification hmac_sha256 \
  --secret @/path/to/webhook-secret.txt \
  --event-type-header X-Gitea-Event \
  --rate-limit 120
```

Note the returned source ID, then register a webhook in the forge pointing at `https://<platform>/webhooks/<orgId>/generic/<sourceId>` (or the orchestrator's direct URL). Set content type to `application/json` and paste the same secret.

**Workflow:**

```typescript
import { workflow, job, step, genericWebhook } from '@kici-dev/sdk';

export default workflow('on-forgejo-push', {
  on: genericWebhook({
    source: 'forgejo-main',
    events: ['push'], // Forgejo/Gitea sends 'push', 'pull_request', 'issues', etc.
    match: { '$.ref': 'refs/heads/main' }, // JSONPath filter on the payload
  }),
  jobs: [
    job('react-to-push', {
      runsOn: 'linux',
      steps: [
        step('log', async ({ rawPayload, log }) => {
          const ref = (rawPayload as { ref?: string }).ref;
          log.info(`Forgejo push to ${ref}`);
        }),
      ],
    }),
  ],
});
```

**Caveat — cloning:** generic webhook sources deliver only the payload; they do not carry a clone token, and KiCI's automatic pre-step clone (`packages/agent/src/checkout/git-clone.ts`) is GitHub-only today (HTTPS + `http.extraHeader` Basic auth with a GitHub installation token). Three practical patterns:

- **Mirror to GitHub and fan out.** Keep the repo on GitHub, register the workflow via a GitHub default-branch push, and have Gogs/Forgejo webhooks fan out via [cross-source delivery](../../architecture/webhooks/webhook-delivery.md#cross-source-delivery). The clone runs against the GitHub mirror using the GitHub App's token.
- **Clone yourself using a secret.** Set `checkout: false` on the job to skip the framework clone, store an SSH private key or forge personal access token as a secret, and run `git clone` explicitly in the first step. This works for any forge the agent can reach, no mirror needed. You still need a way to **register** the workflow — either keep a one-file GitHub repo whose only job is to own the registration, or bootstrap the registration manually against the orchestrator DB.
- **Self-contained workflow.** No clone at all. The step reads whatever it needs from `rawPayload` (e.g., `rawPayload.after`, `rawPayload.repository.clone_url`) and drives external systems — notifications, deploys, third-party CI triggers.

Manual-clone example (pattern 2) using an SSH deploy key:

```typescript
job('forgejo-ci', {
  runsOn: 'linux',
  checkout: false, // skip framework clone
  steps: [
    step('clone', async ({ $, ctx, rawPayload }) => {
      const sshKey = await ctx.secrets.get('FORGEJO_DEPLOY_KEY');
      await $`mkdir -p ~/.ssh`;
      await $`ssh-keyscan forgejo.example.com >> ~/.ssh/known_hosts`;
      await $({ input: sshKey })`tee ~/.ssh/id_ed25519 > /dev/null`;
      await $`chmod 600 ~/.ssh/id_ed25519`;
      const url = (rawPayload as { repository: { ssh_url: string } }).repository.ssh_url;
      const sha = (rawPayload as { after: string }).after;
      await $`git clone ${url} src && cd src && git checkout ${sha}`;
    }),
    step('test', async ({ $ }) => {
      await $`cd src && pnpm install && pnpm test`;
    }),
  ],
});
```

HTTPS with a forge PAT works the same way — store the token as a secret, `await ctx.secrets.expose('FORGEJO_TOKEN')`, then `git clone https://oauth2:$FORGEJO_TOKEN@forgejo.example.com/org/repo.git`.

**Prerequisites:**

- An operator must create a generic webhook source via `kici-admin source add generic` (see above).
- The forge's webhook secret must match the `--secret` value.
- The workflow uses the [registration model](../events.md#the-registration-model) -- push to the default branch of a registered repo before the first webhook fires.

## Plain GitHub repo webhooks (no GitHub App)

The Gogs/Forgejo/Gitea pattern above also applies when you want to trigger workflows from a GitHub repository **without installing the KiCI GitHub App** — for example because you lack org-admin rights, you're on a restricted GitHub Enterprise tenant, or you simply don't want an App installation. Model the repo-level webhook as a generic source, accepting the same `genericWebhook()`-only ergonomics.

**Operator setup:**

```bash
# GitHub sends event name in X-GitHub-Event and HMAC-SHA256 signature in X-Hub-Signature-256.
kici-admin source add generic \
  --org my-org \
  --name gh-repo-foo \
  --verification hmac_sha256 \
  --secret @/path/to/webhook-secret.txt \
  --event-type-header X-GitHub-Event \
  --rate-limit 120

# Patch the verificationConfig to use GitHub's signature header
# (the CLI has no --signature-header flag; use the admin REST API):
curl -X PATCH https://<orchestrator>/api/v1/admin/generic-sources/<sourceId> \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"verificationConfig":{"secret":"<same-secret>","headerName":"x-hub-signature-256"}}'
```

Then in the GitHub repo, go to **Settings → Webhooks → Add webhook**, set:

- **Payload URL:** `https://<platform>/webhooks/<orgId>/generic/<sourceId>` (or the orchestrator's direct URL)
- **Content type:** `application/json`
- **Secret:** the same secret
- **Events:** pick what you care about (e.g., `push`, `pull_request`)

**Workflow:**

```typescript
import { workflow, job, step, genericWebhook } from '@kici-dev/sdk';

export default workflow('on-github-repo-push', {
  on: genericWebhook({
    source: 'gh-repo-foo',
    events: ['push'],
    match: { '$.ref': 'refs/heads/main' },
  }),
  jobs: [
    job('notify', {
      runsOn: 'linux',
      checkout: false, // no App token -> skip auto-clone
      steps: [
        step('log', async ({ rawPayload, log }) => {
          const sha = (rawPayload as { after?: string }).after;
          log.info(`GitHub push ${sha}`);
        }),
      ],
    }),
  ],
});
```

**What you lose compared to the GitHub App** (these are the same cloning / metadata caveats that apply to the Gogs/Forgejo pattern, plus GitHub-specific integrations):

- No auto-clone — `packages/agent/src/checkout/git-clone.ts` uses GitHub App installation tokens to fetch the repo; a generic source has none. Either set `checkout: false` and clone yourself with a PAT/Deploy Key secret (same pattern as the Forgejo manual-clone example above), or keep the workflow self-contained.
- No lock-file fetch — the orchestrator cannot fetch `.kici/kici.lock.json` at the pushed SHA via the GitHub API. The workflow must be pre-registered via the [registration model](../events.md#the-registration-model); ad-hoc per-commit workflow discovery that a GitHub App push gives you is not available.
- No changed-files enrichment — `event.changedFiles` is empty. Use JSONPath `match` on `rawPayload.commits[*].added/modified/removed` if you need path filters.
- No check-run integration — KiCI cannot post Check Run results back to GitHub.
- Workflow authors must use `genericWebhook()`, not `push()` / `pr()` / `webhook()` — the latter three only match events delivered through the native GitHub App provider.

**When to use it anyway:** trigger-only workflows that don't need the cloned repo — posting Slack messages, kicking off external deploys, forwarding to downstream systems, or exposing GitHub repo events as `genericWebhook` for same-org [cross-source fan-out](../../architecture/webhooks/webhook-delivery.md#cross-source-delivery). For anything that compiles, tests, or checks code, install the GitHub App instead.

## Nightly cron build

---

## Pattern reference

Source: https://docs.kici.dev/user/patterns/reference/

Every step receives a `StepContext` with these properties:

| Property            | Type                                | Description                                                      |
| ------------------- | ----------------------------------- | ---------------------------------------------------------------- |
| `$`                 | zx shell                            | Shell executor for running commands                              |
| `log`               | `Logger`                            | Structured logger (info, warn, error, debug)                     |
| `env`               | `Record<string, string\|undefined>` | Environment variables                                            |
| `setEnv()`          | `(key, value) => void`              | Set an env var visible to this step and all subsequent steps     |
| `addPath()`         | `(dir) => void`                     | Prepend a directory to PATH for this and all subsequent steps    |
| `inputs`            | `Record<string, unknown>`           | Typed inputs from dependency outputs                             |
| `workflow`          | `{ name: string }`                  | Current workflow metadata                                        |
| `job`               | `{ name: string, runsOn: string }`  | Current job metadata                                             |
| `matrix`            | `MatrixValues \| undefined`         | Matrix values for current job instance                           |
| `setSecretOutput()` | `(key, value) => void`              | Publish an encrypted secret output consumable by downstream jobs |

### Step outputs

Steps can declare typed outputs using Zod schemas:

```typescript
import { step } from '@kici-dev/sdk';
import { z } from 'zod';

const build = step('build', {
  outputs: {
    version: z.string(),
    artifacts: z.array(z.string()),
  },
  run: async ({ $ }) => {
    await $`pnpm build`;
    return {
      version: '1.0.0',
      artifacts: ['dist/main.js', 'dist/styles.css'],
    };
  },
});
```

## Examples repository

For more runnable examples, see the [examples/](https://github.com/kici-dev/kici-public/tree/main/examples) directory in the KiCI repository.

## GitHub check run output

When workflows run via GitHub pull requests or pushes, KiCI creates GitHub Check runs that show detailed execution feedback directly in the GitHub UI.

### What you see

- **Live progress:** As steps execute, the check run updates with a checklist showing which steps are running, completed, or pending
- **Step durations:** Each step shows its execution time (e.g., "Install deps (1.2s)")
- **Failure details:** When a step fails, the check run includes the error message, exit code, and the last 20 lines of log output
- **Source annotations:** Failed steps are annotated directly on your workflow file (`.kici/workflows/*.ts`) in the GitHub PR diff, linking the failure to the exact `step()` call that failed

### Source location annotations

KiCI captures the source location of each `step()` call during compilation and stores it in the lock file. When a step fails, GitHub displays an annotation on the corresponding line in your workflow file:

```typescript
// This step's source location is captured automatically
step('run tests', async ({ $ }) => {
  await $`pnpm test`; // If this fails, GitHub annotates this step() call
});
```

To enable source location annotations, recompile your workflows after updating KiCI. The compiler captures step locations starting from compile schema version 2.

```bash
pnpm kici compile  # Regenerates kici.lock.json with source locations
```

## See also

---

## Scheduling & event patterns

Source: https://docs.kici.dev/user/patterns/scheduling-and-events/

Run a full build and test suite on a schedule using `schedule()`. Schedule triggers are evaluated by the orchestrator's Raft leader in clustered deployments.

```typescript
import { workflow, job, step, schedule } from '@kici-dev/sdk';

const install = step('install', async ({ $ }) => {
  await $`pnpm install --frozen-lockfile`;
});

const fullTest = job('full-test', {
  runsOn: 'linux',
  steps: [
    install,
    step('test', async ({ $ }) => {
      await $`pnpm test`;
    }),
    step('typecheck', async ({ $ }) => {
      await $`pnpm typecheck`;
    }),
  ],
});

const publish = job('publish-nightly', {
  runsOn: 'linux',
  needs: [fullTest],
  steps: [
    install,
    step('build', async ({ $ }) => {
      await $`pnpm build`;
    }),
    step('publish', async ({ $ }) => {
      await $`./scripts/publish-nightly.sh`;
    }),
  ],
});

export default workflow('nightly-build', {
  on: schedule({ cron: '0 2 * * *', description: 'Every day at 2 AM UTC' }),
  jobs: [fullTest, publish],
});
```

**Notes:**

- The `cron` field uses standard 5-field cron syntax. Use the `timezone` option (defaults to `'UTC'`) to control evaluation in a specific timezone: `schedule({ cron: '0 2 * * *', timezone: 'America/New_York' })`.
- Schedule workflows use the [registration model](../events.md#the-registration-model) -- the cron will not start firing until you push to your default branch.
- In clustered orchestrator deployments, only the Raft leader evaluates cron schedules. If the leader changes, the new leader recovers missed schedules.

**Timing precision and scaling:**

- The orchestrator's cron evaluator wakes up every **30 seconds** (fixed interval, not configurable at runtime). A schedule due at 02:00:00 fires on the next tick after that moment, so expect **0-30 seconds of jitter after the scheduled time** -- never early. The event payload's `scheduledAt` field carries the exact cron-computed time (not the fire time), so downstream consumers can reason about the intended schedule rather than the dispatch moment.
- All cron schedules are evaluated **serially** within a single tick on the leader. Each evaluation does an in-memory cron computation plus two DB writes (atomic claim + event emit), so per-schedule cost is on the order of low tens of milliseconds. Practically, dozens of schedules firing in the same tick add up to well under a second of extra dispatch latency between the first and the last -- negligible compared to the 0-30 s tick alignment.
- If the leader fails over, the new leader recovers **at most one fire per schedule** -- the most recent past scheduled time. KiCI does not backfill multiple missed runs (a cron stuck for two hours fires once, not four times). The per-schedule lower bound on fire frequency is the cron expression's natural period; the upper bound on lateness is `30 s + (Raft election + restart time)`.
- Sub-minute crons (`* * * * *`) are supported but constrained by the 30-second tick: a schedule for `* * * * *` will fire roughly once per minute, but the actual fire time within each minute can drift by up to 30 seconds.

## Workflow-complete-triggered deploy

Trigger a deployment automatically when a build workflow succeeds, using `workflowComplete()`. This is one of the most common event chaining patterns.

```typescript
import { workflow, job, step, push, workflowComplete } from '@kici-dev/sdk';

// Workflow A: build and test on push to main
export const build = workflow('build', {
  on: push({ branches: 'main' }),
  jobs: [
    job('test', {
      runsOn: 'linux',
      steps: [
        step('install', async ({ $ }) => {
          await $`pnpm install --frozen-lockfile`;
        }),
        step('test', async ({ $ }) => {
          await $`pnpm test`;
        }),
        step('build', async ({ $ }) => {
          await $`pnpm build`;
        }),
      ],
    }),
  ],
});

// Workflow B: deploy when build succeeds
export const deploy = workflow('deploy-on-success', {
  on: workflowComplete({ name: 'build', status: ['success'] }),
  jobs: [
    job('deploy', {
      runsOn: 'linux',
      steps: [
        step('deploy-staging', async ({ $ }) => {
          await $`./scripts/deploy.sh staging`;
        }),
        step('run-smoke-tests', async ({ $ }) => {
          await $`./scripts/smoke-test.sh staging`;
        }),
        step('deploy-production', async ({ $ }) => {
          await $`./scripts/deploy.sh production`;
        }),
      ],
    }),
  ],
});
```

**Notes:**

- `workflowComplete()` is a system event trigger -- the orchestrator automatically emits these events when workflows finish. You do not need to call `ctx.emit()`.
- The `status` filter accepts `'success'`, `'failed'`, and `'cancelled'`. Omit `status` to trigger on any completion.
- The `deploy-on-success` workflow uses the [registration model](../events.md#the-registration-model) -- it will not trigger until you push to your default branch. The `build` workflow (using `push()`) works immediately.
- You can also use `jobComplete()` to trigger on individual job completions within a workflow.

## Custom event chaining

Two workflows communicating through custom events using `kiciEvent()` and `ctx.emit()`. Workflow A runs tests and emits a typed event with results. Workflow B listens for that event and triggers a deployment.

```typescript
import { workflow, job, step, push, kiciEvent, defineEvent, z } from '@kici-dev/sdk';

// Define a typed event contract
const testsPassedEvent = defineEvent(
  'tests-passed',
  z.object({
    branch: z.string(),
    commit: z.string(),
    testCount: z.number(),
    duration: z.number(),
  }),
);

// Workflow A: run tests and emit result event
export const testSuite = workflow('test-suite', {
  on: push({ branches: 'main' }),
  jobs: [
    job('test', {
      runsOn: 'linux',
      steps: [
        step('install', async ({ $ }) => {
          await $`pnpm install --frozen-lockfile`;
        }),
        step('run-tests', async ({ $ }) => {
          await $`pnpm test`;
        }),
        step('emit-results', async (ctx) => {
          await ctx.emit(testsPassedEvent.name, {
            branch: 'main',
            commit: 'abc123',
            testCount: 142,
            duration: 45,
          });
        }),
      ],
    }),
  ],
});

// Workflow B: deploy when tests pass (in the same or separate file)
export const autoDeploy = workflow('auto-deploy', {
  on: kiciEvent({ name: 'tests-passed' }),
  jobs: [
    job('deploy', {
      runsOn: 'linux',
      steps: [
        step('deploy', async ({ $ }) => {
          await $`./scripts/deploy.sh`;
        }),
        step('notify', async ({ $ }) => {
          await $`./scripts/notify-slack.sh "Deployment complete"`;
        }),
      ],
    }),
  ],
});
```

**Notes:**

- Both workflows can live in the same `.kici/workflows/` file or in separate files -- the event system routes by event name, not by file.
- `defineEvent()` creates a typed contract using Zod. This is optional but recommended for documenting event payloads.
- Custom events are delivered immediately when `ctx.emit()` is called (mid-workflow), not queued until the workflow completes.
- Payload matching is available via the `match` option: `kiciEvent({ name: 'tests-passed', match: { '$.branch': 'main' } })`.
- The `auto-deploy` workflow uses the [registration model](../events.md#the-registration-model) -- it will not trigger until you push to your default branch.
- The [circuit breaker](../events.md#circuit-breaker) limits chain depth (default: 10) and rate (default: 100/min per workflow) to prevent infinite loops.

## Step context

---

# SDK reference

## Caching

Source: https://docs.kici.dev/user/sdk/caching/

KiCI ships a general-purpose cache for any files or directories your workflow produces — compiled artifacts, downloaded toolchains, package manager stores, build outputs. A cache entry is keyed, immutable once written, and shared across runs of the same repository so a later run can restore what an earlier run produced instead of recomputing it.

Two surfaces drive the same cache:

- **Declarative** — a `cache` field on a job or a step. The runtime restores before the work runs and saves after it succeeds, with no code in your step body.
- **Imperative** — `ctx.cache.restore(spec)` / `ctx.cache.save(spec)` inside a step body, for fine-grained control over when restore and save happen.

The cache is backed by the orchestrator's object storage. Entries are isolated per organization and per ref scope (see [Isolation](#isolation)); no other tenant can read your cache, and an untrusted/fork ref can never poison the cache a trusted branch reads.

## CacheSpec

Both surfaces take the same shape:

```typescript
interface CacheSpec {
  /** Exact cache key. First save wins; re-saving an existing key is a no-op. */
  key: string;
  /** Files/directories to cache. Repo-root-relative or `~`-prefixed. */
  paths: string[];
  /** Ordered prefix fallbacks for partial restore; newest matching entry wins. */
  restoreKeys?: string[];
}
```

- **`key`** is the exact cache key. It is **immutable** — the first save under a given key wins, and any later save under the same exact key is a no-op (the existing entry is never overwritten). Build keys from inputs that change when the cached content should change, e.g. a hash of your lockfile: `` key: `deps-${await ctx.$`sha256sum pnpm-lock.yaml`}` ``.
- **`paths`** are the files and directories to archive, repo-root-relative or `~`-prefixed (the agent expands `~` to the workspace home). At least one path is required.
- **`restoreKeys`** are ordered **prefix** fallbacks tried only when the exact `key` misses on restore. Each prefix is matched against existing entries; the **newest** matching entry wins. This lets a run that changed its lockfile still restore the closest previous cache and rebuild incrementally.

## Declarative cache

Add a `cache` field to a job or a step. It accepts one `CacheSpec` or an array of them. The runtime restores every spec before the job/step runs (surfaced as a `cache:restore` pseudo-step) and saves every spec after it completes successfully (surfaced as a `cache:save` pseudo-step):

```typescript
import { job } from '@kici-dev/sdk';

job('build', {
  runsOn: 'linux-x64',
  cache: {
    key: 'mise-tools-v1',
    paths: ['~/.local/share/mise'],
  },
  steps: [
    step('install-tools', async (ctx) => {
      await ctx.$`mise install`;
    }),
    step('build', async (ctx) => {
      await ctx.$`mise exec -- pnpm build`;
    }),
  ],
});
```

Step-level cache scopes the restore/save to a single step:

```typescript
step('deps', {
  cache: { key: `npm-${lockfileHash}`, paths: ['node_modules'], restoreKeys: ['npm-'] },
  run: async (ctx) => {
    await ctx.$`pnpm install --frozen-lockfile`;
  },
});
```

On a cache **hit**, the archived paths are restored before the step body runs, so `pnpm install` sees a warm `node_modules`. On a **miss**, the step runs cold and the resulting paths are saved under the exact key for the next run.

## Imperative cache (`ctx.cache`)

When you need to decide at runtime whether to restore or save — for example, save only when a build actually changed something — use the imperative API on the step context:

```typescript
step('build', async (ctx) => {
  const result = await ctx.cache.restore({
    key: `build-${sourceHash}`,
    paths: ['dist'],
    restoreKeys: ['build-'],
  });

  if (result.hit) {
    ctx.log.info(`restored cache (matched ${result.matchedKey})`);
  }

  await ctx.$`pnpm build`;

  await ctx.cache.save({ key: `build-${sourceHash}`, paths: ['dist'] });
});
```

`restore(spec)` returns `{ hit, matchedKey? }`:

- `hit` is `true` when the exact `key` matched **or** a `restoreKeys` prefix matched.
- `matchedKey` is the full key that actually matched — the exact key on a direct hit, or the full key of the matched prefix entry on a fallback hit.

`save(spec)` archives `spec.paths` under `spec.key`. Like the declarative surface, it is immutable: the first save under an exact key wins, and re-saving the same key is a no-op.

## Restore semantics

A restore resolves in this order:

1. **Exact key.** If an entry exists under the exact `key`, it is restored and `matchedKey === key`.
2. **restoreKeys prefix fallback.** Each `restoreKeys` prefix is tried in order. Within a prefix, the **newest** matching entry wins; `matchedKey` is that entry's full key.
3. **Miss.** If nothing matches, `hit` is `false` and no paths are restored.

This mirrors the familiar lockfile-hash pattern: key the entry on the exact lockfile hash, and add a `restoreKeys` prefix so a changed lockfile still restores the most recent prior cache to rebuild from.

## Immutability

Cache keys are write-once. The **first** save under an exact key wins; every subsequent save under that same exact key is a no-op and the original bytes are preserved. To publish new content, use a new key (typically by including a content hash in the key). Immutability is what makes a cache hit safe to trust — the bytes behind a given key never change after they are first written.

## Isolation

Each cache entry is scoped to your organization and to the ref's trust level:

- **Trusted refs** (your repository's own branches, default branch) read and write a **shared** scope visible to the whole org for that repository.
- **Untrusted / fork refs** read the shared scope as a fallback but write to an **isolated** per-run scope. A fork build can therefore benefit from a warm cache the trusted branch produced, but can never write into the shared scope — so a malicious fork cannot poison the cache a trusted branch later restores.

No tenant can read another tenant's cache; the org boundary is enforced in the cache key namespace.

## Eviction

Cache storage is bounded per organization. Two mechanisms keep it bounded:

- **Quota** — when a save pushes the org over its byte quota (`KICI_USER_CACHE_QUOTA_BYTES`, default 5 GiB), the oldest entries are evicted until the org is back under quota.
- **TTL** — entries unused for `KICI_USER_CACHE_TTL_MS` (default 7 days) expire. The TTL refreshes on read (touch-on-read), so an actively used cache stays warm.

Both knobs are operator-configured on the orchestrator — see [orchestrator storage layout](../../operator/orchestrator/storage-layout.md).

## Observability

Each cache restore and save surfaces in the run timeline as a `cache:restore` / `cache:save` pseudo-step, reporting the outcome (hit/miss/saved, the matched key, bytes). The same outcomes are recorded as `cache.restore` / `cache.save` run events. See [data flows](../../architecture/data-flows.md#user-facing-cache-flow) for the restore/save protocol.

## See also

- [Core](./core.md) -- `job()` / `step()` factories the `cache` field attaches to
- [Runtime](./runtime.md) -- `StepContext`, where `ctx.cache` lives
- [Orchestrator storage layout](../../operator/orchestrator/storage-layout.md) -- cache prefix, quota, TTL, and eviction
- [Data flows](../../architecture/data-flows.md#user-facing-cache-flow) -- restore/save protocol and trust→scope mapping

---

## SDK reference: core

Source: https://docs.kici.dev/user/sdk/core/

## Factory functions

### workflow(name, options)

Create a workflow containing jobs.

```typescript
function workflow(name: string, options: WorkflowOptions): Workflow;
```

**Parameters:**

| Parameter             | Type                                                                   | Required | Description                                                                                                                                                 |
| --------------------- | ---------------------------------------------------------------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `name`                | `string`                                                               | yes      | Unique workflow name                                                                                                                                        |
| `options.jobs`        | `JobOrFactory[]`                                                       | yes      | Static jobs and/or dynamic job generators                                                                                                                   |
| `options.on`          | `Trigger \| Trigger[]`                                                 | no       | When the workflow should trigger                                                                                                                            |
| `options.rules`       | `Rule[]`                                                               | no       | Conditions that must pass for execution                                                                                                                     |
| `options.description` | `string`                                                               | no       | Human-readable description                                                                                                                                  |
| `options.hashFiles`   | `string[]`                                                             | no       | Extra repo-relative paths or globs mixed into the workflow content hash. Changes invalidate the source cache.                                               |
| `options.registries`  | `Registry[]`                                                           | no       | Private npm registries the agent authenticates against before `npm install`. Each `tokenSecret` uses qualified `<environment>:<secret>` syntax.             |
| `options.installEnv`  | `string[]`                                                             | no       | Qualified `<environment>:<secret>` refs projected as env vars onto the install subprocess (used with a customer-committed `.kici/.npmrc`).                  |
| `options.onCancel`    | `HookInput`                                                            | no       | Runs when the workflow is cancelled                                                                                                                         |
| `options.cleanup`     | `HookInput`                                                            | no       | Always runs after the workflow (success, failure, or cancel)                                                                                                |
| `options.onSuccess`   | `HookInput`                                                            | no       | Runs on workflow success                                                                                                                                    |
| `options.onFailure`   | `HookInput`                                                            | no       | Runs on workflow failure                                                                                                                                    |
| `options.concurrency` | `{ group: (ctx) => string; cancelInProgress?: boolean; max?: number }` | no       | Workflow-scoped concurrency. See [Concurrency](../concurrency.md).                                                                                          |
| `options.timeout`     | `number`                                                               | no       | Whole-run wall-clock timeout in milliseconds across all jobs. On breach the orchestrator cancels the run and marks it timed out. See [Timeouts](#timeouts). |

**Returns:** `Workflow` -- an immutable workflow definition.

```typescript
export default workflow('ci', {
  on: [pr({ target: 'main' }), push({ branches: 'main' })],
  rules: [rule('has source changes')],
  jobs: [lint, test, deploy],
  description: 'Main CI pipeline',
});
```

Secret scoping happens at the job level via `environment` (see [job options](#jobname-options--joboptions) and [Secrets](../secrets.md)) — the workflow itself does not declare which secret environments it can read.

### job(name, options) / job(options)

Create a job with an explicit name or auto-generated ID.

```typescript
function job(name: string, options: JobOptions): Job;
function job(options: JobOptions): Job;
```

**Parameters:**

| Parameter                  | Type                                                            | Required             | Description                                                                                                                                              |
| -------------------------- | --------------------------------------------------------------- | -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `name`                     | `string`                                                        | no                   | Job name (auto-generated UUID if omitted)                                                                                                                |
| `options.runsOn`           | `RunsOn`                                                        | yes                  | Runner label(s) and optional exclusions (see below)                                                                                                      |
| `options.steps`            | `StepInput[]`                                                   | yes (or use `run`)   | Steps to execute in order. Mutually exclusive with `run`.                                                                                                |
| `options.run`              | `(ctx) => Promise<unknown>`                                     | yes (or use `steps`) | Single-step shorthand -- see [Single-step job shorthand](#single-step-job-shorthand). Mutually exclusive with `steps`.                                   |
| `options.needs`            | `NeedsEntry[]`                                                  | no                   | Job dependencies (must complete first) -- see [Job dependencies (`needs`)](#job-dependencies-needs)                                                      |
| `options.rules`            | `Rule[]`                                                        | no                   | Conditions for conditional execution                                                                                                                     |
| `options.description`      | `string`                                                        | no                   | Human-readable description                                                                                                                               |
| `options.matrix`           | `Matrix`                                                        | no                   | Matrix configuration for job expansion                                                                                                                   |
| `options.include`          | `MatrixInclude[]`                                               | no                   | Additional matrix combinations                                                                                                                           |
| `options.exclude`          | `MatrixExclude[]`                                               | no                   | Matrix combinations to remove                                                                                                                            |
| `options.checkout`         | `boolean`                                                       | no (default: `true`) | When `false`, agent skips git clone. Useful for deploy/notify jobs.                                                                                      |
| `options.container`        | `string \| ContainerConfig`                                     | no                   | Docker image for job execution. String form is the image name; object form adds `env`. All steps run inside the container.                               |
| `options.environment`      | `string \| ((event) => string \| Promise<string>)`              | no                   | Deployment environment for this job. Static string or async/dynamic function -- see [Dynamic values](../dynamic-values.md).                              |
| `options.env`              | `Record<string, string> \| ((event) => Record<string, string>)` | no                   | Environment variables. Static object or async/dynamic function -- see [Dynamic values](../dynamic-values.md).                                            |
| `options.concurrencyGroup` | `string \| ((event) => string \| Promise<string>)`              | no                   | Concurrency group name (defaults to environment name) -- see [Concurrency](../concurrency.md).                                                           |
| `options.onCancel`         | `HookInput`                                                     | no                   | Hook that runs when the job is cancelled                                                                                                                 |
| `options.cleanup`          | `HookInput`                                                     | no                   | Hook that always runs after completion                                                                                                                   |
| `options.onSuccess`        | `HookInput`                                                     | no                   | Hook that runs when the job succeeds                                                                                                                     |
| `options.onFailure`        | `HookInput`                                                     | no                   | Hook that runs when the job fails                                                                                                                        |
| `options.beforeStep`       | `HookInput`                                                     | no                   | Hook that runs before each step                                                                                                                          |
| `options.afterStep`        | `HookInput`                                                     | no                   | Hook that runs after each step                                                                                                                           |
| `options.gracePeriod`      | `number`                                                        | no                   | Seconds before SIGKILL after SIGTERM during cancellation -- see [Hooks](../hooks.md#hook-timeout).                                                       |
| `options.timeout`          | `number`                                                        | no                   | Total job wall-clock timeout in milliseconds (init + all steps + hooks). On breach the job is aborted and reported timed out. See [Timeouts](#timeouts). |
| `options.resources`        | `ResourceRequest`                                               | no                   | Per-job CPU / memory request and limit. See [Per-job resources](#per-job-resources) below.                                                               |
| `options.init`             | `GenericInitConfig \| GenericInitConfig[] \| false`             | no                   | Per-job initialization run after clone, before steps -- provisions a toolchain. See [Per-job init](#per-job-init) below.                                 |

**Returns:** `Job` -- an immutable job definition.

```typescript
// Named job
const build = job('build', {
  runsOn: 'linux',
  steps: [checkout, install, compile],
  needs: [lint],
});

// Anonymous job (auto-generated UUID name)
const build = job({
  runsOn: 'linux',
  steps: [checkout, install],
});
```

#### runsOn forms

The `runsOn` parameter accepts three forms for targeting agents:

```typescript
// 1. Simple string -- agent must have this label
runsOn: 'linux'

// 2. Array of required labels -- agent must have ALL labels
runsOn: ['linux', 'docker']

// 3. Object form with exclusions -- agent must have ALL required labels
//    and NONE of the excluded labels
runsOn: { labels: ['linux', 'docker'], exclude: ['gpu'] }
```

**Semantics:**

- **Required labels:** The agent must have every label in the `labels` array (or the string/array form).
- **Excluded labels:** The agent must NOT have any label in the `exclude` array. This includes auto-derived labels like `kici:arch:arm64`, `kici:os:linux`, etc.
- **Compile-time validation:** The compiler will error if any label appears in both `labels` and `exclude` (overlap detection).
- **Operator-declared mandatory labels:** Operators may mark a scaler with `mandatoryLabels` (Kubernetes-taint-style opt-in). When a scaler declares a mandatory label, a job is only allowed to land on it if `runsOn.labels` includes that label. A workflow targeting such a scaler must explicitly list the mandatory label in `runsOn`. See the [auto-scaler mandatory labels](../../operator/orchestrator/auto-scaler/common-config.md#mandatory--exclude-labels) for details.

```typescript
// Route to any Linux agent that does NOT have the 'gpu' label
const build = job('build', {
  runsOn: { labels: ['linux'], exclude: ['gpu'] },
  steps: [checkout, compile],
});

// Route to arm64 Linux agents, excluding those with 'staging' label
const deploy = job('deploy', {
  runsOn: { labels: ['linux', 'arch:arm64'], exclude: ['staging'] },
  steps: [deployStep],
});
```

### step(name, run) / step(name, options)

Create a step with a run function or with typed outputs.

```typescript
// Simple form (no outputs)
function step(name: string, run: StepRunFn): Step;

// Full form (with outputs)
function step<TOutputs extends OutputSchema>(
  name: string,
  options: StepOptions<TOutputs>,
): Step<TOutputs>;
```

**Simple form:**

```typescript
const checkout = step('checkout', async ({ $ }) => {
  await $`git checkout`;
});
```

**With typed outputs:**

```typescript
import { z } from 'zod';

const build = step('build', {
  outputs: {
    version: z.string(),
    artifacts: z.array(z.string()),
  },
  run: async ({ $ }) => {
    await $`pnpm build`;
    return { version: '1.0.0', artifacts: ['dist/main.js'] };
  },
});
```

**StepRunFn type:** `(ctx: StepContext) => Promise<void>`

### Per-job resources

`options.resources` declares the CPU and memory the job needs. The orchestrator's auto-scaler uses these numbers to:

1. **Bill against capacity caps** (`request`). Decides whether the job can be admitted under the per-scaler `maxAgents`, per-scaler `resourceCap`, orchestrator-wide `globalResourceCap`, and machine-pool caps.
2. **Enforce kernel limits** (`limit`). Sets the cgroup `memory.max` and CPU quota on the running container / VM / scope.

The shape mirrors Kubernetes:

```typescript
const heavy = job('build', {
  runsOn: 'linux',
  resources: {
    requests: { memory: '2g', cpus: 1 },
    limits: { memory: '4g', cpus: 2 },
  },
  steps: [...],
});
```

Three input shapes are accepted; all normalise to the same `{ requests, limits }` pair:

| Shape          | Example                                                    | Effective behavior                               |
| -------------- | ---------------------------------------------------------- | ------------------------------------------------ |
| Both           | `{ requests: { memory: '2g' }, limits: { memory: '4g' } }` | Used as-is                                       |
| Request only   | `{ requests: { memory: '2g' } }`                           | `limits` mirrors the request                     |
| Limit only     | `{ limits: { memory: '4g' } }`                             | `requests` mirrors the limit                     |
| Flat shorthand | `{ memory: '2g', cpus: 1 }`                                | Both `requests` and `limits` set to these values |

Memory accepts container-style suffixes: `512m`, `4g`, `2048k`. CPUs are fractional cores (`0.5`, `2`).

If `resources` is omitted, the job inherits the matched scaler's label-set or default resources (configured by the operator in `scalers.yaml`). This keeps existing workflows behaving as they did before per-job resources existed.

Per-backend kernel enforcement of `limits`:

- **Container backend** (Docker / Podman): always enforced via cgroup.
- **Firecracker backend:** always enforced. Fractional CPU rounds up to the nearest integer vCPU.
- **Bare-metal backend:** advisory by default — the scaler caps still apply, but no cgroup is created. Operators can opt in to kernel enforcement via `enforceCgroups: true` on the scaler entry.

### Per-job init

`options.init` declares a hand-written command that runs **after the repo is cloned and before the job's steps execute**. Its purpose is to provision a repo-declared toolchain (a `mise` toolchain, a custom setup script, a language runtime) and put it on the environment every subsequent step sees.

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export const build = workflow('build', {
  on: [push()],
  jobs: [
    job('build', {
      runsOn: 'linux',
      init: {
        run: `
          set -euo pipefail
          command -v mise >/dev/null || curl -fsSL https://mise.run | sh
          export PATH="$HOME/.local/bin:$PATH"
          mise install
          mise env -s bash | sed -n 's/^export //p' >> "$KICI_ENV"
          echo "$HOME/.local/share/mise/shims" >> "$KICI_PATH"
        `,
        cache: { key: 'mise-jq-1.7.1', paths: ['~/.local/share/mise'] },
        timeout: 600_000,
      },
      steps: [
        step('show-jq-version', async (ctx) => {
          // jq is on PATH because the init phase appended the mise shims dir to $KICI_PATH.
          const { stdout } = await ctx.$`jq --version`;
          ctx.log.info(`jq version: ${stdout.trim()}`);
        }),
      ],
    }),
  ],
});
```

**`GenericInitConfig` shape:**

| Field     | Type                    | Required | Description                                                                                                                                    |
| --------- | ----------------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| `run`     | `string`                | yes      | Command run after clone, before steps. Runs in the job's sandbox at the clone root. Must be a non-empty command.                               |
| `shell`   | `string`                | no       | Shell used to run `run`. Defaults to `bash`.                                                                                                   |
| `cache`   | `CacheSpec`             | no       | Cache spec for binaries the command installs -- restored before the command, saved after on a key miss. See [Caching](./caching.md).           |
| `timeout` | `number`                | no       | Max wall-clock for this init command in milliseconds. Defaults to 10 minutes. On breach the init is aborted and the job is reported timed out. |
| `env`     | `Record<string,string>` | no       | Static environment variables available to the command.                                                                                         |

**The `$KICI_ENV` / `$KICI_PATH` handoff.** The init command does not mutate the step environment directly. Instead it writes what it wants visible to later steps to two files the agent allocates and exposes as environment variables:

- **`$KICI_ENV`** -- append one `KEY=value` line per environment variable. The agent reads the file after the command and makes each variable available to every subsequent step.
- **`$KICI_PATH`** -- append one directory per line. The agent prepends each directory to `PATH` for every subsequent step.

The agent reads both files after the command succeeds, applies the delta, and the resulting environment is visible to all steps that follow (and to any later init command).

**Failure before steps.** If the init command exits non-zero or exceeds its `timeout`, the job **fails before any step runs** -- the init surfaces as a failed `init:<n>` pseudo-step in the run timeline (alongside the step list), its logs are attached, and the step loop never executes. This makes a broken toolchain a clear, early failure rather than a confusing mid-run error.

**Arrays run in order.** Passing `GenericInitConfig[]` runs the inits sequentially; each one's `$KICI_ENV` / `$KICI_PATH` delta is applied before the next runs, so a later init sees an earlier init's tools on `PATH`. The first init to fail stops the sequence and fails the job.

**`init: false`** is an explicit opt-out (reserved for a future auto-detect layer); it behaves the same as omitting `init`.

## Step & job authoring patterns

KiCI supports several authoring patterns for steps and jobs to reduce boilerplate and improve developer experience.

### Bare function steps

Async functions are accepted directly in a job's `steps` array without wrapping in `step()`. They receive auto-generated counter names (`step-1`, `step-2`) at compile time. Return values are captured at runtime.

```typescript
const myJob = job('example', {
  runsOn: 'default',
  steps: [
    async (ctx) => {
      ctx.log.info('hello from bare function');
    },
    step('named', async (ctx) => {
      // Named steps keep their explicit name
    }),
    async (ctx) => {
      // This becomes step-2 (counter skips named steps)
      return { value: 42 };
    },
  ],
});
```

### Id-less steps and jobs

Steps and jobs can be created without a name. The compiler assigns counter-based IDs at compile time.

**Id-less steps:**

```typescript
// Id-less step with just a function
const s = step(async (ctx) => {
  await ctx.$`echo hello`;
});

// Id-less step with full options
const s = step({
  run: async (ctx) => {
    return { version: '1.0.0' };
  },
  timeout: 60000,
});
```

**Id-less jobs:**

```typescript
const deploy = job({
  runsOn: 'default',
  steps: [step('deploy', async (ctx) => { ... })],
});
// deploy.name is a UUID at definition time, replaced with job-1 at compile time
```

### Step output types

Steps have three output tiers:

| Tier | Syntax                         | Naming           | TypeScript Type       | Zod Validation |
| ---- | ------------------------------ | ---------------- | --------------------- | -------------- |
| 1    | Bare function                  | Auto (`step-N`)  | Inferred return type  | No             |
| 2    | `step(name, fn)` or `step(fn)` | Explicit or auto | Inferred return type  | No             |
| 3    | `step(name, { outputs, run })` | Explicit or auto | Inferred + Zod schema | Yes (runtime)  |

```typescript
import { z } from '@kici-dev/sdk';

// Tier 3: step with Zod outputs (validated at runtime)
const build = step('build', {
  outputs: {
    version: z.string(),
    artifact: z.string(),
  },
  run: async (ctx) => {
    return { version: '2.0.0', artifact: 'dist/main.js' };
  },
});
```

### Single-step job shorthand

Use the `run` property as an alternative to `steps` for jobs with a single step:

```typescript
const deploy = job('deploy', {
  runsOn: 'default',
  run: async (ctx) => {
    ctx.log.info('Deploying...');
    return { url: 'https://app.example.com' };
  },
});
```

The `run` function is stored as the job's only step with an auto-generated name (`step-1`). `run` and `steps` are mutually exclusive -- providing both throws an error.

### Timeouts

`timeout` (milliseconds) can be set at three levels. Each level caps **its own scope** independently — a workflow or job timeout is a separate wall-clock cap, **not** a default that flows down to steps.

| Level        | Field                        | Caps                                                   | Enforced by      | On breach                                                         |
| ------------ | ---------------------------- | ------------------------------------------------------ | ---------------- | ----------------------------------------------------------------- |
| **step**     | `step(..., { timeout })`     | A single step's wall-clock.                            | the agent        | The step fails; falls back to the 30-minute default when unset.   |
| **job**      | `job(..., { timeout })`      | The job's total wall-clock (init + all steps + hooks). | the agent        | The job is aborted and reported failed with a "timed out" reason. |
| **workflow** | `workflow(..., { timeout })` | The whole run's wall-clock across all jobs.            | the orchestrator | The run is cancelled with a "timed out" reason.                   |

```typescript
export default workflow('ci', {
  timeout: 1_800_000, // whole run must finish within 30 minutes
  jobs: [
    job('build', {
      runsOn: 'linux',
      timeout: 600_000, // this job (init + steps + hooks) within 10 minutes
      steps: [
        step('compile', {
          timeout: 120_000, // this single step within 2 minutes
          run: async (ctx) => {
            await ctx.$`make build`;
          },
        }),
      ],
    }),
  ],
});
```

**Precedence — each scope caps its own scope.** The three timeouts are independent caps, not a fallback chain:

- A **step** with no `timeout` falls back to the 30-minute agent default, regardless of the job or workflow timeout. A job timeout never becomes a step's default.
- A **job** `timeout` bounds the job's total wall-clock (its init, every step including their own per-step timeouts, and its hooks). It does not change any step's individual cap.
- A **workflow** `timeout` is a run-level deadline. The orchestrator records it when the run starts and cancels the run if its wall-clock exceeds the timeout, even when individual jobs and steps are still within their own caps.

Workflow and job timeouts surface with a distinct "timed out" reason so the dashboard labels the run or job as timed out rather than a generic failure or cancel.

### Output chaining

Steps and jobs can access outputs from preceding steps/jobs using two patterns.

**Within-job output chaining:**

```typescript
const buildStep = step('build', async (ctx) => {
  return { version: '2.0.0' };
});

const lint = async (ctx) => {
  return { warnings: 0 };
};

const pipeline = job('pipeline', {
  runsOn: 'default',
  steps: [
    buildStep,
    lint,
    step(async (ctx) => {
      // Pattern 1: .result proxy on Step objects
      const version = buildStep.result.version;

      // Pattern 2: ctx.outputsOf() for Step or bare function references
      const lintOutputs = ctx.outputsOf(lint);
      console.log(lintOutputs.warnings); // 0
    }),
  ],
});
```

**Cross-job output chaining:**

```typescript
const setup = job('setup', {
  runsOn: 'default',
  run: async (ctx) => {
    return { env: 'production' };
  },
});

const build = job('build', {
  runsOn: 'default',
  needs: [setup],
  steps: [
    step('compile', async (ctx) => {
      return { version: '2.0.0' };
    }),
  ],
});

const deploy = job('deploy', {
  runsOn: 'default',
  needs: [build],
  steps: [
    step(async (ctx) => {
      // Multi-step job: jobRef.result.stepName.field
      const version = build.result.compile.version;

      // Single-step job (run shorthand): jobRef.result.field
      const env = setup.result.env;

      // Explicit context method
      const buildOutputs = ctx.jobOutputs(build);
    }),
  ],
});
```

**Access patterns summary:**

| Pattern                        | Scope                     | Notes                         |
| ------------------------------ | ------------------------- | ----------------------------- |
| `stepRef.result.field`         | Within-job                | Proxy on Step object          |
| `ctx.outputsOf(stepRef)`       | Within-job                | Works with bare function refs |
| `jobRef.result.stepName.field` | Cross-job (multi-step)    | Proxy on Job object           |
| `jobRef.result.field`          | Cross-job (run shorthand) | Flat for single-step jobs     |
| `ctx.jobOutputs(jobRef)`       | Cross-job                 | Explicit context method       |

**Important:** `needs` must be declared explicitly. Output chaining does not auto-infer dependencies -- you must list job dependencies in `needs` even if you access their outputs via `.result`.

Cross-job output chaining works in both local test mode (`kici test`) and remote pipeline execution. The orchestrator's needs-aware dispatch scheduler guarantees upstream jobs reach a terminal state before downstream jobs dispatch, and upstream outputs are transported to the downstream agent sandbox via the `upstreamJobOutputs` field on `job.dispatch`. See [needs-scheduler](../../architecture/execution/needs-scheduler.md) for the full dispatch semantics.

### Job dependencies (`needs`)

The `needs` array accepts four entry forms. Mix freely within the same array.

```typescript
// 1. Reference by Job object (type-safe, preferred)
const test = job('test', { needs: [lint], ... });

// 2. Reference by string name
const test = job('test', { needs: ['lint'], ... });

// 3. Object form with per-edge failure policy
const cleanup = job('cleanup', {
  needs: [{ name: 'build', ifFailed: 'run' }],
  ...
});

// 4. Dynamic group reference (for static jobs that depend on a dynamicJob group)
const deploy = job('deploy', {
  needs: [dynamicGroup('test-shards')],
  ...
});
```

**Failure policy (`ifFailed`):** controls what happens to a downstream job when an upstream reaches a non-success terminal state (`failed`, `cancelled`, `drift_dropped`).

| Value  | Behavior                                                                                    |
| ------ | ------------------------------------------------------------------------------------------- |
| `skip` | (Default) Downstream transitions directly to `skipped`. Failures cascade through the DAG.   |
| `run`  | Downstream dispatches anyway. Use for cleanup, notification, or "always-run" teardown jobs. |

String and `Job`-reference entries default to `ifFailed: 'skip'`. To override, use the object form (`{ name, ifFailed }` for static upstreams, `{ group, ifFailed }` for dynamic groups -- `dynamicGroup(name, { ifFailed: 'run' })` produces the latter).

**Dispatch gate:** `needs` is a hard dispatch gate. A job only dispatches after every upstream in its `needs` array reaches a terminal state (success, or failure with `ifFailed: 'run'`). Root jobs (empty `needs`, no dynamic group refs) dispatch immediately. The scheduler is DB-backed and fully recovers across orchestrator restarts.

**DAG validation:** three-layer cycle detection.

1. Compile time: `validateDag` (see below) catches static-to-static cycles.
2. Eval time: after dynamic jobs are generated, a full topological sort runs on the resolved graph. Cycles reject the run with a clear error.
3. Runtime: a defensive invariant check flags stuck jobs as an internal-bug backstop.

### dynamicGroup(name, options?)

Create a reference to a dynamic job group, for use inside a static job's `needs` array.

```typescript
function dynamicGroup(name: string, options?: { ifFailed?: 'skip' | 'run' }): DynamicGroupRef;
```

Use when a static downstream must wait for every generated job tagged with a given group name to complete. If the dynamic group produces zero jobs, the downstream dispatches immediately (empty group satisfies all upstreams).

```typescript
const shardedTests = dynamicJob('test-shards', async (ctx) => {
  return ctx.shardIndices.map((i) =>
    job(`test-shard-${i}`, { runsOn: 'linux', run: async () => {} }),
  );
});

const deploy = job('deploy', {
  runsOn: 'linux',
  needs: [dynamicGroup('test-shards')],
  run: async () => {
    // Runs after ALL test-shards jobs have reached a terminal state
  },
});
```

### dynamicJob(groupName, fn)

Tag a dynamic job generator function with a group name so other jobs can reference it via `dynamicGroup()`.

```typescript
function dynamicJob(groupName: string, fn: DynamicJobFn): DynamicJobFn;
```

The generator runs twice: once in the init phase (to register expected job names) and once inside the executing agent (to produce the actual jobs). Mismatches between the two evaluations are detected as determinism drift -- see [dynamic-jobs](../../architecture/execution/dynamic-jobs.md).

### Auto-generated IDs

Unnamed steps and jobs receive counter-based IDs at compile time:

- **Steps:** `step-1`, `step-2`, etc. Counter is scoped per job and only increments for unnamed entries. Named steps do not consume counter values.
- **Jobs:** `job-1`, `job-2`, etc. Counter is scoped per workflow and only increments for unnamed entries.

These IDs are stable as long as the order of unnamed entries does not change. Adding or removing unnamed entries shifts subsequent IDs.

---

## Event payload reference

Source: https://docs.kici.dev/user/sdk/event-payloads/

<!-- Generated by scripts/docs-gen-event-payloads.ts — do not edit by hand. Regenerate: pnpm docs:gen:events -->

## The envelope

The normalized event envelope is the single event contract in KiCI. Rules receive it as `ctx.event`, and every dynamic function — `environment:`, `env:`, and `concurrencyGroup:` resolvers, generated jobs, and a workflow's `concurrency.group` — receives the same envelope as its argument.

Narrow on the `type` field to branch per trigger kind (`if (event.type === 'push')`). The raw provider webhook body is nested at `payload`; the typed variants below describe its shape per event type.

These fields are present on every envelope (the `EventBase` shape):

| Field             | Type                      | Description                                                                     |
| ----------------- | ------------------------- | ------------------------------------------------------------------------------- |
| `type`            | `string`                  | Normalized event type discriminant.                                             |
| `action?`         | `string`                  | Sub-action (e.g. 'opened', 'created', 'submitted').                             |
| `targetBranch?`   | `string`                  | Target branch (push target, PR base, or default branch).                        |
| `sourceBranch?`   | `string`                  | Source branch (PR head branch). Only set for PR-like events.                    |
| `provider?`       | `string`                  | Provider that originated this event.                                            |
| `isForkPR?`       | `boolean`                 | Whether this PR comes from a fork. Only set for PR-like events.                 |
| `baseBranch?`     | `string`                  | Base branch ref for PR events.                                                  |
| `senderUsername?` | `string`                  | Sender username from the webhook payload.                                       |
| `sourceRepo?`     | `string`                  | Repository identifier (e.g. "owner/repo").                                      |
| `changedFiles?`   | `string[]`                | Files changed in this event (for path filtering).                               |
| `payload?`        | `Record<string, unknown>` | Raw webhook payload from the provider. May be absent in flattened event forms.  |
| `[key: string]`   | `unknown`                 | Index signature for backward compatibility — untyped fields resolve to unknown. |

## Event types

One section per member of the `EventPayload` union. The heading is the `type` literal; the table lists the fields of that event's `payload` property when it declares a typed shape.

### `pull_request`

Carried by `PullRequestEventPayload`. The `payload` property has the following shape:

| Field           | Type                | Description |
| --------------- | ------------------- | ----------- |
| `action`        | `string`            |             |
| `number`        | `number`            |             |
| `pull_request`  | `GitHubPullRequest` |             |
| `repository`    | `GitHubRepository`  |             |
| `sender`        | `GitHubUser`        |             |
| `[key: string]` | `unknown`           |             |

### `push`

Carried by `PushEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `ref`           | `string`           |             |
| `after`         | `string`           |             |
| `before`        | `string`           |             |
| `head_commit?`  | `GitHubCommit`     |             |
| `commits?`      | `GitHubCommit[]`   |             |
| `repository`    | `GitHubRepository` |             |
| `sender?`       | `GitHubUser`       |             |
| `forced?`       | `boolean`          |             |
| `[key: string]` | `unknown`          |             |

### `tag`

Carried by `TagEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `ref`           | `string`           |             |
| `after`         | `string`           |             |
| `repository`    | `GitHubRepository` |             |
| `sender?`       | `GitHubUser`       |             |
| `[key: string]` | `unknown`          |             |

### `comment`

Carried by `CommentEventPayload`. The `payload` property has the following shape:

| Field           | Type                                                                                 | Description |
| --------------- | ------------------------------------------------------------------------------------ | ----------- |
| `action`        | `string`                                                                             |             |
| `comment`       | `GitHubComment`                                                                      |             |
| `issue?`        | `{ number: number; title?: string; pull_request?: unknown; [key: string]: unknown }` |             |
| `repository`    | `GitHubRepository`                                                                   |             |
| `sender`        | `GitHubUser`                                                                         |             |
| `[key: string]` | `unknown`                                                                            |             |

### `review`

Carried by `ReviewEventPayload`. The `payload` property has the following shape:

| Field           | Type                | Description |
| --------------- | ------------------- | ----------- |
| `action`        | `string`            |             |
| `review`        | `GitHubReview`      |             |
| `pull_request`  | `GitHubPullRequest` |             |
| `repository`    | `GitHubRepository`  |             |
| `sender`        | `GitHubUser`        |             |
| `[key: string]` | `unknown`           |             |

### `review_comment`

Carried by `ReviewCommentEventPayload`. The `payload` property has the following shape:

| Field           | Type                | Description |
| --------------- | ------------------- | ----------- |
| `action`        | `string`            |             |
| `comment`       | `GitHubComment`     |             |
| `pull_request`  | `GitHubPullRequest` |             |
| `repository`    | `GitHubRepository`  |             |
| `sender`        | `GitHubUser`        |             |
| `[key: string]` | `unknown`           |             |

### `release`

Carried by `ReleaseEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `action`        | `string`           |             |
| `release`       | `GitHubRelease`    |             |
| `repository`    | `GitHubRepository` |             |
| `sender`        | `GitHubUser`       |             |
| `[key: string]` | `unknown`          |             |

### `dispatch`

Carried by `DispatchEventPayload`. The `payload` property has the following shape:

| Field             | Type                      | Description |
| ----------------- | ------------------------- | ----------- |
| `action`          | `string`                  |             |
| `client_payload?` | `Record<string, unknown>` |             |
| `repository`      | `GitHubRepository`        |             |
| `sender?`         | `GitHubUser`              |             |
| `[key: string]`   | `unknown`                 |             |

### `create`

Carried by `CreateEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `ref`           | `string`           |             |
| `ref_type`      | `string`           |             |
| `repository`    | `GitHubRepository` |             |
| `sender`        | `GitHubUser`       |             |
| `[key: string]` | `unknown`          |             |

### `delete`

Carried by `DeleteEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `ref`           | `string`           |             |
| `ref_type`      | `string`           |             |
| `repository`    | `GitHubRepository` |             |
| `sender`        | `GitHubUser`       |             |
| `[key: string]` | `unknown`          |             |

### `status`

Carried by `StatusEventPayload`. The `payload` property has the following shape:

| Field           | Type                                              | Description |
| --------------- | ------------------------------------------------- | ----------- |
| `state`         | `string`                                          |             |
| `sha`           | `string`                                          |             |
| `context`       | `string`                                          |             |
| `description?`  | `string`                                          |             |
| `target_url?`   | `string`                                          |             |
| `branches?`     | `Array<{ name: string; [key: string]: unknown }>` |             |
| `repository`    | `GitHubRepository`                                |             |
| `sender`        | `GitHubUser`                                      |             |
| `[key: string]` | `unknown`                                         |             |

### `workflow_run`

Carried by `WorkflowRunEventPayload`. The `payload` property has the following shape:

| Field           | Type                                                                                                   | Description |
| --------------- | ------------------------------------------------------------------------------------------------------ | ----------- |
| `action`        | `string`                                                                                               |             |
| `workflow_run`  | `{ head_branch: string; name: string; conclusion?: string; status?: string; [key: string]: unknown; }` |             |
| `repository`    | `GitHubRepository`                                                                                     |             |
| `sender`        | `GitHubUser`                                                                                           |             |
| `[key: string]` | `unknown`                                                                                              |             |

### `fork`

Carried by `ForkEventPayload`. The `payload` property has the following shape:

| Field           | Type                                            | Description |
| --------------- | ----------------------------------------------- | ----------- |
| `forkee`        | `{ full_name: string; [key: string]: unknown }` |             |
| `repository`    | `GitHubRepository`                              |             |
| `sender`        | `GitHubUser`                                    |             |
| `[key: string]` | `unknown`                                       |             |

### `star`

Carried by `StarEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `action`        | `string`           |             |
| `repository`    | `GitHubRepository` |             |
| `sender`        | `GitHubUser`       |             |
| `[key: string]` | `unknown`          |             |

### `watch`

Carried by `WatchEventPayload`. The `payload` property has the following shape:

| Field           | Type               | Description |
| --------------- | ------------------ | ----------- |
| `action`        | `string`           |             |
| `repository`    | `GitHubRepository` |             |
| `sender`        | `GitHubUser`       |             |
| `[key: string]` | `unknown`          |             |

### `webhook`

Carried by `WebhookEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `kici_event`

Carried by `KiciEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `workflow_complete`

Carried by `WorkflowCompleteEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `job_complete`

Carried by `JobCompleteEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `generic_webhook`

Carried by `GenericWebhookEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `schedule`

Carried by `ScheduleEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `lifecycle`

Carried by `LifecycleEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `rerun`

Carried by `RerunEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `manual_schedule`

Carried by `ManualScheduleEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

### `unknown`

Carried by `UnknownEventPayload`. Adds no typed fields beyond the shared envelope; `payload` is the raw provider body (`Record<string, unknown>`).

## Shared GitHub object shapes

The typed `payload` shapes above reference these partial GitHub object types. Each lists only the commonly accessed fields; the index signature on every shape resolves any other field to `unknown`.

### `GitHubRepository`

| Field            | Type                                        | Description |
| ---------------- | ------------------------------------------- | ----------- |
| `full_name`      | `string`                                    |             |
| `default_branch` | `string`                                    |             |
| `name?`          | `string`                                    |             |
| `owner?`         | `{ login: string; [key: string]: unknown }` |             |
| `private?`       | `boolean`                                   |             |
| `[key: string]`  | `unknown`                                   |             |

### `GitHubUser`

| Field           | Type      | Description |
| --------------- | --------- | ----------- |
| `login`         | `string`  |             |
| `id?`           | `number`  |             |
| `[key: string]` | `unknown` |             |

### `GitHubPullRequest`

| Field           | Type                                                                                                          | Description |
| --------------- | ------------------------------------------------------------------------------------------------------------- | ----------- |
| `number`        | `number`                                                                                                      |             |
| `draft?`        | `boolean`                                                                                                     |             |
| `title?`        | `string`                                                                                                      |             |
| `body?`         | `string`                                                                                                      |             |
| `state?`        | `string`                                                                                                      |             |
| `merged?`       | `boolean`                                                                                                     |             |
| `head`          | `{ ref: string; sha: string; repo?: { full_name: string; [key: string]: unknown }; [key: string]: unknown; }` |             |
| `base`          | `{ ref: string; repo?: { full_name: string; [key: string]: unknown }; [key: string]: unknown; }`              |             |
| `user?`         | `GitHubUser`                                                                                                  |             |
| `labels?`       | `Array<{ name: string; [key: string]: unknown }>`                                                             |             |
| `[key: string]` | `unknown`                                                                                                     |             |

### `GitHubCommit`

| Field           | Type                                                                           | Description |
| --------------- | ------------------------------------------------------------------------------ | ----------- |
| `id`            | `string`                                                                       |             |
| `message`       | `string`                                                                       |             |
| `author?`       | `{ name?: string; email?: string; username?: string; [key: string]: unknown }` |             |
| `timestamp?`    | `string`                                                                       |             |
| `added?`        | `string[]`                                                                     |             |
| `removed?`      | `string[]`                                                                     |             |
| `modified?`     | `string[]`                                                                     |             |
| `[key: string]` | `unknown`                                                                      |             |

### `GitHubComment`

| Field           | Type         | Description |
| --------------- | ------------ | ----------- |
| `id`            | `number`     |             |
| `body`          | `string`     |             |
| `user`          | `GitHubUser` |             |
| `[key: string]` | `unknown`    |             |

### `GitHubReview`

| Field           | Type         | Description |
| --------------- | ------------ | ----------- |
| `id`            | `number`     |             |
| `state`         | `string`     |             |
| `body?`         | `string`     |             |
| `user`          | `GitHubUser` |             |
| `[key: string]` | `unknown`    |             |

### `GitHubRelease`

| Field               | Type      | Description |
| ------------------- | --------- | ----------- |
| `id`                | `number`  |             |
| `tag_name`          | `string`  |             |
| `name?`             | `string`  |             |
| `body?`             | `string`  |             |
| `draft?`            | `boolean` |             |
| `prerelease?`       | `boolean` |             |
| `target_commitish?` | `string`  |             |
| `[key: string]`     | `unknown` |             |

---

## SDK reference: idempotent

Source: https://docs.kici.dev/user/sdk/idempotent/

The SDK exposes two idempotency helpers — a generic function `idempotent()` and a step factory `idempotentStep()` — for the common case where a workflow step should:

1. **Check** whether the desired state is already in place.
2. **Apply** the change only when drift is detected.
3. **Surface** the resource (or its identifier) on both branches, so downstream steps don't need to know whether work happened or was skipped.

Both helpers wrap the same underlying runner, so they share semantics and return shape. Pick `idempotentStep()` when the operation is the whole job of a step; use `idempotent()` from anywhere — inside a multi-action step, a hook, or a bare async function.

## `idempotent(options)`

Run a single check / apply cycle and return a discriminated result describing the outcome.

### Parameters

| Name         | Type                                   | Required | Description                                                                                      |
| ------------ | -------------------------------------- | -------- | ------------------------------------------------------------------------------------------------ |
| `name`       | `string`                               | No       | Label that appears in log lines. Defaults to `'idempotent'`.                                     |
| `check`      | `() => Promise<TDrift \| null>`        | Yes      | Read-only inspection. Return `null` when the system is already in the desired state.             |
| `apply`      | `(drift: TDrift) => Promise<TApplied>` | Yes      | Brings the system to the desired state when `check()` returned a non-null drift value.           |
| `whenInSync` | `() => Promise<TInSync>`               | No       | Runs when `check()` returned `null`. Use it to fetch the already-satisfied resource.             |
| `summarize`  | `(drift: TDrift) => string`            | No       | Human-readable, multi-line summary of what `apply()` would do. Defaults to a JSON dump of drift. |
| `log`        | `(line: string) => void`               | No       | Sink for status lines. Defaults to `console.log`.                                                |

### Result

`idempotent()` resolves to a discriminated `IdempotentResult` union:

| Outcome     | `drift`  | `result`                                         |
| ----------- | -------- | ------------------------------------------------ |
| `'skipped'` | `null`   | The `whenInSync()` return value, or `undefined`. |
| `'applied'` | `TDrift` | The `apply()` return value.                      |

Narrow on `result.outcome` before reading `result.result` to get the correct typed shape.

### Example

```typescript
import { idempotent } from '@kici-dev/sdk';

const result = await idempotent({
  name: 'create-dns-record',
  check: async () => {
    const existing = await dns.getRecord('api.example.com');
    return existing ? null : { fqdn: 'api.example.com', target: '203.0.113.10' };
  },
  whenInSync: async () => {
    const existing = await dns.getRecord('api.example.com');
    return { id: existing.id };
  },
  apply: async (drift) => {
    const created = await dns.createRecord(drift.fqdn, drift.target);
    return { id: created.id };
  },
  summarize: (drift) => `Create A record ${drift.fqdn} → ${drift.target}`,
});

// Both branches surface the record id.
const recordId = result.result.id;
```

## `idempotentStep(name, options)`

A factory returning an SDK `Step` whose `run` body executes `idempotent(...)` and routes status lines through the step's structured logger.

### Parameters

| Name      | Type                                       | Required | Description                                                                                            |
| --------- | ------------------------------------------ | -------- | ------------------------------------------------------------------------------------------------------ |
| `name`    | `string`                                   | Yes      | Step name. Appears in the run timeline and in log lines.                                               |
| `options` | `Omit<IdempotentOptions, 'name' \| 'log'>` | Yes      | Same shape as `idempotent()` minus `name` (already provided) and `log` (provided by the step context). |

### Result

`idempotentStep(...)` returns `Step<IdempotentResult<TDrift, TInSync, TApplied>>`. Other steps can consume the result through the standard step output mechanisms.

### Example

```typescript
import { idempotentStep, job } from '@kici-dev/sdk';

const ensureBucket = idempotentStep('ensure-bucket', {
  check: async () => {
    const exists = await s3.bucketExists('app-cache');
    return exists ? null : { bucket: 'app-cache', region: 'eu-central-1' };
  },
  whenInSync: async () => ({ arn: 'arn:aws:s3:::app-cache' }),
  apply: async (drift) => {
    const created = await s3.createBucket(drift.bucket, drift.region);
    return { arn: created.arn };
  },
  summarize: (drift) => `Create S3 bucket ${drift.bucket} in ${drift.region}`,
});

export const setup = job('setup', {
  runsOn: 'linux',
  steps: [ensureBucket],
});
```

## Worked example: create-if-missing returning a resource id

The typical use case is **resource provisioning that should be safe to re-run**. The helper guarantees the same downstream typed shape whether the resource already existed or was just created:

```typescript
import { idempotent } from '@kici-dev/sdk';

interface BucketDrift {
  bucket: string;
  region: string;
}

interface BucketHandle {
  arn: string;
}

async function ensureBucket(bucket: string, region: string): Promise<BucketHandle> {
  const result = await idempotent<BucketDrift, BucketHandle, BucketHandle>({
    name: `ensure-${bucket}`,
    check: async () => {
      const existing = await s3.describeBucket(bucket);
      return existing ? null : { bucket, region };
    },
    whenInSync: async () => {
      const existing = await s3.describeBucket(bucket);
      return { arn: existing.arn };
    },
    apply: async (drift) => {
      const created = await s3.createBucket(drift.bucket, drift.region);
      return { arn: created.arn };
    },
    summarize: (drift) => `Create S3 bucket ${drift.bucket} in ${drift.region}`,
  });

  return result.result;
}
```

The caller never has to branch on outcome — `result.result` is always a `BucketHandle`. A second invocation against the same bucket logs a single "in sync, skipping" line and returns the same ARN.

## See also

- [Core SDK reference](./core.md) — the `step()`, `job()`, and `workflow()` factories that `idempotentStep()` builds on.
- [Runtime types](./runtime.md) — `StepContext`, `Logger`, and other surface used inside the helpers.

---

## SDK reference: rules, matrix, dynamic jobs

Source: https://docs.kici.dev/user/sdk/rules-matrix-dynamic/

## Rules

Rules control conditional execution of workflows and jobs. A rule that returns `false` (or whose check function returns `false`) prevents execution.

### rule(label) / rule(label, check)

Create a rule.

```typescript
function rule(label: string): Rule;
function rule(label: string, check: RuleCheckFn): Rule;
```

**Without check function:** Always passes. Useful as a marker in the decision trace.

```typescript
rule('ci: required check');
```

**With check function:** Passes when the function returns `true`.

```typescript
rule('has source changes', async (ctx) => {
  return ctx.changedFiles.some((f) => f.startsWith('src/'));
});
```

### skip(label, check)

Create a rule that skips when the condition is met. Inverts the check function.

```typescript
function skip(label: string, check: RuleCheckFn): Rule;
```

When the check returns `true` (condition met), the rule returns `false` (skip execution).
When the check returns `false` (condition not met), the rule returns `true` (allow execution).

```typescript
// Skip when only docs changed
skip('docs only PR', async (ctx) => {
  return ctx.changedFiles.every((f) => f.endsWith('.md'));
});
```

### RuleCheckFn

```typescript
type RuleCheckFn = (ctx: RuleContext) => Promise<boolean> | boolean;
```

Can be sync or async. Receives a `RuleContext`:

| Property       | Type                                | Description                                                           |
| -------------- | ----------------------------------- | --------------------------------------------------------------------- |
| `event`        | `EventPayload`                      | The triggering event payload (discriminated union — narrow on `type`) |
| `changedFiles` | `string[]`                          | Files changed in this event                                           |
| `env`          | `Record<string, string\|undefined>` | Environment variables                                                 |
| `$`            | zx shell                            | Shell executor for running commands                                   |

### evaluateRules(rules, context, label, onRuleResult?)

Evaluate an array of rules sequentially with fail-fast behavior. Stops on the first failure.

```typescript
function evaluateRules(
  rules: Rule[],
  context: RuleContext,
  label: string,
  onRuleResult?: (result: RuleResult) => void,
): Promise<RuleEvaluationResult>;
```

Returns a `RuleEvaluationResult`:

```typescript
interface RuleEvaluationResult {
  allPassed: boolean;
  results: RuleResult[];
}
```

### isEventType(event, type)

Type guard that narrows an `EventPayload` to a specific event type variant. Use this in rule check functions to get autocomplete on provider-specific fields.

```typescript
function isEventType<T extends EventPayload['type']>(
  event: EventPayload,
  type: T,
): event is Extract<EventPayload, { type: T }>;
```

**Example — skip draft PRs:**

```typescript
rule('skip-draft-prs', (ctx) => {
  if (!isEventType(ctx.event, 'pull_request')) return true;
  // ctx.event is now PullRequestEventPayload — full autocomplete
  return !ctx.event.payload.pull_request.draft;
});
```

**Example — branch-based rule with push narrowing:**

```typescript
rule('only-main-pushes', (ctx) => {
  if (!isEventType(ctx.event, 'push')) return false;
  // ctx.event.payload.ref is typed as string
  return ctx.event.payload.ref === 'refs/heads/main';
});
```

You can also narrow directly with `if (ctx.event.type === 'pull_request')` — TypeScript's discriminated union narrowing works on the `type` field.

### EventPayload

`EventPayload` is a discriminated union over the `type` field. Each variant provides typed access to the normalized event fields and the raw webhook payload.

Every variant carries the shared `EventBase` fields — `type`, `action`, `targetBranch`, `sourceBranch`, `provider`, `isForkPR`, `baseBranch`, `senderUsername`, `sourceRepo`, `changedFiles`, and the raw `payload` — plus a per-type `payload` shape for the typed variants. The complete field-by-field schema, including every typed `payload` shape and the shared GitHub object types, is in the [event payload reference](./event-payloads.md).

**Typed variants** (with GitHub-specific payload fields): `pull_request`, `push`, `tag`, `comment`, `review`, `review_comment`, `release`, `dispatch`, `create`, `delete`, `status`, `workflow_run`, `fork`, `star`, `watch`.

**Generic variants** (payload is `Record<string, unknown>`): `webhook`, `kici_event`, `workflow_complete`, `job_complete`, `generic_webhook`, `schedule`, `lifecycle`.

## Matrix

Matrix configurations expand a single job into multiple instances, one per parameter combination. Maximum 256 combinations.

### Static array (single dimension)

```typescript
matrix: ['18', '20', '22'];
```

Creates 3 job instances. In steps, the current value is `matrix.value`:

```typescript
step('test', async ({ $, matrix }) => {
  console.log(matrix!.value); // '18', '20', or '22'
});
```

### Static object (multi-dimensional)

```typescript
matrix: {
  os: ['linux', 'arm64'],
  node: ['18', '20'],
}
```

Creates 4 job instances (2 x 2). The `os` values (`linux`, `arm64`) are **customer-defined scaler labels** matched by subset semantics against the labels your orchestrator advertises in its scaler `labelSets` — not hosted-runner names. In steps, values are named properties:

```typescript
step('test', async ({ $, matrix }) => {
  console.log(matrix!.os); // 'linux' or 'arm64'
  console.log(matrix!.node); // '18' or '20'
});
```

### Dynamic function

Compute matrix values at runtime:

```typescript
matrix: async ({ $ }) => {
  const result = await $`ls packages/`;
  return result.stdout.trim().split('\n');
};
```

The function receives a `DynamicMatrixContext`:

| Property | Type                                | Description               |
| -------- | ----------------------------------- | ------------------------- |
| `$`      | zx shell                            | Shell executor            |
| `ctx`    | `{ workflow, job }`                 | Workflow and job metadata |
| `log`    | `Logger`                            | Structured logger         |
| `env`    | `Record<string, string\|undefined>` | Environment variables     |

Must return `string[]` (single dimension) or `Record<string, string[]>` (multi-dimensional).

### Include and exclude

Fine-tune matrix combinations on multi-dimensional matrices:

```typescript
matrix: {
  os: ['linux', 'arm64', 'windows'],
  node: ['18', '20', '22'],
},
exclude: [
  { os: 'windows', node: '18' },
],
include: [
  { os: 'linux', node: '23' },
],
```

**Exclude** removes combinations matching all specified keys. Applied first.
**Include** adds exact combinations. Applied after exclude.

Types:

```typescript
type MatrixInclude = Record<string, string>;
type MatrixExclude = Record<string, string>;
```

### MatrixValues

The shape of `matrix` in `StepContext`:

```typescript
interface MatrixValues {
  value?: string; // Single-dimension value
  [dimension: string]: string | undefined; // Named dimensions
}
```

### Matrix type guards

```typescript
import { isStaticArray, isStaticObject, isDynamicFunction } from '@kici-dev/sdk';

isStaticArray(matrix); // true if string[]
isStaticObject(matrix); // true if Record<string, string[]>
isDynamicFunction(matrix); // true if async function
```

### Matrix expansion utilities

```typescript
import { expandMatrix, applyIncludeExclude } from '@kici-dev/sdk';
```

`expandMatrix(matrix)` takes a `StaticMatrixArray` or `StaticMatrixObject` and returns all combinations as `MatrixValues[]`. For a single-dimension array, each value becomes `{ value: '...' }`. For multi-dimensional objects, it produces the Cartesian product.

`applyIncludeExclude(values, include?, exclude?)` filters an expanded matrix: removes combinations matching any exclude entry, then appends include entries. Returns the filtered `MatrixValues[]`.

## Dynamic jobs

Generate jobs at runtime using async factory functions.

### DynamicJobFn

```typescript
type DynamicJobFn = (context: DynamicJobContext) => Promise<Job[]>;
```

Receives a `DynamicJobContext`:

| Property | Type                                | Description                 |
| -------- | ----------------------------------- | --------------------------- |
| `$`      | zx shell                            | Shell executor              |
| `ctx`    | `{ workflow, event? }`              | Workflow metadata and event |
| `log`    | `Logger`                            | Structured logger           |
| `env`    | `Record<string, string\|undefined>` | Environment variables       |

```typescript
const discoverJobs: DynamicJobFn = async ({ $ }) => {
  const result = await $`ls packages/`;
  const packages = result.stdout.trim().split('\n');
  return packages.map((pkg) =>
    job(`test-${pkg}`, {
      runsOn: 'linux',
      steps: [
        step('test', async ({ $ }) => {
          await $`cd packages/${pkg} && pnpm test`;
        }),
      ],
    }),
  );
};

export default workflow('ci', {
  jobs: [discoverJobs],
});
```

### JobOrFactory

The `jobs` array in `WorkflowOptions` accepts both static jobs and dynamic generators:

```typescript
type JobOrFactory = Job | DynamicJobFn;
```

### isDynamicJobFn(item)

Type guard to distinguish static jobs from dynamic generators:

```typescript
function isDynamicJobFn(item: JobOrFactory): item is DynamicJobFn;
```

```typescript
for (const item of workflow.jobs) {
  if (isDynamicJobFn(item)) {
    const generatedJobs = await item(context);
  } else {
    // item is Job
  }
}
```

---

## SDK reference: runtime

Source: https://docs.kici.dev/user/sdk/runtime/

## Types

All types are exported from `@kici-dev/sdk` as type-only imports.

### Core types

| Type              | Description                                                                           |
| ----------------- | ------------------------------------------------------------------------------------- |
| `Workflow`        | Workflow definition returned by `workflow()`                                          |
| `WorkflowOptions` | Options for `workflow()` factory                                                      |
| `Job`             | Job definition returned by `job()`                                                    |
| `JobOptions`      | Options for `job()` factory                                                           |
| `Step<TOutputs>`  | Step definition returned by `step()`                                                  |
| `StepOptions<T>`  | Options for `step()` factory (full form with outputs)                                 |
| `StepRunFn`       | Simple step function type: `(ctx) => Promise<void>`                                   |
| `BareStepFn`      | Bare step function (no options, just `(ctx) => ...`)                                  |
| `StepInput`       | Union of step input forms accepted by `job()`                                         |
| `OutputSchema`    | Record of Zod types for step outputs                                                  |
| `InferOutputs<T>` | Infer output type from output schema                                                  |
| `ContainerConfig` | Container config for job execution (`image`, `env?`)                                  |
| `RunsOn`          | Union of `runsOn` forms: `string \| string[] \| RunsOnSelector`                       |
| `RunsOnSelector`  | Object form for `runsOn` with `labels` (required) and `exclude` (optional) properties |
| `Fixture`         | Test fixture definition returned by `fixture()`                                       |
| `FixtureOptions`  | Options for `fixture()` factory                                                       |
| `Registry`        | Private npm registry declaration used in `WorkflowOptions.registries`                 |

### Trigger types

| Type                            | Description                                                           |
| ------------------------------- | --------------------------------------------------------------------- |
| `Trigger`                       | Trigger definition (trigger config + source location)                 |
| `TriggerConfig`                 | Union of all 22 trigger config types                                  |
| `PrTriggerConfig`               | PR trigger configuration (from `pr()`)                                |
| `PushTriggerConfig`             | Push trigger configuration (from `push()`)                            |
| `TagTriggerConfig`              | Tag trigger configuration (from `tag()`)                              |
| `CommentTriggerConfig`          | Comment trigger configuration (from `comment()`)                      |
| `ReviewTriggerConfig`           | Review trigger configuration (from `review()`)                        |
| `ReviewCommentTriggerConfig`    | Review comment trigger configuration (from `reviewComment()`)         |
| `ReleaseTriggerConfig`          | Release trigger configuration (from `release()`)                      |
| `DispatchTriggerConfig`         | Repository dispatch trigger configuration (from `dispatch()`)         |
| `CreateTriggerConfig`           | Ref creation trigger configuration (from `create()`)                  |
| `DeleteTriggerConfig`           | Ref deletion trigger configuration (from `delete()`)                  |
| `StatusTriggerConfig`           | Commit status trigger configuration (from `status()`)                 |
| `WorkflowRunTriggerConfig`      | Workflow run trigger configuration (from `workflowRun()`)             |
| `ForkTriggerConfig`             | Fork trigger configuration (from `fork()`)                            |
| `StarTriggerConfig`             | Star trigger configuration (from `star()`)                            |
| `WatchTriggerConfig`            | Watch trigger configuration (from `watch()`)                          |
| `WebhookTriggerConfig`          | Catch-all webhook trigger configuration (from `webhook()`)            |
| `KiciEventTriggerConfig`        | Custom event trigger configuration (from `kiciEvent()`)               |
| `WorkflowCompleteTriggerConfig` | Workflow completion trigger configuration (from `workflowComplete()`) |
| `JobCompleteTriggerConfig`      | Job completion trigger configuration (from `jobComplete()`)           |
| `GenericWebhookTriggerConfig`   | Generic webhook trigger configuration (from `genericWebhook()`)       |
| `ScheduleTriggerConfig`         | Schedule trigger configuration (from `schedule()`)                    |
| `LifecycleTriggerConfig`        | Lifecycle trigger configuration (from `lifecycle()`)                  |
| `PrConfigInput`                 | Config object for `pr()` factory                                      |
| `PushConfigInput`               | Config object for `push()` factory                                    |
| `BranchPattern`                 | `{ type: 'glob', pattern } \| { type: 'regex', pattern, flags? }`     |
| `PrEvent`                       | PR event string literal union (17 event types)                        |
| `GenericWebhookConfigInput`     | Config object for `genericWebhook()` factory                          |
| `GenericWebhookAuth`            | Union of generic webhook auth types (HMAC or API key)                 |
| `GenericWebhookHmacAuth`        | HMAC-SHA256 auth configuration for generic webhooks                   |
| `GenericWebhookApiKeyAuth`      | API key auth configuration for generic webhooks                       |
| `GenericWebhookAuthMethod`      | Auth method string literal (`'hmac-sha256' \| 'api-key'`)             |

### Rule types

| Type                   | Description                                                             |
| ---------------------- | ----------------------------------------------------------------------- |
| `Rule`                 | Rule definition returned by `rule()` / `skip()`                         |
| `RuleCheckFn`          | `(ctx: RuleContext) => Promise<boolean> \| boolean`                     |
| `RuleContext`          | Context passed to rule check functions                                  |
| `RuleResult`           | Result of rule evaluation (label, passed, duration)                     |
| `EventPayload`         | Discriminated union over event type (narrow on `type` for autocomplete) |
| `RuleEvaluationResult` | Result of `evaluateRules()` (allPassed + results)                       |

### Matrix types

| Type                   | Description                                                         |
| ---------------------- | ------------------------------------------------------------------- |
| `Matrix`               | Union: `StaticMatrixArray \| StaticMatrixObject \| DynamicMatrixFn` |
| `StaticMatrixArray`    | `string[]`                                                          |
| `StaticMatrixObject`   | `Record<string, string[]>`                                          |
| `DynamicMatrixFn`      | `(ctx) => Promise<StaticMatrixArray \| StaticMatrixObject>`         |
| `DynamicMatrixContext` | Context passed to dynamic matrix functions                          |
| `MatrixValues`         | Values exposed to steps (`value?` + named dimensions)               |
| `MatrixInclude`        | `Record<string, string>` -- additional combinations                 |
| `MatrixExclude`        | `Record<string, string>` -- removed combinations                    |

### Hook types

| Type              | Description                                                     |
| ----------------- | --------------------------------------------------------------- |
| `HookConfig`      | Hook definition returned by hook factories (`onCancel()`, etc.) |
| `HookFn`          | Hook function type: `(ctx: HookContext) => Promise<void>`       |
| `HookInput`       | Hook input: `HookFn \| { run: HookFn; timeout?: number }`       |
| `HookContext`     | Context passed to hook functions                                |
| `OutcomeMetadata` | Metadata about the outcome that triggered the hook              |

### Dynamic job types

| Type                | Description                        |
| ------------------- | ---------------------------------- |
| `DynamicJobFn`      | `(ctx) => Promise<Job[]>`          |
| `DynamicJobContext` | Context for dynamic job generators |
| `JobOrFactory`      | `Job \| DynamicJobFn`              |

### Context types

| Type                  | Description                                                        |
| --------------------- | ------------------------------------------------------------------ |
| `StepContext<T>`      | Context passed to step run functions                               |
| `Logger`              | Logger interface (info, warn, error, debug)                        |
| `WorkflowInfo`        | Workflow metadata: `{ name: string }`                              |
| `JobInfo`             | Job metadata: `{ name: string, runsOn: string }`                   |
| `RepoInfo`            | Repository metadata available in step context                      |
| `StepSecrets`         | Async accessor interface for step secrets (`get`, `expose`, `has`) |
| `StepSecretsTyped`    | Typed step secrets with known key inference                        |
| `KnownSecretKeys`     | String literal union of declared secret context keys               |
| `SecretNotFoundError` | Thrown when accessing a nonexistent key in secrets                 |

## StepContext

The context object passed to every step's `run` function:

```typescript
interface StepContext<TInputs = Record<string, unknown>> {
  /** zx shell executor for running commands */
  $: typeof Shell;
  /** Structured logger */
  log: Logger;
  /** Environment variables */
  env: Record<string, string | undefined>;
  /** Set an environment variable visible to this step and all subsequent steps */
  setEnv(key: string, value: string): void;
  /** Prepend a directory to PATH, visible to this step and all subsequent steps */
  addPath(dir: string): void;
  /** Typed inputs from dependency step outputs */
  inputs: TInputs;
  /** Current workflow metadata */
  workflow: WorkflowInfo;
  /** Current job metadata */
  job: JobInfo;
  /** Matrix values for current job instance (undefined without matrix) */
  matrix?: MatrixValues;
  /** Raw webhook payload from the git provider */
  rawPayload?: Record<string, unknown>;
  /** Which git provider triggered this workflow (e.g. 'github', 'gitlab') */
  provider?: string;
  /** Whether this execution was triggered by `kici test` (remote test run) */
  isTestRun: boolean;
  /** The resolved deployment environment name for this job (undefined without environment) */
  environment?: string;
  /** Flat secrets resolved for this job. Throws SecretNotFoundError on missing key. */
  secrets: StepSecrets;
  /** Emit a custom event that can trigger other workflows */
  emit(
    eventName: string,
    payload?: Record<string, unknown>,
    options?: EventEmitOptions,
  ): Promise<{ deliveryId: string }>;
  /** Resolve outputs from a preceding step by reference */
  outputsOf<T>(ref: { _tag: 'Step'; name: string } | ((...args: any[]) => any)): T;
  /** Resolve outputs from a preceding job by reference */
  jobOutputs(ref: { name: string }): Record<string, unknown>;
  /** Publish a secret output value from this job (encrypted before leaving the agent) */
  setSecretOutput(key: string, value: string): void;
}
```

### Logger

```typescript
interface Logger {
  info(message: string, ...args: unknown[]): void;
  warn(message: string, ...args: unknown[]): void;
  error(message: string, ...args: unknown[]): void;
  debug(message: string, ...args: unknown[]): void;
}
```

### Usage

```typescript
step('example', async ({ $, log, env, matrix, workflow, job }) => {
  log.info(`Running in workflow: ${workflow.name}`);
  log.info(`Job: ${job.name} on ${job.runsOn}`);

  if (matrix) {
    log.info(`Matrix value: ${matrix.value}`);
  }

  const token = env.GITHUB_TOKEN;
  await $`echo "Building..."`;
});
```

### `rawPayload` and rule-context parity

`ctx.rawPayload` carries the same data that rule contexts access via `ctx.event.payload` — the unmodified webhook body from the git provider. A rule that branches on `ctx.event.payload.client_payload.foo` and a step body that reads `ctx.rawPayload.client_payload.foo` see the same value. Use it inside steps when the operator's dispatch payload (or any other provider-specific field) needs to drive runtime behavior — e.g. a `--dry-run` toggle or a deploy target — without bouncing the data through an env var.

**What's captured in the dashboard log viewer.** KiCI captures user output from every place in a workflow that can run TypeScript:

- **Inside a step body** — the agent merges three streams into the step's log: `ctx.log.*` structured calls, subprocess stdout/stderr from `ctx.$`, and any direct `console.log` / `.error` / `.warn` / `.info` / `.debug` (or other library that writes to `process.stdout` / `process.stderr`).
- **Inside hooks** (`beforeStep`, `afterStep`, `onSuccess`, `onFailure`, `onCancel`, `cleanup`) — the same three streams are captured; per-step hooks share the step's log, post-loop hooks get their own dashboard row.
- **At workflow module top-level, in rule `check` functions, and in the workflow `concurrency.group` function** — captured to the workflow-level `prepare` log bucket for the job, alongside KiCI's own setup narration.
- **Inside a dynamic `environment` / `env` / `concurrencyGroup` function** on a static job — captured to the `__init__` job's synthetic step-0 log, which appears in the timeline as "Init: _jobname_".
- **Inside a `DynamicJobFn` body and the per-generated-job `environment` / `env` / `concurrencyGroup` / `matrix` functions** — captured to the `__dynamic__` job's synthetic step-0 log ("Evaluate: _jobname_" in the timeline). The `$` parameter in that context is a scoped zx shell, so `await $\`...\`` subprocess output is captured too.

Use whichever style is convenient — you don't have to wrap `console.log` in the provided `log` parameter to make it visible. One limitation applies to in-process contexts only (init, build, dynamic-eval): direct `process.stdout.write` / `printf` is not captured there, because the agent's own logger uses that path and we don't want agent-internal output leaking into your step logs. Use `console.*` or the `log` parameter instead. See [Log streaming](../../architecture/execution/job-execution.md#log-streaming) for the full capture surface and limits (default 10 MB per step, backpressure behavior).

### setEnv(key, value)

Export an environment variable to later steps in the same job. This is the canonical way to hand a value computed in one step to the steps that follow — the equivalent of `echo "KEY=VALUE" >> $GITHUB_ENV` in GitHub Actions. The value is visible to the current step and all subsequent steps in the job.

```typescript
step('setup', async (ctx) => {
  // Install a tool and record its version
  await ctx.$`npm install -g some-tool`;
  const version = (await ctx.$`some-tool --version`).stdout.trim();
  ctx.setEnv('TOOL_VERSION', version);
});

step('use', async (ctx) => {
  // TOOL_VERSION is available here
  ctx.log.info(`Using tool version: ${ctx.env.TOOL_VERSION}`);
});
```

**Behavior:**

- Last-write-wins -- if multiple steps set the same key, the last value is used
- Cannot override operator-injected secrets (the operator value takes precedence)
- Changes take effect immediately in the current step and persist for all subsequent steps
- Shell commands export the same way by appending to `$KICI_ENV` (see [Exporting env from shell commands](#exporting-env-from-shell-commands-kici_env--kici_path) below)

### addPath(dir)

Prepend a directory to `PATH` for the current step and all subsequent steps in the same job. Useful for tools installed to non-standard locations.

```typescript
step('install-go', async (ctx) => {
  await ctx.$`curl -L https://go.dev/dl/go1.22.0.linux-amd64.tar.gz | tar -C /tmp -xz`;
  ctx.addPath('/tmp/go/bin');
});

step('build', async (ctx) => {
  // `go` is now on PATH
  await ctx.$`go build ./...`;
});
```

### Exporting env from shell commands ($KICI_ENV / $KICI_PATH)

`setEnv` and `addPath` are the TypeScript form of "export env to later steps". A shell command — including a non-JS toolchain installer — exports env the same way by appending to two files the agent points at before every step:

- **`$KICI_ENV`** — append `KEY=value` lines. Each becomes an environment variable visible to subsequent steps, exactly like `ctx.setEnv('KEY', 'value')`.
- **`$KICI_PATH`** — append one directory per line. Each is prepended to `PATH` for subsequent steps, exactly like `ctx.addPath(dir)`. The first directory appended ends up first on `PATH`.

```typescript
step('install-tool', async (ctx) => {
  await ctx.$`./install-mytool.sh`; // installs to /opt/mytool
  // Export from the shell, no JS round-trip needed:
  await ctx.$`echo "MYTOOL_HOME=/opt/mytool" >> "$KICI_ENV"`;
  await ctx.$`echo "/opt/mytool/bin" >> "$KICI_PATH"`;
});

step('build', async (ctx) => {
  // MYTOOL_HOME is set and /opt/mytool/bin is on PATH here.
  await ctx.$`mytool build`;
});
```

**Format (v1):**

- One `KEY=value` per line in `$KICI_ENV`. The split is on the first `=`, so the value may contain `=`. Blank lines and lines without a `=` are ignored.
- One directory per line in `$KICI_PATH`. Blank lines are ignored.
- Values must be single-line — embedded newlines are not supported in v1.

**Behavior (shared with `setEnv` / `addPath`):**

- Applied after the step completes and visible to every later step in the job.
- Last-write-wins on a repeated key.
- Cannot override an operator-injected secret — a collision is ignored and logged, and the operator value is preserved.
- The files are reset before each step, so each step sees only its own appended lines.

### setSecretOutput(key, value)

Publish an encrypted secret output from this job. Downstream jobs that list this job in their `needs` array receive the value merged into `ctx.secrets`.

```typescript
const generateToken = job('generate-token', {
  steps: [
    step('create', async (ctx) => {
      const token = (await ctx.$`vault write -f auth/token/create`).stdout.trim();
      ctx.setSecretOutput('DEPLOY_TOKEN', token);
    }),
  ],
});

const deploy = job('deploy', {
  needs: [generateToken],
  steps: [
    step('deploy', async (ctx) => {
      // DEPLOY_TOKEN is available as a secret (decrypted by the orchestrator)
      const token = await ctx.secrets.get('DEPLOY_TOKEN');
      await ctx.$`DEPLOY_TOKEN=${token} ./deploy.sh`;
    }),
  ],
});
```

**Security model:**

- The value is encrypted on the agent before leaving the machine (X25519 ECDH + AES-256-GCM)
- The orchestrator decrypts and re-encrypts with its own key before storing
- The ephemeral key pair is deleted when the run completes (forward secrecy)
- Downstream agents never see the plaintext -- they receive it as part of their injected secrets

**Limits:**

- Maximum 20 secret outputs per job
- Maximum 64 KB per value

## Secrets

Workflows access secrets through `ctx.secrets` on `StepContext`. Use `await ctx.secrets.get('KEY')` to retrieve a value (rejects with `SecretNotFoundError` if the key is missing, fail-fast on typos), `ctx.secrets.has('KEY')` for a synchronous existence check, and `await ctx.secrets.expose('KEY')` when you need the value as a `process.env` entry for a child process.

### Declaring the secret environment

Each job picks its secret environment via the `environment` option on `job()`. The orchestrator resolves the environment's scoped-secret store at dispatch time, evaluates access rules, and sends the decrypted secrets to the agent:

```typescript
const deploy = job('deploy', {
  runsOn: 'linux',
  environment: 'production',
  steps: [
    /* ... */
  ],
});

export default workflow('deploy', {
  on: push({ branches: 'main' }),
  jobs: [deploy],
});
```

`environment` accepts either a static string or an async function `(event) => string | Promise<string>` for dynamic resolution at trigger-evaluation time. The resolved environment's secrets are flattened into `ctx.secrets`.

### Accessing secrets (ctx.secrets)

`ctx.secrets` provides flat access to the secrets resolved for the job's environment.

```typescript
step('deploy', async ({ secrets }) => {
  // get() rejects with SecretNotFoundError if DEPLOY_TOKEN is not found
  const token = await secrets.get('DEPLOY_TOKEN');

  // Safe check before access (no throw, synchronous)
  if (secrets.has('OPTIONAL_KEY')) {
    const optional = await secrets.get('OPTIONAL_KEY');
  }
});
```

**Throw behavior:** `get()` rejects with `SecretNotFoundError` and the message lists all available keys. This catches typos immediately rather than producing silent `undefined` values.

### Complete example

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

const deploy = job('deploy', {
  runsOn: 'linux',
  environment: 'production',
  steps: [
    step('deploy', async (ctx) => {
      const token = await ctx.secrets.get('DEPLOY_TOKEN');

      // Safe check before access
      if (ctx.secrets.has('OPTIONAL_NOTIFICATION_URL')) {
        const url = await ctx.secrets.get('OPTIONAL_NOTIFICATION_URL');
        ctx.log.info('Sending notification...');
      }

      // Pass to subprocess explicitly (secrets are NOT auto-injected as env vars)
      await ctx.$`DEPLOY_TOKEN=${token} ./scripts/deploy.sh`;
    }),
  ],
});

export default workflow('deploy-production', {
  on: push({ branches: 'main' }),
  jobs: [deploy],
});
```

### Security notes

- Secrets are **not** automatically injected as environment variables. You must explicitly pass them to subprocesses.
- All secret values are automatically **masked** in log output. If a step logs a string containing a secret value, the value is replaced with `***`.
- Secrets flow from the orchestrator to the agent via the authenticated WebSocket channel. The Platform tier never handles secret material.

### Enumerating available keys (ctx.secrets.list)

`ctx.secrets.list()` returns every secret key available to the step, sorted alphabetically. Synchronous, never throws, names only — call `getMeta(key)` to inspect backend / scope per key. Useful when the set of provisioned keys isn't known at workflow-author time, for example to pick up every `AGE_KEY_*` the operator has seeded:

```typescript
step('discover', async (ctx) => {
  const ageKeys = ctx.secrets.list().filter((k) => k.startsWith('AGE_KEY_'));
  ctx.log.info(`Found ${ageKeys.length} age keys`);
});
```

### File-mounted secrets (ctx.secrets.mountFile / exposeFile)

Tools that require a file path on disk (sops `SOPS_AGE_KEY_FILE`, kubectl `KUBECONFIG`, gcloud `GOOGLE_APPLICATION_CREDENTIALS`) get a typed step-side API: `ctx.secrets.mountFile(opts)` writes the concatenation of one or more existing secrets to a per-step tmpfile and returns the path; `ctx.secrets.exposeFile(envVar, opts)` additionally sets `process.env[envVar] = path`. Files are removed and env vars are unset automatically when the step completes (success, failure, or timeout) — no manual cleanup. See [Mounting secrets as files](../secrets.md#mounting-secrets-as-files) for the full options table, lifecycle details, and the canonical sops example.

### Local test mode secrets

When running `kici test`, you can provide secrets locally without an orchestrator.

#### .kici/.secrets file

Create a `.kici/.secrets` file in your project (auto-gitignored by `kici init`):

```ini
# Flat secrets (before any section)
DEPLOY_TOKEN=my-deploy-token
API_KEY=my-api-key

# Context-scoped secrets
[production]
DB_PASSWORD=prod-secret
API_KEY=prod-key

[npm-publish]
NPM_TOKEN=npm-abc123
```

Lines before any `[section]` header are flat secrets. Lines within a section become context-scoped secrets. Comments start with `#`. Values are everything after the first `=` (so values can contain `=` characters).

#### CLI flags

Override or supplement file-based secrets with CLI flags:

```bash
# Inject flat secrets (repeatable)
kici test push --secret DEPLOY_TOKEN=my-token --secret API_KEY=my-key

# Inject context-scoped secrets (repeatable)
kici test push --context production.DB_PASSWORD=prod-secret --context npm-publish.NPM_TOKEN=abc123
```

**Precedence:** CLI flags override `.kici/.secrets` file values. Context secrets are auto-flattened into `ctx.secrets` using the same merge logic as production (last context wins).

## Fixtures

Test fixtures define event replicas for `kici run remote`. They simulate trigger events without requiring real webhooks.

### fixture(id, options)

```typescript
function fixture(
  id: string,
  options: FixtureOptions | (() => FixtureOptions | Promise<FixtureOptions>),
): Fixture;
```

**Parameters:**

- `id` — unique fixture name (no whitespace). Used in `kici run remote <id>`.
- `options` — a `FixtureOptions` object, or an async factory function returning one.

```typescript
import { fixture, push } from '@kici-dev/sdk';

export const pushMain = fixture('push-main', {
  event: push({ branches: ['main'] }),
});
```

### FixtureOptions

| Property       | Type                     | Description                                                |
| -------------- | ------------------------ | ---------------------------------------------------------- |
| `event`        | `TriggerConfig`          | The trigger event to simulate (required)                   |
| `branch`       | `string`                 | Override branch name (defaults to git-detected)            |
| `sha`          | `string`                 | Override commit SHA (defaults to HEAD)                     |
| `repo`         | `string`                 | Override repository (defaults to git-detected)             |
| `pr`           | `number`                 | For PR events, override PR number                          |
| `secrets`      | `Record<string, string>` | Secret context mappings: `{ localName: 'remote-context' }` |
| `workflowName` | `string`                 | Bypass trigger matching and run this workflow directly     |

Options can also be provided as an async factory function for dynamic fixture generation.

---

## SDK reference: triggers

Source: https://docs.kici.dev/user/sdk/triggers/

## Triggers

Triggers define when a workflow runs. KiCI provides 22 trigger types: 16 GitHub webhook triggers and 6 internal/generic triggers for event routing, scheduling, and non-GitHub sources. Each trigger returns a frozen config object with a unique `_tag` discriminator.

All triggers use a config object form -- pass an options object to configure the trigger.

### pr()

Create a pull request trigger. Returns a frozen `PrTriggerConfig` directly.

```typescript
function pr(config?: PrConfigInput): PrTriggerConfig;
```

**Config options:**

```typescript
interface PrConfigInput {
  events?: PrEvent[];
  target?: string | RegExp | (string | RegExp)[];
  source?: string | RegExp | (string | RegExp)[];
  paths?: string[]; // Use '!' prefix for exclusions (e.g., '!docs/**')
  repos?: string | RegExp | (string | RegExp)[]; // Cross-repo source patterns -- see global-workflows.md
  description?: string;
}
```

**PrEvent values:** `'opened'`, `'synchronize'`, `'reopened'`, `'closed'`, `'assigned'`, `'unassigned'`, `'labeled'`, `'unlabeled'`, `'edited'`, `'converted_to_draft'`, `'ready_for_review'`, `'locked'`, `'unlocked'`, `'review_requested'`, `'review_request_removed'`, `'auto_merge_enabled'`, `'auto_merge_disabled'`

**Default events** (when `events` is not specified): `opened`, `synchronize`, `reopened`, `closed`

**Examples:**

```typescript
// All PRs with default events
pr();

// PRs targeting main with path filter
pr({ target: 'main', events: ['opened', 'synchronize'], paths: ['src/**'] });

// Regex branch pattern
pr({ target: /^release\/v\d+$/ });
```

### push()

Create a push trigger. Returns a frozen `PushTriggerConfig` directly.

```typescript
function push(config?: PushConfigInput): PushTriggerConfig;
```

**Config options:**

```typescript
interface PushConfigInput {
  branches?: string | RegExp | (string | RegExp)[];
  tags?: string | RegExp | (string | RegExp)[];
  paths?: string[]; // Use '!' prefix for exclusions (e.g., '!docs/**')
  repos?: string | RegExp | (string | RegExp)[]; // Cross-repo source patterns -- see global-workflows.md
  description?: string;
}
```

**Examples:**

```typescript
// Any push
push();

// Push to main only
push({ branches: 'main' });

// Push with branch and path filters
push({ branches: ['main', 'develop'], paths: ['src/**'] });

// Tag pushes
push({ tags: ['v*'] });
```

### tag()

Create a tag trigger. Returns a frozen `TagTriggerConfig`.

```typescript
function tag(config?: TagConfigInput): TagTriggerConfig;
```

**Config options:** `patterns` (string/RegExp/array), `description`

```typescript
tag(); // Any tag
tag({ patterns: ['v*'] }); // Semver tags
tag({ patterns: /^v\d+\.\d+$/ }); // Regex match
```

### comment()

Create an issue/PR comment trigger. Returns a frozen `CommentTriggerConfig`.

```typescript
function comment(config?: CommentConfigInput): CommentTriggerConfig;
```

**Config options:** `actions` (created/edited/deleted), `source` (issue/pr), `bodyMatch` (string or RegExp), `description`

```typescript
comment(); // Any comment
comment({ bodyMatch: '/deploy' }); // Glob match on body
comment({ bodyMatch: /^\/deploy/i }); // Regex match on body
comment({ source: 'pr', actions: ['created'] }); // PR comments only
```

### review()

Create a pull request review trigger. Returns a frozen `ReviewTriggerConfig`.

```typescript
function review(config?: ReviewConfigInput): ReviewTriggerConfig;
```

**Config options:** `actions` (submitted/edited/dismissed), `states` (approved/changes_requested/commented/dismissed), `description`

```typescript
review(); // Any review
review({ states: ['approved'] }); // Approvals only
review({ actions: ['submitted'], states: ['approved'] }); // Submitted approvals
```

### reviewComment()

Create a PR review comment trigger. Returns a frozen `ReviewCommentTriggerConfig`.

```typescript
function reviewComment(config?: ReviewCommentConfigInput): ReviewCommentTriggerConfig;
```

**Config options:** `actions` (created/edited/deleted), `description`

```typescript
reviewComment(); // Any review comment
reviewComment({ actions: ['created'] }); // New review comments only
```

### release()

Create a release trigger. Returns a frozen `ReleaseTriggerConfig`.

```typescript
function release(config?: ReleaseConfigInput): ReleaseTriggerConfig;
```

**Config options:** `actions` (published/unpublished/created/edited/deleted/prereleased/released), `description`

```typescript
release(); // Any release event
release({ actions: ['published'] }); // Published releases only
```

### dispatch()

Create a repository_dispatch trigger. Returns a frozen `DispatchTriggerConfig`.

```typescript
function dispatch(config?: DispatchConfigInput): DispatchTriggerConfig;
```

**Config options:** `types` (string[]), `description`

```typescript
dispatch(); // Any dispatch
dispatch({ types: ['deploy', 'rollback'] }); // Specific event types
```

### create()

Create a ref creation trigger (branches/tags). Returns a frozen `CreateTriggerConfig`.

```typescript
function create(config?: CreateConfigInput): CreateTriggerConfig;
```

**Config options:** `refTypes` (branch/tag), `patterns` (string/RegExp/array), `description`

```typescript
create(); // Any ref creation
create({ refTypes: ['tag'], patterns: ['v*'] }); // Tag creation only
```

### delete()

Create a ref deletion trigger (branches/tags). Returns a frozen `DeleteTriggerConfig`.

Note: Since `delete` is a JavaScript reserved word, import as `del`: `import { delete as del } from '@kici-dev/sdk'`

```typescript
function del(config?: DeleteConfigInput): DeleteTriggerConfig;
```

**Config options:** `refTypes` (branch/tag), `patterns` (string/RegExp/array), `description`

```typescript
del(); // Any ref deletion
del({ refTypes: ['branch'], patterns: ['temp/*'] }); // Temp branch cleanup
```

### status()

Create a commit status trigger. Returns a frozen `StatusTriggerConfig`.

```typescript
function status(config?: StatusConfigInput): StatusTriggerConfig;
```

**Config options:** `contexts` (picomatch strings like 'ci/\*'), `states` (error/failure/pending/success), `description`

```typescript
status(); // Any status
status({ contexts: ['ci/*'], states: ['success'] }); // CI success
```

### workflowRun()

Create a workflow_run trigger. Returns a frozen `WorkflowRunTriggerConfig`.

```typescript
function workflowRun(config?: WorkflowRunConfigInput): WorkflowRunTriggerConfig;
```

**Config options:** `actions` (requested/completed/in_progress), `workflows` (name filters), `conclusions` (success/failure/cancelled), `description`

```typescript
workflowRun(); // Any workflow run
workflowRun({ workflows: ['CI'], actions: ['completed'], conclusions: ['success'] });
```

### fork()

Create a fork trigger. No filter fields. Returns a frozen `ForkTriggerConfig`.

```typescript
function fork(config?: ForkConfigInput): ForkTriggerConfig;
```

```typescript
fork(); // Any fork event
fork({ description: 'Track forks' }); // With description
```

### star()

Create a star trigger. Returns a frozen `StarTriggerConfig`.

```typescript
function star(config?: StarConfigInput): StarTriggerConfig;
```

**Config options:** `actions` (created/deleted), `description`

```typescript
star(); // Any star event
star({ actions: ['created'] }); // New stars only
```

### watch()

Create a watch trigger. Returns a frozen `WatchTriggerConfig`.

```typescript
function watch(config?: WatchConfigInput): WatchTriggerConfig;
```

**Config options:** `actions` (started), `description`

```typescript
watch(); // Any watch event
watch({ actions: ['started'] }); // Watch started only
```

### webhook()

Create a catch-all webhook trigger for any GitHub event. Returns a frozen `WebhookTriggerConfig`. Unlike other triggers, `events` is **required** -- catch-all must specify what to catch.

```typescript
function webhook(config: WebhookConfigInput): WebhookTriggerConfig;
```

**Config options:** `events` (required string[]), `actions` (optional string[]), `repos` (optional cross-repo source patterns -- see [global workflows](../global-workflows.md)), `description`

```typescript
webhook({ events: ['deployment'] }); // Deployment events
webhook({ events: ['deployment', 'deployment_status'] }); // Multiple events
webhook({ events: ['deployment'], actions: ['created'] }); // With action filter
```

#### Cross-source delivery

A `webhook()` trigger fires whenever a matching event arrives via **any inbound webhook source within the same org**, not just the source the workflow's repository is bound to. If your repo is registered through a github source and a separate generic source in the same org POSTs an event with a matching name, the workflow still runs.

Two important rules govern the cross-source path:

1. **The registration's source owns dispatch credentials.** The runtime clone, auth, and check-status posting come from the source the workflow was registered with (via its default-branch push), never from the inbound source. A generic webhook fanning out to a github-registered workflow uses the github bundle's clone token provider — the generic source contributes only the event payload.
2. **Org isolation is structural.** A webhook delivered to org A can never trigger a workflow registered against org B. The lookup index is keyed on `(customerId, eventName)` so cross-org leakage is impossible.

The orchestrator emits `kici_cross_source_fanout_size` (histogram) per inbound webhook so operators can observe how many workflows each event reaches.

### Event triggers

The following 6 trigger types support internal event routing, scheduling, lifecycle orchestration, and non-GitHub webhook sources.

### kiciEvent()

Create a custom event trigger. Fires when a named internal event is emitted from a workflow step via `ctx.emit()`. Returns a frozen `KiciEventTriggerConfig`.

```typescript
function kiciEvent(config: KiciEventConfigInput): KiciEventTriggerConfig;
```

**Config options:**

```typescript
interface KiciEventConfigInput {
  name: string; // Required: event name to listen for
  match?: Record<string, unknown>; // JSONPath payload matching (e.g., { '$.env': 'prod' })
  not?: Record<string, unknown>; // Negative JSONPath filter
  source?: string; // Cross-repo source filter (e.g., 'org/infra-repo')
  description?: string;
}
```

```typescript
kiciEvent({ name: 'deploy-complete' }); // Match by name
kiciEvent({ name: 'deploy-complete', match: { '$.env': 'prod' } }); // With payload filter
kiciEvent({ name: 'deploy-complete', not: { '$.env': 'staging' } }); // Negative filter
kiciEvent({ name: 'deploy-complete', source: 'org/infra-repo' }); // Cross-repo
```

### workflowComplete()

Create a workflow completion trigger. Fires automatically when another workflow finishes execution. Returns a frozen `WorkflowCompleteTriggerConfig`.

```typescript
function workflowComplete(config?: WorkflowCompleteConfigInput): WorkflowCompleteTriggerConfig;
```

**Config options:**

```typescript
interface WorkflowCompleteConfigInput {
  name?: string; // Filter by workflow name
  status?: WorkflowCompleteStatus[]; // Filter by completion status
  source?: string; // Cross-repo source filter
  description?: string;
}
type WorkflowCompleteStatus = 'success' | 'failed' | 'cancelled';
```

```typescript
workflowComplete(); // Any workflow completion
workflowComplete({ name: 'CI' }); // Specific workflow
workflowComplete({ name: 'CI', status: ['success'] }); // Success only
workflowComplete({ name: 'CI', status: ['success'], source: 'org/repo' }); // Cross-repo
```

### jobComplete()

Create a job completion trigger. Fires automatically when a specific job within a workflow finishes. Returns a frozen `JobCompleteTriggerConfig`.

```typescript
function jobComplete(config?: JobCompleteConfigInput): JobCompleteTriggerConfig;
```

**Config options:**

```typescript
interface JobCompleteConfigInput {
  workflow?: string; // Filter by workflow name
  job?: string; // Filter by job name
  status?: JobCompleteStatus[]; // Filter by completion status
  source?: string; // Cross-repo source filter
  description?: string;
}
type JobCompleteStatus = 'success' | 'failed' | 'cancelled' | 'skipped';
```

```typescript
jobComplete(); // Any job completion
jobComplete({ workflow: 'CI', job: 'build' }); // Specific workflow + job
jobComplete({ workflow: 'CI', job: 'build', status: ['success'] }); // Success only
jobComplete({ workflow: 'CI', job: 'build', source: 'org/repo' }); // Cross-repo
```

### genericWebhook()

Create a generic webhook trigger. Fires when a non-GitHub webhook is received from an external source configured via the admin API. Returns a frozen `GenericWebhookTriggerConfig`.

```typescript
function genericWebhook(config: GenericWebhookConfigInput): GenericWebhookTriggerConfig;
```

**Config options:**

```typescript
interface GenericWebhookConfigInput {
  source: string; // Required: must match `--name` from `kici-admin source add generic`
  events?: string[]; // Filter by event types
  match?: Record<string, unknown>; // JSONPath payload matching
  not?: Record<string, unknown>; // Negative JSONPath filter
  auth?: GenericWebhookAuth; // HMAC or API key authentication
  path?: string; // URL path pattern (replaces source for URL matching)
  description?: string;
}
```

```typescript
genericWebhook({ source: 'argocd' }); // Any event from ArgoCD
genericWebhook({ source: 'argocd', events: ['deploy.success'] }); // Specific events
genericWebhook({ source: 'argocd', match: { '$.env': 'prod' } }); // With payload filter
genericWebhook({ source: 'argocd', not: { '$.dry_run': true } }); // Negative filter
genericWebhook({
  source: 'stripe',
  auth: { method: 'hmac-sha256', secret: 'stripe-key', signatureHeader: 'stripe-signature' },
}); // HMAC auth
genericWebhook({ source: 'slack', auth: { method: 'api-key', secret: 'slack-token' } }); // API key auth
genericWebhook({ source: 'stripe', path: 'stripe/payments' }); // URL path pattern
```

### schedule()

Create a cron-based schedule trigger. Returns a frozen `ScheduleTriggerConfig`.

```typescript
function schedule(config: ScheduleConfigInput): ScheduleTriggerConfig;
```

**Config options:**

```typescript
interface ScheduleConfigInput {
  cron: string; // Required: cron expression (5-field)
  timezone?: string; // Timezone for cron evaluation (default: 'UTC')
  description?: string; // Human-readable description of the schedule
}
```

```typescript
schedule({ cron: '0 * * * *' }); // Every hour
schedule({ cron: '0 0 * * *' }); // Daily at midnight UTC
schedule({ cron: '0 9 * * 1', timezone: 'America/New_York' }); // Monday 9am ET
schedule({ cron: '*/15 * * * *', description: 'health check every 15 min' });
```

### lifecycle()

Create a lifecycle trigger for cross-workflow orchestration events. Returns a frozen `LifecycleTriggerConfig`.

```typescript
function lifecycle(config: LifecycleConfigInput): LifecycleTriggerConfig;
```

**Config options:**

```typescript
interface LifecycleConfigInput {
  events: LifecycleEvent[]; // Required: lifecycle events to listen for
  sources?: string[]; // Optional: filter by source repo (e.g., 'org/repo')
  description?: string; // Human-readable description
}

type LifecycleEvent = 'workflow_complete' | 'job_complete' | 'job_failed' | 'registration_updated';
```

```typescript
lifecycle({ events: ['workflow_complete'] }); // Any workflow completion
lifecycle({ events: ['job_failed'], sources: ['org/deploy-repo'] }); // Job failures from specific repo
lifecycle({ events: ['registration_updated'] }); // Workflow registration changes
```

### Branch patterns

Both `pr()` and `push()` (as well as `tag()`, `create()`, and `delete()`) accept glob strings and RegExp literals for pattern matching:

```typescript
// Glob patterns (micromatch syntax)
pr({ target: ['main', 'release/*', 'feature/**'] });

// Regex patterns
pr({ target: /^release\/v\d+\.\d+$/ });

// Mixed
push({ branches: ['main', /^hotfix\//] });
```

Glob patterns use micromatch syntax. Regex patterns use standard JavaScript `RegExp`.

---

## SDK reference: validation & events

Source: https://docs.kici.dev/user/sdk/validation-events/

## Validation

### validateDag(nodes)

Validate a directed acyclic graph for correctness.

```typescript
function validateDag(nodes: DagNode[]): DagValidationResult;
```

**DagNode:**

```typescript
interface DagNode {
  id: string;
  needs: string[];
}
```

**DagValidationResult** (discriminated union):

```typescript
// Valid graph with topological sort order
{ valid: true; sortedOrder: string[] }

// Cycle detected
{ valid: false; error: 'cycle'; nodesInCycle: string[] }

// Job depends on itself
{ valid: false; error: 'self-reference'; nodeId: string }

// Job depends on non-existent job
{ valid: false; error: 'missing-dependency'; nodeId: string; missingDep: string }
```

Checks (in order): self-references, missing dependencies, cycles (Kahn's algorithm).

```typescript
const result = validateDag([
  { id: 'lint', needs: [] },
  { id: 'test', needs: ['lint'] },
  { id: 'deploy', needs: ['test'] },
]);

if (result.valid) {
  console.log(result.sortedOrder); // ['lint', 'test', 'deploy']
}
```

## Event definitions

The `defineEvent()` helper creates typed event definitions with Zod validation schemas. Event definitions serve as contracts for custom event payloads used with `ctx.emit()` and `kiciEvent()`.

### defineEvent(name, schema)

```typescript
function defineEvent<T extends z.ZodTypeAny>(name: string, schema: T): EventDefinition<T>;
```

**Parameters:**

| Parameter | Type        | Required | Description                       |
| --------- | ----------- | -------- | --------------------------------- |
| `name`    | `string`    | yes      | Unique event name                 |
| `schema`  | `z.ZodType` | yes      | Zod schema for payload validation |

**Returns:** `EventDefinition<T>` -- a frozen event definition with `name` and `schema`.

```typescript
import { defineEvent, z } from '@kici-dev/sdk';

const deployComplete = defineEvent(
  'deploy-complete',
  z.object({
    env: z.string(),
    version: z.string(),
    services: z.array(z.string()),
  }),
);
```

The `z` (Zod) module is re-exported from `@kici-dev/sdk` so you can define event schemas without adding Zod as a direct dependency.

## Emitting events

Workflow steps can emit custom events via `ctx.emit()`. Emitted events are delivered immediately (mid-workflow, not queued until completion) and can trigger other workflows that listen with `kiciEvent()`, `workflowComplete()`, or `jobComplete()` triggers.

### ctx.emit(eventName, payload?, options?)

```typescript
emit(
  eventName: string,
  payload?: Record<string, unknown>,
  options?: EventEmitOptions,
): Promise<{ deliveryId: string }>;
```

**Parameters:**

| Parameter        | Type                      | Required | Description                                   |
| ---------------- | ------------------------- | -------- | --------------------------------------------- |
| `eventName`      | `string`                  | yes      | Name of the event to emit                     |
| `payload`        | `Record<string, unknown>` | no       | Event payload data                            |
| `options.target` | `{ repos?: string[] }`    | no       | Target specific repos for cross-repo delivery |

**Returns:** `Promise<{ deliveryId: string }>` -- a delivery receipt after the event is persisted and routed.

**Examples:**

```typescript
// Emit a simple event
step('notify', async (ctx) => {
  await ctx.emit('deploy-complete', { env: 'prod', version: '1.2.3' });
});

// Cross-repo targeting
step('notify-other-repos', async (ctx) => {
  await ctx.emit(
    'deploy-complete',
    { env: 'prod' },
    {
      target: { repos: ['org/other-repo', 'org/monitoring'] },
    },
  );
});
```

### Cross-repo event delivery

Events emitted from one repo can trigger workflows in another repo, provided:

1. A trust relationship exists between the source and target repos (configured via the admin API)
2. The target workflow uses a trigger with `source` filter matching the emitting repo

```typescript
// In repo A: emit event
step('deploy', async (ctx) => {
  await ctx.emit(
    'deploy-complete',
    { env: 'prod' },
    {
      target: { repos: ['org/repo-B'] },
    },
  );
});

// In repo B: listen for event from repo A
workflow('post-deploy', {
  on: kiciEvent({ name: 'deploy-complete', source: 'org/repo-A' }),
  jobs: [postDeployJob],
});
```

### System events

The orchestrator automatically emits system events for workflow and job completions. You do not need to call `ctx.emit()` for these -- they are generated by the orchestrator after execution. Listen for them with `workflowComplete()` and `jobComplete()` triggers.

---

## SDK reference: waitFor

Source: https://docs.kici.dev/user/sdk/wait-for/

The SDK exposes two wait-for helpers — a generic function `waitFor()` and a step factory `waitForStep()` — for the common case where a workflow step should:

1. **Poll** a condition on a fixed interval.
2. **Proceed** as soon as the condition is met, optionally running a success action.
3. **Fail or recover** gracefully when the deadline is exceeded, with an optional timeout action.

Both helpers wrap the same polling loop, so they share semantics and return shape. Pick `waitForStep()` when the wait is the whole job of a step; use `waitFor()` from anywhere — inside a multi-action step, a hook, or a bare async function.

## `waitFor(options)`

Poll `check()` on a fixed interval until it returns a non-null value or the deadline is exceeded. Resolves to a discriminated result describing which outcome occurred.

### Parameters

| Name             | Type                                                                   | Required | Description                                                                                                |
| ---------------- | ---------------------------------------------------------------------- | -------- | ---------------------------------------------------------------------------------------------------------- |
| `name`           | `string`                                                               | No       | Label that appears in log lines and in the timeout error. Defaults to `'waitFor'`.                         |
| `check`          | `() => Promise<TValue \| null>`                                        | Yes      | Polled inspection. Return the resolved value when the condition is met, or `null` to keep polling.         |
| `intervalMs`     | `number`                                                               | No       | Time between successive `check()` invocations. Defaults to `2000` milliseconds.                            |
| `timeoutMs`      | `number`                                                               | No       | Total time budget for the wait. Defaults to `60000` milliseconds.                                          |
| `initialDelayMs` | `number`                                                               | No       | Time to wait before the first `check()` invocation. Defaults to `0`.                                       |
| `onSuccess`      | `(value: TValue) => Promise<TSuccess>`                                 | No       | Runs once after `check()` returns a non-null value. Its return value is surfaced as `result` on success.   |
| `onTimeout`      | `(info: { elapsedMs: number; attempts: number }) => Promise<TTimeout>` | No       | Runs when the deadline is exceeded. Its return value is surfaced as `result` on the `'timed-out'` outcome. |
| `swallowErrors`  | `boolean`                                                              | No       | When `true` (default), errors thrown by `check()` are logged and polling continues.                        |
| `log`            | `(line: string) => void`                                               | No       | Sink for status lines. Defaults to `console.log`.                                                          |

### Result

`waitFor()` resolves to a discriminated `WaitForResult` union:

| Outcome       | Branch fields                                                                                         |
| ------------- | ----------------------------------------------------------------------------------------------------- |
| `'succeeded'` | `value: TValue`, `elapsedMs`, `attempts`, `result: TSuccess` (the `onSuccess` return or `undefined`). |
| `'timed-out'` | `elapsedMs`, `attempts`, `result: TTimeout` (the `onTimeout` return).                                 |

Narrow on `result.outcome` before reading the branch-specific fields.

When `onTimeout` is **not** supplied, the helper throws a `WaitForTimeoutError` instead of returning a `'timed-out'` result. The error exposes `stepName`, `elapsedMs`, and `attempts` as instance fields so a catch block can branch on them.

### Cancellation and the deadline check

The loop inspects the deadline at the top of each iteration. A `check()` that takes longer than `intervalMs` is not aborted mid-flight; the helper has no `AbortSignal` plumbing. The step's own `timeout` field is the hard kill if the step needs to be interrupted unconditionally.

### Example

```typescript
import { waitFor } from '@kici-dev/sdk';

const result = await waitFor({
  name: 'await-build-artifact',
  check: async () => {
    const artifact = await registry.findArtifact('myapp', 'v1.2.3');
    return artifact ?? null;
  },
  onSuccess: async (artifact) => ({ digest: artifact.digest }),
  intervalMs: 5000,
  timeoutMs: 5 * 60 * 1000,
});

if (result.outcome === 'succeeded') {
  console.log(`Artifact ready: ${result.result.digest} (${result.attempts} polls)`);
} else {
  console.log(`Gave up after ${result.elapsedMs} ms`);
}
```

## `waitForStep(name, options)`

A factory returning an SDK `Step` whose `run` body executes `waitFor(...)` and routes status lines through the step's structured logger.

### Parameters

| Name      | Type                                    | Required | Description                                                                                         |
| --------- | --------------------------------------- | -------- | --------------------------------------------------------------------------------------------------- |
| `name`    | `string`                                | Yes      | Step name. Appears in the run timeline and in log lines.                                            |
| `options` | `Omit<WaitForOptions, 'name' \| 'log'>` | Yes      | Same shape as `waitFor()` minus `name` (already provided) and `log` (provided by the step context). |

### Result

`waitForStep(...)` returns `Step<WaitForResult<TValue, TSuccess, TTimeout>>`. Other steps can consume the result through the standard step output mechanisms.

### Example

```typescript
import { waitForStep, job } from '@kici-dev/sdk';

const awaitMarker = waitForStep('await-marker', {
  check: async () => {
    const stat = await tryStatMarker('/tmp/build-ready');
    return stat ? { path: '/tmp/build-ready' } : null;
  },
  intervalMs: 1000,
  timeoutMs: 60_000,
  onTimeout: async ({ attempts }) => ({ aborted: true, attempts }),
});

export const release = job('release', {
  runsOn: 'linux',
  steps: [awaitMarker],
});
```

If `check()` throws while polling, the error is logged and polling continues — the default `swallowErrors: true` matches the "poll until healthy" pattern. Pass `swallowErrors: false` to fail fast on the first error instead.

## See also

- [Core SDK reference](./core.md) — the `step()`, `job()`, and `workflow()` factories that `waitForStep()` builds on.
- [Idempotent helpers](./idempotent.md) — `idempotent()` and `idempotentStep()` for check / apply patterns.
- [Runtime types](./runtime.md) — `StepContext`, `Logger`, and other surface used inside the helpers.

---

## SDK reference

Source: https://docs.kici.dev/user/sdk-reference/

Reference documentation for `@kici-dev/sdk`. The reference is split across five pages by topic.

| Page                                                         | Covers                                                                                                                                                                                                                      |
| ------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Core](./sdk/core.md)                                        | `workflow()`, `job()`, `step()` factory functions and step / job authoring patterns (bare functions, output chaining, `needs`, dynamic groups).                                                                             |
| [Triggers](./sdk/triggers.md)                                | All 22 trigger factories -- GitHub events (`pr`, `push`, `tag`, `comment`, ...), event triggers (`kiciEvent`, `workflowComplete`, `jobComplete`), `genericWebhook`, `schedule`, `lifecycle`, plus branch-pattern semantics. |
| [Rules, matrix, dynamic jobs](./sdk/rules-matrix-dynamic.md) | `rule()`, `skip()`, matrix builds (static + dynamic), and `dynamicJob()` / `dynamicGroup()`.                                                                                                                                |
| [Caching](./sdk/caching.md)                                  | `CacheSpec`, declarative `cache` on jobs/steps, imperative `ctx.cache.restore()` / `ctx.cache.save()`, immutable keys, `restoreKeys` prefix fallback, per-org + per-ref isolation.                                          |
| [Validation & events](./sdk/validation-events.md)            | `validateDag()`, `defineEvent()`, event emission patterns.                                                                                                                                                                  |
| [Runtime](./sdk/runtime.md)                                  | Types index, `StepContext`, secrets, and fixtures.                                                                                                                                                                          |
| [Idempotent helpers](./sdk/idempotent.md)                    | `idempotent()` and `idempotentStep()` — check / apply pattern with typed results on both the skipped and applied branches.                                                                                                  |
| [Wait-for helpers](./sdk/wait-for.md)                        | `waitFor()` and `waitForStep()` — poll a condition on an interval, run an optional success action, recover gracefully on timeout.                                                                                           |

The `@kici-dev/sdk` package re-exports the entire surface from a single entry point. Pick what you need:

```typescript
import { workflow, job, step, pr, push, rule, defineEvent } from '@kici-dev/sdk';
```

For the complete list of every named export (factory functions, triggers, rules, validation, hook factories, types), see the per-topic pages above.

## See also

- [Getting started](getting-started.md) -- install the SDK, write your first workflow, test locally
- [CLI reference](cli-reference.md) -- compile, test, and manage workflows from the command line
- [Workflow patterns](workflow-patterns.md) -- common patterns using the SDK features documented above
- [Secrets management (operator)](../operator/security/secrets.md) -- configure encrypted secret storage and admin API
- [Secrets architecture](../architecture/security/secrets.md) -- encryption model, multi-backend, and data flow
- [State machine](../architecture/execution/state-machine.md) -- how execution states map to the lifecycle of jobs and steps

---

# CLI and authoring

## CLI authentication

Source: https://docs.kici.dev/user/cli-auth/

The KiCI CLI supports three authentication methods: browser-based OAuth (default), device authorization flow (for headless environments), and API key paste (for CI/CD pipelines).

## Authentication methods

### Browser OAuth (default)

The default `kici login` flow:

1. Opens your default browser to the KiCI identity provider
2. You authenticate in the browser
3. The CLI receives a token via localhost callback
4. A personal access token (PAT) is created and stored locally

```bash
kici login
```

The CLI auto-detects headless environments (SSH sessions, CI runners) and switches to device flow automatically.

### Device flow (headless)

For environments without a browser (SSH, remote servers):

```bash
kici login --device
```

This displays a URL and a code. Open the URL on any device, enter the code, and authenticate. The CLI polls for completion.

### API key paste

For CI/CD pipelines and automated environments, paste an API key directly:

```bash
kici login --token kici_sk_abc123...
```

The API key (starts with `kici_sk_`) is passed directly as the flag value and stored in your local config file.

## kici logout

Revoke your PAT and clear local authentication:

```bash
kici logout
```

This:

1. Revokes the PAT on the server (preventing further use)
2. Clears auth fields from the local config file
3. Preserves non-auth settings (endpoint, routing key, connection mode)

## Organization management

### List organizations

```bash
kici org list
```

Shows all organizations you belong to, with your role in each. The active organization is marked with an asterisk.

### Switch active organization

```bash
kici org use <name-or-id>
```

Name matching is case-insensitive. You can also use the organization ID directly.

### Show current organization

```bash
kici org current
```

Displays the currently active organization name and ID.

## Auth status

The `kici status <run-id>` command displays run details including your authentication state:

```bash
kici status <run-id>
```

The auth-related output includes:

- Login state (logged in / not logged in)
- Active organization name
- PAT expiry date and time remaining
- Warning if PAT expires within 7 days

## Personal access tokens

Personal access tokens (PATs) are created automatically when you log in via OAuth. You can also create and manage PATs through the dashboard.

### How PATs work

- **User-scoped**: PATs work across all organizations you belong to
- **120-day default expiry**: Configurable when creating from the dashboard
- **Named per machine**: Each login creates a PAT named after the machine hostname
- **Permission inheritance**: PATs inherit your effective role permissions in each org

### PATs vs API keys

|            | Personal access tokens | API keys        |
| ---------- | ---------------------- | --------------- |
| Scope      | User (cross-org)       | Organization    |
| Prefix     | `kici_pat_`            | `kici_sk_`      |
| Created by | CLI login or dashboard | Dashboard       |
| Expiry     | 120 days (default)     | No expiry       |
| Use case   | Developer CLI access   | CI/CD pipelines |

### Dashboard management

Create, view, and revoke PATs from the dashboard:

1. Click your avatar in the sidebar
2. Select **Account settings**
3. Navigate to the **Personal access tokens** tab

From here you can:

- Create PATs with custom names and expiry periods
- View active PATs with their prefixes and expiry dates
- Revoke PATs that are no longer needed

## Reaching the Platform API directly

The Platform exposes a versioned REST API under `/api/v1/*`. The same endpoints back the dashboard SPA, the `kici` CLI, and any third-party automation. There is no separate "public" surface — the dashboard's API is the API.

### Base URL

| Deployment  | Base URL pattern                                                          |
| ----------- | ------------------------------------------------------------------------- |
| KiCI Cloud  | `https://<your-platform-host>/api/v1/`                                    |
| Self-hosted | `https://<orchestrator-host>/<deployment-slug>/api/v1/` (slug is optional |
|             | — `KICI_BASE_PATH` may add a prefix when the Platform is reverse-proxied) |

`/api/v1/*` requires authentication (see below). `/health`, `/metrics`, and `/ws` (WebSocket) sit outside that prefix and have their own access posture (`/metrics` is meant for Prometheus scrape, not public exposure).

### Authentication

Every request to `/api/v1/*` carries an `Authorization: Bearer <token>` header. The Platform routes on the prefix:

| Prefix      | Token type                    | Created via                             | Scope            |
| ----------- | ----------------------------- | --------------------------------------- | ---------------- |
| `kici_pat_` | Personal access token         | `kici login` or dashboard               | User (cross-org) |
| `kici_sk_`  | User API key                  | Dashboard → Settings → API keys         | Org              |
| `kici_sa_`  | Service account key           | Dashboard → Settings → Service accounts | Org              |
| (other)     | OIDC JWT or opaque OIDC token | OIDC login (browser SPA)                | User (cross-org) |

JWT and opaque OIDC tokens are validated against the configured OIDC issuer (JWKS for JWTs, the issuer's UserInfo endpoint for opaque ones). All `kici_*` tokens are validated by SHA-256 hash lookup against the Platform DB. See [RBAC: authentication methods](../architecture/security/rbac.md#authentication-methods) for the full model.

> **Note:** `kici_ok_` keys are **not** for the HTTP API — they authenticate orchestrator-to-Platform WebSocket connections only. Use `kici_sk_` (or `kici_pat_`) for HTTP calls.

### Permissions

Tokens authenticate; RBAC authorizes. Every org-scoped route runs `orgContextMiddleware` (verifies you are a member of the target org) followed by `requirePermission(resource, level)`. The 15 resources and 5 levels are documented in [RBAC](../architecture/security/rbac.md#permission-model). User API keys carry their own permission matrix bounded above by the creator's effective permissions; PATs inherit the user's role permissions (or are capped further by their `scopes` field).

### Configurable surfaces

The dashboard is a browser SPA on top of the same `/api/v1/*` surface, so anything you can configure in the dashboard you can configure over HTTP. The mounted route groups include:

- **Auth & identity:** `/cli/exchange-token`, `/pats`, `/user`, `/identity-links`, `/github-oauth`, `/invites`, `/invites/pending`, `/invites/:inviteId/{accept,decline}`
- **Org & membership:** `/orgs`, `/orgs/:customerId`, `/orgs/:customerId/{members,roles,api-keys,orchestrator-keys,service-accounts,billing,trust-policies}`
- **Workflows & runs:** `/orgs/:customerId/{runs,registrations,workflows,held-runs,environments,secrets,global-workflows}`
- **Webhooks & event log:** `/orgs/:customerId/{sources,webhook-endpoints,event-log}`
- **Diagnostics & activity:** `/orgs/:customerId/{diagnostics,activity,access-log}`
- **Admin (kici-admin org only):** `/admin/{orgs,connections,audit-log,grafana/*}`

The full route tree is the source of truth — every method, request schema, and response schema is enumerated server-side. There is currently no auto-generated OpenAPI spec; the typed `DashboardApiType` export is the canonical contract for TypeScript clients.

### Calling the API

Two short examples — adapt the base URL and token to your deployment.

**curl (PAT or API key):**

```bash
TOKEN="$(grep -E '^pat=' ~/.kici/config | cut -d= -f2)"   # or paste a kici_sk_…
ORG="<your-org-id>"
curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  "https://<orchestrator-host>/<deployment-slug>/api/v1/orgs/$ORG/runs?limit=5" | jq
```

**Browser console (after dashboard login):**

```js
const ns = Object.keys(localStorage).find((k) => k.startsWith('oidc.user:'));
const { access_token } = JSON.parse(localStorage.getItem(ns));
const res = await fetch('/<deployment-slug>/api/v1/orgs/<your-org-id>/runs?limit=5', {
  headers: { Authorization: `Bearer ${access_token}` },
});
console.log(await res.json());
```

### Rate limits and body size

There is currently no per-token rate limit on `/api/v1/*`. A single global body-size cap applies to webhook ingress and dashboard API requests alike.

### Audit trail

Every `/api/v1/*` mutation that touches tenant-plane data is recorded in the upstream tenant-plane audit log, stamped with the actor (user, API key, service account, or upstream operator on a break-glass support read). Reads on customer data go through the orchestrator over the WebSocket proxy and land in the orchestrator's `access_log` table. See [Audit log](../operator/security/audit-log.md) for the orchestrator schema and the dashboard's "Activity" page for the federated view.

## Token storage

The CLI stores authentication data in `~/.kici/config` with `0600` permissions (owner read/write only). The config file contains:

- PAT token
- PAT expiry date
- Active organization ID
- Server endpoint URL

## Troubleshooting

### Browser doesn't open

If `kici login` can't open a browser:

- Use `kici login --device` for the device flow
- Or set the `KICI_BROWSER_CMD` environment variable to your browser command (e.g., `KICI_BROWSER_CMD='firefox {url}'`)

### Device flow timeout

The device flow has a 5-minute timeout. If it expires:

- Run `kici login --device` again to get a new code
- Ensure you're using the correct URL displayed by the CLI

### Expired PAT

If you see "Personal access token has expired":

- Run `kici login` to create a new PAT
- The old expired PAT is automatically superseded

### "Not a member" errors

If authenticated commands return 403:

- Check your active org: `kici org current`
- List available orgs: `kici org list`
- Switch to the correct org: `kici org use <name>`

### Connection refused

If the CLI can't reach the server:

- Verify the endpoint: check `~/.kici/config` for the correct URL
- Test connectivity: `curl <your-platform-url>/health`

---

## CLI reference

Source: https://docs.kici.dev/user/cli-reference/

The `@kici-dev/compiler` package provides the `kici` CLI for compiling, testing, and managing workflows.

## Installation

```bash
pnpm add -D @kici-dev/compiler
```

The examples use pnpm, but npm and yarn work too — `npm install -D @kici-dev/compiler` or `yarn add -D @kici-dev/compiler`.

Run commands with `npx kici` or add scripts to your `package.json`:

```json
{
  "scripts": {
    "kici:compile": "kici compile",
    "kici:test": "kici test"
  }
}
```

## Commands

### kici compile

Compile workflows from `.kici/workflows/` to `kici.lock.json`.

```bash
kici compile [options]
```

**Options:**

| Option              | Default | Description                                  |
| ------------------- | ------- | -------------------------------------------- |
| `--check`           | `false` | Validate workflows without writing lock file |
| `--watch`           | `false` | Watch for changes and recompile              |
| `--kici-dir <path>` | `.kici` | Path to .kici directory                      |
| `--verbose`         | `false` | Detailed output                              |

**Examples:**

```bash
# Compile all workflows
kici compile

# Validate only (CI-friendly, no file writes)
kici compile --check

# Watch mode for development
kici compile --watch

# Custom .kici directory location
kici compile --kici-dir packages/app/.kici

# Verbose output for debugging
kici compile --verbose
```

**Exit codes:**

| Code | Meaning                     |
| ---- | --------------------------- |
| 0    | Compilation successful      |
| 1    | Compilation failed (errors) |

The `--check` flag is useful in CI pipelines and pre-commit hooks. It validates that workflows are syntactically and semantically correct without writing the lock file or any other files.

**Auto-type regeneration:** When authenticated (via `kici login`), `kici compile` automatically refreshes `.kici/types/secrets.d.ts` after each successful compilation. This keeps type declarations in sync with your orchestrator's secret contexts. The type regeneration is non-blocking -- if the orchestrator is unreachable, compilation still succeeds with a warning. The `--check` flag skips type regeneration since no files are written.

### kici run

Execute workflows locally or remotely. The `run` command has two subcommands: `local` for direct execution without infrastructure, and `remote` for fixture-based execution through an orchestrator.

#### kici run local

Execute workflows locally without orchestrator infrastructure. Compiles workflows, matches triggers against the specified event, expands matrices, and runs jobs with DAG-based parallel scheduling.

```bash
kici run local [event] [options]
```

**Arguments:**

| Argument | Required               | Description                                      |
| -------- | ---------------------- | ------------------------------------------------ |
| `event`  | when `--pick` is unset | Event type (e.g., `push`, `pr:open`, `schedule`) |

**Options:**

| Option              | Default   | Description                                                                                                                                                                                                       |
| ------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `-p, --pick`        | `false`   | Interactively pick a workflow + trigger (see below)                                                                                                                                                               |
| `--workflow <name>` | none      | Run only the specified workflow (mutex with `--pick`)                                                                                                                                                             |
| `--job <name>`      | none      | Run only the specified job (and its dependencies)                                                                                                                                                                 |
| `--branch <name>`   | detected  | Override detected git branch                                                                                                                                                                                      |
| `--sha <hash>`      | detected  | Override detected git SHA                                                                                                                                                                                         |
| `--payload <path>`  | none      | Path to explicit event payload JSON file                                                                                                                                                                          |
| `--concurrency <n>` | CPU cores | Max parallel jobs **within one run** (job-level only). Cross-run [concurrency groups](concurrency.md) declared in `workflow({ concurrency: ... })` are enforced separately — see "Concurrency enforcement" below. |
| `--keep-going`      | `false`   | Continue after job failure                                                                                                                                                                                        |
| `--container`       | `false`   | Use Podman container isolation                                                                                                                                                                                    |
| `--env <KEY=VALUE>` | none      | Environment variable override (repeatable)                                                                                                                                                                        |
| `--files <path>`    | git diff  | Override changed file paths (repeatable, default: git diff)                                                                                                                                                       |
| `--quiet`           | `false`   | Suppress streaming output (summary only)                                                                                                                                                                          |
| `--json`            | `false`   | Output structured JSON result                                                                                                                                                                                     |
| `--junit <path>`    | none      | Output JUnit XML result to file                                                                                                                                                                                   |
| `--debug`           | `false`   | Verbose internals                                                                                                                                                                                                 |
| `--kici-dir <path>` | `.kici`   | Path to .kici directory                                                                                                                                                                                           |
| `--in-place`        | `false`   | Run against the real working directory instead of an isolated tmp checkout (see "Execution isolation" below)                                                                                                      |
| `--keep`            | `false`   | Always retain the isolated tmp checkout (default: keep only on failure)                                                                                                                                           |

**Interactive workflow selection (`--pick` / `-p`):**

When you do not remember the event arg for a workflow, pass `--pick` (or `-p`) to open an interactive picker. It lists every workflow with a compact summary of its triggers, lets you choose one, and (for multi-trigger workflows) prompts again for which trigger to simulate. The selected trigger is converted back into an event arg and fed through the normal pipeline.

```bash
# Open the picker across all triggerable workflows
kici run local --pick

# Scope the picker to a trigger family (e.g. only workflows that react to pr:*)
kici run local pr:open --pick
```

Rules:

- `--pick` is mutually exclusive with `--workflow`. Passing both exits with code 2.
- When `stdin` is not a TTY, `--pick` prints the available workflows and exits without running anything — fall back to `kici run local <event> --workflow <name>` in scripts.
- Passing an event arg together with `--pick` narrows the picker to workflows that declare at least one trigger in that event family (e.g. `schedule --pick` shows only scheduled workflows).

**Concurrency enforcement:**

When the workflow declares a `concurrency` block, `kici run local` enforces it across concurrent local invocations on the same machine and user account. The behavior mirrors the orchestrator:

- The `group` callback is evaluated against the simulated event (same `{ branch, event }` context that the agent sees), and the resulting key is used as the lock identity. Throwing from `group` aborts the workflow run with a clear error — there is no fallback to the workflow name.
- `cancelInProgress: true` interrupts the holder via `SIGTERM`, then escalates to `SIGKILL` after a grace window if the holder does not exit, and proceeds with the new run.
- Otherwise the new invocation waits in FIFO order. A status line is printed when the wait starts and roughly every five seconds thereafter.
- Locks live under `$XDG_RUNTIME_DIR/kici-local-locks/` on Linux, falling back to `os.tmpdir()/kici-local-locks-<uid>/` on platforms without a per-user runtime dir. Each lock file records the holder PID, hostname, workflow name, group key, and start timestamp so concurrent invocations can describe what they are waiting for.
- Stale locks (the recorded holder PID is gone, per `process.kill(pid, 0)`) are reclaimed automatically.

Coordination is local only — running the same workflow on two different machines does not serialize across them. That requires the orchestrator.

The `SIGTERM`-to-`SIGKILL` grace window defaults to 30 000 ms. Override it with the `KICI_LOCAL_LOCK_KILL_GRACE_MS` environment variable (positive integer, milliseconds) when iterating on workflows that need longer to clean up on cancellation.

**Execution isolation:**

By default, `kici run local` executes steps inside an **isolated tmp checkout** rather than against your real working directory. Any file a step writes, builds, or deletes — and any `git` mutation a step performs — lands in that throwaway copy, so casual local runs never touch your tree.

What gets materialized into the isolated checkout has full parity with what `kici run remote` reconstructs: your current working tree minus gitignored files, with `.kiciignore` applied to local changes, over a real `.git` directory. Concretely, the checkout is built from a clone pinned to your current `HEAD`, with your local overlay (modified, staged, and untracked-but-not-ignored files) copied on top and locally-deleted files removed. Workflows that read git metadata work because the `.git` directory is present and pinned to your `HEAD`.

The path is logged at run start (for example, `running in /tmp/kici-run-ab12cd`) so you can inspect it.

Cleanup policy:

- On a fully successful run, the isolated checkout is removed.
- On failure, it is retained and its path is logged so you can inspect the failed state.
- `--keep` always retains it, even on success.
- Retained checkouts are garbage-collected after 72 hours by the next `kici run local` invocation — copy a checkout elsewhere if you need it longer.

Set the `KICI_RUN_DIR` environment variable to place the isolated checkout under a base directory other than the system temp directory.

Secrets are always sourced from your real `.kici/` directory, not from the isolated checkout. Gitignored secret files (such as `.kici/.env.local` and `.kici/secrets.yaml`) are never copied into the checkout, so a step that reads a secret still gets it from the original location.

Pass `--in-place` to run against the real working directory instead — useful when you explicitly want in-tree execution. `--in-place` requires no git repository; the default isolated mode does, and fails with an actionable error pointing at `--in-place` when the directory is not a git repository.

**Examples:**

```bash
# Run workflows matching a push event
kici run local push

# Run only a specific workflow
kici run local push --workflow ci

# Run only a specific job (and its dependencies)
kici run local push --job test

# JSON output for CI scripting
kici run local push --json

# JUnit XML for CI integration
kici run local push --junit results.xml

# Quiet mode (summary only, no streaming)
kici run local push --quiet

# Override branch and SHA
kici run local push --branch main --sha abc1234

# Environment variable overrides
kici run local push --env NODE_ENV=test --env CI=true

# Continue running other jobs after one fails
kici run local push --keep-going
```

**Exit codes:**

| Code | Meaning                 |
| ---- | ----------------------- |
| 0    | All workflows succeeded |
| 1    | One or more jobs failed |

**Output formats:**

- **Default:** Streaming job output during execution, followed by a tree-format summary with per-step timing
- **`--json`:** Structured JSON with workflows, jobs, steps, timing, and matrix values
- **`--junit <path>`:** Standard JUnit XML for CI integration (Jenkins, GitLab, etc.)
- **`--quiet`:** Summary only, no streaming output during execution

#### kici run remote

Execute fixtures remotely through the full CI pipeline. Fixtures are defined in `.kici/tests/*.ts` using the `fixture()` factory function. Without arguments, lists available fixtures.

Requires `kici login` (an authenticated session) and a target orchestrator that has **cache storage configured** (`KICI_STORAGE_TYPE` = `s3` or `filesystem`) — the command uploads your working-tree overlay to that storage for the agent to fetch. The quickstart orchestrators do not enable storage by default; see the [testing guide](testing-guide.md) for setup (including non-public / self-hosted S3 endpoints).

```bash
kici run remote [fixture] [options]
```

**Arguments:**

| Argument  | Required | Description                                     |
| --------- | -------- | ----------------------------------------------- |
| `fixture` | no       | Fixture name or glob pattern (omit to list all) |

**Options:**

| Option                      | Default | Description                                                                                                           |
| --------------------------- | ------- | --------------------------------------------------------------------------------------------------------------------- |
| `--all`                     | `false` | Run all fixtures                                                                                                      |
| `--workflow <name>`         | none    | Run a specific workflow directly (bypass triggers)                                                                    |
| `--parallel`                | `false` | Run multiple fixtures concurrently                                                                                    |
| `--no-wait`                 | -       | Fire and forget (print runIds, don't stream)                                                                          |
| `--quiet`                   | `false` | Minimal output (only final result)                                                                                    |
| `--json`                    | `false` | Machine-readable JSON output                                                                                          |
| `--junit <path>`            | none    | JUnit XML output to file for CI integration                                                                           |
| `--history`                 | `false` | Show table of recent test runs                                                                                        |
| `--routing-key <key>`       | none    | Override routing key for this run                                                                                     |
| `--context <ctx.key=value>` | none    | Inject a namespaced context secret, uploaded encrypted to the orchestrator (repeatable)                               |
| `--env <KEY=VALUE>`         | none    | Provide a per-run secret, uploaded encrypted to the orchestrator (repeatable) — see [testing guide](testing-guide.md) |
| `--debug`                   | `false` | Verbose internals                                                                                                     |
| `--kici-dir <path>`         | `.kici` | Path to .kici directory                                                                                               |

**Examples:**

```bash
# List available fixtures
kici run remote

# Run a single fixture
kici run remote push-main

# Run all push-related fixtures
kici run remote push-*

# Run everything
kici run remote --all

# Run a specific workflow directly (bypass trigger matching)
kici run remote --workflow ci

# Quiet mode -- just pass/fail
kici run remote push-main --quiet

# JSON output for scripting
kici run remote push-main --json

# Fire and forget
kici run remote push-main --no-wait

# View recent test run history
kici run remote --history
```

**Exit codes:**

| Code | Meaning                      |
| ---- | ---------------------------- |
| 0    | All matched workflows passed |
| 1    | One or more workflows failed |

#### Fresh repos (no GitHub remote)

`kici run remote` works even if the repo has never been pushed to GitHub. When no remote is detected:

- The entire repo content is uploaded (not just a diff overlay)
- A synthetic routing key `local:<repo-name>` is used
- The lock file is sent inline (no GitHub API fetch)
- Steps that use git commands will fail (no `.git` directory in the remote workspace)
- Build cache (`__build__` jobs) is skipped for local repos
- Environments must have `allowLocalExecution: true` to be accessible from local runs (default is `false`)

For a detailed guide on writing fixtures, configuring secrets, and understanding the upload flow, see [Testing guide](testing-guide.md).

### kici test

Preview which workflows match a trigger event (dry-run, no execution). Useful for verifying trigger configurations during development.

```bash
kici test [event] [options]
```

**Arguments:**

| Argument | Required | Description                                                 |
| -------- | -------- | ----------------------------------------------------------- |
| `event`  | no       | Event type to preview (e.g., `push`, `pr:open`, `schedule`) |

**Options:**

| Option                      | Default | Description                                                  |
| --------------------------- | ------- | ------------------------------------------------------------ |
| `--workflow <name>`         | none    | Filter to specific workflow                                  |
| `--job <name>`              | none    | Filter to specific job                                       |
| `--branch <name>`           | `main`  | Override target branch for trigger matching                  |
| `--sha <hash>`              | none    | Override commit SHA                                          |
| `--files <path>`            | none    | Simulate changed file path for trigger matching (repeatable) |
| `--secret <key=value>`      | none    | Inject flat secret (repeatable)                              |
| `--context <ctx.key=value>` | none    | Inject context secret (repeatable)                           |
| `--debug`                   | `false` | Verbose internals                                            |
| `--kici-dir <path>`         | `.kici` | Path to .kici directory                                      |

**Examples:**

```bash
# Preview which workflows match a push event
kici test push

# Preview PR trigger matching
kici test pr:open

# Preview with branch override
kici test push --branch develop

# Filter to specific workflow
kici test push --workflow ci

# Simulate changed files for path-filtered triggers
kici test push --files src/index.ts --files README.md
```

**Exit codes:**

| Code | Meaning                                    |
| ---- | ------------------------------------------ |
| 0    | Preview completed (including zero matches) |
| 1    | Error                                      |

**Migration from old `kici test <fixture>`:** If you were using `kici test <fixture-name>` for remote fixture execution, use `kici run remote <fixture-name>` instead. For local workflow execution, use `kici run local <event>`.

### kici login

Authenticate with KiCI via browser-based OAuth (default) or API key (`--token`).

By default, `kici login` opens your browser for OIDC authentication using PKCE. In headless environments (SSH, CI, containers), it automatically switches to the RFC 8628 device authorization flow where you visit a URL and enter a code.

After OAuth, the CLI exchanges the OIDC token for a personal access token (PAT) stored in the config directory (`~/.kici/config` by default, overridable with `KICI_CONFIG_DIR`).

```bash
kici login [options]
```

**Options:**

| Option                      | Default | Description                                    |
| --------------------------- | ------- | ---------------------------------------------- |
| `--token <key>`             | none    | API key for direct authentication (legacy)     |
| `--device`                  | false   | Force device authorization flow (headless/SSH) |
| `--endpoint <url>`          | none    | Orchestrator URL for direct connection         |
| `--platform-endpoint <url>` | none    | Platform relay URL                             |
| `--routing-key <key>`       | none    | Routing key for webhook source identification  |

**Environment variables:**

| Variable              | Default                                      | Description                                                            |
| --------------------- | -------------------------------------------- | ---------------------------------------------------------------------- |
| `KICI_PLATFORM_URL`   | `https://api.kici.dev`                       | Platform API base URL (override for a self-hosted Platform)            |
| `KICI_OIDC_ISSUER`    | `https://auth.kici.dev/realms/kici-internal` | OIDC issuer URL (override for a self-hosted Platform)                  |
| `KICI_OIDC_CLIENT_ID` | `kici-cli`                                   | OIDC client ID (override for a self-hosted Platform)                   |
| `KICI_BROWSER_CMD`    | uses `open` package                          | Custom browser command with `{url}` placeholder, or `none` to suppress |
| `KICI_CALLBACK_PORT`  | random                                       | Fixed port for OAuth PKCE callback server                              |
| `KICI_CONFIG_DIR`     | `~/.kici`                                    | Override config directory                                              |

**Examples:**

```bash
# Browser-based OAuth login (default)
kici login

# Force device flow (for SSH/headless)
kici login --device

# Legacy API key login
kici login --token kici_sk_abc123...

# Direct connection to orchestrator with API key
kici login --token kici_sk_abc123... --endpoint https://my-orchestrator.example.com

# Platform relay connection with routing key
kici login --token kici_sk_abc123... --platform-endpoint https://platform.kici.dev --routing-key github:42

# Suppress browser opening (print authorize URL to stdout)
KICI_BROWSER_CMD=none kici login

# Use custom browser command
KICI_BROWSER_CMD='firefox {url}' kici login

# Fixed callback port and custom config directory
KICI_CALLBACK_PORT=19876 KICI_CONFIG_DIR=/tmp/kici-test kici login
```

**Headless detection:** The CLI automatically detects headless environments by checking for `SSH_CLIENT`, `SSH_TTY`, `CI`, `GITHUB_ACTIONS`, `GITLAB_CI`, `container`, or `DOCKER_CONTAINER` env vars, and on Linux, the absence of `DISPLAY` and `WAYLAND_DISPLAY`.

### kici logout

Revoke your personal access token on the server and clear local credentials.

If the server is unreachable, local credentials are still cleared (the PAT will expire automatically). Non-auth config fields (endpoint, routing key, etc.) are preserved.

```bash
kici logout
```

**Examples:**

```bash
# Log out and revoke PAT
kici logout
```

### kici org

Manage organization context. Requires a PAT (run `kici login` first).

#### kici org list

List organizations you belong to. The active org is marked with a star (`*`).

```bash
kici org list
```

**Example output:**

```
Organizations:

  * Personal          (owner)  abc123def456
    My team           (admin)  xyz789ghi012
```

#### kici org use

Switch the active organization by name (case-insensitive) or ID.

```bash
kici org use <name>
```

**Arguments:**

| Argument | Required | Description             |
| -------- | -------- | ----------------------- |
| `name`   | yes      | Organization name or ID |

**Examples:**

```bash
# Switch by name
kici org use "My team"

# Switch by ID
kici org use xyz789ghi012
```

#### kici org current

Show the current active organization.

```bash
kici org current
```

### kici status

Show details for a specific test run. Fetches from the orchestrator with fallback to local history.

The status output includes an **auth section** showing login state, active organization, and PAT expiry. A warning appears when the PAT expires within 7 days.

For a failed run, the output prints a `Reason:` line with the run's failure reason and shows the failed job's error inline, so you can see why a run failed without opening the dashboard. When provisioning an agent failed before any step ran, this reason is the captured scaler error (for example a missing binary or an unpullable image) rather than a generic "no agents available" message.

```bash
kici status <run-id> [options]
```

**Arguments:**

| Argument | Required | Description    |
| -------- | -------- | -------------- |
| `run-id` | yes      | Run identifier |

**Options:**

| Option         | Default | Description                                                  |
| -------------- | ------- | ------------------------------------------------------------ |
| `--logs`       | `false` | Stream logs (live for active runs, historical for completed) |
| `--job <name>` | all     | Filter logs to a specific job                                |
| `--json`       | `false` | Machine-readable JSON output                                 |

**Examples:**

```bash
# Show run summary
kici status abc123

# Show full logs (historical for completed runs, live streaming for active runs)
kici status abc123 --logs

# Show logs for a specific job
kici status abc123 --logs --job build

# Machine-readable output
kici status abc123 --json
```

When `--json` is set, `kici` emits only the JSON document on stdout — the
`kici v<version>` banner is suppressed — so the output is safe to pipe into
`jq` or `JSON.parse`. The same holds for the other `--json` commands (`kici run
remote --json`, `kici workflows list --json`) and for `--quiet`.

### kici cancel

Cancel a running workflow or all runs on a branch.

```bash
kici cancel [run-id] [options]
```

**Arguments:**

| Argument | Required | Description      |
| -------- | -------- | ---------------- |
| `run-id` | no       | Run ID to cancel |

**Options:**

| Option            | Default | Description                                 |
| ----------------- | ------- | ------------------------------------------- |
| `--force`         | `false` | Force cancel (kill immediately, skip hooks) |
| `--branch <name>` | none    | Cancel all in-progress runs on this branch  |

**Examples:**

```bash
# Cancel a specific run
kici cancel abc123

# Force cancel (kill immediately)
kici cancel abc123 --force

# Cancel all runs on a branch
kici cancel --branch feature/wip
```

### kici secrets list

List secret contexts available for test runs. Shows context names and key names (not values).

```bash
kici secrets list [options]
```

**Options:**

| Option             | Default | Description               |
| ------------------ | ------- | ------------------------- |
| `--endpoint <url>` | none    | Orchestrator URL override |

Each "context" corresponds to an environment configured on the orchestrator. The output lists every environment whose `allowLocalExecution` flag is `true` (the gate that lets CLI-initiated test runs resolve secrets through that environment), along with the secret key names reachable from the environment's bound scopes.

Only key names are shown — secret values are never returned over this endpoint.

**Prerequisites:** authenticate via `kici login` and select an active organization with `kici org use <name>`.

### kici types

Generate TypeScript declaration files from orchestrator environment metadata. The generated `.d.ts` file augments the SDK's `KnownSecretKeys` and `EnvironmentSecrets` interfaces, providing compile-time autocomplete and type checking for secret key names.

```bash
kici types [options]
```

**Options:**

| Option              | Default | Description               |
| ------------------- | ------- | ------------------------- |
| `--kici-dir <path>` | `.kici` | Path to .kici directory   |
| `--endpoint <url>`  | none    | Orchestrator URL override |

**Prerequisites:** Must be authenticated via `kici login`.

**Output:** `.kici/types/secrets.d.ts`

**Examples:**

```bash
# Generate types from orchestrator
kici types

# Use custom .kici directory
kici types --kici-dir packages/app/.kici

# Override orchestrator endpoint
kici types --endpoint https://my-orchestrator.example.com
```

**How it works:**

1. Fetches all environment metadata (environment names and secret key names) from the orchestrator
2. Generates a `.d.ts` file that augments `@kici-dev/sdk`'s `KnownSecretKeys` and `EnvironmentSecrets` interfaces
3. Writes the file to `.kici/types/secrets.d.ts`

After generating types, `ctx.secrets.get('MY_KEY')` and `ctx.secrets.expose('DB_HOST')` gain autocomplete and type checking in your IDE.

**Git workflow:** Commit the generated `.kici/types/secrets.d.ts` so team members get type checking without needing orchestrator access. Run `kici types` to refresh when environments change.

**Auto-regeneration:** `kici compile` automatically runs `kici types` after successful compilation when authenticated. See the [kici compile](#kici-compile) section for details.

**Escape hatch:** For dynamic keys not in the generated types, use a cast: `(ctx.secrets as any).DYNAMIC_KEY`.

### kici fixture

Generate a fixture template for an event type. Useful for creating custom test payloads.

```bash
kici fixture <event> [options]
```

**Arguments:**

| Argument | Required | Description                   |
| -------- | -------- | ----------------------------- |
| `event`  | yes      | Event to generate fixture for |

**Valid events:** `pr:open`, `pr:sync`, `pr:close`, `pr:reopen`, `push`, `tag`, `comment`, `review`, `review_comment`, `release`, `dispatch`, `create`, `delete`, `status`, `workflow_run`, `fork`, `star`, `watch`, `kici_event`, `workflow_complete`, `job_complete`, `generic_webhook`, `schedule`, `lifecycle` (many support `:action` suffixes, e.g. `comment:edited`, `release:published`, `lifecycle:workflow_complete`). `webhook:<source>` is a shorthand alias for `generic_webhook:<source>`.

**Options:**

| Option            | Default | Description                     |
| ----------------- | ------- | ------------------------------- |
| `--output <path>` | stdout  | Write to file instead of stdout |

**Examples:**

```bash
# Print fixture to stdout
kici fixture pr:open

# Write fixture to file
kici fixture pr:open --output fixtures/pr-open.json

# Generate push fixture
kici fixture push --output fixtures/push.json
```

Use generated fixtures as reference when writing test fixture files in `.kici/tests/`:

```bash
kici fixture pr:open --output fixtures/pr-open-reference.json
# Use the generated JSON as reference when writing .kici/tests/pr-open.ts
```

### kici init

Initialize a `.kici/` directory with default workflow templates.

```bash
kici init [options]
```

**Options:**

| Option                                | Default                | Description                                                                                             |
| ------------------------------------- | ---------------------- | ------------------------------------------------------------------------------------------------------- |
| `--force`                             | `false`                | Overwrite existing `.kici/` directory                                                                   |
| `--skip-install`                      | `false`                | Create files without installing dependencies                                                            |
| `--package-manager <npm\|pnpm\|yarn>` | auto-detect            | Force a package manager for the install step (default: detect from your repo)                           |
| `--mjs`                               | `false`                | JavaScript-only mode (no TypeScript, no deps)                                                           |
| `--no-agents-md`                      | writes `AGENTS.md`     | Skip writing `.kici/AGENTS.md` (the LLM authoring context file)                                         |
| `--private-registry <url>`            | none                   | Scaffold a workflow `registries:` entry pointing at `<url>` (e.g. CodeArtifact, GH Packages, Verdaccio) |
| `--private-registry-scope <scope>`    | none                   | Optional npm package scope (e.g. `@my-org`) for the private registry                                    |
| `--private-registry-secret <ref>`     | `production:NPM_TOKEN` | Qualified secret reference (`env:NAME`) the private registry token comes from                           |

**Examples:**

```bash
# Interactive initialization
kici init

# Overwrite existing setup
kici init --force

# Skip dependency install (faster, install manually later)
kici init --skip-install

# Force a specific package manager (default: detect from your repo)
kici init --package-manager pnpm

# JavaScript mode (no TypeScript)
kici init --mjs

# Skip writing the AGENTS.md LLM authoring context file
kici init --no-agents-md

# Scaffold a workflow registries entry for a private npm registry
kici init --private-registry https://npm.pkg.github.com/ \
          --private-registry-scope @my-org \
          --private-registry-secret production:GITHUB_PACKAGES_TOKEN
```

**What it creates:**

```
.kici/
  workflows/
    hello-world.ts    # Minimal push workflow
    pr-checks.ts      # Comprehensive PR workflow
  tests/
    push-test.ts      # Sample test fixture
  types/              # Directory for generated type declarations (kici types)
  package.json        # Dependencies (@kici-dev/sdk)
  tsconfig.json       # TypeScript configuration (includes types/**/*.d.ts)
.kiciignore           # Default exclusion patterns for test uploads
```

In interactive mode (TTY), `kici init` prompts you to:

1. Select which workflow templates to include
2. Optionally install a pre-commit hook

**Package manager:** the dependency install step uses the package manager detected for your repo — the `packageManager` field in the nearest `package.json` (Corepack convention), then a lockfile in the project root (`pnpm-lock.yaml` → pnpm, `yarn.lock` → yarn, `package-lock.json` → npm), then the package manager that invoked `kici` (`pnpm dlx` / `yarn dlx` / `npx`), defaulting to npm. Pass `--package-manager <npm|pnpm|yarn>` to override detection, or `--skip-install` to set up the files and install later yourself.

**Development mode:** When `KICI_DEV=true` or `package.json` has `"kici": { "development": true }`, the generated `package.json` uses prerelease-compatible version ranges (`>=0.0.1-0`) so npm resolves Verdaccio's prerelease builds.

### kici hook install

Install a pre-commit hook that runs `kici compile` before each commit.

```bash
kici hook install [options]
```

**Options:**

| Option  | Default | Description                                |
| ------- | ------- | ------------------------------------------ |
| `--git` | `false` | Use raw git hook (`.git/hooks/pre-commit`) |

**Examples:**

```bash
# Auto-detect hook tool (husky, lint-staged, etc.)
kici hook install

# Force raw git hook
kici hook install --git
```

The command auto-detects existing hook tools in your project:

- **Husky**: Adds to `.husky/pre-commit`
- **lint-staged**: Adds to lint-staged configuration
- **Raw git**: Writes `.git/hooks/pre-commit`

If multiple tools are detected, you are prompted to choose.

### kici endpoints

List all webhook entrypoints for the current project. Reads the compiled lock file and displays webhook URLs grouped by type (git provider, generic webhooks, scheduled, event-driven).

```bash
kici endpoints [options]
```

**Options:**

| Option              | Default | Description             |
| ------------------- | ------- | ----------------------- |
| `--kici-dir <path>` | `.kici` | Path to .kici directory |

**Prerequisites:** Run `kici compile` first to generate the lock file.

**Examples:**

```bash
# List all webhook entrypoints
kici endpoints

# Custom .kici directory
kici endpoints --kici-dir packages/app/.kici
```

### kici workflows list

List permanently registered workflows on the orchestrator.

```bash
kici workflows list [options]
```

**Options:**

| Option                  | Default | Description                                    |
| ----------------------- | ------- | ---------------------------------------------- |
| `--json`                | `false` | Output as JSON                                 |
| `--stale <duration>`    | none    | Filter stale registrations (e.g., `30d`, `7d`) |
| `--trigger-type <type>` | none    | Filter by trigger type                         |
| `--repo <repo>`         | none    | Filter by repository                           |

**Examples:**

```bash
# List all registered workflows
kici workflows list

# JSON output for scripting
kici workflows list --json

# Show workflows not updated in 30 days
kici workflows list --stale 30d

# Filter by trigger type
kici workflows list --trigger-type push

# Filter by repository
kici workflows list --repo my-org/my-repo
```

### kici docs

Open the KiCI documentation site in the default browser. With the `llm` subcommand, print the LLM-friendly documentation bundle that ships with `@kici-dev/compiler` — pipe it into a coding agent's context buffer to brief the agent on authoring conventions without an internet round-trip.

```bash
kici docs               # open https://kici.dev/docs/
kici docs --no-open     # print the URL instead of opening a browser
kici docs llm           # print llms-full.txt (the full bundle) to stdout
kici docs llm --index   # print llms.txt (the curated link index) to stdout
kici docs llm --out path/to/file.md   # write the bundle to a file
```

**Examples:**

```bash
# Open the docs site in your browser
kici docs

# Pipe the full LLM bundle into a coding agent
kici docs llm | claude -- "Read this and help me author a deploy workflow"

# Save the curated index for offline reference
kici docs llm --index --out kici-llms.txt
```

The bundle is regenerated from `docs/` every time `@kici-dev/compiler` is built, so it always matches your installed CLI version. The same content is available online at <https://kici.dev/llms.txt> and <https://kici.dev/llms-full.txt> following the [llms.txt convention](https://llmstxt.org/).

### kici admin

Operator-facing commands for running instances.

#### kici admin drain-worker

Trigger graceful drain on a worker instance. Sends a POST request to the worker's `/drain` endpoint.

```bash
kici admin drain-worker [options]
```

**Options:**

| Option        | Required | Description                                   |
| ------------- | -------- | --------------------------------------------- |
| `--url <url>` | yes      | Worker URL (e.g., `http://worker-host:10143`) |

**Examples:**

```bash
# Drain a local worker
kici admin drain-worker --url http://localhost:10143

# Drain a remote worker
kici admin drain-worker --url http://worker-2.internal:10143
```

**Exit codes:**

| Code | Meaning                             |
| ---- | ----------------------------------- |
| 0    | Drain request accepted              |
| 1    | Error (unreachable or request fail) |

## Workflow discovery

The CLI discovers workflows by scanning `.kici/workflows/*.ts` (or `.mjs` in MJS mode). Each file should `export default` a single workflow:

```typescript
// .kici/workflows/ci.ts
import { workflow, job, step, pr } from '@kici-dev/sdk';

export default workflow('ci', {
  on: pr(),
  jobs: [
    /* ... */
  ],
});
```

Multiple workflow files are supported -- each becomes a separate workflow in `kici.lock.json`.

## Lock file

The `kici compile` command produces `.kici/kici.lock.json` inside the `.kici` directory. This file:

- Contains all workflow definitions in a portable JSON format
- Is used by the orchestrator to evaluate triggers without code checkout
- Should be committed to version control
- Is regenerated on every `kici compile` run

Use `kici compile --check` in CI to validate that workflows are correct without writing files. For the full story on drift, pre-commit/CI, and agent-side verification, see [Lock file and workflow drift](lock-file-and-drift.md).

## Exit codes

All commands follow a consistent exit code convention:

| Code | Meaning              |
| ---- | -------------------- |
| 0    | Success              |
| 1    | Failure (see output) |

## Debug output

Use `--debug` (on `kici run local`, `kici run remote`, `kici test`) or `--verbose` (on `kici compile`) for detailed output:

```bash
# Shows trigger matching, rule evaluation, decision traces
kici run local push --debug

# Shows detailed compilation steps
kici compile --verbose

# Shows trigger matching preview
kici test pr:open --debug
```

Set `KICI_DEBUG=true` for additional internal debug output across all commands.

## Environment variables

| Variable     | Description                               |
| ------------ | ----------------------------------------- |
| `KICI_DEV`   | Set to `true` for development mode        |
| `KICI_DEBUG` | Set to `true` for verbose internal output |
| `CI`         | When `true`, disables interactive prompts |

## See also

- [Getting started](getting-started.md) -- install the SDK and write your first workflow
- [Testing guide](testing-guide.md) -- writing fixtures, remote test runs, secret contexts, and repo state transfer
- [SDK reference](sdk-reference.md) -- complete API for the workflow definitions that the CLI compiles
- [Workflow patterns](workflow-patterns.md) -- example workflows to compile and test with these commands

---

## Lifecycle hooks

Source: https://docs.kici.dev/user/hooks/

Hooks are callbacks that run at specific points in the execution lifecycle. They let you react to outcomes (cancellation, success, failure) and perform cleanup without affecting the execution flow.

## Hook types

KiCI supports six hook types at three levels (step, job, workflow):

| Hook         | When it runs                         | Available on        |
| ------------ | ------------------------------------ | ------------------- |
| `onCancel`   | After step/job/workflow is cancelled | Step, Job, Workflow |
| `cleanup`    | Always (success, failure, or cancel) | Step, Job, Workflow |
| `onSuccess`  | After job/workflow succeeds          | Job, Workflow       |
| `onFailure`  | After job/workflow fails             | Job, Workflow       |
| `beforeStep` | Before each step in a job            | Job                 |
| `afterStep`  | After each step in a job             | Job                 |

## Basic usage

### Job-level hooks

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('deploy', {
  on: push({ branches: ['main'] }),
  jobs: [
    job('deploy-prod', {
      runsOn: 'linux',
      steps: [
        step('deploy', async ({ $ }) => {
          await $`kubectl apply -f manifests/`;
        }),
      ],
      onCancel: async (ctx) => {
        console.log(`Deploy cancelled: ${ctx.outcome.reason}`);
        await ctx.$`kubectl rollout undo deployment/app`;
      },
      cleanup: async (ctx) => {
        // Always runs -- release lock, notify team, etc.
        await ctx.$`curl -X POST https://slack.com/webhook -d '{"text": "Deploy ${ctx.outcome.status}"}'`;
      },
      onSuccess: async (ctx) => {
        console.log(`Deploy succeeded in ${ctx.outcome.duration}ms`);
      },
      onFailure: async (ctx) => {
        console.log(`Deploy failed at step: ${ctx.outcome.failedStep}`);
      },
      gracePeriod: 60, // 60 seconds before SIGKILL on cancel
    }),
  ],
});
```

### Step-level hooks

```typescript
step('download-artifacts', {
  run: async ({ $ }) => {
    await $`wget https://artifacts.example.com/build.tar.gz`;
  },
  onCancel: async (ctx) => {
    // Clean up partial downloads
    await ctx.$`rm -f build.tar.gz`;
  },
  cleanup: async (ctx) => {
    await ctx.$`rm -rf /tmp/staging`;
  },
});
```

### Workflow-level hooks

```typescript
workflow('ci', {
  on: push({ branches: ['main'] }),
  jobs: [
    /* ... */
  ],
  onCancel: async (ctx) => {
    // Notify when any job in the workflow is cancelled
    console.log('CI workflow cancelled');
  },
  cleanup: async (ctx) => {
    // Always runs after all jobs complete
    console.log(`CI workflow finished with status: ${ctx.outcome.status}`);
  },
});
```

## Hook context

Hook functions receive the same `StepContext` as regular steps (`$`, `ctx`, `log`, `env`), plus an `outcome` object with metadata about the execution result.

### ctx.outcome

```typescript
interface OutcomeMetadata {
  /** Final status of the job/workflow. */
  status: 'cancelled' | 'success' | 'failed';
  /** Reason for cancellation (e.g., "User requested", "Superseded by run #42"). */
  reason?: string;
  /** Name of the step that caused failure (for onFailure hooks). */
  failedStep?: string;
  /** Outputs from all completed steps. */
  stepOutputs: Record<string, unknown>;
  /** Total execution duration in milliseconds. */
  duration: number;
}
```

### Capabilities

Hooks can do everything regular steps can:

- Run shell commands via `$`
- Set environment variables via `ctx.setEnv()` and prepend to `PATH` via `ctx.addPath()`
- Access previous step outputs via `ctx.outputsOf()` and `ctx.jobOutputs()`
- Publish encrypted secret outputs via `ctx.setSecretOutput()`
- Log via `log.info()`, `log.error()`, etc.

## Hook timeout

Each hook has a timeout (default: 5 minutes). You can customize it per-hook:

```typescript
job('deploy', {
  runsOn: 'linux',
  steps: [
    /* ... */
  ],
  cleanup: {
    run: async (ctx) => {
      await ctx.$`./lengthy-cleanup.sh`;
    },
    timeout: 10 * 60 * 1000, // 10 minutes in ms
  },
});
```

## Hook execution order

Hooks execute inside-out on cancellation (like stack unwinding):

1. **Step-level** cleanup (on the cancelled step)
2. **Job-level** onCancel, then cleanup
3. **Workflow-level** onCancel, then cleanup

On success: step afterStep (after each step), then job onSuccess + cleanup, then workflow onSuccess + cleanup.

On failure: job onFailure + cleanup, then workflow onFailure + cleanup.

**cleanup always runs** -- regardless of whether the outcome was success, failure, or cancel.

## Hooks are observers

Hooks follow the "one mechanism per concern" principle:

- **Rules** control whether a step/job executes (conditional logic)
- **Hooks** react to execution outcomes (lifecycle callbacks)

Hooks cannot short-circuit step execution or change the execution flow. They observe and respond.

## beforeStep and afterStep

These job-level hooks run around every step in the job:

```typescript
job('test', {
  runsOn: 'linux',
  beforeStep: async (ctx) => {
    console.log(`Starting step at ${new Date().toISOString()}`);
  },
  afterStep: async (ctx) => {
    console.log(`Step completed with status: ${ctx.outcome.status}`);
  },
  steps: [
    step('lint', async ({ $ }) => {
      await $`pnpm lint`;
    }),
    step('test', async ({ $ }) => {
      await $`pnpm test`;
    }),
  ],
});
```

`afterStep` runs immediately after its step, before the next step starts (not deferred to the end of the job).

## Step-level rules

Step-level rules control whether a step executes, evaluated at runtime by the agent:

```typescript
import { step, rule, skip, isEventType } from '@kici-dev/sdk';

step('deploy', {
  run: async ({ $ }) => {
    await $`kubectl apply -f manifests/`;
  },
  rules: [
    rule('only on main pushes', (ctx) => {
      if (!isEventType(ctx.event, 'push')) return false;
      return ctx.event.payload.ref === 'refs/heads/main';
    }),
  ],
});

// Or use skip() for explicit skip with a reason
step('optional-check', {
  run: async ({ $ }) => {
    await $`./optional-check.sh`;
  },
  rules: [skip('not needed in CI', () => true)],
});
```

When a rule returns `false`, the step is reported as `skipped` and subsequent steps continue normally. Skipped steps don't cause the job to fail.

Step rules have access to runtime context via `RuleContext`: `event` (typed discriminated union), `changedFiles`, `env`, and `$`. They evaluate agent-side (unlike job-level rules which evaluate at the orchestrator during trigger matching).

## Hook failure behavior

If a hook throws an error or times out:

- The job status changes to `failed` with a compound reason (e.g., "cancelled (onCancel hook failed: Connection timeout)")
- Remaining hooks for that level are skipped
- The failure is visible in the dashboard as a failed hook step
- Force cancel kills running hooks immediately via SIGKILL

This behavior is consistent across all hook types.

---

_Source: `packages/sdk/src/hooks/`, `packages/sdk/src/types.ts`_

---

## Lock file and workflow drift

Source: https://docs.kici.dev/user/lock-file-and-drift/

KiCI uses a **two-artifact model**: TypeScript workflows are the source of truth; the lock file (`kici.lock.json`) is the execution contract. The orchestrator reads only the lock file to match triggers and decide cache vs build. Keeping these in sync is important.

## Why the lock file matters

- **Orchestrator** fetches the lock file at the commit SHA and uses it to evaluate triggers and to look up the cached `.kici/` source tarball + `node_modules` tarball. It never runs your TypeScript.
- **Agents** download the cached source tarball (or, on cold cache, the build agent clones + packs it), register the shared TypeScript loader hook, and dynamic-`import()` the workflow `.ts` directly. The lock file's per-workflow `contentHash` identifies the expected raw-source bytes and is verified against the extracted source before any step runs.

If you change a workflow file (`.ts`) but do **not** regenerate and commit the lock file, the repo at that commit has **drift**: the lock file no longer matches the source. Triggers and cache keys can be wrong, and runs can fail with a clear “stale lock file” error once the agent verifies the hash.

## Lock file structure

The lock file (`kici.lock.json`) is a JSON file with the following top-level fields:

| Field           | Description                                                                                                                                                                                                                                                                                                                                           |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `schemaVersion` | Lock file schema version (currently 15). Incremented on breaking format changes.                                                                                                                                                                                                                                                                      |
| `source`        | Reference to the source file and export (e.g., `{ file: “.kici/workflows/ci.ts”, export: “#default” }`).                                                                                                                                                                                                                                              |
| `contentHash`   | SHA-256 of the serialized lock file content (excluding itself). Changes when any workflow, trigger, or job changes.                                                                                                                                                                                                                                   |
| `lockfileHash`  | SHA-256 of the detected package manager's lockfile, used as the dependency cache key. The lockfile is `.kici/package-lock.json` for npm, or the repo-root `pnpm-lock.yaml` / `yarn.lock` for a pnpm/yarn workspace; the hash input is prefixed with the manager name so a manager change is a guaranteed cache miss. Omitted when no lockfile exists. |
| `workflows`     | Array of workflow entries, each with its own `contentHash`, `compileSchemaVersion`, triggers, and jobs.                                                                                                                                                                                                                                               |

Each workflow entry includes:

| Field                  | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `name`                 | Workflow name.                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `source`               | Per-workflow source file and export reference.                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `contentHash`          | SHA-256 of the raw workflow source mixed with `compileSchemaVersion` (and an `assetDigest` of declared `hashFiles` when present): `SHA-256(compileSchemaVersion + ":" + rawSource [+ "\0" + assetDigest])`. The orchestrator uses this as the source-tarball cache key and the agent re-computes it against the extracted source to detect drift.                                                                                                               |
| `compileSchemaVersion` | Compiler schema version used when computing `contentHash` (currently `5`). The hash input is line-ending-normalized (CRLF → LF) so a lock file produced on Linux matches the agent's hash on Windows where Git's `core.autocrlf=true` rewrites checked-out text to CRLF. Bumping the schema version invalidates every existing source cache entry even if source is unchanged, which is the correct behavior when the compile-time or runtime contract changes. |
| `triggers`             | Trigger definitions extracted from the workflow (used by the orchestrator for event matching).                                                                                                                                                                                                                                                                                                                                                                  |
| `jobs`                 | Job definitions with scheduling metadata (runsOn, needs, matrix, environment, concurrency, container, checkout, gracePeriod, label routing, dynamic fields, etc.).                                                                                                                                                                                                                                                                                              |
| `rules`                | Workflow-level conditional rules (optional). Stored as dynamic references since rule functions cannot be serialized.                                                                                                                                                                                                                                                                                                                                            |
| `description`          | Optional workflow description.                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `hashFiles`            | Declared glob patterns for extra files included in the content hash (optional). See [extra files in the content hash](#extra-files-in-the-content-hash-hashfiles).                                                                                                                                                                                                                                                                                              |
| `resolvedHashFiles`    | Resolved file paths from `hashFiles` at compile time (optional). Recorded so the agent can verify without re-discovering.                                                                                                                                                                                                                                                                                                                                       |
| `contexts`             | Secret contexts declared by the workflow (optional). The orchestrator validates access to each context before dispatch.                                                                                                                                                                                                                                                                                                                                         |
| `registries`           | Private npm registry declarations the agent authenticates against before install (optional): `url`, `scope`, `tokenSecret` reference, `alwaysAuth`. Resolved token bytes never appear in the lock file. See [private registries](private-registries.md).                                                                                                                                                                                                        |
| `installEnv`           | Extra qualified secret refs (`<environment>:<secret-name>`) projected as env vars on the install subprocess for use with a committed `.kici/.npmrc` (optional). See [private registries](private-registries.md).                                                                                                                                                                                                                                                |
| `concurrency`          | Workflow-level concurrency config: `hasGroup`, `cancelInProgress`, `max` (optional). See [concurrency groups](concurrency.md).                                                                                                                                                                                                                                                                                                                                  |
| `timeout`              | Whole-run wall-clock timeout in milliseconds (optional). The orchestrator reads this at run creation to set the run deadline.                                                                                                                                                                                                                                                                                                                                   |
| Hook flags             | Boolean flags (`hasOnCancel`, `hasCleanup`, `hasOnSuccess`, `hasOnFailure`) indicating which lifecycle hooks are defined. Job entries additionally have `hasBeforeStep` and `hasAfterStep`.                                                                                                                                                                                                                                                                     |

## Rule: commit both together

**Always commit `.kici/kici.lock.json` in the same commit as the workflow source files it was generated from.**

1. After editing `.kici/workflows/*.ts`, run:
   ```bash
   npx kici compile
   ```
2. Stage both the workflow file(s) and `.kici/kici.lock.json`.
3. Commit them together.

That way the lock file at every commit SHA matches the workflow source at that SHA.

## Catch drift early: pre-commit and CI

Use automation so drift is caught before it reaches the repo.

### Pre-commit hook

Install a hook that compiles and stages the lock file before each commit:

```bash
npx kici hook install
```

This runs `kici compile && git add .kici/kici.lock.json` before each commit: if compilation fails the commit is blocked; if it succeeds the updated lock file is automatically staged. See [CLI Reference — kici hook](cli-reference.md#kici-hook) for options (husky, lefthook, pre-commit, prek, raw git).

### CI check

In your CI pipeline, verify that the workflow source compiles without errors:

```bash
kici compile --check
```

This validates all workflows and generates the lock file in memory without writing it. If any workflow has syntax errors or invalid configuration, the command exits non-zero. Pair this with the agent-side hash verification (below) for full drift detection -- `--check` catches broken source, while the agent catches source-lock-file mismatches at run time.

## Extra files in the content hash (`hashFiles`)

By default, the per-workflow content hash is `SHA-256(compileSchemaVersion + ":" + rawSource)` where `rawSource` is the TypeScript text of the workflow entry file. If your workflow depends on files outside `.kici/workflows/` -- configuration files, scripts, Dockerfiles, etc. -- changes to those files will **not** invalidate the cache unless you declare them.

Use the `hashFiles` option on a workflow to include additional paths or glob patterns (relative to the repo root) in the content hash:

```typescript
export default workflow('deploy', {
  hashFiles: ['config.json', 'scripts/*.sh'],
  jobs: [
    /* ... */
  ],
});
```

When any of the matched files change, the content hash formula becomes `SHA-256(compileSchemaVersion + ":" + rawSource + "\0" + assetDigest)` where `assetDigest` is a deterministic encoding of the resolved file paths and their contents. This busts the source-tarball cache and forces the build agent to pack and upload a fresh `source/{contentHash}.tar.gz`. The resolved file paths are recorded in the lock file under `resolvedHashFiles` so the agent can verify without re-discovering the workflow.

## Agent-side safety net

If drift still occurs (e.g. someone committed only the `.ts` change), the agent detects it at run time before any step runs:

- After extracting the `.kici/` source tarball (or loading source from a `git clone` on the build path), the agent reads the workflow entry file and re-computes `contentHash = SHA-256(compileSchemaVersion + ":" + rawSource [+ "\0" + assetDigest])` using the same formula as the compiler.
- If the orchestrator sent a `contentHash` (from the lock file) and the computed hash does **not** match, the agent fails the run with an error like: **lock file is out of date** (workflow source changed without regenerating the lock file). The error includes the baked agent `@kici-dev/sdk` version + bundle hash so operators can debug cross-host compile mismatches.

So even without a pre-commit or CI check, a stale lock file will cause the run to fail with a clear message instead of running with the wrong workflow.

## Summary

| Goal                         | What to do                                                                                 |
| ---------------------------- | ------------------------------------------------------------------------------------------ |
| Keep lock file in sync       | Commit `kici.lock.json` with the workflow `.ts` changes; run `kici compile` before commit. |
| Catch drift before commit    | Install a pre-commit hook with `kici hook install`.                                        |
| Catch broken source in CI    | Run `kici compile --check` in CI.                                                          |
| Bust cache on external files | Add `hashFiles: [‘config.json’]` to include non-workflow files in the content hash.        |
| Fail fast when drift remains | Rely on the agent’s hash verification when it compiles from source.                        |

## See also

- [Getting Started](getting-started.md) — compile and commit the lock file
- [CLI Reference](cli-reference.md) — `kici compile`, `kici compile --check`, `kici hook`
- [Architecture — Data flows](../architecture/data-flows.md) — how the lock file is used in the pipeline

---

## Testing guide

Source: https://docs.kici.dev/user/testing-guide/

Test your workflows remotely against the full CI pipeline from your local machine. `kici run remote` uploads your current repo state (including uncommitted changes), triggers the pipeline, and streams execution logs back in real time.

## Overview

`kici run remote` connects your local development environment to the remote orchestrator/agent pipeline. Instead of pushing a commit and waiting for CI, you can:

- Run any workflow against your current working tree (including unstaged changes)
- Get real-time log output streamed back to your terminal
- Give test runs test-scoped secrets — your local secret files and `--env` values (uploaded encrypted) plus any environment flagged `allowLocalExecution: true` — while production environments stay unreachable
- Detect test mode in workflow code via `ctx.isTestRun`

The command is remote-only -- all execution happens on the orchestrator and agent. For local-only trigger matching previews, use `kici test <event>`.

:::note[Orchestrator prerequisite: cache storage]
`kici run remote` uploads your working-tree overlay to the orchestrator's **cache storage** via a pre-signed URL, and the agent fetches it from there (see [Repo state transfer](#repo-state-transfer)). The target orchestrator must therefore have cache storage enabled (`KICI_STORAGE_TYPE` = `s3` or `filesystem`).

- **The [Docker / Podman quickstart](quickstart/compose.md) wires this up for you** — it ships a SeaweedFS service, so `kici run remote` works there out of the box (see its "run a workflow without pushing" step).
- **The [bare-metal quickstart](quickstart/bare-metal.md) does not configure storage by default** — enable a backend before using `kici run remote`:
  - **`filesystem`** — simplest for a single-host orchestrator: set `KICI_STORAGE_TYPE=filesystem` and `KICI_STORAGE_FS_PATH=/var/lib/kici/cache`. No external service needed; blobs are served through the orchestrator's own HMAC-signed HTTP route.
  - **`s3`** — any S3-compatible bucket. **A non-public / self-hosted endpoint works**: set `KICI_STORAGE_TYPE=s3`, `KICI_STORAGE_BUCKET`, `KICI_STORAGE_ENDPOINT=https://your-endpoint` and (for most self-hosted services) `KICI_STORAGE_FORCE_PATH_STYLE=true`. If the developer machine running `kici run remote` reaches the bucket at a different address than the orchestrator, set `KICI_STORAGE_UPLOAD_ENDPOINT` to the developer-reachable address; if agents reach it at yet another address (e.g. agents in containers), set `KICI_STORAGE_EXTERNAL_ENDPOINT` to the agent-routable URL.

See [Storage layout](../operator/orchestrator/storage-layout.md) for the full env-var reference.
:::

## Getting started

### 1. Authenticate

```bash
kici login
```

This opens your browser for OAuth authentication and stores a personal access token in `~/.kici/config`. For CI/CD pipelines or headless environments, use `kici login --token <your-api-key>` or `kici login --device` instead. See [CLI authentication](cli-auth.md) for details.

### 2. Write a test fixture

Fixtures define the events you want to simulate. They live in `.kici/tests/*.ts` and use the same SDK trigger functions as workflows.

```typescript
// .kici/tests/push-tests.ts
import { fixture, push } from '@kici-dev/sdk';

export const pushMain = fixture('push-main', {
  event: push({ branches: ['main'] }),
});

export const pushDevelop = fixture('push-develop', {
  event: push({ branches: ['develop'] }),
});
```

Each file can export multiple fixtures. The `fixture()` factory takes an ID (used on the command line) and options including the event to simulate.

### 3. Run a fixture

```bash
# List available fixtures
kici run remote

# Run a specific fixture
kici run remote push-main

# Run all fixtures matching a glob
kici run remote push-*

# Run everything
kici run remote --all
```

## Fixture reference

### Event types

Fixtures accept any SDK trigger function as their event:

```typescript
import { fixture, push, pr, comment, tag, release } from '@kici-dev/sdk';

// Push event
export const pushMain = fixture('push-main', {
  event: push({ branches: ['main'] }),
});

// PR event
export const prOpen = fixture('pr-open', {
  event: pr({ branches: ['main'], actions: ['opened'] }),
});

// Comment event
export const prComment = fixture('pr-comment', {
  event: comment({ actions: ['created'] }),
});

// Tag event
export const tagRelease = fixture('tag-release', {
  event: tag({ tags: ['v*'] }),
});

// Release event
export const published = fixture('release-published', {
  event: release({ actions: ['published'] }),
});
```

### Overrides

Override default payload values per fixture:

```typescript
export const pushFeature = fixture('push-feature', {
  event: push({ branches: ['feature/*'] }),
  branch: 'feature/auth', // Override branch name
  sha: 'abc123def456', // Override commit SHA
  repo: 'myorg/myrepo', // Override repository
  pr: 42, // Override PR number (for PR events)
});
```

When not specified, these default to values detected from your local git repo (current branch, HEAD SHA, remote URL).

### Secret context mappings

Map secret contexts to your fixture:

```typescript
export const pushWithSecrets = fixture('push-with-secrets', {
  event: push({ branches: ['main'] }),
  secrets: {
    db: 'test-database',
    api: 'test-api-keys',
  },
});
```

This maps the `db` secret context to the `test-database` context, and `api` to `test-api-keys`.

This mapping is honored by **both** `kici run local` and `kici run remote`:

- For **`kici run local`** (see [`kici run local`](cli-reference.md#kici-run-local)), each named context is resolved from your local secret files (`.kici/.secrets`, `.env.local`, `secrets.yaml`, and `--env` flags).
- For **`kici run remote`**, each named context maps to an orchestrator **environment**, and the orchestrator resolves that environment's secrets for the run. The target environment must be flagged `allowLocalExecution: true` — mapping a context to a missing or non-test environment rejects the run (see [Secret contexts for testing](#secret-contexts-for-testing) below).

### Async fixtures

For dynamic fixture configuration, export an async function:

```typescript
export const dynamicFixture = fixture('dynamic', async () => ({
  event: push({ branches: ['main'] }),
  sha: await getCurrentSha(),
}));
```

## Running tests

### Basic commands

```bash
# List all available fixtures (discovers .kici/tests/*.ts)
kici run remote

# Run a single fixture by ID
kici run remote push-main

# Glob matching -- run all push-related fixtures
kici run remote push-*

# Run all fixtures sequentially
kici run remote --all

# Run all fixtures in parallel
kici run remote --all --parallel
```

### Direct workflow run

Bypass trigger matching and run a specific workflow directly:

```bash
kici run remote --workflow ci
```

This skips the trigger evaluation step and runs all jobs in the named workflow.

### Output modes

```bash
# Default: full log streaming with colored job prefixes
kici run remote push-main

# Quiet: minimal output (just pass/fail result)
kici run remote push-main --quiet

# JSON: machine-readable structured output
kici run remote push-main --json

# JUnit XML: for CI integration
kici run remote push-main --junit results.xml
```

### Non-blocking execution

```bash
# Fire and forget -- returns immediately with run ID
kici run remote push-main --no-wait

# Check status later
kici status <run-id>
```

### Cancellation

Press Ctrl+C during a running test to send a cancel signal to the orchestrator. The agent job will be terminated gracefully.

## Repo state transfer

When you run `kici run remote`, the CLI:

1. Detects all files differing from HEAD (staged, unstaged, and untracked)
2. Creates a compressed tarball of changed files
3. Encrypts the tarball using X25519 ECDH key exchange
4. Uploads the encrypted tarball to storage via a signed URL
5. Triggers the pipeline with a reference to the upload

The agent clones your repo at HEAD, then applies the overlay tarball on top -- giving you the exact same file state as your local working tree.

### What gets included

- Modified tracked files (staged and unstaged)
- New untracked files (not in `.gitignore`)
- File deletions (tracked files you deleted locally)

### What gets excluded

- Files matching `.gitignore` patterns
- Files matching `.kiciignore` patterns (additional exclusions)
- The `.git` directory itself

### `.kiciignore`

Create a `.kiciignore` file in your repo root to exclude additional files from the upload:

```
# Large binaries
*.bin
*.iso
data/fixtures/large-dataset.csv

# Local-only configs
.env.local
docker-compose.override.yml
```

The format is the same as `.gitignore` -- one glob pattern per line, `#` for comments.

### Size limits

| Threshold | Behavior                                                                          |
| --------- | --------------------------------------------------------------------------------- |
| < 50 MB   | Normal upload                                                                     |
| 50-500 MB | Warning displayed, upload proceeds                                                |
| > 500 MB  | Error -- reduce bundle size via `.kiciignore` or check for unintended large files |

The CLI always shows a pre-upload summary before transferring:

```
12 files changed, 3 new, 1 deleted (2.3 MB compressed)
```

## Secret contexts for testing

The goal of the test-secret model is to let test runs reach **test-only credentials** while keeping production credentials out of reach. `kici run remote` combines two sources of secrets for a test run, then merges them with a clear precedence and a fail-closed gate.

### CLI-uploaded local secrets

`kici run remote` collects the same local secret values that `kici run local` reads — `.kici/.secrets`, `.kici/.env.local`, `.kici/secrets.yaml`, and any `--env KEY=VALUE` flags — and uploads them **encrypted** to the orchestrator alongside the run. The orchestrator decrypts them only to inject them into the agent for that run; the control plane never sees the values.

```bash
# Provide an ad-hoc test value for a single remote run
kici run remote push-main --env KICI_DATABASE_URL=postgresql://localhost/test
```

`--env` provides a **flat** per-run override; `--context <ctx>.<KEY>=<value>` is its sibling for a **namespaced** per-run override, placing the value under the named context `ctx`. Both are uploaded **encrypted** and follow the same precedence rule below — a CLI-supplied value wins over the orchestrator test-environment secret on a key collision.

```bash
# Provide a namespaced per-run value under the 'db' context
kici run remote push-db --context db.KICI_DATABASE_URL=postgresql://localhost/test
```

Because these values originate on your machine, they are the natural place to put throwaway test credentials without touching any orchestrator-stored secret.

### Orchestrator test-environment secrets

In addition to your uploaded values, the orchestrator resolves test-scoped secrets from its own store for a remote test run:

- The job's own declared `environment` contributes its resolved secrets (flat). Static strings and **pure dynamic functions** both participate: a pure `environment:` function (see [Dynamic values](dynamic-values.md)) is evaluated against the fixture's simulated event, and the resolved name is gated and resolved like a static one. Impure dynamic functions (those requiring an init job) are not evaluated for test runs — use a fixture `secrets:` mapping (or `--context`) to supply such a job's secrets.
- Each fixture `secrets: { ctx: envName }` mapping resolves the named environment's secrets under the namespaced context `ctx`.

Both paths are restricted to environments flagged `allowLocalExecution: true`. A production environment left at the default `false` is never resolvable for a test run.

```typescript
export const pushWithDb = fixture('push-db', {
  event: push({ branches: ['main'] }),
  secrets: { db: 'test-database' }, // 'test-database' must be allowLocalExecution: true
});
```

```typescript
step('migrate', async (ctx) => {
  const dbUrl = await ctx.secrets.get('KICI_DATABASE_URL');
  await ctx.$`npx prisma migrate deploy`;
});
```

### Precedence: CLI values win

When a key exists in both sources, the **CLI-uploaded local value wins** over the orchestrator test-environment value. This makes a local override a per-run knob: set `--env KICI_DATABASE_URL=...` (or put it in `.kici/.secrets`) to shadow the test environment's value for just that run, without changing anything on the orchestrator.

### Fail-closed on non-test environments

Test-run secret resolution is fail-closed:

- If a fixture maps a context to an environment that does not exist, the run is **rejected**.
- If a fixture maps a context to an environment whose `allowLocalExecution` is `false`, the run is **rejected**.
- The `allowLocalExecution` gate applies to **all** remote test runs: a run whose matched workflow targets an environment with the flag off is rejected, so a test run can never resolve production secrets.

### The `allowLocalExecution` environment flag

Each environment carries an `allowLocalExecution` flag (default `false`) that controls test-run access to that environment and to its secrets. Production environments should leave it at `false`; create a dedicated test environment with `allowLocalExecution: true` that binds only test-only secret scopes for the contexts you want test runs to use.

The flag is set by the orchestrator operator, either via the CLI:

```bash
kici-admin environment set-policy --env test-database --allow-local-execution true
```

or via the dashboard's "Test runs" toggle on the environment detail page. `kici secrets list` only surfaces contexts whose owning environment has `allowLocalExecution: true`, so production environments are never advertised as test-accessible.

### Local execution as an alternative

`kici run local` resolves the same local secret files entirely on your machine and honors the fixture `secrets: { ... }` mapping to pick which local context backs each name (see [`kici run local`](cli-reference.md#kici-run-local)). Because the values never leave your machine, it's a good fit when you want to exercise secret-dependent steps without involving the orchestrator at all.

### Discovering available contexts

```bash
# List test-accessible secret contexts and their key names (not values)
kici secrets list
```

## Detecting test mode in workflows

Use `ctx.isTestRun` to conditionally skip destructive operations:

```typescript
step('deploy', async (ctx) => {
  if (ctx.isTestRun) {
    ctx.log.info('Skipping deployment in test mode');
    return;
  }
  await ctx.$`kubectl apply -f k8s/`;
});
```

## Run history

### Viewing history

```bash
# Show recent test runs (from local history)
kici run remote --history
```

### Run details

```bash
# Show run summary (tries orchestrator first, falls back to local history)
kici status <run-id>

# Show full logs
kici status <run-id> --logs

# Show logs for a specific job
kici status <run-id> --logs --job build

# Machine-readable output
kici status <run-id> --json
```

## Scaffolding with kici init

Running `kici init` in a new project scaffolds a sample test fixture alongside the workflow templates:

```
.kici/
  workflows/
    hello-world.ts     # Sample workflow
    pr-checks.ts       # Sample PR workflow
  tests/
    push-test.ts       # Sample push fixture
  package.json
  tsconfig.json
.kiciignore            # Default exclusion patterns
```

The generated fixture uses the detected default branch:

```typescript
// .kici/tests/push-test.ts
import { fixture, push } from '@kici-dev/sdk';

export const pushMain = fixture('push-main', {
  event: push({ branches: ['main'] }),
});
```

## See also

- [CLI reference](cli-reference.md) -- complete command reference for all `kici` commands
- [SDK reference](sdk-reference.md) -- trigger functions, step context, and workflow API
- [Workflow patterns](workflow-patterns.md) -- example workflows to test against

---

## Workflow patterns

Source: https://docs.kici.dev/user/workflow-patterns/

Practical patterns for building real-world KiCI workflows in TypeScript. The patterns are organised across five pages -- start with [Basic CI](./patterns/basic.md) if you're new, or jump to [Integrations](./patterns/integrations.md) if you're wiring up a non-GitHub forge or a generic webhook.

| Page                                                       | Covers                                                                                                                                             |
| ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Basic CI](./patterns/basic.md)                            | Single-job CI, PR-only / push-only filters, multiple triggers on one workflow, manual-only workflows.                                              |
| [Conditionals & matrix](./patterns/conditionals-matrix.md) | Conditional execution with rules, matrix builds (static + dynamic), and dynamic job generation.                                                    |
| [Integrations](./patterns/integrations.md)                 | Workflow chaining, generic webhooks, Stripe handlers, self-hosted git forges (Forgejo / Gitea / Gogs), plain GitHub repo webhooks (no GitHub App). |
| [Scheduling & events](./patterns/scheduling-and-events.md) | Nightly cron, workflow-complete-triggered deploys, custom event chaining.                                                                          |
| [Pattern reference](./patterns/reference.md)               | Step context, the examples repository, and GitHub check run output -- cross-cutting reference shared by every pattern above.                       |

## See also

- [Event system](events.md) -- event model concepts, registration model, circuit breaker
- [SDK reference](sdk-reference.md) -- complete API reference for all functions used in these patterns
- [CLI reference](cli-reference.md) -- how to compile and test these workflows locally
- [Getting started](getting-started.md) -- installation and first workflow setup
- [Job execution lifecycle](../architecture/execution/job-execution.md) -- how agents execute the jobs defined in these patterns
- [GitHub checks architecture](../architecture/webhooks/github-checks.md) -- deep dive into the check run system

---

# Workflow features

## Concurrency groups

Source: https://docs.kici.dev/user/concurrency/

Concurrency groups prevent multiple workflow runs from executing in parallel when they target the same resource. Common use cases include preventing parallel deploys to the same environment or serializing database migrations.

## Basic usage

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('deploy', {
  on: push({ branches: ['main', 'staging'] }),
  concurrency: {
    group: (ctx) => `deploy-${ctx.branch}`,
    cancelInProgress: true,
    max: 1,
  },
  jobs: [
    job('deploy', {
      runsOn: 'linux',
      steps: [
        step('deploy', async ({ $ }) => {
          await $`./deploy.sh`;
        }),
      ],
    }),
  ],
});
```

## Configuration

The `concurrency` option on a workflow accepts:

| Field              | Type     | Default  | Description                                |
| ------------------ | -------- | -------- | ------------------------------------------ |
| `group`            | Function | Required | Returns the concurrency group key string   |
| `cancelInProgress` | boolean  | `true`   | Cancel older runs when a newer run arrives |
| `max`              | number   | `1`      | Maximum concurrent runs in the same group  |

### Group key function

The group key function receives a context with the branch name and event payload. Runs with the same group key are subject to concurrency limits.

```typescript
// Per-branch concurrency (most common)
group: (ctx) => `deploy-${ctx.branch}`;

// Global concurrency (across all branches)
group: () => 'deploy';

// Per-target-branch concurrency
group: (ctx) => `deploy-${ctx.event.targetBranch ?? 'default'}`;
```

The workflow-level group function is always evaluated **agent-side** at runtime -- the lock file records only that a group function exists (`hasGroup: true`), not the function itself. The agent loads the workflow source, calls the group function with `{ branch, event }`, and reports the evaluated key back to the orchestrator before step execution begins. This differs from job-level `concurrencyGroup` (see [Environments](environments.md#concurrency-groups)), where the compiler performs purity analysis and can inline pure functions for orchestrator-side evaluation.

## cancelInProgress mode

When `cancelInProgress: true`, a newer run supersedes older runs in the same group:

```
Run #1 starts deploying to main        -> running
Run #2 arrives for deploy-main group   -> Run #1 cancelled ("Superseded by run in concurrency group 'deploy-main'")
Run #2 continues                       -> running
```

This is the most common mode for deploy workflows -- you want the latest code deployed, not an outdated version.

The cancelled run:

- Receives a cancellation with reason "Superseded by run in concurrency group 'deploy-main'"
- Goes through the normal cancel flow (grace period, hooks if graceful)
- GitHub Check status updated to `cancelled` with the superseded reason

```typescript
workflow('deploy', {
  concurrency: {
    group: (ctx) => `deploy-${ctx.branch}`,
    cancelInProgress: true,
  },
  jobs: [
    /* ... */
  ],
});
```

## Queue mode

When `cancelInProgress: false`, newer runs will wait until older runs complete:

```
Run #1 starts deploying                -> running
Run #2 arrives for same group          -> queued ("Waiting for deploy-main (1 ahead)")
Run #1 completes                       -> success
Run #2 starts                          -> running
```

In queue mode, the agent that picked up the queued run **stays connected** to the orchestrator and parks on a long-poll wait. When the holder finishes (success, failure, or cancel), the orchestrator dequeues the FIFO-next entry and pushes a `proceed` notification over the same WebSocket; the queued agent then continues with normal step execution against the workspace it already has. The agent's slot is therefore held for the duration of the queue wait — bound by `KICI_CONCURRENCY_WAIT_TIMEOUT_MS` (default 1 hour).

```typescript
workflow('migrate-db', {
  concurrency: {
    group: () => 'migrations',
    cancelInProgress: false,
    max: 1,
  },
  jobs: [
    /* ... */
  ],
});
```

The dashboard will show a "Queued" badge with the reason: "Waiting for deploy-main (1 ahead)".

## Max concurrent runs

The `max` field controls how many runs can execute simultaneously in the same group:

```typescript
// Allow up to 3 parallel test runs per branch
workflow('test', {
  concurrency: {
    group: (ctx) => `test-${ctx.branch}`,
    cancelInProgress: false,
    max: 3,
  },
  jobs: [
    /* ... */
  ],
});
```

When `max: 1` (default), runs are fully serialized within the group.

## Group key examples

### Deploy per environment

```typescript
workflow('deploy', {
  concurrency: {
    group: (ctx) => `deploy-${ctx.branch}`,
    cancelInProgress: true,
  },
  jobs: [
    job('deploy-staging', {
      runsOn: 'linux',
      environment: 'staging',
      steps: [
        /* ... */
      ],
    }),
  ],
});
```

### Global singleton

```typescript
// Only one migration can run at a time, regardless of branch
workflow('migrate', {
  concurrency: {
    group: () => 'db-migration',
    cancelInProgress: false,
  },
  jobs: [
    /* ... */
  ],
});
```

### Environment-aware groups

```typescript
// Serialize deploys per environment
workflow('deploy', {
  concurrency: {
    group: (ctx) => {
      const env = ctx.branch === 'main' ? 'production' : 'staging';
      return `deploy-${env}`;
    },
    cancelInProgress: true,
  },
  jobs: [
    /* ... */
  ],
});
```

## Interaction with environment protection

When a workflow has both `concurrency` and `environment` protection rules:

1. Environment protection gates (required reviewers, wait timer) apply first
2. Concurrency group check happens after protection gates pass
3. If the run is queued by concurrency, it keeps its protection approval

This means a run that passed approval won't need re-approval if it gets queued by concurrency.

## Cancelling queued runs

Queued runs can be cancelled before they start executing. The cancel request removes them from the queue immediately -- they don't go through the grace period since no step is running.

## Job-level concurrency groups

In addition to workflow-level concurrency, individual jobs can define their own concurrency group via the `concurrencyGroup` property. This controls concurrent execution at the job level rather than the workflow level. See [Environments — concurrency groups](environments.md#concurrency-groups) for details.

## Local execution

`kici run local` honors workflow-level `concurrency` per-machine, per-user. The `group` callback is evaluated against the simulated event identically to the remote orchestrator path; `cancelInProgress` carries the same semantics — `true` interrupts the holder via `SIGTERM` (escalating to `SIGKILL` after a grace window) and proceeds with the new run, while `false` queues the new invocation in FIFO order until the holder finishes.

Coordination is local only. Running the same workflow on two different machines does not serialize across them — that requires the orchestrator. For full cross-host enforcement (queueing across agents, dashboard visibility, `max > 1`), use `kici run remote` against a deployed orchestrator.

Lock files live under `$XDG_RUNTIME_DIR/kici-local-locks/` on Linux, falling back to `os.tmpdir()/kici-local-locks-<uid>/`. A workflow whose `group` callback throws aborts the run with a clear error rather than running unprotected. See [`kici run local` — Concurrency enforcement](cli-reference.md#concurrency-enforcement) for the `KICI_LOCAL_LOCK_KILL_GRACE_MS` override and the diagnostic output emitted while contending on a busy lock.

The `kici run local --concurrency <n>` flag is a separate concept — it caps **job-level** parallelism within a single run (how many jobs from one workflow run at once), not cross-run serialization.

---

_Source: `packages/sdk/src/types.ts` (WorkflowOptions.concurrency, JobOptions.concurrencyGroup)_

---

## Dashboard

Source: https://docs.kici.dev/user/dashboard/

The KiCI dashboard is a web-based interface for monitoring workflow runs, inspecting job and step details, and reading log output. It is a browser single-page application that authenticates via OIDC and communicates with the Platform tier through REST API endpoints.

## Getting started

<!-- help:getting-started-overview#getting-started -->

The getting-started page is a six-step checklist that takes you from zero to your first workflow run.

- **Self-checked steps** -- install the CLI, scaffold a workflow, and run it locally. These run on your own machine, so you tick them off yourself; the dashboard remembers your choices in the browser.
- **Auto-detected steps** -- connect an orchestrator, add a webhook source, and trigger your first run. These tick automatically as the dashboard observes the matching activity in your organization.

Each step links to the relevant settings page or documentation. A progress bar tracks overall completion, and the sidebar entry shows a `done/total` badge until you finish or dismiss reminders.

<!-- /help:getting-started-overview -->

When you first sign in to a brand-new organization with no orchestrator, no webhook source, and no runs, the dashboard opens this page automatically. Once your organization has any activity, the run list becomes your landing page instead. The **Getting started** sidebar entry stays available so you can return to the checklist at any time.

The six steps are:

1. **Install the kici CLI** -- `npm install -g kici`.
2. **Create a workflow** -- `kici init` scaffolds a `.kici/` directory in your repository.
3. **Run a workflow locally** -- `kici run local pr:open` executes a workflow on your machine with no orchestrator required.
4. **Connect an orchestrator** -- deploy an orchestrator and connect it with a join token from **Settings → Orchestrator keys**.
5. **Add a webhook source** -- register a source under **Settings → Sources** so pushes and pull requests trigger runs.
6. **Trigger your first run** -- push to your repository to produce your first run through the relay.

## Navigation

### Sidebar

The left sidebar provides persistent navigation across all org-scoped pages:

- **Org switcher** -- dropdown at the top to switch between organizations
- **Getting started** -- onboarding checklist (shows a `done/total` badge until complete or dismissed)
- **Runs** -- the default landing page, showing your workflow run history
- **Workflows** -- permanently registered workflows listening for events
- **Diagnostics** -- infrastructure health, execution metrics, and recent errors
- **Metrics** -- time-series charts of orchestrator health (dispatch & agents, execution, webhooks, caching, logs, errors), scoped to this org
- **Environments** -- deployment environments with protection rules
- **Secrets** -- secret scope management with environment bindings
- **Approval queue** -- held runs pending approval (shows a badge with pending count)
- **Activity** -- federated forensic log merging upstream tenant-plane mutations and orchestrator reads (`access_log`) into one chronological stream
- **DLQ** -- dead-letter queue of internal events whose dispatch retries were exhausted (shows a badge with the current depth)
- **Settings** -- organization settings with tabbed sub-pages

The sidebar footer shows the WebSocket connection indicator, your user profile, UTC/local time toggle, theme toggle, and a collapse button.

<!-- help:sidebar-build-info#sidebar -->

Below the KiCI logo, the sidebar shows build information for both the dashboard UI and the Platform API backend:

- Git commit hash.
- Relative build timestamp (e.g. "2h ago").

This makes it easy to confirm which version is currently deployed.

<!-- /help:sidebar-build-info -->

### Mobile navigation

On screens narrower than 768px (the `sm` breakpoint), the sidebar collapses and is replaced by a bottom tab bar with six navigation items: Runs, Workflows, Envs (environments), Secrets, Health (diagnostics), and Settings. Note that the mobile tab bar shows a subset of the full sidebar navigation -- activity and approval queue are only available in the full desktop sidebar.

<!-- help:run-list-overview#run-list -->

The run list is your organization's default landing page, showing all workflow runs with status, trigger, branch, and timing. Use filters and sorting to find specific runs, or enable commit grouping to see all runs triggered by a single push.

<!-- /help:run-list-overview -->

<!-- help:run-list-commit-grouping#commit-grouped-view -->

Commit grouping collapses runs that share the same commit SHA under a single header. This is useful when a push triggers multiple workflows -- you can see their aggregate status at a glance instead of scanning individual rows.

<!-- /help:run-list-commit-grouping -->

## Run list

The run list is the default page when entering an organization (`/orgs/:customerId/runs`).

### Columns

Each run is displayed in a table row (desktop) or card (mobile) with:

- **Status** -- colored badge (green = success, red = failed/error/timed out, amber = running/cancelling, yellow = queued/pending, gray = cancelled/skipped)
- **Trigger** -- icon indicating the event type (push, pull request, tag, dispatch, etc.)
- **Workflow** -- the workflow name from your `.kici/workflows/` directory
- **Branch** -- the git ref that triggered the run
- **Commit** -- the first 7 characters of the commit SHA, linked to the provider (GitHub)
- **Duration** -- how long the run took (e.g. "2m 30s")
- **Time** -- relative timestamp (e.g. "5 minutes ago")

### Filters

Dropdown filters appear above the table:

- **Status** -- filter by success, failed, running, or cancelled
- **Workflow** -- filter by workflow name
- **Branch** -- filter by git branch
- **Repository** -- filter by repository

A "More filters" button reveals additional filters:

- **Trigger type** -- filter by push, pull_request, tag, dispatch, etc.

Filters persist in URL query parameters (e.g. `/runs?status=failed&branch=main`), making filtered views shareable and bookmark-friendly. A "Clear filters" button appears when any filter is active.

### Sorting

Click any column header to sort the table by that column. Clicking the same header toggles between ascending and descending order. The current sort is reflected in the URL (e.g. `?sort=workflowName&dir=desc`), so sorted views are shareable.

Sorting is server-side -- the API returns results in the requested order.

### Column visibility

A gear icon button (labeled "Toggle columns") next to the filter bar opens a menu of toggleable columns. Uncheck a column to hide it from the table. Column visibility preferences are saved per organization in `localStorage`.

### Commit grouped view

A "Group by commit" toggle switch groups runs by their commit SHA. When enabled, runs sharing the same commit are collapsed under a group header showing the commit SHA (first 7 characters), commit message, and aggregate status dots. This is useful for seeing all workflow runs triggered by a single push.

### Compile indicator

Runs where the lock file was recompiled during execution show a hammer icon next to the workflow name. Hover over the icon to see the tooltip "Lock file recompiled".

### Pagination

The run list shows 20 runs per page with numbered pagination controls. A footer displays the current range and total count (e.g. "Showing 1-20 of 237 runs").

### Empty states

- **No runs, WS disconnected** -- "No orchestrator connected" with guidance to check orchestrator configuration and a link to settings.
- **No runs, WS connected** -- "No runs yet" with guidance to push code to trigger a workflow run.
- **No filter matches** -- "No matching runs" with guidance to adjust filters.

<!-- help:run-detail-job-tree#job-tree -->

The job tree shows the hierarchical structure of your run's jobs and steps. Click a job to see combined logs from all its steps, or expand a job to select an individual step.

Failed runs auto-expand the first failed job for quick diagnosis.

<!-- /help:run-detail-job-tree -->

<!-- help:run-detail-metadata#metadata -->

The metadata panel displays detailed context about the selected run, job, or step:

- IDs, status, duration.
- Orchestrator and agent assignment.
- Matrix values (when present).
- Provider links — commit SHA, trigger event, workflow source file on GitHub.

Use it as the quick-reference card when you need to jump from the dashboard to the underlying VCS or infrastructure.

<!-- /help:run-detail-metadata -->

<!-- help:run-detail-source#metadata -->

The source row identifies which webhook source produced this run. Orchestrators register sources at startup with a friendly name and a fine-grained subtype — GitHub App, generic webhook, universal Git, or internal.

The dimmed routing key under the name (e.g. `github:12345` or `generic:org:src-id`) is the unique identifier Platform uses to route the webhook back to your orchestrator.

Two repos with the same path served by different sources are distinguished here, so you can tell at a glance which deployment a run came from.

<!-- /help:run-detail-source -->

<!-- help:run-detail-job-labels#metadata -->

Labels show the routing constraints used to match this job to an agent.

Label categories:

- **Platform labels:** `kici:os:linux`, `kici:arch:x64`, etc.
- **Scaler labels:** added by the scaler that provisioned the agent.
- **Role labels:** e.g. `kici:role:builder`.
- **Custom labels:** anything you set via `runsOn` in your workflow definition.

<!-- /help:run-detail-job-labels -->

<!-- help:run-detail-trust-context#metadata -->

The trust context shows the security evaluation for PR-triggered runs:

- **Trust tier:** trusted, known, or unknown contributor.
- **Lock file source:** head branch or base branch.
- **Secrets access level:** what the run was permitted to read.

Use this to understand why a run was held for approval or ran with restricted permissions.

<!-- /help:run-detail-trust-context -->

<!-- help:summary-job-contexts#tabs -->

The job contexts section in the run summary shows execution context per job:

- **Sandbox type** — container, Firecracker, bare-metal.
- **Runtime environment** — image, OS, arch.
- **Dependency cache status** — hit, miss, or skipped.
- **Available secret keys** — which scopes the job could read.

This gives a quick overview of every job's execution environment without needing to select each job individually.

<!-- /help:summary-job-contexts -->

<!-- help:summary-scaler-context#tabs -->

The scaler configuration section shows the execution mode and backend-specific settings used to provision the agent that ran this job. The execution mode describes how steps run — for example, "bare-metal" means direct processes, even inside containers where the container is the isolation layer.

Backend-specific fields:

- **Container backends:** image name, runtime (Docker/Podman), resource limits, and network isolation settings.
- **Firecracker backends:** rootfs and kernel paths, vCPU/memory allocation, and the VM's IP address.
- **Bare-metal backends:** binary path and resource hints.

Hover over the execution mode badge for a contextual explanation.

<!-- /help:summary-scaler-context -->

<!-- help:summary-job-outputs#tabs -->

Shows plain outputs and secret output keys produced by this job.

- **Plain outputs:** values returned by step functions, grouped by step name.
- **Secret outputs:** set via `ctx.setSecretOutput()`, shown here as masked key names — the values are encrypted and never sent to the dashboard.

Downstream jobs that declare this job in their `needs` array can read these outputs.

<!-- /help:summary-job-outputs -->

<!-- help:summary-step-secrets#tabs -->

When a step is selected, the secrets-accessed section shows which secret keys the step read via `ctx.secrets.get()` or `ctx.secrets.expose()` during execution.

Only key names are shown, never values — use this to audit which steps access sensitive credentials.

Available for runs executed after this feature was deployed; older runs show no data.

<!-- /help:summary-step-secrets -->

## Run detail

Click any run in the list to open its detail page (`/orgs/:customerId/runs/:runId`).

### Layout

The page uses a responsive multi-panel layout that adapts to screen width:

- **Wide desktop (>= 1200px)** -- three-panel layout with a resizable job tree (left), content area (center), and metadata sidebar (right). Two draggable dividers between the panels let you resize them. Panel sizes persist to `localStorage`.
- **Medium desktop (< 1200px)** -- two-panel layout with the job tree and content area. Metadata is accessible via a "Show metadata" drawer button.
- **Mobile (< 768px)** -- stacked layout with the job tree at the top and content below. Metadata is available as a tab alongside Logs, Payload, Timeline, and Summary.

### Run header

A summary bar above the two panels shows:

- **Breadcrumbs** -- Runs > github > owner/repo > commit SHA > #runId > workflow name (each segment is clickable and filters the run list by that dimension)
- **Status badge** -- the current run status
- **Trigger icon** -- visual indicator of the event type
- **Branch** -- the git ref with a branch icon
- **Commit SHA** -- linked to the provider's commit page
- **Duration** -- total run time
- **Timestamp** -- relative time since the run started (hover for absolute time)
- **Re-run button** -- available for terminal-state runs (success, failed, cancelled, error, timed out) triggered by webhooks. Opens a confirmation dialog before re-running on the same commit. After confirmation, navigates to the new run.
- **Cancel button** -- available for pending, running, cancelling, or queued runs. For running runs, sends a graceful cancel; for already-cancelling runs, a "Force cancel" button appears to immediately kill without cleanup.
- **Lineage badge** -- if the run is a re-run, a badge shows the parent/child relationship with a link to the original run.

### Job tree

The left panel shows a tree of jobs and their steps:

- Each job shows a **status dot**, **name**, and **duration** (live timer while running)
- Click a job row to select the job and view its combined logs (all steps merged with sticky headers)
- Click the expand chevron on a job to expand/collapse its steps
- Each step shows a **status dot**, **name**, and **duration**
- Click a step to select it and view its individual logs

**Job-level selection** -- clicking a job row selects it and shows combined logs from all of its steps, with sticky step headers separating each step's output. This provides a unified view of the entire job's execution without needing to click through steps individually.

**Matrix jobs** are grouped under a parent node. For example, a matrix with 3 Node.js versions appears as "Test (3 variants)" with expandable sub-entries like "Test (node:18)", "Test (node:20)", "Test (node:22)".

**Hook steps** -- lifecycle hook steps (e.g. `onCancel`, `cleanup`, `onSuccess`) are displayed with a distinct badge to differentiate them from regular steps.

<!-- help:run-detail-setup-jobs#job-tree -->

**Setup jobs** are rows prefixed with `__init__`, `__build__`, or `__dynamic__`. They run before (or alongside) your workflow jobs.

In the tree they appear with:

- A pretty display name — `Init: foo`, `Build: foo`, `Evaluate: foo`.
- A muted "setup" visual variant that distinguishes them from regular jobs without hiding them.
- A single synthetic step-0 log that captures everything the workflow source and dynamic functions write — explicit `log.*` calls, `console.*` output, and subprocess stdout from `await $` inside a `DynamicJobFn`.

Their elapsed time is intentionally visible: clone, dependency install, and dynamic evaluation can consume real user-observable time, and hiding it would obscure where the run is actually spending itself.

<!-- /help:run-detail-setup-jobs -->

**Auto-expand on failure** -- when viewing a failed run, the first failed job is automatically expanded and the failed step is selected.

**URL sync** -- selecting a job updates the URL to `/runs/:runId/jobs/:jobId`, and selecting a step updates it to `/runs/:runId/jobs/:jobId/steps/:stepIndex`, making selections bookmarkable and shareable.

### Keyboard navigation

The job tree supports keyboard navigation:

- **Arrow Up/Down** -- move focus through tree items
- **Enter** -- select a job (show combined logs) or select a step (show step logs)
- **Escape** -- deselect the current selection and navigate to the first job

### Tabs

The content area has the following tabs:

- **Logs** (default) -- shows log output for the selected job or step
- **Payload** -- webhook payload viewer showing the raw event payload that triggered the run. This tab appears only for runs triggered by a webhook event (and re-runs of those, which copy the original payload); runs started by a schedule, manual schedule, lifecycle event, or another run carry no payload, so the tab is hidden for them
- **Timeline** -- CSS Gantt chart showing the execution timeline of all jobs, with percentage-based bars and striped animation for running jobs. A **Provisioning** milestones section between the dispatch and execution phases plots scaler lifecycle events for the run — including a **Provisioning failed** marker when the scaler could not bring an agent up
- **Summary** -- contextual overview scoped to the current selection (run-level trigger/repo/timing info, or job-level execution context with environment variables, runtime info, and sandbox details)

On wide desktop (>= 1200px), Metadata is shown in a dedicated sidebar panel instead of as a tab.

### Metadata

The metadata panel shows detailed information organized into sections:

- **Run metadata** -- run ID, status, trigger event, branch, commit SHA (linked to provider), workflow name (linked to source file on provider), duration, and timestamps
- **Job metadata** -- job name, status, agent ID, matrix values (if present), duration
- **Step metadata** -- step name, step index, status, duration
- **Trust context** (PR-triggered runs only) -- shows the contributor's trust tier (trusted, known, or unknown), lock file source (head or base branch), and secrets access level

Provider-specific links (e.g., GitHub commit URL, branch URL, PR link, workflow source file link) are automatically generated based on the repository context. The workflow name in the metadata panel is a clickable link to the `.kici/workflows/<name>.ts` source file on the provider (e.g. GitHub blob view).

### WebSocket connection indicator

A small indicator in the sidebar footer shows the real-time WebSocket connection status:

- **Green dot** -- connected and receiving live updates
- **Red dot (pulsing)** -- disconnected

## Log viewer

The log viewer renders step output with full terminal color support.

### ANSI color rendering

Log lines containing ANSI escape codes are rendered with color. Supported sequences include:

- Standard 16 colors (red, green, blue, etc.) and bright variants
- 256-color palette
- Truecolor (24-bit RGB)
- Bold, faint, italic, underline, and inverse text

Colors use CSS classes with a dark background (similar to a terminal), regardless of the dashboard's light/dark theme setting.

### Timestamps

A clock icon button next to the search bar toggles per-line timestamps in the log viewer. When enabled, each log line shows the timestamp in the gutter alongside the line number. The timestamp format respects the UTC/local time preference. The setting persists to `localStorage`.

### Search

A search bar at the top of the log viewer provides:

- **Debounced search** -- type a query and matches are highlighted after 300ms
- **Match count** -- shows "N of M" with the current and total match count
- **Navigation** -- up/down arrows to jump between matches (also Enter/Shift+Enter)
- **Clear** -- press Escape or click the X button to clear the search
- **Wraparound** -- navigation wraps from the last match back to the first

### Permalink

Click any line number in the gutter to:

1. Highlight that line with a blue tint
2. Update the URL hash to `#L42` (for line 42)

Sharing the URL scrolls the recipient directly to the highlighted line.

### Copy to clipboard

Hover over any line to reveal a copy button on the right. Clicking it copies the line's **plain text** (ANSI escape codes are stripped) to the clipboard. A "Copied!" tooltip confirms the action.

### Live log streaming

When viewing a running job, logs appear in real time as the agent executes steps. The dashboard maintains a WebSocket connection to the Platform tier and subscribes to log updates for the currently selected step.

**Auto-scroll** -- new lines automatically scroll into view as they arrive. If you scroll up to review earlier output, auto-scroll pauses and a **"Jump to bottom"** button appears. Clicking it resumes auto-scroll.

**Streaming indicator** -- a pulsing "Streaming" badge appears next to the Logs tab header while a step is actively running.

**Completion banner** -- when a step finishes, a banner appears at the bottom of the log viewer showing the final status (success or failed) and total line count.

**Status updates** -- the run list and run detail pages update live as jobs and steps change state. You do not need to refresh the page to see a run complete.

**Known limitations**:

- Live streaming requires an active WebSocket connection. Some corporate proxies may block WebSocket upgrades.
- If the WS connection drops, the dashboard reconnects automatically and refetches all cached data to catch up on missed updates.
- Log lines received during streaming are held in memory. For very long-running steps with massive output, the REST endpoint is the authoritative source for complete logs.

### Provisioning logs

Above the step logs, a collapsible **Provisioning logs** section shows the orchestrator-side lifecycle of the agent that ran the job — the scaler lifecycle events emitted while bringing an agent up. It starts expanded while provisioning is in progress (no step logs yet) and collapses once steps begin producing output.

When the scaler **fails** to provision an agent (for example a missing binary, an unpullable container image, or a microVM that fails to boot), the failure appears here along with a bounded tail of the agent process's own stdout/stderr captured by the scaler. This is the surface to check for a run that fails with no step logs at all — the agent never started, so the cause lives in the provisioning lifecycle rather than in any step's output.

### Performance

The log viewer uses virtualized scrolling to handle large outputs. Only the visible lines plus a small buffer are rendered in the DOM, keeping performance smooth even for logs with 10,000+ lines.

<!-- help:settings-general#settings -->

General settings show your organization's basic information, including the org name (editable by owners) and the unique organization ID. Use this to rename your org or reference the ID for API calls and configuration.

<!-- /help:settings-general -->

<!-- help:settings-members#settings -->

The members tab lets you manage your team:

- Invite new members by email.
- Assign roles.
- Suspend or remove members.
- Configure per-user CI trust levels.

Each member's linked provider accounts (e.g. GitHub) are also visible here.

<!-- /help:settings-members -->

<!-- help:settings-roles#settings -->

Roles define granular permissions across 15 resource categories (runs, secrets, members, etc.) with 5 access levels: `none`, `read`, `read_payload`, `write`, `admin`.

Create custom roles to restrict what team members can do, or use the built-in **Owner** role for full access.

<!-- /help:settings-roles -->

<!-- help:settings-api-keys#settings -->

API keys allow programmatic access to the KiCI API for automation, scripts, and CI integrations.

Each key is scoped to this organization with a custom permission matrix and an optional expiry date. Keys can be revoked individually.

Use a key's clone button to open the creation modal prefilled with that key's name, expiry, and permissions — handy for recreating an expired key or deriving a new key from an existing one.

<!-- /help:settings-api-keys -->

<!-- help:settings-orchestrator-keys#orchestrator-keys -->

Orchestrator keys authenticate the WebSocket connection between your orchestrator and the KiCI Platform relay.

Create a key here and set it as the `KICI_PLATFORM_TOKEN` environment variable in your orchestrator configuration. Keys can optionally be restricted to specific routing patterns.

Use a key's clone button to open the creation modal prefilled with that key's name and description.

<!-- /help:settings-orchestrator-keys -->

<!-- help:settings-sources#sources -->

Webhook sources are registered automatically when an orchestrator connects to the Platform and sends a `source.register` message.

Each source shows its routing key and full webhook URL — configure this URL in your provider's webhook settings (e.g. GitHub App).

To retrieve the webhook secret for signature verification, use the `kici-admin source get-webhook-secret <routingKey>` command shown below each source.

<!-- /help:settings-sources -->

<!-- help:settings-billing#settings -->

The billing tab shows your current plan (Free, Pro, or Team), resource usage meters, and lets you upgrade to a paid tier.

Choose Monthly or Annual billing, click "Upgrade to Pro" or "Upgrade to Team" to start a Stripe Checkout, or use "Manage payment" to switch tiers and update payment methods via the Stripe Billing Portal.

The usage meters track:

- **Members:** invited users in this org.
- **Orchestrator connections:** direct WebSocket connections from your orchestrators to the Platform. Only coordinators (and standalone orchestrators) open a connection; peer/worker nodes in a Raft cluster share their coordinator's connection and don't count separately.
- **Relayed webhooks (this month):** webhooks delivered through the Platform relay during the current billing window.
- **Live log minutes (today):** log streaming time consumed in the current UTC day.
- **Retention period:** how long execution history is kept.

The diagnostics page may show a higher orchestrator count than this tab — diagnostics counts cluster nodes, billing counts billable connections.

<!-- /help:settings-billing -->

<!-- help:settings-billing-orch-connections#settings -->

The orchestrator-connections counter measures the number of **direct WebSocket connections** that your orchestrator processes hold open against the KiCI Platform — one count per live connection.

**What counts as one connection:**

- One standalone orchestrator (single process, no cluster) → **1 connection**.
- One Raft cluster (1 coordinator + N peers) → **1 connection** — only the coordinator opens a Platform WebSocket. The peers gossip through the coordinator and never connect to Platform directly, so they do **not** count toward your billing limit.
- N independent orchestrator deployments (e.g., one per environment, one per region) → **N connections**.

This is why the diagnostics page can show more orchestrator **nodes** than the billing page shows **connections**: diagnostics counts every node in your topology (coordinator + peers), while billing only counts the WebSocket connections you pay for. A 4-connection org running two 3-node clusters and two standalones will show 4 on the billing meter and 8 on the diagnostics page — both numbers are correct, they measure different things.

When you hit the cap, the next coordinator that tries to connect is rejected with WebSocket close code 4020 (`WS_CLOSE_PLAN_LIMIT`). Existing connections are never disconnected. Upgrade your plan to lift the cap; the meter updates immediately.

<!-- /help:settings-billing-orch-connections -->

<!-- help:settings-billing-relayed-webhooks#settings -->

The relayed-webhooks counter only includes webhooks delivered through the KiCI Platform relay — the route at `kici.dev` that signature-verifies an inbound webhook and forwards it over WebSocket to your orchestrator.

Webhooks pointed directly at your orchestrator's public ingest endpoint never reach the Platform, so they're invisible to this counter and uncapped on every Hosted tier. If you have a public orchestrator ingress, you can mix-and-match: use the relay for sources you can't expose publicly, and point GitHub (or any provider / generic webhook) straight at your orchestrator for the rest.

Every webhook the relay forwards counts — **including ones your workflows ultimately ignore**. Trigger matching runs on your orchestrator, not on the Platform, so the relay forwards each signature-verified webhook before any trigger is evaluated. A source that sends many events you filter down to a handful of runs still consumes one relayed webhook per event. If a high-volume source mostly produces no run, point it directly at your orchestrator (see above) to keep it off this counter entirely.

When you hit the cap, new relayed webhooks are rejected with `429 Plan limit reached`. Upgrade in the Stripe Billing Portal to lift the cap immediately; usage resets monthly on your billing anniversary.

<!-- /help:settings-billing-relayed-webhooks -->

<!-- help:settings-billing-currency#settings -->

Switch the prices shown on the tier cards between US dollars and euros. The choice you pick here is also the currency Stripe charges in when you click "Upgrade".

The default is detected from your browser language. EU, EFTA, and UK locales default to euros; everywhere else defaults to dollars.

Your choice persists in a 90-day cookie (`kici_pricing_currency`), so it survives across reloads and applies on every billing page.

<!-- /help:settings-billing-currency -->

<!-- help:billing-payment-failure#settings -->

This banner appears when your organization's latest payment to Stripe has failed. Your subscription remains active during the retry period, but you should update your payment method promptly to avoid service interruption.

<!-- /help:billing-payment-failure -->

<!-- help:activity-overview#activity -->

Activity is your forensic log — every Platform mutation (invites, role changes, sources, plans) and every orchestrator action (reads, run cancels, secret reveals, environment edits) merged into one chronological stream.

Each row shows the actor, the action, the target, and the outcome.

- **Audit rows:** expand for field-level change tracking.
- **Access rows:** expand for the request ID, origin, and any error message.

<!-- /help:activity-overview -->

<!-- help:activity-filters#activity -->

Filters live entirely in the URL — bookmark or share a filtered view to replay it.

- **Search:** full-text match against access-log error messages and the JSON body of audit entries.
- **Run ID:** combine with another filter to scope all activity touching a specific run.

Click a row's run target to jump straight to the run detail page.

<!-- /help:activity-filters -->

## DLQ

The DLQ (dead-letter queue) page lists internal events whose dispatch attempts were exhausted (or that hit a non-retryable error). Each row shows when the event landed in the DLQ, the event name, the attempt count, the failure reason, and the last error message.

<!-- help:dlq#dlq -->

The DLQ holds events your org emitted that could not be dispatched within the retry budget. The sidebar badge shows the current depth so you can spot a building backlog without opening the page.

Per-row actions (visible when you have `event_dlq:write`):

- **Retry:** clears the DLQ flag and re-publishes the event. A healthy orchestrator picks it up immediately.
- **Discard:** permanently deletes the row. Use when the payload is corrupt or the routing target no longer exists.

Members with only `event_dlq:read` see the list but cannot retry or discard. Org owners have both actions by default.

<!-- /help:dlq -->

<!-- help:settings-ci-trust#settings -->

CI trust policy controls how your organization handles PR-triggered runs from different contributor types. Configure the default trust level for unknown contributors and set per-member overrides to control who can run workflows with full secrets access.

<!-- /help:settings-ci-trust -->

<!-- help:settings-global-workflows#settings -->

Global workflows let a single "workflow repo" define jobs that run when events happen in other repos in the same org.

This tab exposes the security knobs as independent axes:

- **Master enable toggle:** turn the whole feature on or off.
- **Authoring allow-list:** which repos may **define** global workflows.
- **Source deny-list:** **source** repos whose events never trigger globals (forks, public-contrib).
- **Elevated-access list:** authoring repos that need source-repo secrets during execution.

See the [user guide](global-workflows.md) and the [architecture reference](../architecture/global-workflows.md) for the full model.

<!-- /help:settings-global-workflows -->

<!-- help:settings-global-workflows-enable#settings -->

Master kill-switch for global workflows in this org.

- **OFF:** the orchestrator will **not register** any workflow that declares `repos:` patterns, and will **not dispatch** cross-repo triggers — effectively rolling the org back to per-repo-only semantics. All other settings on this page are ignored.
- **ON:** the other toggles become your safety rails. Turn ON to opt in.

<!-- /help:settings-global-workflows-enable -->

<!-- help:settings-global-workflows-authors#settings -->

Restricts which repos in this org may **define** global workflows (the "authoring axis").

- **OFF:** any repo in the org may declare a workflow with `repos:` patterns and have it registered.
- **ON:** only repos whose identifier matches one of the entries below may author globals. Non-matching repos have their global workflows dropped at registration time, with a warning in the orchestrator log.
- **ON + empty list:** **no repo** may author globals — use as a temporary lock-down.

Each entry has two parts:

- **Source:** pick a configured source (a specific GitHub App or universal-git source) to pin the entry to that source only, or leave it as **Any source** to match across every source in the org.
- **Pattern:** a glob matched against the authoring repo identifier (e.g. `myorg/ci-*`, `myorg/platform-*`).

Pinning by source is useful when the same `owner/repo` could legitimately exist on more than one configured source and you only want to trust one of them as an author.

<!-- /help:settings-global-workflows-authors -->

<!-- help:settings-global-workflows-blocked-sources#settings -->

Deny-list for **source** repos whose events must never trigger a global workflow (the "source axis").

Use this for untrusted territory — forks, public-contrib mirrors, sandboxes — where a single push shouldn't be able to fan out org-wide automation.

Evaluated at dispatch time against the repo that emitted the event, independently of the authoring allow-list: a global workflow whose author is allowed will still be skipped if the _source_ repo is denied. Both lists can be active simultaneously.

Each entry has two parts:

- **Source:** pick a configured source to deny only events delivered on that source, or leave it as **Any source** to deny across the org.
- **Pattern:** a glob matched against the source repo identifier (e.g. `myorg/fork-*`, `myorg/public-*`).

Pinning by source is the right move when the same `owner/repo` is reachable through more than one configured source (e.g. a public forge and a trusted mirror) and you want to drop deliveries from only one of them.

<!-- /help:settings-global-workflows-blocked-sources -->

<!-- help:settings-global-workflows-elevated#settings -->

Authoring repos listed here receive **elevated access to source-repo secrets** during global workflow execution.

- **Without elevation:** a global workflow job runs with only the workflow repo's own credentials — it can clone both repos but can't read the source repo's scoped secrets.
- **With elevation:** the job gets the source repo's secret context injected, so deploy / release / cross-repo automation flows work.

Treat elevated repos as effective owners of every source repo's CI secrets — only add repos you fully trust.

Each entry has two parts:

- **Source:** pick a configured source to elevate only when the authoring repo lives on that source, or leave it as **Any source** to elevate across the org.
- **Pattern:** a glob matched against the **workflow-authoring** repo, not the source repo (e.g. `myorg/ci-deploy`, `myorg/release-automation`).

Pinning by source narrows the trust window: if the same `owner/repo` is configured on more than one source, only the source you pick will grant elevation.

<!-- /help:settings-global-workflows-elevated -->

<!-- help:settings-webhooks#settings -->

Configure outbound webhook endpoints to receive notifications when runs and jobs change status. Each endpoint receives HMAC-SHA256 signed payloads with event details.

For each endpoint you can:

- **Subscribe to event types:** `run.started`, `run.completed`, `run.failed`, `job.started`, `job.completed`, `job.failed`.
- **View delivery logs:** HTTP response codes and retry counts.
- **Send a test ping:** verify connectivity before going live.

<!-- /help:settings-webhooks -->

<!-- help:settings-security-dashboard-policy#settings -->

Read-only view of the orchestrator's dashboard-write policy.

Each row toggles one mutating dashboard action — setting a secret, approving a held run, retrying a dead-lettered webhook, and so on. The orchestrator operator decides which actions stay on the dashboard and which become **CLI-only**. The dashboard cannot change the policy itself — that's the point: disabled actions stay out of the SaaS Platform's trust path.

Manage the policy with:

- **Show the full policy:** `kici-admin org-settings dashboard-writes show`
- **Disable an operation:** `kici-admin org-settings dashboard-writes set --op <name>=false`
- **Reset to permissive defaults:** `kici-admin org-settings dashboard-writes reset`

The summary strip at the top shows total / enabled / disabled counts plus whether your orchestrator is currently connected. A disconnected orchestrator means the page falls back to the cached policy from the most recent connection.

<!-- /help:settings-security-dashboard-policy -->

<!-- help:settings-support-access#settings -->

Controls whether KiCI support staff may open read-only support sessions against your organization. Sessions are **off by default** — nobody outside your org can read your data until you opt in here.

When enabled:

- KiCI staff can open time-boxed, read-only sessions to investigate an issue.
- Every read they perform is recorded in your audit trail with the support reason.
- No writes are ever possible during a session.

Disabling the toggle immediately ends any in-progress support session. Only users with the `support:admin` permission (owners by default) can change this setting.

<!-- /help:settings-support-access -->

<!-- help:settings-webhooks-delivery-log#settings -->

The delivery log shows recent webhook deliveries for an endpoint, including the HTTP status code, number of retry attempts, and the event payload.

Retry behavior:

- Deliveries are retried up to 3 times with exponential backoff.
- After 10 consecutive failures, the endpoint is automatically disabled — you can re-enable it from this view.

<!-- /help:settings-webhooks-delivery-log -->

<!-- help:settings-event-log#event-log -->

The event log shows every inbound webhook this organization has received, regardless of whether it came in via the Platform relay or directly to an orchestrator.

Each row joins two records:

- **Platform side:** event metadata and a SHA-256 hash of the body (no payload stored).
- **Orchestrator side:** full payload and processing outcome.

Filter by routing key, event type, status, or delivery ID. Click a row for the full per-tier breakdown.

<!-- /help:settings-event-log -->

<!-- help:settings-event-log-detail#event-log -->

The detail panel shows both tiers' projections side-by-side.

- **Platform record:** answers "did the delivery arrive at the relay and where was it routed".
- **Orchestrator record:** answers "what was the body and what happened next" — including the matched workflow count and spawned run links.
- **Payload:** the raw webhook body. Streams over the dashboard's existing WebSocket connection in 64 KiB chunks so Platform never buffers the full body and you see progress for large deliveries. Requires `event_log:read_payload`.

Oversized or storage-failed payloads show an "omitted" badge, with the hash preserved for correlation against raw logs.

<!-- /help:settings-event-log-detail -->

## Activity

The activity page (`/orgs/:customerId/activity`) is the org-level forensic log. It federates two streams into one chronological view: the upstream tenant-plane audit log (every tenant-plane mutation -- invites, role changes, source registrations, plan changes) and orchestrator `access_log` rows (every read and admin action -- run cancels, secret reveals, environment edits, dashboard data fetches via the Platform proxy). Filters live in the URL via search params so a filtered view is bookmarkable and shareable. The page uses cursor-based pagination and supports filtering by source (audit / access_log / all), free-text search, run ID, and other dimensions. Requires `audit:read` permission. The legacy `/orgs/:customerId/audit-log` URL redirects here to preserve bookmarks.

## Settings

The settings page (`/orgs/:customerId/settings`) uses a tabbed layout:

1. **General** -- displays the organization name (editable by owners via inline click-to-edit) and the organization ID
2. **Members** -- team management with invite, role assignment, and member removal
3. **Roles** -- custom role management with granular permission matrix
4. **API keys** -- API key creation and revocation for dashboard/programmatic access
5. **Orchestrator keys** -- orchestrator API key management for Platform WebSocket connections
6. **Sources** -- read-only list of registered webhook sources (see below)
7. **Billing** -- plan and payment management (hidden in the `kici-admin` org)
8. **CI trust** -- trust policy configuration for CI runs (visible with `ci_trust:read` permission)
9. **Global workflows** -- org-level security knobs for cross-repo workflows (visible with `org_settings:read` permission)
10. **Webhooks** -- outbound webhook endpoint management with delivery logs and test ping
11. **Event log** -- inbound webhook delivery log (visible with `event_log:read` permission)
12. **Security** -- read-only view of the orchestrator's dashboard-write policy matrix (visible with `org_settings:read` permission)
13. **Support access** -- opt-in switch that controls whether KiCI support staff may open read-only support sessions against your org (visible with `support:read`; toggled with `support:admin`)

Audit-log-style entries are no longer a settings tab; they live on the dedicated **Activity** page accessible from the sidebar.

Tab selection syncs with the URL path (`/settings/members`, `/settings/api-keys`, etc.), making tabs bookmarkable.

### Support access

The Support access tab controls whether KiCI support staff may open a read-only **support session** against your organization to help diagnose an issue. The setting is **off by default** -- until you opt in here, no one outside your org can read your data.

When support access is enabled:

- A KiCI operator can open a time-boxed (30-minute, renewable), read-only support session scoped to a stated reason.
- A support session is **runs-only**: the operator can browse your run list and, by confirming each run individually, view that run's detail and step logs. Nothing else is visible, and no write is ever possible.
- Every run an operator opens is recorded in your [Activity](#activity) audit trail, attributed to the operator with the support reason -- so you can see exactly what was looked at and why.

**Disabling immediately ends any active session.** Toggling the switch off closes every in-progress support session for your org at once. Enabling and disabling the setting is itself audited, attributed to the user who changed it.

Viewing the setting requires the `support:read` permission; changing it requires `support:admin` (granted to owners by default).

### Orchestrator keys

The orchestrator keys tab manages API keys used to authenticate orchestrator-to-Platform WebSocket connections. These are separate from user API keys (which grant dashboard/API access).

**List view** -- shows all active orchestrator keys with name, description, key prefix, creation date, and last used date.

**Create** -- opens a modal to enter a name and optional description. After creation, the raw key is shown once in a copyable box. Set this key as the `KICI_PLATFORM_TOKEN` environment variable in your orchestrator configuration.

**Revoke** -- opens a confirmation modal before soft-deleting the key. Any orchestrators using the revoked key will be disconnected.

### Sources

The sources tab shows webhook sources registered by connected orchestrators. Sources appear here **automatically** when an orchestrator connects to the Platform via WebSocket and sends a `source.register` message -- there is no manual "add source" action in the UI.

**What causes a source to appear:**

1. An orchestrator is configured with one or more providers (e.g., a GitHub App with `appId: 12345`)
2. The orchestrator connects to the Platform using an orchestrator API key for your organization
3. On connection, the orchestrator sends `source.register` with its provider sources (e.g., `github:12345`)
4. The Platform records the source against your organization
5. The source immediately appears in the dashboard

**Each source displays:**

- **Routing key** -- the source identifier (e.g., `github:12345` for a GitHub App, `generic:my-source` for a generic webhook)
- **Webhook URL** -- the URL to configure in your provider's webhook settings (constructed by the Platform based on the provider type and org ID)
- **Registered at** -- when the orchestrator first registered this source
- **Copy button** -- copies the webhook URL to the clipboard

**Read-only** -- sources cannot be created, edited, or deleted from the dashboard. They are managed entirely by orchestrator connections. When an orchestrator disconnects, its sources remain visible (they are not automatically removed).

**Empty state** -- if no orchestrator has connected yet, the tab shows "No webhook sources registered" with a link to the operator setup guide.

**Webhook secrets** -- webhook HMAC secrets are not visible in the dashboard. They are stored in the orchestrator's database (`webhook_secrets` table) and pushed to the Platform via the `source.secrets` WebSocket message after registration. The Platform uses these secrets to verify incoming webhook signatures. Secrets are configured in the orchestrator's database, not through the UI.

**Adding a new source** requires:

1. Configure a new provider in the orchestrator (e.g., add a GitHub App to the orchestrator's provider config)
2. Seed the webhook secret in the orchestrator's `webhook_secrets` database table
3. Restart the orchestrator -- it will register the new source with the Platform on connection
4. Configure the webhook URL (shown in the sources tab) in the provider's settings (e.g., GitHub App webhook URL)

### Event log

The event log tab (`/orgs/:customerId/settings/event-log`) shows every inbound webhook this organization has received. Each row joins two tiers of records:

1. **Platform record** -- written by the Platform relay on every delivery: routing key, event, action, repo, routing target, status, SHA-256 payload hash. The Platform never persists the payload (trust boundary).
2. **Orchestrator record** -- written by the destination orchestrator when it processes the delivery: full payload (in object storage), processing outcome (`processed` / `duplicate` / `lockfile_missing` / `failed`), matched workflow count, first run spawned (if any), and a payload hash that matches the Platform record for cross-tier correlation.

The list view supports filters for routing key, event type, status, and free-text delivery ID search. Click a row to open a detail panel with both tiers' projections side-by-side, plus the payload viewer.

**Permissions:**

- `event_log:read` -- list rows and view metadata in the detail panel.
- `event_log:read_payload` -- additionally view the raw webhook payload body. (Owners and admins inherit this. Lower-tier roles see "Payload not available" with a hint to ask for an elevated role.)

**Edge cases the UI surfaces:**

- **Payload omitted** -- when the inbound payload exceeded the orchestrator's `eventLog.maxPayloadBytes` soft cap (default 5 MB) or the object-storage write failed, the row is still recorded with `payload_omitted=true`. The hash is preserved so operators can correlate against `KICI_WEBHOOK_PAYLOAD_DIR` or raw logs.
- **Orchestrator unavailable** -- when the orchestrator does not respond within 2 seconds of the merge fan-out, the list still loads with Platform-side metadata only, marked with an `orchestrator_unavailable` banner.
- **Orchestrator-only deliveries** -- direct-ingress deliveries (independent / hybrid mode) that never crossed the Platform appear with `platform.status = orchestrator_only`.

Retention is 30 days on both tiers, matching the Platform `event_log` audit window.

<!-- help:personal-profile#account -->

Account settings let you view your profile information (name, email) and manage your KiCI account. Changes here apply across all organizations you belong to.

<!-- /help:personal-profile -->

<!-- help:personal-pats#account -->

Personal access tokens (PATs) are long-lived credentials for programmatic API access.

Create a PAT to authenticate CLI tools or scripts without going through the OIDC login flow. Tokens can be revoked at any time.

Use a token's clone button to open the creation modal prefilled with that token's name, expiry, and permissions.

<!-- /help:personal-pats -->

<!-- help:personal-linked-accounts#account -->

Linked accounts connect your external provider identities (like GitHub) to your KiCI account. Linking enables features like showing your provider username in run metadata and associating your commits with your KiCI identity.

<!-- /help:personal-linked-accounts -->

<!-- help:orgs-list#organizations -->

Organizations are the top-level container for your CI/CD resources. Each org has its own runs, settings, environments, secrets, and team members. Select an organization to manage its workflows and configuration.

<!-- /help:orgs-list -->

<!-- help:orchestrators-list#orchestrators -->

The Orchestrators page lists every orchestrator currently connected to this org, keyed by **cluster name**. Each row shows:

- **Cluster** — the human-friendly cluster name set on the orch via `kici-admin cluster-name set <name>`, or an auto-generated `cluster-<6hex>` if no operator has renamed it.
- **Role** — `coordinator` (talks to Platform directly) or `worker` (relays through a coordinator).
- **Version**, **mode**, **routing keys**, and **last heartbeat**.

Click a cluster to drill into its per-orch surfaces (security policy, environments, secrets, DLQ, registrations, global workflows). Different clusters in the same org can have different settings — this page is the entry point that lets you pick which cluster you're configuring.

<!-- /help:orchestrators-list -->

<!-- help:orchestrators-scope#orchestrators -->

Every panel inside this view scopes to the named cluster. Settings shown here come from that orchestrator's own database — a sibling orchestrator in the same org may have a different security policy, different environments, and different secrets.

When the cluster shows **disconnected**, the orch is offline and its current state can't be queried. Most child pages will return 404 in that state; return to the orchestrator list to find a connected cluster.

To rename a cluster, run `kici-admin cluster-name set <new>` on the orchestrator host and restart the orch service so the new name reaches Platform on the next `source.register`.

<!-- /help:orchestrators-scope -->

## Workflows

The workflows page (`/orgs/:customerId/workflows`) shows permanently registered workflows listening for events. It displays a filterable table with columns for workflow name, repository, trigger types, last triggered time, next fire time (for scheduled workflows), source repos, and actions.

Each row is expandable to show trigger configuration details. Rows include action controls: a "Run now" button for manual triggering, a toggle switch to enable/disable the workflow, and a delete button with a confirmation modal (optionally cancelling active runs). Stale workflows (no triggers in the last 30 days) show a yellow "Stale" badge. Registry health indicators (version, sync status, last updated) appear above the table.

Filters include trigger type, repository, and workflow name.

## Diagnostics

The diagnostics page (`/orgs/:customerId/diagnostics`) provides infrastructure health monitoring. It has four sections:

1. **Execution metrics** -- cards showing total runs (24h), success rate, average duration, and active jobs (queued + running). Refreshes every 30 seconds.
2. **Infrastructure alerts** -- banner summarizing any critical or warning alerts from connected orchestrators
3. **Infrastructure tree** -- hierarchical view of orchestrators, their scalers, and agents. Refreshes every 10 seconds. Each orchestrator row shows:
   - **`orchestrator:`** (bold monospace, left group) -- the orchestrator's cluster instance ID, set via `KICI_CLUSTER_INSTANCE_ID` env var or auto-generated as a UUID. If no instance ID is set, the first 8 characters of the connection ID are shown here instead.
   - **`conn:`** (dimmed monospace, left group) -- first 8 characters of the WebSocket connection ID assigned by the Platform relay. Only shown when an explicit instance ID is present.
   - Connection status badge, role badge (coordinator or worker), version badge (left group, after the ID labels)
   - **`host:`** badge (right side) -- the system hostname of the machine running the orchestrator process
   - Additional badges on the right side: running-as user, CPU count, memory usage, uptime

   Each orchestrator lists its **scalers** (indented at level 1) showing scaler name, type badge (container/firecracker/bare-metal), active/max agent count, and a config info popover. Below each scaler, its **agents** (indented at level 2) display agent ID, platform/arch, heartbeat age, hostname, running-as user, CPU count, memory, uptime, and version. Labels (both user-defined and auto-generated `kici:` prefixed) are shown on a separate row beneath scalers and stateful agents, with a tooltip distinguishing user labels from auto labels.

4. **Secret backends** -- health cards for each configured secret backend (e.g. OpenBao), showing connection status with sync and test actions. Allows triggering a manual sync or connectivity test per backend.

## Environments

The environments page (`/orgs/:customerId/environments`) lists all deployment environments for the organization. Each environment shows its name, type (fixed or glob pattern), protection status (branch restrictions, concurrency limits, required reviewers, wait timers), and enabled/disabled state.

Users with `environments:admin` permission can create new environments via a modal dialog, choosing between fixed and glob (pattern-matching) types. Clicking an environment row navigates to the environment detail page.

### Environment detail

The environment detail page (`/orgs/:customerId/environments/:environmentId`) shows a header with the environment name, type badge, enabled/disabled toggle, and a delete button. Below the header, a tabbed layout provides four sections:

1. **Variables** (default) -- environment-scoped variables
2. **Secrets** -- secrets bound to this environment
3. **Protection** -- protection rules (branch restrictions, concurrency limits, required reviewers, wait timers)
4. **History** -- audit history of changes to this environment

Tab selection syncs with the URL path (`/orgs/:customerId/environments/:environmentId/variables`, `/orgs/:customerId/environments/:environmentId/protection`, etc.).

## Secrets

The secrets page (`/orgs/:customerId/secrets`) provides a scope-centric view of all secrets in the organization. Secrets are organized into a scope tree with environment binding checkboxes, allowing you to control which secret scopes are available in which environments.

Permission-gated: `secrets:read` to view scopes, `secrets:write` to add or delete secrets, `environments:write` to modify environment bindings.

### Where secrets live

Secret values are stored in the orchestrator's secret store and authorized through the orchestrator's RBAC. The dashboard surfaces secret **names** and scope membership for every secret regardless of where the value was entered.

Whether secret **values** can be set from the dashboard depends on the orchestrator's [dashboard-write policy](/operator/security/dashboard-write-policy):

- **Permissive (default):** the "Add secret" and "Edit value" controls accept plaintext directly in the dashboard. This is how a typical SaaS CI tool works and is the right default for small teams.
- **`secrets.set` disabled by policy:** the controls render with a lock icon, grayed out. Hovering shows the exact `kici-admin secret set` invocation needed; a copy button puts it on the clipboard. The control is inert — the dashboard issues no mutating request. Use the CLI to enter values; the dashboard refreshes within ~30 seconds and shows the new secret name.

The policy state is visible at three layers in the UI:

- A **lock-icon prefix** on every disabled control, with a per-control CLI hint.
- A **per-page banner** on any page containing at least one disabled operation, listing every disabled op on that page and its CLI equivalent.

The Security policy page (Settings → Security → Dashboard policy) renders the full 24-row read-only matrix with the current state and the `kici-admin` command for each row. The policy itself cannot be changed from the dashboard — the orchestrator operator manages it via `kici-admin org-settings dashboard-writes`. See [Dashboard-write policy](/operator/security/dashboard-write-policy) for the operator-side details.

## Approval queue

The approval queue page (`/orgs/:customerId/approval-queue`) shows held runs that are pending approval. Runs can be held due to environment protection rules (required reviewers, wait timers). The page supports filtering by status (pending, approved, rejected, expired) and provides approve/reject actions for users with `environments:write` permission. Users with `environments:admin` permission can skip wait timers.

## Account

The standalone account page (`/account`) provides access to personal settings outside of any organization context. It has three tabs:

- **Profile** -- view your name and email
- **Personal access tokens** -- create and revoke PATs for programmatic API access
- **Linked accounts** -- connect external provider identities (e.g. GitHub) to your KiCI account

This page is also accessible within an org context via the user menu in the sidebar (`/orgs/:customerId/account`).

## Admin section

When viewing the `kici-admin` organization, the dashboard switches to an admin-mode interface for platform-wide management. The admin pages are:

- **Overview** (`/orgs/kici-admin/admin`) -- embedded Grafana dashboards with three tabs: System, Orgs, and Execution
- **Organizations** (`/orgs/kici-admin/admin/orgs`) -- table of all organizations with plan type, member count, Stripe status, and creation date; rows link to org detail pages
  - **Org detail** (`/orgs/kici-admin/admin/orgs/:orgId`) -- org info summary, plan limit controls, current usage stats with over-limit warnings, quick actions, and a tabbed section with audit log
- **Connections** (`/orgs/kici-admin/admin/connections`) -- table of connected orchestrators showing org, routing keys, heartbeat age, running jobs, and force-disconnect action
- **Scheduled jobs** (`/orgs/kici-admin/admin/jobs`) -- table of Platform scheduled background jobs with cron schedule, last run status, consecutive failure count, estimated next run time, and a "Run now" action to trigger immediate execution
- **Audit log** (`/orgs/kici-admin/admin/audit-log`) -- paginated table of platform-level admin actions with expandable JSON details
- **Metrics** -- external link to the Grafana instance

## Organizations

The organizations page (`/orgs`) lists all organizations your account has access to.

Organizations are sorted alphabetically by display name. Each entry shows your role (owner or member). A "Create organization" button opens an inline form to create a new org by name.

## Theme

The dashboard supports three theme modes:

- **System** (default) -- follows your operating system's dark/light preference
- **Dark** -- forced dark mode
- **Light** -- forced light mode

Toggle between modes using the sun/moon icon in the sidebar footer. The selection persists to `localStorage`.

## Date and time preferences

A toggle button in the sidebar lets you switch between **local time** and **UTC time** display. When UTC mode is enabled:

- All timestamps in the run list, run detail header, metadata panel, and log viewer show UTC times
- Tooltips on relative timestamps (e.g. "5 minutes ago") show the absolute time in UTC
- The timeline Gantt chart uses UTC for time labels

The preference persists to `localStorage`.

## Keyboard shortcuts

| Key           | Context    | Action                 |
| ------------- | ---------- | ---------------------- |
| Arrow Up/Down | Job tree   | Move focus             |
| Enter         | Job tree   | Select job or step     |
| Escape        | Job tree   | Navigate to first job  |
| Enter         | Log search | Jump to next match     |
| Shift+Enter   | Log search | Jump to previous match |
| Escape        | Log search | Clear search           |

## Error pages

The dashboard shows informative error pages instead of blank screens:

- **404** -- "Page not found" with a "Go home" button linking to the organizations page
- **500** -- "Failed to load" with an error message, a trace ID for support, and a "Go home" button (shown when API requests fail)
- **Client-side rendering errors** -- caught by the error boundary, showing "Something went wrong" with a trace ID and a "Reload page" button
- **Auth errors** -- authentication failures on the OIDC callback page show the error message with a retry mechanism and a "Back to login" link

---

## Dynamic values

Source: https://docs.kici.dev/user/dynamic-values/

Dynamic values let you compute `environment`, `env`, and `concurrencyGroup` at runtime based on the incoming event. Instead of hardcoding static strings, you pass a function that receives the normalized event envelope and returns the resolved value.

```typescript
job('deploy', {
  runsOn: ['default'],
  environment: (event) => event.targetBranch,
  env: (event) => ({ BRANCH: event.targetBranch }),
  concurrencyGroup: (event) => `deploy-${event.targetBranch}`,
  steps: [
    /* ... */
  ],
});
```

```typescript
job('deploy', {
  runsOn: 'default',
  // One shape everywhere: branch on the normalized event type.
  environment: (event) => (event.type === 'pull_request' ? 'preview' : 'production'),
  steps: [
    /* ... */
  ],
});
```

## How it works

When you define a dynamic value as a function, the compiler analyzes it at compile time to determine whether it is **pure** (can be evaluated without cloning the repo or running an init job).

### Pure functions (inline evaluation)

A pure function is one that:

- Is synchronous (no `async`/`await`)
- Only references its parameters and local variables
- Does not import or require external modules
- Does not access globals like `process`, `fetch`, `console`, `setTimeout`, etc.
- Uses only safe built-in constructors: `String`, `Number`, `Boolean`, `Array`, `Object`, `JSON`, `Math`, `parseInt`, `parseFloat`, `isNaN`, `isFinite`, `encodeURIComponent`, `decodeURIComponent`, `encodeURI`, `decodeURI`
- Does not use `this`, `new`, `class`, `throw`, `try`/`catch`, `delete`, `var`, `yield`, or mutation operators (`++`, `--`, `+=`, etc.)

When the compiler detects a pure function, it serializes the function source directly into the lock file as an inline expression. At dispatch time, the orchestrator evaluates the expression in a sandboxed VM context (~0ms overhead) instead of dispatching an init job.

**Examples of pure functions:**

```typescript
// Simple branch extraction
environment: (event) => event.targetBranch;

// Object literal with string operations
env: (event) => ({ BRANCH: event.targetBranch });

// Concatenation with event data
concurrencyGroup: (event) => `deploy-${event.targetBranch}`;

// Using safe globals
env: (event) => ({ UPPER: String(event.targetBranch).toUpperCase() });

// Local variables are fine
environment: (event) => {
  const parts = event.targetBranch.split('/');
  return parts[parts.length - 1];
};
```

### Impure functions (init-job evaluation)

If the compiler determines a function is impure, it emits a warning during compilation and falls back to the two-phase init model. This means:

1. The orchestrator dispatches a special `__init__` job to a builder agent
2. The builder agent clones the repository and evaluates the function
3. The resolved values are sent back to the orchestrator
4. The orchestrator dispatches the real execution job with the resolved values

This adds approximately 5-10 seconds of overhead for cloning and evaluation.

**Examples of impure functions (will use init job):**

```typescript
// Async functions cannot be inlined
environment: async (event) => await lookupEnv(event.targetBranch);

// External module references
env: (event) => {
  const config = require('./config');
  return config.env;
};

// Process/global access
environment: (event) => process.env.DEFAULT_ENV || 'staging';

// Dynamic imports
env: async (event) => {
  const m = await import('./config.js');
  return m.default;
};
```

## Performance comparison

| Evaluation path                      | Overhead | When used                                                           |
| ------------------------------------ | -------- | ------------------------------------------------------------------- |
| Static value (string/object literal) | ~0ms     | `environment: 'staging'`                                            |
| Inline expression (pure function)    | ~0ms     | `environment: (event) => event.targetBranch`                        |
| Init job (impure function)           | ~5-10s   | `environment: async (event) => await lookupEnv(event.targetBranch)` |

## Tips

- **Write pure functions whenever possible** to avoid the init-job delay. Most environment and env computations only need the event payload data.
- **Check compiler warnings** -- the compiler tells you when a function is classified as impure and explains why.
- **Runtime errors in inline expressions cause immediate job failure.** There is no fallback to the init-job path. If your pure function throws at runtime (e.g., accessing a property on `undefined`), the job fails immediately.
- **The event parameter is the normalized event envelope** — the same shape rules receive as `ctx.event`: `{ type, action, targetBranch, sourceBranch, changedFiles, payload, … }` (see the [event payload reference](./sdk/event-payloads.md) for the complete schema). Narrow on `event.type` (`'push'`, `'pull_request'`, `'tag'`, …) to branch per trigger kind. The raw provider webhook body is nested at `event.payload` (for GitHub pushes: `payload.ref`, `payload.after`, `payload.repository`, …).

---

## Environment variables

Source: https://docs.kici.dev/user/env-vars/

The KiCI CLI reads the following environment variables to customize its behavior. OAuth login (`kici login` without `--token`) defaults `KICI_PLATFORM_URL`, `KICI_OIDC_ISSUER`, and `KICI_OIDC_CLIENT_ID` to the hosted KiCI Platform, so `kici login` works with no configuration. Set them only to target a self-hosted Platform or a testing environment.

## Authentication

| Variable              | Description                            | Default                                      |
| --------------------- | -------------------------------------- | -------------------------------------------- |
| `KICI_OIDC_ISSUER`    | OIDC issuer URL for authentication     | `https://auth.kici.dev/realms/kici-internal` |
| `KICI_OIDC_CLIENT_ID` | OIDC client ID for the CLI application | `kici-cli`                                   |
| `KICI_PLATFORM_URL`   | Platform API base URL                  | `https://api.kici.dev`                       |
| `KICI_CONFIG_DIR`     | Override the KiCI config directory     | `~/.kici`                                    |

## Browser behavior

| Variable             | Description                                                                                                                                          | Default                                                |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ |
| `KICI_BROWSER_CMD`   | Custom browser command for OAuth login. Supports `{url}` placeholder. Set to `none` to suppress browser opening and print the URL to stdout instead. | Uses the system default browser via the `open` package |
| `KICI_CALLBACK_PORT` | Fixed port for the OAuth PKCE callback server. Useful when firewall rules require a known port.                                                      | Random available port                                  |

## Development

| Variable     | Description                                                                                                                                                                | Default |
| ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `KICI_DEV`   | Enable development mode. When `true`, uses prerelease-compatible version ranges (`>=0.0.1-0`) for dev dependencies and skips npm version resolution.                       | unset   |
| `KICI_DEBUG` | Enable debug logging. When `true`, prints verbose diagnostics (SDK alias resolution, step-level debug logs, stack traces on errors). Equivalent to the `--debug` CLI flag. | unset   |

## Usage examples

### CI/CD environment

Authenticate with a pre-existing API key (no browser needed):

```bash
kici login --token <<< "$KICI_API_KEY"
```

### Self-hosted Platform or custom OIDC provider

`kici login` targets the hosted KiCI Platform by default. To point the CLI at a self-hosted Platform or a testing OIDC provider, override the defaults:

```bash
export KICI_OIDC_ISSUER=https://your-idp.example.com
export KICI_OIDC_CLIENT_ID=your-client-id
export KICI_PLATFORM_URL=https://your-platform.example.com
kici login
```

### Headless SSH session

The CLI auto-detects headless environments and uses the device flow. To force PKCE with URL output instead:

```bash
export KICI_BROWSER_CMD=none
kici login
```

This prints the authorization URL to stdout as `KICI_AUTH_URL=<url>`. Open the URL in any browser to complete authentication.

### Fixed callback port

When behind a firewall or using port forwarding:

```bash
export KICI_CALLBACK_PORT=19876
kici login
```

### Custom config location

Store the KiCI config in a non-default location:

```bash
export KICI_CONFIG_DIR=/tmp/kici-test
kici login
```

---

## Environments

Source: https://docs.kici.dev/user/environments/

<!-- help:environments-list#overview -->

Environments are named deployment targets (like staging or production) that control where your workflow jobs run. Each environment can have its own variables, secrets, and protection rules to gate deployments.

<!-- /help:environments-list -->

<!-- help:environments-protection#protection-rules -->

Protection rules control when jobs targeting an environment can execute.

Available rules:

- **Branch restrictions** — only allow specific branches to deploy.
- **Required reviewer approvals** — gate the run on human sign-off.
- **Wait timers** — delay execution for a fixed period.
- **Concurrency limits** — prevent collisions between parallel deployments.

<!-- /help:environments-protection -->

Environments represent deployment targets like `staging`, `production`, or `review/PR-*`. Each environment can have its own variables, bound secrets, and protection rules that control when and how jobs targeting that environment can execute.

## Overview

An environment in KiCI provides:

- **Variables** -- non-secret key-value configuration (e.g., `API_URL`, `CLUSTER_NAME`)
- **Scoped secrets** -- encrypted values bound to the environment via scope bindings
- **Protection rules** -- branch restrictions, required reviewers, wait timers, and concurrency limits
- **Per-source overrides** -- repositories can override unlocked variables for their own deployments

## SDK API

### Job-level environment property

The `environment` property is set on a job, not a workflow or step:

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('deploy', {
  on: [push({ branches: ['main'] })],
  jobs: [
    job('deploy-staging', {
      runsOn: 'default',
      environment: 'staging',
      steps: [
        step('deploy', async (ctx) => {
          // ctx.environment is the resolved environment name
          console.log(`Deploying to ${ctx.environment}`);
          // ctx.secrets provides async get/expose/has methods for environment-bound secrets
          const dbPassword = await ctx.secrets.get('DB_PASSWORD');
          // Environment variables are in ctx.env
          const apiUrl = ctx.env.API_URL;
          await ctx.$`deploy --target ${ctx.environment}`;
        }),
      ],
    }),
  ],
});
```

### Dynamic environments

The environment name can be a string or a function (sync or async) for dynamic environments (e.g., per-PR review environments). The function receives the normalized event envelope, with the raw provider body nested at `event.payload`:

```typescript
job('deploy-review', {
  runsOn: 'default',
  environment: (event) => `review/PR-${event.payload.pull_request.number}`,
  steps: [
    step('deploy', async (ctx) => {
      // ctx.environment is 'review/PR-123' (resolved at runtime)
      await ctx.$`deploy-preview --env ${ctx.environment}`;
    }),
  ],
});
```

A pure function like the one above (see [Dynamic values](dynamic-values.md)) is evaluated inline at dispatch with no init-job overhead. Dynamic environments that match a glob pattern (e.g., `review/*`) inherit the pattern's configuration, variables, and protection rules.

### Job-level environment variables

The `env` property on a job provides static or dynamic environment variables:

```typescript
job('deploy', {
  runsOn: 'default',
  environment: 'production',
  env: { DEPLOY_TARGET: 'us-east-1' },
  // Or dynamic:
  // env: (event) => ({ DEPLOY_SHA: event.payload.after?.slice(0, 7) }),
  steps: [
    step('deploy', async (ctx) => {
      // DEPLOY_TARGET is available in ctx.env
      await ctx.$`deploy --region ${ctx.env.DEPLOY_TARGET}`;
    }),
  ],
});
```

### Concurrency groups

Jobs can define their own concurrency groups to control concurrent execution within an environment. For workflow-level concurrency (which applies to all jobs in a workflow), see [Concurrency groups](concurrency.md).

Control concurrent deployments to the same environment:

```typescript
job('deploy', {
  runsOn: 'default',
  environment: 'production',
  concurrencyGroup: 'production-api',
  // Or dynamic:
  // concurrencyGroup: (event) => `review-${event.payload.pull_request.number}`,
  steps: [
    /* ... */
  ],
});
```

If no `concurrencyGroup` is specified, the environment name is used as the default concurrency group.

### Step context

Inside a step, the `ctx` object provides:

| Property          | Type                                  | Description                                                                  |
| ----------------- | ------------------------------------- | ---------------------------------------------------------------------------- |
| `ctx.environment` | `string \| undefined`                 | Resolved environment name (undefined for jobs without environment)           |
| `ctx.env`         | `Record<string, string \| undefined>` | Environment variables (merged from system, org, source, and job-level `env`) |
| `ctx.secrets`     | `StepSecretsTyped`                    | Async accessor for bound secrets (get, expose, has, getMeta)                 |

| Method                          | Returns                   | Description                                                                                                                       |
| ------------------------------- | ------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `await ctx.secrets.get(key)`    | `string`                  | Retrieve a secret value. Throws `SecretNotFoundError` if not found.                                                               |
| `await ctx.secrets.expose(key)` | `void`                    | Inject a secret into the step's environment variables (`ctx.env`). Throws `SecretNotFoundError` if not found.                     |
| `ctx.secrets.has(key)`          | `boolean`                 | Check if a secret key exists. Synchronous, never throws.                                                                          |
| `ctx.secrets.getMeta(key)`      | `SecretMeta \| undefined` | Retrieve metadata (value, backend name, scope path) for a resolved secret. Returns `undefined` if not found.                      |
| `ctx.setSecretOutput(key, val)` | `void`                    | Publish an encrypted secret output from this job, consumable by downstream jobs via `needs`. Never logged or stored in plaintext. |

## Environment variable merge precedence

When a job targets an environment, variables are merged in this order (last wins):

1. **Allowed system vars** -- `PATH`, `HOME`, etc. from the agent process
2. **Sandbox defaults** -- `FORCE_COLOR=1`
3. **KICI\_\* system vars** -- orchestrator-generated metadata
4. **Org-level environment vars** -- from the dashboard, managed per-environment
5. **Source-level overrides** -- per-repository overrides (skips locked vars)
6. **Job env** -- from the `env` property in the SDK
7. **`setEnv()` calls** -- runtime modifications within steps

> **Note:** Secrets are NOT part of the environment variable merge. They are delivered to the step context via IPC and accessed through `ctx.secrets`, not through `process.env`. See the [step context](#step-context) section above.

## Protection rules

Environments can have protection rules that gate job execution:

### Branch restrictions

Limit which branches can deploy to an environment:

```
Allowed branches: main, release/*
```

Jobs from other branches are rejected immediately with an error message.

### Required reviewers

Require manual approval before a job can proceed:

```
Required reviewers: alice, bob
```

When reviewers are required, the job enters a "held" state. Reviewers can approve or reject via the dashboard or API. Held runs expire after a configurable timeout (default: 1 hour).

### Wait timer

Add a mandatory delay before deployment starts:

```
Wait timer: 300 seconds
```

The job waits for the specified duration before proceeding. Useful for staged rollouts.

### Minimum trust

Gate job execution based on the contributor's trust tier for PR-triggered runs:

```
Minimum trust: known
```

| Value     | Effect                                              |
| --------- | --------------------------------------------------- |
| `known`   | Blocks unknown contributors; allows known + trusted |
| `trusted` | Blocks unknown + known; allows only trusted         |

When a contributor does not meet the minimum trust level, the job is held in the security approval queue. Someone with `ci_trust:write` or higher must approve it before execution proceeds.

Trust tier is determined by the contributor's identity link and CI trust RBAC level:

- **Trusted** -- identity-linked org member with `ci_trust:write+` AND provider write access
- **Known** -- identity-linked member or verified collaborator via provider API
- **Unknown** -- no identity link and no provider access, fork PRs

The trust tier also affects which lock file is used for PR-triggered runs: trusted contributors use the PR head lock file, while known and unknown contributors use the base branch lock file. This prevents untrusted workflow modifications from affecting execution.

See the [CI security architecture docs](../architecture/security/ci-security.md) for the full trust resolution flow.

### Security approval queue

When a PR is held for security review (unknown contributor, workflow modification, or trust policy violation), it enters the security approval queue. This is separate from environment-level approval queues.

Held runs can be approved:

- Via the **dashboard** in Settings > CI trust > Approval queue
- Via a PR comment: `/kici approve` (commenter must have `ci_trust:write+`)

Security holds expire after a configurable timeout (default 1 hour).

### Concurrency limits

Control how many jobs can run simultaneously in an environment:

```
Concurrency limit: 1
Strategy: queue (or cancel-pending)
```

- **queue** -- new jobs wait in a FIFO queue (with configurable timeout, default 1 hour)
- **cancel-pending** -- pending (queued) jobs are cancelled when the limit is reached

## Dashboard management

### Creating environments

Navigate to **Settings > Environments** in the dashboard. Click **New environment** to choose the environment name and type (Fixed or Glob).

- **Fixed** -- applies to jobs that declare exactly this environment name, like `staging` or `production`
- **Glob** -- applies to any environment name a job declares that matches the pattern, e.g. `review/*` matches a job with `environment: 'review/PR-123'`

The environments list shows each environment's type, whether test runs may use it (the `allowLocalExecution` flag -- see the [testing guide](./testing-guide.md)), and whether it is enabled.

### Environment detail page

Each environment has four tabs:

1. **Variables** -- manage key-value pairs with lock toggles. Locked variables cannot be overridden by source-level overrides. Source overrides are managed in a sub-tab.

2. **Secrets** -- view bound secret scopes and their resolved secret count. Add bindings by specifying scope glob patterns (e.g., `aws/prod/**`).

3. **Protection** -- configure branch restrictions, required reviewers, wait timers, and concurrency limits with enable toggles for each section.

4. **History** -- view filtered runs targeting this environment.

### Secrets management

Secrets are individual encrypted values organized by scope paths (e.g., `aws/prod`, `databases/postgres`). Scopes are bound to environments via bindings:

- **Scope-centric view** (Secrets page): tree view of scopes with per-scope environment binding checkboxes
- **Environment-centric view** (inside environment detail): bound scopes, resolved secrets, add binding

When scope paths collide on the same key name, the longer (more specific) path wins.

## Type generation

Running `kici types` generates two augmented interfaces: `KnownSecretKeys` (union of all secret keys across all environments) and `EnvironmentSecrets` (per-environment key unions):

```typescript
interface KnownSecretKeys {
  DB_PASSWORD: string;
  API_KEY: string;
}

interface EnvironmentSecrets {
  production: 'DB_PASSWORD' | 'API_KEY';
  staging: 'DB_PASSWORD';
}
```

`KnownSecretKeys` narrows `ctx.secrets.get()` and `ctx.secrets.expose()` key parameters to valid key names. `EnvironmentSecrets` maps each environment to its available secret key names as a string union. Dynamic environments fall back to the full `KnownSecretKeys` union.

---

## Event system

Source: https://docs.kici.dev/user/events/

KiCI supports two broad categories of workflow triggers: **git-based triggers** that work immediately, and **event-based triggers** that use a registration model. Understanding this distinction is key to working effectively with non-git triggers like schedules, custom events, and generic webhooks.

## Overview

Git-based triggers (`push()`, `pr()`, `tag()`, `comment()`, `review()`, `release()`, etc.) work immediately after you commit your lock file. When a GitHub webhook arrives, the orchestrator fetches your lock file and evaluates triggers on the spot -- no advance setup needed.

Event-based triggers work differently. The orchestrator needs to know about them _before_ the event arrives. This is because event-based triggers are matched against a pre-built registration index rather than being evaluated per-event from a lock file fetch. The six event-based trigger types are:

- `kiciEvent()` -- custom events emitted from workflow steps
- `workflowComplete()` -- fires when a workflow finishes
- `jobComplete()` -- fires when a specific job finishes
- `genericWebhook()` -- HTTP webhooks from external services
- `schedule()` -- cron-based time triggers
- `lifecycle()` -- orchestrator lifecycle events (workflow completion, job failure, registration updates)

All six require the **registration model** to function -- covered in detail below.

## Event types

### Custom events

Custom events are user-defined events emitted from workflow steps using `ctx.emit()`. Use `kiciEvent()` to listen for them.

```typescript
import { kiciEvent } from '@kici-dev/sdk';

// Listen for a custom event by name
kiciEvent({ name: 'deploy-complete' });

// With payload matching (JSONPath)
kiciEvent({ name: 'deploy-complete', match: { '$.env': 'prod' } });

// With negative filter
kiciEvent({ name: 'deploy-complete', not: { '$.env': 'staging' } });

// From a specific repository
kiciEvent({ name: 'deploy-complete', source: 'org/infra-repo' });
```

**Config options:** `name` (required), `match`, `not`, `source`, `description`.

### System events

The orchestrator automatically emits completion events when workflows and jobs finish. No manual emission needed -- these fire automatically.

**Workflow completion:**

```typescript
import { workflowComplete } from '@kici-dev/sdk';

// Any workflow completion
workflowComplete();

// Specific workflow by name
workflowComplete({ name: 'build' });

// Only successful completions
workflowComplete({ name: 'build', status: ['success'] });
```

**Config options:** `name`, `status` (`'success'`, `'failed'`, `'cancelled'`), `source`, `description`.

**Job completion:**

```typescript
import { jobComplete } from '@kici-dev/sdk';

// Any job completion
jobComplete();

// Specific workflow + job
jobComplete({ workflow: 'build', job: 'test' });

// Only failures
jobComplete({ workflow: 'build', job: 'test', status: ['failed'] });
```

**Config options:** `workflow`, `job`, `status` (`'success'`, `'failed'`, `'cancelled'`, `'skipped'`), `source`, `description`.

### External events

Generic webhooks let you trigger workflows from any HTTP service -- Stripe, ArgoCD, Slack, Grafana, or your own internal services.

```typescript
import { genericWebhook } from '@kici-dev/sdk';

// Match any event from a source
genericWebhook({ source: 'stripe' });

// Match specific event types
genericWebhook({ source: 'stripe', events: ['invoice.paid'] });

// With HMAC-SHA256 signature verification
genericWebhook({
  source: 'stripe',
  events: ['invoice.paid'],
  auth: {
    method: 'hmac-sha256',
    secret: 'stripe-signing-key',
    signatureHeader: 'stripe-signature',
  },
});

// With API key auth
genericWebhook({
  source: 'slack',
  auth: { method: 'api-key', secret: 'slack-token' },
});
```

**Config options:** `source` (required), `events`, `match`, `not`, `auth`, `path`, `description`.

The `source` field MUST match the `--name` that an operator passed to `kici-admin source add generic --name <name>` when the source was created — that string is the source's identifier in the orchestrator. Generic webhook sources must be created by an operator before events can be received; see [Operator guide: event routing](../operator/event-routing.md) for setup instructions.

### Schedule events

Cron-based triggers evaluated by the orchestrator on a periodic interval. Only the Raft leader evaluates schedules in a clustered deployment.

```typescript
import { schedule } from '@kici-dev/sdk';

// Run every hour
schedule({ cron: '0 * * * *' });

// Run daily at 2 AM UTC
schedule({ cron: '0 2 * * *' });

// Run weekly on Mondays at 9 AM Eastern
schedule({ cron: '0 9 * * 1', timezone: 'America/New_York' });
```

**Config options:** `cron` (required), `timezone` (defaults to `'UTC'`), `description`.

### Lifecycle events

Lifecycle triggers listen for orchestrator-level events related to workflow execution and system state changes.

```typescript
import { lifecycle } from '@kici-dev/sdk';

// Trigger when any workflow completes
lifecycle({ events: ['workflow_complete'] });

// Trigger on job failures from a specific repo
lifecycle({ events: ['job_failed'], sources: ['org/deploy-repo'] });

// Trigger when registrations are updated
lifecycle({ events: ['registration_updated'] });
```

**Available events:** `'workflow_complete'`, `'job_complete'`, `'job_failed'`, `'registration_updated'`.

**Config options:** `events` (required), `sources`, `description`.

## The registration model

This is the most important concept for understanding event-based triggers.

### Why registrations exist

When a GitHub webhook arrives (push, PR, etc.), the orchestrator fetches your lock file from the repository and evaluates triggers on the spot. This works because the event itself tells the orchestrator _which repository_ to look at.

Event-based triggers are different. When a cron timer fires or a custom event is emitted, there is no incoming webhook pointing to a specific repository. The orchestrator needs to know _in advance_ which workflows care about which events. That is what the registration model provides: a pre-built index of event-based workflows.

### How registration works

1. You define a workflow with an event-based trigger (e.g., `schedule()`, `kiciEvent()`, `genericWebhook()`)
2. You compile the workflow (`kici compile`), which produces a lock file
3. You push the lock file to your repository's **default branch** (e.g., `main` or `master`)
4. The orchestrator receives the push webhook, detects it targets the default branch, and extracts all workflows with event-based triggers from the lock file
5. Those workflows are stored in the orchestrator's registration database
6. From that point on, matching events will trigger those workflows

### Key implications

- **Event-based workflows do not trigger until you push to the default branch.** If you add a new `schedule()` workflow, it will not start running until you merge to your default branch. This is by design -- the orchestrator cannot match events to workflows it does not know about.

- **Registration is automatic.** There is no manual setup. Push your code, and the orchestrator handles the rest.

- **Registrations refresh on every default-branch push.** If you add, remove, or modify event-based workflows and push to the default branch, the orchestrator updates its registration index automatically. Removed workflows stop triggering. New workflows start triggering.

- **Git-based triggers are unaffected.** Triggers like `push()`, `pr()`, and `tag()` do not use registrations. They work immediately from any branch because the orchestrator evaluates them per-event from the lock file.

### Practical example

You create a nightly build workflow:

```typescript
import { workflow, job, step, schedule } from '@kici-dev/sdk';

export default workflow('nightly-build', {
  on: schedule({ cron: '0 2 * * *' }),
  jobs: [
    job('build', {
      runsOn: 'linux',
      steps: [
        step('build', async ({ $ }) => {
          await $`pnpm build`;
        }),
      ],
    }),
  ],
});
```

You compile it, commit the lock file, and push to a feature branch. **Nothing happens** -- the cron will not fire because the orchestrator has not registered this workflow yet.

You merge the feature branch into `main`. On the merge push, the orchestrator extracts the `nightly-build` workflow (it has a `ScheduleTrigger`) and registers it. Starting at the next 2 AM UTC, the workflow will trigger.

## How events are matched

When an event arrives, the orchestrator follows this flow:

1. **Event received** -- a custom event is emitted by a step, a cron timer fires, or a generic webhook arrives
2. **Registration lookup** -- the orchestrator queries its registration index for workflows matching the event type (e.g., all workflows with `ScheduleTrigger` for a cron fire, or all workflows with `KiciEventTrigger` for a custom event)
3. **Trigger evaluation** -- for each candidate workflow, the orchestrator evaluates the trigger conditions: event name patterns, payload matching, status filters, source filters
4. **Dispatch** -- matched workflows are dispatched to agents for execution, following the same job queue and agent routing as git-triggered workflows

This lookup is fast because the registration index is held in memory and refreshed only when the registry version changes (on default-branch pushes).

### Cross-source webhook delivery

The catch-all `webhook()` trigger (see [SDK reference: webhook()](sdk/triggers.md#webhook)) participates in this same registration lookup, but with one twist: it fires for matching events arriving via **any** inbound webhook source in the same org, not just the source the workflow's repo is bound to. The orchestrator maintains a `(customerId, eventName)` index over webhook trigger registrations and consults it on every inbound generic webhook.

The lookup is structurally org-isolated — a generic webhook delivered to org A can never reach a workflow registered against org B, because foreign-org rows live in a different bucket of the index. When a webhook fires across sources, the runtime clone token, repo URL, and check-status posting all come from the **registration's** source bundle, not the inbound source. The inbound source contributes only the event payload.

## Circuit breaker

Events can trigger workflows that emit more events, creating chains. The circuit breaker prevents runaway event storms.

### Chain depth limit

Each event carries a `chainDepth` counter. When a workflow triggered by an event emits a new event, the new event's chain depth increments. The orchestrator rejects events that exceed the maximum chain depth.

- **Default limit:** 10 levels deep
- **What happens when hit:** the event is dropped and logged. It is not queued for later delivery.

For example: Workflow A emits event X (depth 0) -> Workflow B triggers, emits event Y (depth 1) -> ... -> at depth 10, any further emitted events are dropped.

### Rate limiting

Each workflow is rate-limited on how many events it can process per minute, using a sliding window.

- **Default limit:** 100 events per workflow per minute
- **What happens when hit:** additional events for that workflow are dropped and logged until the window clears.

These defaults are hardcoded in the orchestrator and are not currently configurable via environment variables.

## Delivery guarantees

KiCI's event router delivers every accepted event with **at-least-once** semantics:

- An event that passes the circuit breaker (chain depth + rate limit) and commits
  to the `kici_events` table is guaranteed to dispatch to all matching workflows
  at least once.
- Each dispatch attempt acquires a short-lived lease (default 60 s) on the row.
  If the dispatching node crashes or the handler throws, the lease expires (or
  is released on failure) and the event is automatically retried.
- The retry policy is exponential backoff with full jitter: base 5 s, cap 5 min,
  up to 5 attempts before the event lands in the **DLQ** (dead-letter queue).
  Operators triage DLQ entries via `kici-admin event-dlq list / count / retry / discard`.

**What this means for workflow authors:**

- **Make event handlers idempotent.** A retried dispatch may run a handler more
  than once (e.g. if the first attempt threw after a partial side-effect).
  Workflows that mutate external state should use idempotency keys, conditional
  writes, or other deduplication patterns — same advice as for any distributed
  CI system.
- **Schedule fires are at-least-once too.** A cron schedule that fires while a
  leader is being killed will commit (atomically with `cron_last_fired`) or roll
  back together — never half. Recovery on the new leader does not backfill
  multiple missed instants; if your workflow needs at-least-N guarantees across
  outages, drive it from a different mechanism (e.g. a workflow that runs more
  frequently and emits its own custom event).
- **Drops are still possible — and visible.** Events rejected by the circuit
  breaker (chain depth or rate limit exceeded) are dropped and logged, not
  retried. That's a deliberate safety mechanism; the metric to watch is
  `kici_orch_events_dropped_total{reason}`.

## Emitting custom events

Custom events are emitted from workflow steps using `ctx.emit()`. You can optionally define typed event schemas using `defineEvent()`.

### Basic emission

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('build', {
  on: push({ branches: 'main' }),
  jobs: [
    job('build', {
      runsOn: 'linux',
      steps: [
        step('build', async ({ $ }) => {
          await $`pnpm build`;
        }),
        step('notify', async (ctx) => {
          await ctx.emit('build-complete', {
            version: '1.0.0',
            success: true,
          });
        }),
      ],
    }),
  ],
});
```

### Typed event definitions

Use `defineEvent()` with Zod schemas to create a typed contract for event payloads:

```typescript
import { defineEvent, z } from '@kici-dev/sdk';

export const deployComplete = defineEvent(
  'deploy-complete',
  z.object({
    env: z.string(),
    version: z.string(),
    services: z.array(z.string()),
  }),
);
```

Then emit using the definition's name:

```typescript
step('emit', async (ctx) => {
  await ctx.emit(deployComplete.name, {
    env: 'prod',
    version: '1.2.3',
    services: ['api', 'web'],
  });
});
```

And consume in another workflow:

```typescript
import { workflow, job, step, kiciEvent } from '@kici-dev/sdk';

export default workflow('post-deploy', {
  on: kiciEvent({ name: 'deploy-complete', match: { '$.env': 'prod' } }),
  jobs: [
    job('smoke-test', {
      runsOn: 'linux',
      steps: [
        step('test', async ({ $ }) => {
          await $`./scripts/smoke-test.sh`;
        }),
      ],
    }),
  ],
});
```

Custom events are delivered immediately when emitted (mid-workflow, not queued until workflow completion). See the [SDK reference: emitting events](sdk/validation-events.md#emitting-events) section for the full `ctx.emit()` API.

## See also

- [SDK reference: event triggers](sdk/triggers.md#event-triggers) -- complete API signatures for all trigger builders
- [SDK reference: emitting events](sdk/validation-events.md#emitting-events) -- `ctx.emit()` and `defineEvent()` API
- [Workflow patterns: workflow chaining](patterns/integrations.md#workflow-chaining) -- examples of event-driven workflow chains
- [Operator guide: event routing](../operator/event-routing.md) -- configuring generic webhook sources, trust relationships, and event routing
- [Architecture: event system](../architecture/webhooks/event-system.md) -- internal event routing design, registration model, cluster synchronization

---

## Global workflows

Source: https://docs.kici.dev/user/global-workflows/

Global workflows let one **workflow repo** define jobs that run on events from many **source repos** in the same org. They're the answer to "I want one CI policy / release pipeline / security scan to fire on every repo without copy-pasting `.kici/` folders everywhere."

If you've only ever used per-repo workflows so far, start with the mental model section — global workflows add two new concepts (workflow repo vs. source repo, and authoring vs. source axes) that show up everywhere from SDK syntax to dashboard settings.

## Mental model

| Term           | Meaning                                                                                                                                                |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Workflow repo  | The repo whose `.kici/workflows/*.ts` file **declares** the global workflow. Holds the steps. Also known as the _authoring_ repo.                      |
| Source repo    | The repo that **emits** the event (push / PR / tag / ...) that causes the global workflow to fire. The agent checks out this repo as the working copy. |
| Global         | A workflow whose trigger carries one or more `repos:` glob patterns. The presence of `repos:` is what classifies a workflow as global.                 |
| Authoring axis | Policy that answers "which repos may **author** global workflows?" Controlled by the allow-list in the dashboard's _Workflow authors_ setting.         |
| Source axis    | Policy that answers "which **source** repos' events are allowed to trigger global workflows?" Controlled by the deny-list in _Blocked source repos_.   |

The two axes are independent. A global workflow fires only if it passes **both** — its authoring repo is allowed AND the source repo is not denied.

## Declaring a global workflow

Add `repos:` to any trigger. Any workflow with at least one `repos:`-bearing trigger becomes global automatically; no separate flag is required.

```ts
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('org-lint', {
  on: [
    push({
      repos: ['myorg/*', '!myorg/archived-*'],
      branches: ['main'],
    }),
  ],
  jobs: [
    job('lint', {
      steps: [
        step('lint-all', async ({ $, env }) => {
          await $`echo source=${env.KICI_SOURCE_REPO_PATH ?? 'unknown'}`;
          await $`npm run lint`;
        }),
      ],
    }),
  ],
});
```

Patterns in `repos:` use the same globbing as `branches:` / `paths:` — plain globs (`myorg/*`), a leading `!` for exclusions (`!myorg/fork-*`), and a fully-qualified `owner/repo` identity for exact matches (`myorg/platform`). A bare `**` matches every repo in the org.

### At a dual-repo checkout

The agent receives two sets of context during a global workflow execution:

| `env` var                 | Points to                                                                                                                                    |
| ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `KICI_SOURCE_REPO_PATH`   | The **source** repo's working tree (the repo that emitted the event). This is the repo the job's `$` / `git` commands operate on by default. |
| `KICI_WORKFLOW_REPO_PATH` | The **workflow** repo's working tree (the repo that authored the workflow). Useful for reading shared scripts or config from your CI repo.   |

Source repo secrets are **not** available to a global workflow's job by default — see _Elevated access_ below.

## Enabling global workflows

Global workflows are **opt-in per org**. In a fresh org, `repos:`-bearing workflows are registered but never dispatched.

1. Open the dashboard → **Settings → Global workflows**.
2. Turn on **Enable global workflows** (the master toggle). This is the kill-switch — every other toggle below is ignored while this is off.
3. Decide which authoring/source controls you need:

| Setting              | What it controls                                                                                                                                                                                     | Typical use                                                                                  |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| Workflow authors     | Restricts which repos can **author** (register) global workflows. Globs matched against the authoring repo identifier. When OFF, any repo in the org may author globals.                             | Lock authoring to `myorg/ci-*` so random product repos can't ship org-wide automation.       |
| Blocked source repos | Blocks dispatch for events emitted from these **source** repos, regardless of authoring. Globs matched against the event source repo identifier. When OFF, events from any repo may trigger globals. | Protect against fork spam — e.g. `!myorg/*` via `myorg/fork-*`.                              |
| Elevated access      | Authoring repos listed here get **read access to source-repo secrets** during execution. Globs matched against the authoring repo identifier.                                                        | A `myorg/ci-deploy` repo that needs to read a source repo's `NPM_TOKEN` to publish releases. |

All three lists accept globs. Leading `!` inside a single pattern is not supported here; negation is via the list-is-implicit-deny semantics, so keep it simple (`myorg/ci-*`, `myorg/platform-*`).

### Saving and reverting

The page is a two-state editor — changes are local until you click **Save changes**, and you can abandon them with **Discard changes**. There is no partial save; the PATCH is all-or-nothing per save click.

## Security model

### Two independent axes

A global workflow fires only if:

1. **The authoring repo is allowed.** If _Workflow authors_ is ON, the workflow's authoring repo must match at least one allow-list glob. If OFF, any repo may author. Enforced at two points:
   - At registration time (extraction from the lock file — non-matching globals are dropped with a warning).
   - At dispatch time (defense-in-depth — policy changes after registration still take effect).
2. **The source repo is not denied.** If the event's source repo matches any glob in _Blocked source repos_, the global workflow is skipped. Enforced at dispatch time.

Both checks are logged to the orchestrator. Grep the logs for `Skipping global workflow` to see enforcement in action.

### Elevated access (source-repo secrets)

By default a global workflow's job runs with credentials scoped to the **workflow** repo — it can clone both repos but cannot read the source repo's scoped secrets. That's the safe default: a random workflow in `myorg/ci-pipelines` does not get read access to secrets in `myorg/backend` just because it runs on a push there.

Adding the authoring repo to the _Elevated access_ list flips that: the job receives the source repo's secret context, so deploy and release flows that need `NPM_TOKEN` / `AWS_ROLE_ARN` / etc. from the source repo can read them. Treat elevated repos as effective owners of every source repo's CI secrets — only add repos you fully trust.

## When does it fire?

Same-repo globals (a workflow in `myorg/app` with `repos: ['myorg/app']`) fire on pushes to `myorg/app`. Cross-repo globals fire on pushes to any source repo whose identifier matches a glob on the authoring workflow's trigger. The orchestrator de-duplicates between the per-repo and cross-repo matching passes, so a single event produces at most one run per (workflow, source-repo, trigger) triple.

Non-push triggers work too — `pr()`, `tag()`, `comment()`, `release()`, `workflowRun()`, etc. all accept `repos:`. `kiciEvent()` / `schedule()` / cron-like triggers have no source repo, so they're always per-org-registered regardless of `repos:`.

## Troubleshooting

| Symptom                                                      | Likely cause                                                                                       | Where to look                                                                                                      |
| ------------------------------------------------------------ | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| Global workflow registered but never runs                    | Master toggle OFF, or allow-list blocks the authoring repo, or deny-list blocks the source repo    | Orchestrator log: `Skipping global workflow dispatch` / `Skipping global workflow registration: not permitted`     |
| `repos:` has no effect — workflow only fires on its own repo | Master toggle OFF. Without opt-in, the orchestrator treats the workflow as per-repo-only.          | Dashboard → Settings → Global workflows (top toggle)                                                               |
| Source repo secrets unavailable in a global job              | Expected default — elevate the authoring repo to grant access.                                     | Dashboard → Settings → Global workflows → _Elevated access_                                                        |
| Dashboard shows workflow twice after registering             | Both a generic webhook source and a provider source (github, generic) re-registered the same repo. | Check `workflow_registrations` via `kici-admin workflow list` and confirm the right routing key owns the workflow. |

## See also

- [Architecture — global workflows](../architecture/global-workflows.md) — dual-query dispatch flow, cross-provider auth, security model, lock-file schema.
- [Universal-git provider](providers/universal-git.md#global-workflows) — how global workflows interact with `generic:<orgId>:<sourceId>` routing keys.
- [SDK reference](sdk-reference.md) — the full set of triggers that accept `repos:`.

---

## Private npm registries

Source: https://docs.kici.dev/user/private-registries/

A workflow's `.kici/package.json` may depend on packages published to a private registry — your org's internal CodeArtifact, a GitHub Packages scope, a self-hosted Verdaccio, JFrog, Cloudsmith, GitLab, etc. KiCI ships two ways to authenticate `npm install` against those registries from inside a job, plus an escape hatch for short-lived tokens.

## Choose a path

| Path                                                    | When to pick it                                                                                                                                                                                               |
| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Option A — `registries:` block in the workflow**      | The token is a long-lived secret you rotate manually (GH Packages PAT, CodeArtifact IAM access key, Verdaccio service token). KiCI manages the `.npmrc` for you.                                              |
| **Option C — Committed `.kici/.npmrc` + `installEnv:`** | You already have an `.npmrc` you want to keep verbatim (e.g. it carries an `audit=false` line, a custom CA, or a complex multi-scope mapping). KiCI just supplies the env vars your `${VAR}` references need. |
| **Setup-step pattern (short-lived tokens)**             | The token is minted at workflow time (CodeArtifact authorization token, GCP Artifact Registry token). A `setup` job runs the cloud CLI, writes a fresh `.kici/.npmrc`, and the install jobs read it.          |

The two channels (Option A and Option C) compose. If you declare both, the agent's auto-generated lines come **after** your committed `.npmrc`, so npm's last-wins semantics let agent-managed registries override committed ones — never the other way around.

## Option A — `registries:` block

Declare the registry in your workflow file and point its `tokenSecret` at a scoped secret using the qualified `<environment>:<secret-name>` syntax. The orchestrator resolves the token at dispatch time and the agent applies it for one `npm install` only.

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('build', {
  on: [push({ branches: ['main'] })],
  registries: [
    {
      url: 'https://npm.pkg.github.com/',
      scope: '@my-org',
      tokenSecret: 'production:GITHUB_PACKAGES_TOKEN',
    },
  ],
  jobs: [
    job('build', {
      runsOn: 'default',
      environment: 'production',
      steps: [
        step('install-and-build', async (ctx) => {
          // .kici/package.json can now reference @my-org/* packages
          await ctx.$`npm run build`;
        }),
      ],
    }),
  ],
});
```

Per-field rules:

- **`url`** — Must be HTTPS. HTTP is permitted only for `localhost` / `127.0.0.0/8` / `::1` / `*.local` hosts, or when an operator has flipped the org-level `allow_http_npm_registries` toggle (see [`kici-admin org-settings allow-http-npm`](/operator/kici-admin-cli#allow-http-npm--permit-non-https-private-npm-registries)).
- **`scope`** — Optional. When present, the registry serves only that scope (`@my-org`). When absent, this entry becomes the **default** registry — at most one entry may omit `scope`.
- **`tokenSecret`** — Mandatory `<environment>:<secret-name>`. The orchestrator looks up the secret in the named environment via the per-environment secret resolver. The bare name **must not** contain a colon.
- **`alwaysAuth`** — Defaults to `true`. Forces npm to send the token on every request (even GETs), which is what most managed-registry providers require.

### How tokens reach `npm install`

The agent never writes the token bytes to your `.kici/.npmrc`. Each registry token is exposed to the install subprocess as a job-scoped env var (`KICI_NPM_TOKEN_<jobIdShort>_<i>`), and the on-disk auth line carries a `${VAR}` reference that npm substitutes at read time. The job-scoped nonce makes the env var name unguessable from outside the install subprocess.

After the install completes (success or failure), the agent restores the original `.kici/.npmrc` — your committed file is never permanently modified.

## Option C — committed `.kici/.npmrc` + `installEnv:`

If you'd rather hand-craft the `.npmrc`, commit it under `.kici/.npmrc` with `${VAR}` placeholders, then list each variable in the workflow's `installEnv:` block using the same qualified syntax as `tokenSecret`.

```ini
# .kici/.npmrc
@my-org:registry=https://npm.example.com/
//npm.example.com/:_authToken=${MY_NPM_TOKEN}
//npm.example.com/:always-auth=true
audit=false
```

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('build', {
  on: [push({ branches: ['main'] })],
  installEnv: ['production:MY_NPM_TOKEN'],
  jobs: [
    job('build', {
      runsOn: 'default',
      environment: 'production',
      steps: [step('build', async (ctx) => ctx.$`npm run build`)],
    }),
  ],
});
```

The orchestrator resolves `MY_NPM_TOKEN` from the `production` environment's secret store and seeds it as `MY_NPM_TOKEN` (bare name) in the install subprocess. Your committed `.npmrc` reads it through `${MY_NPM_TOKEN}`.

This path is the right answer when:

- The `.npmrc` carries non-auth knobs (`audit=false`, `legacy-peer-deps=true`, custom CA bundles).
- You want a single source of truth for registry topology that `npm` tooling outside KiCI can consume too.
- The auth lines reference the **same** env var across multiple registries.

## Short-lived tokens (CodeArtifact, GCP Artifact Registry)

AWS CodeArtifact authorization tokens expire after 12 hours; GCP Artifact Registry tokens after 60 minutes. Storing one as a long-lived `tokenSecret` does not work — by the time a build runs, the token may be expired.

The supported pattern is a **setup job** that mints a fresh token, writes `.kici/.npmrc`, and downstream jobs install with it.

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('build', {
  on: [push({ branches: ['main'] })],
  jobs: [
    job('mint-codeartifact-token', {
      runsOn: 'default',
      environment: 'production',
      steps: [
        step('mint', async (ctx) => {
          const awsKey = await ctx.secrets.get('AWS_ACCESS_KEY_ID');
          const awsSecret = await ctx.secrets.get('AWS_SECRET_ACCESS_KEY');
          process.env.AWS_ACCESS_KEY_ID = awsKey;
          process.env.AWS_SECRET_ACCESS_KEY = awsSecret;

          const token = (
            await ctx.$`aws codeartifact get-authorization-token --domain my-domain --query authorizationToken --output text`
          ).stdout.trim();

          // Write directly into the workspace's .kici/ — the next job reuses the same workspace.
          const npmrc = [
            '@my-org:registry=https://my-domain-1234567890.d.codeartifact.eu-central-1.amazonaws.com/npm/workflow-deps/',
            `//my-domain-1234567890.d.codeartifact.eu-central-1.amazonaws.com/npm/workflow-deps/:_authToken=${token}`,
            '//my-domain-1234567890.d.codeartifact.eu-central-1.amazonaws.com/npm/workflow-deps/:always-auth=true',
            '',
          ].join('\n');
          await ctx.$`tee .kici/.npmrc`.stdin(npmrc);
        }),
      ],
    }),
    job('build', {
      runsOn: 'default',
      environment: 'production',
      needs: ['mint-codeartifact-token'],
      steps: [step('build', async (ctx) => ctx.$`npm run build`)],
    }),
  ],
});
```

The same pattern works for GCP Artifact Registry — replace the `aws codeartifact` call with `gcloud auth print-access-token`. The manual setup-step shown here is the supported path for these short-lived flows.

## Provider-specific examples

### GitHub Packages

```typescript
registries: [
  {
    url: 'https://npm.pkg.github.com/',
    scope: '@my-org',
    tokenSecret: 'production:GITHUB_PACKAGES_TOKEN',
  },
],
```

Mint the token from a fine-grained PAT with `read:packages` scope, store it as a scoped secret in the `production` environment.

### GitLab Packages

```typescript
registries: [
  {
    url: 'https://gitlab.example.com/api/v4/projects/123/packages/npm/',
    scope: '@my-group',
    tokenSecret: 'production:GITLAB_DEPLOY_TOKEN',
  },
],
```

Use a project- or group-level deploy token with `read_package_registry` scope.

### Verdaccio (self-hosted)

```typescript
registries: [
  {
    url: 'https://npm.internal.example.com/',
    tokenSecret: 'production:VERDACCIO_TOKEN',
  },
],
```

For local development against a Verdaccio container, point at `http://localhost:4873/` — the loopback exemption means the operator does NOT need to flip `allow_http_npm_registries`.

### JFrog Artifactory

```typescript
registries: [
  {
    url: 'https://artifactory.example.com/artifactory/api/npm/npm-virtual/',
    scope: '@my-org',
    tokenSecret: 'production:JFROG_API_KEY',
  },
],
```

### Cloudsmith

```typescript
registries: [
  {
    url: 'https://npm.cloudsmith.io/my-org/my-repo/',
    scope: '@my-org',
    tokenSecret: 'production:CLOUDSMITH_TOKEN',
  },
],
```

## Security model

- **Per-environment scoping.** Every `tokenSecret` and `installEnv` entry is qualified with an environment name. The orchestrator runs the same protection-rule pipeline (branch / trust / concurrency / reviewer / wait-timer) against each named environment **before** resolving any secret, so a workflow that wants a `production` token from a feature branch is rejected exactly like a job that tries to deploy to `production` from a feature branch.
- **Untrusted contributors get no tokens.** When a fork PR is dispatched and the contributor-trust resolution returns anything other than `trusted`, the orchestrator strips both `npmRegistries` and `installEnvSecrets` out of the dispatch. The install runs without auth and fails naturally on the first private dep — fork PRs cannot ever observe a registry token, even if a misconfigured environment lacks an explicit `requiredTrustTier`.
- **Lifecycle scripts disabled.** Whenever a private registry is in scope, the agent runs the install with `--ignore-scripts` (npm or pnpm alike). A malicious `preinstall` / `postinstall` hook in committed `package.json` cannot read the synthesized token env vars, even though they exist in the install subprocess. For a pnpm workspace, the agent builds your in-repo dependency closure as a separate step **after** the install's auth is torn down, so build scripts never see the tokens either.
- **Stderr is redacted.** If the install fails, the agent masks every token literal out of the surfaced stderr / stdout chunks before logging.
- **Job-scoped env-var names.** The synthesized auth env var is `KICI_NPM_TOKEN_<jobIdShort>_<i>` where `jobIdShort` is the first 8 chars of the dispatched job id. The name is unguessable from outside the install subprocess and not reused across jobs.
- **`.npmrc` restored.** Whatever the agent appended for one install is stripped (or the file unlinked) on cleanup, so the workspace is never permanently modified.

## Limitations

- **`registries:` is workflow-level only in v1.** Per-job overrides aren't supported — there is one shared `.kici/` per workspace, so a per-job `registries:` would be physically nonsensical.
- **A non-pass protection-rule outcome rejects the whole workflow dispatch.** Today, if the named environment requires reviewer approval (`hold` action) or hits concurrency (`queue` / `wait`) for the install gate, the entire workflow dispatch is rejected with a clear reason. Workflow-scoped held-runs (which would let the install wait for an approver instead of failing outright) are tracked as a follow-up — until then, choose an environment whose protection rules `pass` for the branches that need to install private deps.
- **Container registries (Docker Hub, ECR, GHCR) are out of scope.** This feature covers **npm** registry auth only. Container image pulls travel through the executor backend's own credential paths.

## Observability

The orchestrator exposes Prometheus counters and a histogram under the `kici_orch_install_secrets_*` prefix on its `/metrics` endpoint. They populate the **Install secrets resolution** Grafana dashboard and let operators graph install-secrets activity without digging through Loki.

| Metric                                                        | Type      | Labels                         | What it tells you                                                                                                                                                                              |
| ------------------------------------------------------------- | --------- | ------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `kici_orch_install_secrets_decisions_total`                   | Counter   | `decision`, `reason`           | Pass / reject volume. Reject reasons enumerate the failure mode: `malformed_ref`, `invalid_url_scheme`, `env_not_found`, `protection_rule_block`, `missing_token`, `missing_install_env`, etc. |
| `kici_orch_install_secrets_npm_registry_used_total`           | Counter   | `channel`, `provider`, `scope` | Per-channel + per-scope usage. `channel=registries` is Option A, `channel=install_env` is Option C. `scope=default` marks a no-scope default registry; `scope=-` marks Option C entries.       |
| `kici_orch_install_secrets_contributor_stripped_total`        | Counter   | `trust_tier`                   | Number of dispatches where registry tokens were stripped because the contributor tier wasn't `trusted` (fork PRs from unknown / known contributors). Expected to be 0 in single-tenant orgs.   |
| `kici_orch_install_secrets_token_resolution_duration_seconds` | Histogram | `environment`                  | Latency of per-environment secret resolution. Pathological tails (>500ms) usually mean a Vault timeout or a slow Postgres replica.                                                             |

The dashboard JSON lives at `infra/terraform/modules/grafana/dashboards/install-secrets.json`; if you maintain your own monitoring stack, you can import it directly.

## See also

- [Secrets](secrets.md) — how to seed the `<environment>:<secret-name>` values referenced by `tokenSecret` / `installEnv`.
- [Environments](environments.md) — protection rules (`branch_restrictions`, `requires_review`, `minimum_trust`) that the install gate inherits.
- [Operator: `kici-admin org-settings`](/operator/kici-admin-cli#org-settings----org-level-security-policy) — the `allow_http_npm_registries` toggle and other org-scoped knobs.

---

## Secrets

Source: https://docs.kici.dev/user/secrets/

KiCI provides an explicit secrets API that gives workflow steps controlled access to secrets stored in the orchestrator's secret store. Secrets are never auto-injected into `process.env` -- you must explicitly request each secret by name.

## Overview

Secrets are managed per-environment in the orchestrator (see [operator docs](/operator/orchestrator/configuration) for setup). When a job runs with an `environment` binding, the agent receives the secret keys available for that environment but does **not** inject their values into the step's process environment. Instead, steps access secrets through the `ctx.secrets` API.

This design prevents accidental secret leakage through child processes, log output, or error messages. Only secrets you explicitly request are loaded into memory.

## Where secret values come from

Secret values are written either through the dashboard or through `kici-admin` running against the orchestrator. The orchestrator operator decides — per organization — which surface accepts secret writes. From the workflow author's perspective, the resolution path at run time is identical either way; the difference is where you (or your ops team) **enter** the value.

### Default — dashboard or CLI

A fresh orchestrator starts in **permissive** mode: both surfaces are available.

- **Dashboard:** Settings → Secrets → pick a scope → enter the secret name and value.
- **CLI:** `kici-admin secret set --scope <scope> <KEY>` against the orchestrator's HTTP admin API.

Use whichever fits the workflow — most small teams stay on the dashboard; ops engineers and CI scripts use the CLI.

### When the operator has disabled dashboard writes

The orchestrator operator can flip `secrets.set` (and `variables.set`) to **CLI-only** as part of the [dashboard-write policy](/operator/security/dashboard-write-policy). When that flip is on:

- The dashboard's "Add secret" / "Edit value" controls render with a lock icon. Clicking them shows a tooltip with the exact `kici-admin secret set` invocation needed.
- The dashboard's secrets page still lists secret **names**, scopes, and bindings — only the value-entry path moves to the CLI.
- `kici-admin secret set` becomes the single entry point for new and updated secret values.

This configuration is common for SOC2-prep and regulated workloads, where the customer requirement is "the SaaS control plane process never receives plaintext customer secret values." The dashboard remains usable for everything else (read paths, name CRUD, environment bindings).

### CLI input modes

`kici-admin secret set` accepts five input modes — pick the one that fits your workflow:

```bash
# Interactive prompt (default when stdin is a TTY). No echo, no shell history.
kici-admin secret set --scope production DB_PASSWORD --prompt

# Pipe from another tool (default when stdin is not a TTY).
pass show prod/db | kici-admin secret set --scope production DB_PASSWORD --from-stdin

# Read from a file (handy after `sops -d` to a tmpfile).
kici-admin secret set --scope production DB_PASSWORD --from-file ./db.pass

# Read from a named environment variable (CI-friendly).
KICI_SECRET_VALUE=$(my-secrets-fetcher prod db) \
  kici-admin secret set --scope production DB_PASSWORD --from-env KICI_SECRET_VALUE

# Direct argv — discouraged. Prints a stderr warning ("visible in shell history").
kici-admin secret set --scope production DB_PASSWORD --value "<plaintext>"
```

Two cross-cutting flags help every mode:

- `--confirm-fingerprint <hex>` — pre-compute SHA-256 of the value and pass it. The CLI rejects the call if the value's fingerprint doesn't match. Catches paste corruption.
- `--dry-run` — parse and validate the value, print `[dry-run] would set <key> in scope <scope> sha256=<hex>`, exit without writing.

`kici-admin variable set` uses the same flags for non-encrypted variables, plus `--locked` to mark a variable as immutable from subsequent dashboard writes.

A full reference of input modes — including the default-mode resolution rules and the security trade-offs of each — lives in [Dashboard-write policy → CLI input modes](/operator/security/dashboard-write-policy#cli-input-modes-for-the-plaintext-path).

## Accessing secrets

Use `ctx.secrets.get(key)` to retrieve a secret value. The method is async to support process-level step isolation in future versions.

```typescript
import { workflow, job, step } from '@kici-dev/sdk';

export default workflow('deploy', {
  on: [push({ branches: ['main'] })],
  jobs: [
    job('deploy', {
      runsOn: 'default',
      environment: 'production',
      steps: [
        step('deploy', async (ctx) => {
          const token = await ctx.secrets.get('DEPLOY_TOKEN');
          await ctx.$`deploy --token ${token}`;
        }),
      ],
    }),
  ],
});
```

If the secret does not exist, `get()` throws a `SecretNotFoundError` with a descriptive message.

## Exposing secrets to shell commands

When you need a secret available as an environment variable for shell commands (e.g., tools that read `$API_KEY` from the environment), use `ctx.secrets.expose(key)`:

```typescript
step('run-tool', async (ctx) => {
  // Injects MY_API_KEY into process.env for this step only
  await ctx.secrets.expose('MY_API_KEY');

  // Now child processes can read it from the environment
  await ctx.$`some-tool --use-env-auth`;
});
```

`expose()` sets `process.env[key]` to the secret value. This is scoped to the step's child process -- it does not leak to other steps or jobs.

## Checking secret existence

Use `ctx.secrets.has(key)` to check whether a secret is available without retrieving its value:

```typescript
step('conditional-notify', async (ctx) => {
  if (ctx.secrets.has('SLACK_WEBHOOK')) {
    const webhook = await ctx.secrets.get('SLACK_WEBHOOK');
    await ctx.$`curl -X POST ${webhook} -d '{"text": "Deploy complete"}'`;
  } else {
    console.log('Slack webhook not configured, skipping notification');
  }
});
```

`has()` is synchronous and does not load the secret value.

## Mounting secrets as files

Some tools refuse to read credentials from environment variables and require a file path on disk (for example, `sops` reads `SOPS_AGE_KEY_FILE`, `kubectl` reads `KUBECONFIG`, and `gcloud` reads `GOOGLE_APPLICATION_CREDENTIALS`). The secrets API materialises one or more existing string secrets to a tmpfile for the lifetime of the step.

### list()

`ctx.secrets.list()` returns every secret key available to the step, sorted alphabetically. It is synchronous, never throws, and returns names only — call `getMeta(key)` to inspect the backend and scope for a specific key.

```typescript
step('discover-keys', async (ctx) => {
  // Pick up every age key the operator has provisioned.
  const ageKeys = ctx.secrets.list().filter((k) => k.startsWith('AGE_KEY_'));
  ctx.log.info(`Found ${ageKeys.length} age keys`);
});
```

### mountFile(opts)

`ctx.secrets.mountFile(opts)` writes the concatenation of one or more existing secrets to a tmpfile inside a per-step tmpdir and returns the absolute path. The file is removed automatically when the step completes (success, failure, or timeout).

Options:

- `sources: string[]` — secret keys to concatenate (in order). Required.
- `divider?: string` — separator written between concatenated values. Default: no divider.
- `mode?: number` — permission bits to chmod the file to. Default: `0o600` (owner read/write only).
- `name?: string` — filename inside the per-step tmpdir. Default: auto-generated.

If any source key is missing, `mountFile` rejects with `SecretNotFoundError` listing every missing key.

```typescript
step('decrypt', async (ctx) => {
  const ageKeys = ctx.secrets.list().filter((k) => k.startsWith('AGE_KEY_'));
  const keyFile = await ctx.secrets.mountFile({
    sources: ageKeys,
    divider: '\n',
  });
  await ctx.$`sops --age-key-file ${keyFile.path} -d secrets.enc.yaml`;
});
```

### exposeFile(envVar, opts)

`ctx.secrets.exposeFile(envVar, opts)` is `mountFile` plus `process.env[envVar] = path`. The env var is unset and the file is removed when the step completes. The customer controls every env var name — there is no implicit `KICI_SECRET_FILE_*` naming.

```typescript
step('deploy', async (ctx) => {
  await ctx.secrets.exposeFile('SOPS_AGE_KEY_FILE', {
    sources: ctx.secrets.list().filter((k) => k.startsWith('AGE_KEY_')),
    divider: '\n',
  });

  // sops reads SOPS_AGE_KEY_FILE from the environment.
  await ctx.$`sops -d secret.enc.yaml`;
});
```

### Lifecycle and cleanup

- **Lazy allocation:** no tmpdir is created until the first `mountFile` / `exposeFile` call. Steps that never mount pay nothing.
- **Per-step tmpdir:** allocated under the OS temp directory and bound to a single step. Two mounts in the same step share the same tmpdir; the runtime auto-suffixes filenames when no `name` is supplied.
- **Automatic cleanup:** when the step returns (success), throws (failure), or times out, the runtime removes the tmpdir and unsets any env var set via `exposeFile`. There is nothing to clean up by hand.
- **Sandbox container:** when the agent runs the step inside a container or microVM, the tmpdir lives on the sandbox's `/tmp` (a fresh tmpfs in the production sandbox profile). The file is gone when the sandbox is torn down.

### Log masking

Mounted file contents are registered with the log masker, so a subprocess that echoes the credential (e.g. a tool that prints its loaded credential on `--debug`) sees `***` in the streamed log instead of the raw value. This covers the case where `mountFile` joins two source secrets into a brand-new byte sequence neither original value would mask on its own.

### Canonical sops example

```typescript
import { workflow, job, step, push } from '@kici-dev/sdk';

export default workflow('deploy', {
  on: push({ branches: ['main'] }),
  jobs: [
    job('decrypt-and-deploy', {
      runsOn: 'default',
      environment: 'production',
      steps: [
        step('decrypt', async (ctx) => {
          const ageKeys = ctx.secrets.list().filter((k) => k.startsWith('AGE_KEY_'));
          await ctx.secrets.exposeFile('SOPS_AGE_KEY_FILE', {
            sources: ageKeys,
            divider: '\n',
          });
          await ctx.$`sops -d secret.enc.yaml > config.yaml`;
          // No cleanup -- the tmpdir + the SOPS_AGE_KEY_FILE env var
          // are removed automatically when this step returns.
        }),
      ],
    }),
  ],
});
```

## API reference

| Method       | Signature                                                                | Description                                                                        |
| ------------ | ------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- |
| `get`        | `get(key: string): Promise<string>`                                      | Retrieve a secret value. Throws `SecretNotFoundError` if not found.                |
| `expose`     | `expose(key: string): Promise<void>`                                     | Set `process.env[key]` to the secret value for child process access.               |
| `has`        | `has(key: string): boolean`                                              | Check if a secret key is available (synchronous).                                  |
| `getMeta`    | `getMeta(key: string): SecretMeta \| undefined`                          | Get metadata (backend name, scope) for a secret. Returns `undefined` if not found. |
| `list`       | `list(): string[]`                                                       | Sorted array of every secret key available to the step. Synchronous, never throws. |
| `mountFile`  | `mountFile(opts: SecretFileOptions): Promise<{ path: string }>`          | Materialise one or more secrets as a tmpfile. Auto-cleanup at step end.            |
| `exposeFile` | `exposeFile(envVar: string, opts: SecretFileOptions): Promise<{ path }>` | `mountFile` plus `process.env[envVar] = path`. Env var unset at step end.          |

## Migration from property access

If upgrading from a previous version that used property access (`ctx.secrets.KEY`), update your workflow code:

```typescript
// Before (old API)
const token = ctx.secrets.DEPLOY_TOKEN;

// After (new API)
const token = await ctx.secrets.get('DEPLOY_TOKEN');
```

For conditional access:

```typescript
// Before (old API)
if (ctx.secrets.DEPLOY_TOKEN) { ... }

// After (new API)
if (ctx.secrets.has('DEPLOY_TOKEN')) { ... }
```

Note that `get()` is async -- you must `await` the result.

## Typed secrets

When you run `kici types`, the compiler generates a `.kici/secrets.d.ts` file that provides type-safe autocompletion for your secret keys. The generated types augment the `StepSecrets` interface so that `ctx.secrets.get('...')` and `ctx.secrets.has('...')` offer suggestions for known keys.

See [CLI reference](/user/cli) for the `kici types` command.

---

# Providers

## GitHub App provider

Source: https://docs.kici.dev/user/providers/github/

The **GitHub App** is KiCI's flagship source. A single App:

1. receives `push`, `pull_request`, and related events from every repo it's installed on,
2. clones repos with a short-lived installation token (no deploy key to manage),
3. posts workflow / job / step Check runs back to the pull request (see
   [GitHub checks architecture](../../architecture/webhooks/github-checks.md)).

You don't need an App for every scenario — if you only care about `push`
events, don't want to install an App, or are using a non-GitHub forge,
use the [universal-git provider](universal-git.md) instead.

## GitHub App vs. `github-repo` preset

Both paths reach the same trigger pipeline; they differ in what the
forge side looks like:

| Capability                               | GitHub App (this guide)                | `github-repo` preset on universal-git       |
| ---------------------------------------- | -------------------------------------- | ------------------------------------------- |
| Webhook source                           | App-level webhook (one per App)        | Per-repo webhook (one per repo)             |
| Clone auth                               | Installation token (auto, short-lived) | PAT or SSH deploy key (you manage rotation) |
| Check runs on pull requests              | Yes — full KiCI Checks UI              | No (status post only via custom step)       |
| Cross-repo install in seconds            | Yes (install the App on more repos)    | No (new webhook per repo)                   |
| Works without a GitHub org admin         | No (App creation is org-scoped)        | Yes (per-repo webhook is repo-admin)        |
| Works on Forgejo / Gitea / Gogs / GitLab | No                                     | Yes (other presets)                         |

Use the App when you can; the `github-repo` preset is a fallback for
repos where you can't install an App.

## Create the GitHub App on GitHub's side

1. **Decide the App scope.** User-owned Apps can only be installed on
   repos you own; organization-owned Apps can be installed anywhere in
   the org. For production, create the App under the org.

2. **Create the App.** Go to _Settings -> Developer settings -> GitHub
   Apps -> New GitHub App_ (org-level is _Settings -> Developer
   settings -> GitHub Apps_ on the org page).

3. **Set the webhook URL.** KiCI exposes one webhook endpoint per org:

   ```
   https://<platform-host>/webhook/<orgId>/github
   ```

   GitHub App webhooks are always delivered to this Platform endpoint and
   relayed to your orchestrator over its outbound connection — platform and
   hybrid orchestrators both receive GitHub events this way. Independent-mode
   orchestrators have no Platform connection and therefore no GitHub-App
   ingress; use a generic webhook source instead. The `<orgId>` segment is
   the KiCI organization ID the source belongs to; the `<appId>` is
   discovered from `X-GitHub-Hook-Installation-Target-ID` at request time and
   is _not_ part of the URL.

4. **Set the webhook secret.** Generate a random hex string (e.g.
   `openssl rand -hex 32`) and save it for step 4 of the orchestrator
   registration below. GitHub uses this secret to HMAC-sign every
   webhook; KiCI rejects mismatches.

5. **Pick permissions.** Minimum required:

   | Scope                       | Access       | Why                                                                                                                                                       |
   | --------------------------- | ------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
   | Repository -> Contents      | Read         | Clone the repo to read the lock file                                                                                                                      |
   | Repository -> Metadata      | Read (auto)  | Default for every App; also lets KiCI look up a pull-request author's repository access level for CI trust                                                |
   | Repository -> Pull requests | Read         | Match `pull_request` triggers                                                                                                                             |
   | Repository -> Checks        | Read & write | Post KiCI's enriched Check runs                                                                                                                           |
   | Organization -> Members     | Read         | (optional, org installs) Receive `organization` / `membership` / `team` events so KiCI's CI-trust permission cache invalidates promptly on access changes |

   The first four rows cover the core flow (clone, trigger matching,
   Check runs). The **Organization -> Members** row is only relevant if
   you use [CI trust tiers](../../architecture/security/ci-security.md)
   on an org-level install — see the event note below.

6. **Subscribe to events.** At minimum: `push`, `pull_request`,
   `check_run`, `check_suite`. Add others (`issues`, `release`, ...) if
   your workflows use those triggers.

   **For CI trust (optional but recommended on org installs):** also
   subscribe to `member`, `organization`, `membership`, and `team`.
   KiCI caches each pull-request author's repository access level (used
   to decide whether workflow changes take effect immediately or are
   held for approval — see
   [CI security](../../architecture/security/ci-security.md)). These
   events let the orchestrator drop stale cache entries the moment a
   contributor's access changes. They are not required for correctness:
   without them the cache simply ages out on its own 15-minute TTL, so a
   permission change can take up to 15 minutes to take effect. The
   `organization` / `membership` / `team` events require the
   **Organization -> Members** read permission and an org-level
   installation; `member` is a repository event covered by the default
   Metadata permission.

7. **Generate a private key.** Scroll to the bottom of the App settings
   and click _Generate a private key_. A `.pem` file downloads —
   store it safely; you cannot redownload it.

8. **Copy the App ID.** It's the numeric ID near the top of the App
   settings page. You'll need it for `--app-id` below.

9. **Install the App on target repos.** Under the App's _Install App_
   tab, install it on the repos (or whole org) that should trigger
   KiCI runs. Re-install to add repos later — this is live and
   revocable without redeploying the App.

## Register the App with the orchestrator

With the App ID, private key `.pem`, and webhook secret in hand:

```bash
kici-admin --url http://<orchestrator-host>:4000 --token $KICI_BOOTSTRAP_ADMIN_TOKEN \
  source add github \
  --name my-org \
  --app-id 12345 \
  --private-key @/path/to/private-key.pem \
  --webhook-secret <the-webhook-secret-from-step-4>
```

The command prints the routing key (always `github:<appId>`) and the public
webhook URL to paste into the GitHub App's "Webhook URL" field:

```
Source added: github:<appId> (my-org)
Webhook URL:  https://<platform-host>/webhook/<orgId>/github
  ↳ Paste this into your GitHub App's "Webhook URL" field.
```

When the orchestrator runs in independent mode (no Platform connection) the
URL line reads `(unavailable — this orchestrator runs in independent mode)`,
because GitHub-App ingress is Platform-relayed. The private key and webhook
secret are stored encrypted in the orchestrator database under
`KICI_SECRET_KEY`; no restart needed — the orchestrator accepts webhooks from
this App immediately.

**Secret input modes** (for `--private-key` and `--webhook-secret`):

| Mode                 | Syntax                  | Example                                                   |
| -------------------- | ----------------------- | --------------------------------------------------------- |
| Direct value         | `--private-key <value>` | `--webhook-secret mysecret`                               |
| File (`@` prefix)    | `--private-key @<path>` | `--private-key @/path/to/key.pem`                         |
| Environment variable | `--from-env <var>`      | `--from-env GITHUB_PRIVATE_KEY`                           |
| Standard input       | `--stdin`               | `cat key.pem \| kici-admin source add github --stdin ...` |

Use `@file` for private keys — it reads the full PEM including
newlines without quoting pitfalls.

To list and inspect:

```bash
kici-admin source list                             # All configured sources
kici-admin source get-webhook-secret github:12345  # Fetch the secret (for debugging)
```

For the full CLI reference see the `source` section of the
[kici-admin CLI reference](../../operator/orchestrator/kici-admin-cli.md).

## Routing keys

Every GitHub App source has routing key `github:<appId>`. It's the
identifier every other KiCI surface uses to talk about the source:

- `kici-admin source update github:<appId> ...` for rotation / updates
- `kici-admin source remove github:<appId>` to decommission
- `kici-admin org-settings global-workflows ... --customer-id <orgId> [--source github:<appId>]` for policy (org-scoped row, optional per-entry source qualifier)
- The orchestrator's source records and event-log entries key on
  `github:<appId>`; org-level settings key on `customer_id` (one row
  per org)

If you install the same App across multiple KiCI orgs, each org has
its own source record and the orchestrator looks up the right one by
combining the URL's `<orgId>` with the App ID from the
`X-GitHub-Hook-Installation-Target-ID` header.

## Global workflows

A GitHub App source opts in to org-wide global workflows using the
org-scoped settings row. Pass `--customer-id <orgId>` (alias `--org`)
to select the row; on `*-add` mutators, pass `--source github:<appId>`
when you want a list entry pinned to this specific App rather than
applying to any source in the org:

```bash
# Enable global workflows for the org
kici-admin org-settings global-workflows set-enabled true \
  --customer-id <orgId>

# Allow the listed repo as an author for any source in the org
kici-admin org-settings global-workflows allow-add 'my-org/ci-workflows/*' \
  --customer-id <orgId>

# Allow the listed repo as an author only when authored on this App
kici-admin org-settings global-workflows allow-add 'my-org/ci-workflows/*' \
  --customer-id <orgId> --source github:12345

# Deny events from untrusted repos delivered on this App
kici-admin org-settings global-workflows deny-add 'my-org/contrib/*' \
  --customer-id <orgId> --source github:12345
```

Global workflows authored in a GitHub App repo can dispatch against
events from universal-git sources in the same org, and vice versa,
with each clone using its own source's credentials. See
[Global workflows](../../architecture/global-workflows.md) for the
policy model and cross-source dispatch contract.

## Check runs

Once registered, the App's Check-runs permission lets KiCI post
enriched Check runs:

- `kici/{workflowName}` — overall pass/fail for the workflow
- `kici/{workflowName}/job/{jobName}` — per-job detail with step progress
- `kici/{workflowName}/setup` — (optional) build / dependency-install check

Step progress, log tails, and source-location annotations are all
driven by the orchestrator's reporting module; no workflow
configuration is required beyond installing the App with the
`checks: write` permission.

For architecture details see
[GitHub checks architecture](../../architecture/webhooks/github-checks.md).

## Rotation

### Rotate the webhook secret

1. Generate a new random hex: `openssl rand -hex 32`.
2. Update GitHub: _App settings -> Webhook -> Webhook secret_. GitHub
   will sign new deliveries with this immediately.
3. Update the orchestrator:

   ```bash
   kici-admin source update github:12345 --webhook-secret <new-secret>
   ```

   The orchestrator verifies signatures against every cached secret
   during a dual-secret window, so brief mismatches during rotation
   don't drop deliveries. The HMAC verifier iterates over all stored
   secrets for the routing key.

### Rotate the private key

1. In GitHub's App settings click _Generate a private key_ — this
   does **not** revoke existing keys. Download the new `.pem`.
2. Push it to the orchestrator:

   ```bash
   kici-admin source update github:12345 --private-key @/path/to/new-key.pem
   ```

3. After confirming clones work on the new key, delete the old key
   from GitHub's App settings.

### Decommission

```bash
kici-admin source remove github:12345
```

After removal the routing-key row and its secrets are purged; GitHub
deliveries to the endpoint will be rejected as "Unknown routing key".
Uninstall the App from GitHub's side separately.

## Troubleshooting

**Webhook hits the endpoint but KiCI replies 404 `Unknown
organization`.** The `<orgId>` segment of the webhook URL doesn't
match the org that owns the source. Check the URL registered in
_App settings -> Webhook_ against `kici-admin source list`.

**Webhook hits the endpoint but KiCI replies 401 `Invalid
signature`.** The webhook secret in the App settings doesn't match
the one stored with the source. Rotate it via the steps above.

**Webhook hits the endpoint but KiCI replies 400 `Missing GitHub App
target headers`.** The request isn't actually from a GitHub App
(missing `X-GitHub-Hook-Installation-Target-Type: integration` +
`X-GitHub-Hook-Installation-Target-ID`). If you're test-firing a
webhook, use the App's _Recent Deliveries_ tab on GitHub to re-send a
real one.

**Webhook arrives but no run fires.** The orchestrator accepted the
webhook but no workflow registration matched. Causes (in order of
likelihood): the repo isn't registered with the orchestrator yet
(push a commit that touches `.kici/kici.lock.json` first), the event
type isn't one the workflow's triggers list, or
`global_workflow_denied_repos` filtered out the source repo. Check
`kici-admin event-log list --routing-key github:12345` and the
orchestrator logs for `no registrations for event`.

**Clone fails with 401 / 403.** The installation token minted from
the App private key was refused. Usually means the App was uninstalled
from the repo, or the private key on the orchestrator no longer matches
the one GitHub knows about (rotate it).

**Check runs don't appear on pull requests.** The App is missing the
`checks: write` permission or wasn't installed on the target repo.
Re-request permissions in _App settings -> Permissions & events_
(GitHub will prompt installers to accept the new scope on next visit)
and confirm the App is installed on that repo.

## See also

- [Universal-git provider](universal-git.md) — for Forgejo / Gitea /
  Gogs / GitLab, and for plain-GitHub repos without an App
- [GitHub checks architecture](../../architecture/webhooks/github-checks.md)
- [Global workflows](../../architecture/global-workflows.md)
- [kici-admin CLI reference](../../operator/orchestrator/kici-admin-cli.md)
- [Event routing](../../operator/event-routing.md) — operator-level
  routing-key mechanics

---

## Universal-git provider

Source: https://docs.kici.dev/user/providers/universal-git/

The **universal-git** provider lets KiCI treat any git forge that speaks a
GitHub-shaped webhook payload as a first-class source. That covers Forgejo,
Gitea, Gogs, GitLab, plain GitHub (without the App), and any custom
webhook-driven forge you can describe in JSONPath.

> **Want Check runs on pull requests?** Use the [GitHub App
> provider](github.md) instead — it clones via short-lived installation
> tokens and drives KiCI's enriched Checks UI out of the box. The
> universal-git `github-repo` preset is the right fallback when you
> can't install an App.

The orchestrator:

1. receives the forge's webhook,
2. clones the repo via HTTPS (PAT) or SSH (deploy key) to read the lock
   file at `.kici/kici.lock.json`,
3. dispatches workflows that match the push / pull_request event.

No mirror, no GitHub App, no `checkout: false` escape hatch. The same
trigger matching, global-workflow policy, and agent execution pipeline
that back the GitHub App source also serve universal-git sources.

## Which preset do I need?

KiCI ships canonical presets so you don't have to spell out JSONPath for
every forge:

| Preset        | Forge                                   | Webhook header   |
| ------------- | --------------------------------------- | ---------------- |
| `forgejo`     | Forgejo                                 | `X-Gitea-Event`  |
| `gitea`       | Gitea                                   | `X-Gitea-Event`  |
| `gogs`        | Gogs                                    | `X-Gogs-Event`   |
| `gitlab-repo` | GitLab (per-project webhooks)           | `X-Gitlab-Event` |
| `github-repo` | Plain GitHub (per-repo webhook, no App) | `X-GitHub-Event` |
| `custom`      | Anything else                           | You supply it    |

Pick `custom` only when the forge's payload structure or event header
deviates from GitHub's — you'll then supply `payloadPaths` and
`eventMapping` explicitly.

## Create a source (PAT)

```bash
kici-admin source add generic \
  --org <orgId> \
  --name forgejo-main \
  --verification hmac_sha256 \
  --secret <random-hex> \
  --preset forgejo \
  --git-url-template 'https://forgejo.example.com/{owner}/{name}.git' \
  --credential-ref pat \
  --credential-type pat \
  --credential-user bot-user
```

Then seed the PAT under the source's own secret scope:

```bash
# The scope __source__/<sourceId> is the orchestrator's convention for
# source-level credentials. Use the sourceId printed by `source add`.
kici-admin secret set <orgId> "__source__/<sourceId>" pat --value "<your-forgejo-pat>"
```

Finally, configure the forge to deliver webhooks to:

```
https://<platform-host>/webhook/<orgId>/generic/<source-name>
```

with the same secret you passed to `--secret`.

## SSH deploy key

For SSH instead of HTTPS:

1. **Generate an Ed25519 deploy key.** Ed25519 is the recommended default.

   ```bash
   ssh-keygen -t ed25519 -N '' -C 'kici-forgejo-deploy-key' -f ~/.ssh/forgejo-deploy-key
   ```

   This produces `~/.ssh/forgejo-deploy-key` (private, OpenSSH PEM) and
   `~/.ssh/forgejo-deploy-key.pub` (public).

2. **Register the public key as a deploy key on the forge.** On Forgejo
   / Gitea this is _Repository -> Settings -> Deploy Keys -> Add Key_
   (paste the `.pub` contents). On GitLab it's _Settings -> Repository
   -> Deploy keys_. On plain GitHub it's _Settings -> Deploy keys_.
   Read-only access is enough — KiCI only clones.

3. **Capture the forge's host keys** (needed only for
   `--ssh-host-key-policy pinned`):

   ```bash
   ssh-keyscan -t ed25519,rsa forgejo.example.com > forgejo.known_hosts
   ```

   Inspect the file before trusting it (compare against what the forge
   publishes in its docs) — this is your one chance to pin the key
   out-of-band rather than trust-on-first-use.

4. **Create the source:**

   ```bash
   kici-admin source add generic \
     --org <orgId> \
     --name forgejo-ssh \
     --verification hmac_sha256 \
     --secret <random-hex> \
     --preset forgejo \
     --git-url-template 'ssh://git@forgejo.example.com:22/{owner}/{name}.git' \
     --credential-ref deploy-key \
     --credential-type ssh \
     --ssh-host-key-policy pinned \
     --ssh-known-hosts-pem "@/path/to/forgejo.known_hosts"
   ```

   The `@` prefix on `--ssh-known-hosts-pem` tells the CLI to read the
   file contents.

5. **Store the private key PEM under the source scope:**

   ```bash
   kici-admin secret set <orgId> "__source__/<sourceId>" deploy-key \
     --value "$(cat ~/.ssh/forgejo-deploy-key)"
   ```

   The orchestrator materialises this PEM into a tempfile (mode `0600`)
   at every clone and drives `git` with a purpose-built
   `GIT_SSH_COMMAND` (`IdentitiesOnly=yes`, `BatchMode=yes`, plus the
   host-key flags below). The tempdir is cleaned up as soon as the
   clone finishes.

**Host-key policy:** `accept-new` (default) auto-trusts the forge on
first connection (TOFU) and logs a one-time warning. `pinned` sets
`StrictHostKeyChecking=yes` with `UserKnownHostsFile=<the PEM you
supplied>` and rejects any host key that doesn't match — use this for
production supply-chain hardening. `pinned` requires
`--ssh-known-hosts-pem` (or the equivalent `sshKnownHostsPem` field on
update); the CLI rejects the request otherwise.

**Updating an existing source:** use `kici-admin source update-generic
<id>` with the same flags to switch an HTTPS/PAT source to SSH, rotate
the host-key policy, or flip presets. Pass `--clear-git-config` to
revert the source back to a payload-only generic webhook.

## Credential rotation

To rotate a PAT or SSH key, overwrite the value under the same scope +
key and the next clone picks it up:

```bash
kici-admin secret set <orgId> "__source__/<sourceId>" pat --value "<new-pat>"
```

The orchestrator re-reads the secret at each clone. No source update
needed.

## Global workflows

Universal-git sources participate in the org-wide global-workflow model
exactly like GitHub App sources — a global workflow authored in one
source can dispatch against pushes from a different source in the same
org (including across forges), with each clone using its own bundle's
credentials.

Enable and tune the policy via the org-settings CLI. Settings are
org-scoped (one row per `customer_id`); each list entry can optionally
pin to a specific source via `--source <routingKey>`:

```bash
# Enable global workflows for the org
kici-admin org-settings global-workflows set-enabled true \
  --customer-id <orgId>

# Allow authors from any source in the org
kici-admin org-settings global-workflows allow-add \
  'forgejo.example.com/ci-workflows/*' \
  --customer-id <orgId>

# Allow authors only when the workflow lives on a specific source
kici-admin org-settings global-workflows allow-add \
  'forgejo.example.com/ci-workflows/*' \
  --customer-id <orgId> \
  --source "generic:<orgId>:<sourceId>"

# Forbid events from a specific source from firing any global workflow
kici-admin org-settings global-workflows deny-add \
  'forgejo.example.com/untrusted/*' \
  --customer-id <orgId> \
  --source "generic:<orgId>:<sourceId>"
```

See [Global workflows](../../architecture/global-workflows.md) for the
policy model (`isWorkflowRepoAllowed` + `isSourceRepoAllowed` +
`isElevatedAccessAllowed`) and the cross-provider dispatch contract.

## Routing-key collisions

When a user has both a GitHub App source and a universal-git source
targeting the same `owner/repo`, each creates its own registration and
each fires its own run on a matching push. This is intentional: the two
sources are independently authenticated and may resolve different lock
files. If you want deduplication, either:

- constrain one side via `global_workflow_denied_repos`, or
- don't create both sources.

## Troubleshooting

**The webhook hits the orchestrator but no run fires.** Check the
orchestrator log for `Skipping global workflow dispatch` or
`no registrations for event`. Most common cause: the webhook event
header doesn't match the preset's `eventMapping`. For `custom` sources,
make sure the `eventMapping` array includes every value the forge
actually sends (they can vary by event type).

**Clone fails with 401.** The source-scoped secret is missing or
wrong. Verify with:

```bash
kici-admin secret list <orgId> "__source__/<sourceId>"
```

**Clone fails with 403 `default branch` fetch.** The PAT lacks
read-access to the repo or the SSH deploy key isn't registered on it.

**SSH clone fails with host-key rejection.** If you set
`sshHostKeyPolicy: pinned`, verify the known-hosts PEM matches the
forge's current key. If you're still using `accept-new`, the orch's
`~/.ssh/known_hosts` has a stale entry — clear it or flip to `pinned`
with the right PEM.

---

# Architecture overview

## Data flows

Source: https://docs.kici.dev/architecture/data-flows/

This document describes the key data flows through the KiCI architecture: webhook delivery, job execution, dependency caching, re-run and cancel, trace ID propagation, internal event routing, and generic webhook ingestion.

> **Lock file schema version:** The lock file uses schema version 15, which adds per-job init config on top of v14's declarative cache specs, v11's `LockInlineValue` for pure function inline evaluation, v10's simplified negative patterns (! prefix in repos/paths arrays), v9's global workflow repos matching, and v8's runsOn polymorphic type support.

## Webhook delivery flow

A webhook event from a provider (e.g., GitHub) travels through three tiers before execution begins.

```
GitHub  -->  Platform Relay  -->  Orchestrator  -->  Agent
        1. Webhook          2. WebSocket        3. Job dispatch
           POST                relay               + execution
```

### Step by step

1. **Provider sends webhook** to the Platform relay endpoint.
2. **Platform routes the webhook** to the right orchestrator over WebSocket and forwards the body bytes verbatim. Platform never sees customer HMAC secrets — signature verification happens entirely on the orchestrator after reassembly.
3. **Orchestrator verifies signature** (HMAC-SHA256 against per-source webhook secret, with dual-secret rotation support).
4. **Orchestrator dedup check** against dual-layer `DedupCache` (in-memory set + `dedup_cache` DB table).
5. **Orchestrator resolves provider** by looking up the provider bundle from the `ProviderRegistry` using `getByRoutingKey()` (exact match first, falls back to provider type prefix for backward compatibility). Skips processing if the provider is unknown.
6. **Orchestrator normalizes** the webhook via the provider's `WebhookNormalizer` (extracts branch, event type, action, sender).
7. **Orchestrator extracts repo and credentials** from payload (repository identifier from `repository.full_name`, provider credentials such as GitHub installation ID).
8. **Orchestrator handles /kici commands** in `issue_comment` events: intercepts `/kici approve` and `/kici reject` approval commands before trigger matching, delegating to `handleApprovalComment()` for security hold management.
9. **Orchestrator resolves trust** for PR events (determines lock file source: head vs base branch).
10. **Orchestrator fetches lock file** via the provider's `LockFileFetcher` (cached with LRU). For untrusted PR events, fetches both base and head lock files in parallel; for trusted PRs and pushes, fetches from head SHA.
11. **Orchestrator detects workflow modifications** for untrusted PR events by comparing base and head lock files via `detectWorkflowModifications()`, applying security holds when non-trusted contributors modify workflow files.
12. **Orchestrator extracts registrations** on default-branch pushes: persists registerable workflows (event, schedule, lifecycle triggers) for cluster-wide event matching.
13. **Orchestrator notifies the event router** on default-branch pushes: after the registrations are persisted, emits a `registration.updated` event via `eventRouter.emit()` (if event routing is active). Workflow event subscriptions are the persisted registrations themselves, matched at emit time through the registration index.
14. **Orchestrator fetches changed files** via the provider's `ChangedFilesFetcher` for path-based trigger filtering (skipped when no workflow uses path filters).
15. **Orchestrator matches triggers** against lock file using `matchAllWorkflows()` from `@kici-dev/engine`.
16. **Orchestrator checks caches** for source tarballs and dependency tarballs.
17. **Orchestrator dispatches jobs** to agents via the job queue and WebSocket.
18. **Orchestrator persists a delivery row** keyed by `(org_id, delivery_id)` to its own `event_log`, including a pointer to the gzipped payload in object storage. The orchestrator's delivery log is surfaced in the dashboard's Settings → Event log tab. See [`webhook-delivery.md`](./webhooks/webhook-delivery.md#delivery-log).

## Job execution flow

Once the orchestrator has matched triggers and resolved caches, jobs are dispatched to agents.

```
Orchestrator                         Agent                    Sandbox (child process)
    |                                  |                          |
    |-- job.dispatch (WS) ------------>|                          |
    |   (jobConfig, sourceTarUrl,      |                          |
    |    sourceTarHash, depsUrl,       |-- Create sandbox ------->|
    |    depsHash)                     |   (container/bare-metal/ |
    |                                  |    firecracker)          |
    |                                  |                          |-- Restore .kici/ source (tarball)
    |                                  |                          |-- Restore deps (tarball)
    |                                  |                          |-- Load workflow (TS loader hook)
    |                                  |                          |-- Evaluate rules
    |                                  |                          |-- Execute steps
    |                                  |<-- IPC (step status, ----|
    |                                  |    log lines, events)    |
    |<-- job.status (WS) -------------|                          |
    |   (step progress, completion)    |-- Teardown sandbox ----->|
    |                                  |                          |
```

### Agent pipeline

The agent delegates job execution to an `ExecutionSandbox` (container, bare-metal, or firecracker). The sandbox runs customer code in an isolated child process -- never in the agent's V8 isolate. Four job types are handled: execution jobs (sandbox), build-only jobs (in-process, cache population), init-only jobs (in-process, dynamic field resolution), and DynamicJobFn evaluation jobs (in-process, runtime job generation). See [Job execution lifecycle](./execution/job-execution.md) for details.

1. **Report running** -- Send `job.status: running` immediately upon accepting the dispatch
2. **Sandbox selection** -- Determine execution mode (container, bare-metal, firecracker) from job config and environment
3. **Sandbox setup** -- Create and start the execution environment (container: `docker create`/`start`; bare-metal: validate; firecracker: detect)
4. **Context emission** -- Send `job.context` to orchestrator with runtime details (Node version, OS, arch, sandbox type)
5. **Sandbox execution** -- The sandbox child process handles the inner pipeline: `.kici/` source tarball restore, deps tarball restore, workflow loading (dynamic-import `.ts` via the shared TypeScript ESM loader hook), step extraction, rule evaluation, step execution sequentially with timeout and abort support. There is no runtime bundling step — workflow TS is transformed on import, not ahead of time.
6. **IPC callbacks** -- Step status, log lines, event emissions, and concurrency reports flow from the sandbox to the agent via IPC, then to the orchestrator via WebSocket
7. **Report** -- Send final `job.status` back to orchestrator with step results and timing
8. **Cleanup** -- Tear down sandbox and remove work directory

## Source and dependency caching flow

KiCI runs two orchestrator-side caches — the **source tarball cache** (raw `.kici/` directory minus `node_modules/`) and the **dependency tarball cache** (packed `node_modules/`). Both use a build-then-execute pattern: the orchestrator checks the caches before dispatching execution jobs, and if the source cache is cold a build agent populates both in one pass.

With the shared TypeScript loader hook plus source tarball, execution agents do not run `git clone` or compile anything at runtime — they perform exactly two S3 GETs (source + deps) and extract.

### Cache miss flow

When the source cache is cold and the dep cache is also missing:

```
Webhook
  |
  v
Trigger Match
  |
  v
Cache Check (source: MISS, deps: MISS)
  |
  v
Build Job Dispatch --> Build Agent (kici:role:builder + matching kici:os:/kici:arch:)
  |                      |
  |                      |-- git clone + checkout SHA
  |                      |-- npm ci in .kici/
  |                      |-- Pack .kici/ source (portable tar.gz, excludes node_modules)
  |                      |-- Pack .kici/node_modules (portable tar.gz)
  |                      |-- Upload source tarball to cache (source/{contentHash}.tar.gz)
  |                      |-- Upload deps tarball to cache (deps/{plat}-{arch}/{lockfileHash}.tar.gz)
  |                      |-- Upload deps companion .hash file
  |                      |-- Report success (cache.upload.complete × 2)
  |                      |
  v                      v
Build Complete <---------+
  |
  v
Get sourceTarUrl + depsUrl from cache (pre-signed S3 GETs)
  |
  v
Execution Job Dispatch --> Execution Agent
  |                          |
  |                          |-- Download source tarball (sourceTarUrl) -> extract to workDir/.kici/
  |                          |-- Download deps tarball (depsUrl) -> verify SHA-256 -> extract to .kici/node_modules/
  |                          |-- Register @kici-dev/shared/ts-loader-hook
  |                          |-- Verify workflow contentHash against lock file (drift guard)
  |                          |-- Dynamic-import workflow .ts
  |                          |-- Execute steps
  |                          |-- Report result
  |                          |
  v                          v
Done <-----------------------+
```

The execution agent never clones the repo. The source tarball IS the workflow repo's `.kici/` directory.

### Cache hit flow

When both caches have valid entries (the common case after the first run at a commit SHA):

```
Webhook
  |
  v
Trigger Match
  |
  v
Cache Check (source: HIT, deps: HIT)
  |
  v
Get sourceTarUrl + depsUrl from cache (pre-signed S3 GETs)
  |
  v
Execution Job Dispatch --> Execution Agent
  |                          |
  |                          |-- Download source tarball (sourceTarUrl) -> extract
  |                          |-- Download deps tarball (depsUrl) -> verify SHA-256 -> extract
  |                          |-- Register TS loader hook
  |                          |-- Dynamic-import workflow .ts
  |                          |-- Execute steps
  |                          |-- Report result
  |                          |
  v                          v
Done <-----------------------+
```

No build job is dispatched. The execution agent performs exactly two S3 GETs and extracts.

### Partial cache hit

The source cache and dep cache are independent. Four combinations are possible:

| Source | Deps | Behavior                                                                             |
| ------ | ---- | ------------------------------------------------------------------------------------ |
| HIT    | HIT  | Direct execution dispatch (fastest, two S3 GETs)                                     |
| HIT    | MISS | No build job; execution agent falls back to inline `npm ci` after restoring source   |
| MISS   | HIT  | Build job for source only (agent still packs deps opportunistically); then execution |
| MISS   | MISS | Build job for source + deps (single job packs both), then execution                  |

Dep cache misses alone do **not** trigger a build job. Deps are platform-specific (`deps/{platform}-{arch}/{hash}.tar.gz`) so a build job would need a builder agent matching the target platform, which may not exist (e.g., an arm64 builder when only x64 builders are available). When the source cache misses, the dispatched build job piggy-backs dep packing if deps are also missing. A single build job handles both artifacts when both miss, avoiding duplicate builds.

### Cross-source / no-contentHash workflows

- **Lock files without `contentHash`** (schema v1) skip the source cache entirely; agents compile from source. Regenerate lock files with `kici compile` to enable caching. The current lock file schema version is 15.
- **Cross-source / global-workflow dispatch** (a workflow registered against source A fired by a webhook on source B) bypasses both caches. The registration's lock file entry still carries `contentHash`, but the cross-source path always clone-and-installs — the eval temp dir doesn't ship `@kici-dev/sdk`. The execution agent still verifies `contentHash` against the cloned source for drift detection.

### Build deduplication

When multiple webhooks trigger simultaneously for the same repository state, the `BuildCoordinator` coalesces concurrent build requests using a combined key (`contentHash:lockfileHash`). Only one build job runs; all waiting dispatches share the result.

### Graceful degradation

If cache storage is unavailable or a download fails:

- **Source tarball download failure:** Hard failure today — the agent does not fall back to `git clone` on the execution path. (The build path is where cloning happens.) In practice this is rare because the same orchestrator that issued the pre-signed URL controls the cache backend.
- **Dep tarball download failure:** Agent falls back to running `npm ci` / `npm install` inline.
- **Dep tarball hash mismatch:** Agent retries the download twice (3 total attempts), then fails the job (no fallback for integrity failures).
- **Source tarball drift (extracted `contentHash` ≠ lock file):** Hard failure with "Lock file is out of date: workflow source changed without regenerating kici.lock.json" — see [Lock file and drift](../user/lock-file-and-drift.md).
- **Build failure:** Execution is skipped entirely with a "Build failed" check status. Workflows that contain dynamic job entries (DynamicJobFn) are allowed to proceed with their dynamic eval jobs since those compile from source.
- **No cache configured:** Agent runs inline install for every job (pre-caching behavior).

## Cache storage architecture

Both source and dep caches use `S3CacheStorage` as the sole backend. The `CacheStorage` interface provides a consistent API, but S3 (or any S3-compatible service: SeaweedFS, MinIO, LocalStack) is the only supported implementation.

```
                    +------------------+
                    | CacheStorage     |
                    | (interface)      |
                    +--------+---------+
                             |
                   +---------+---------+
                   | S3CacheStorage     |
                   | (AWS S3, SeaweedFS,|
                   |  MinIO, LocalStack)|
                   +--------------------+
```

### Cache key design

Cache keys reflect that source tarballs and deps have different platform characteristics:

- **Source:** `source/{contentHash}.tar.gz` — platform-agnostic. Raw TypeScript source is identical regardless of CPU architecture, so one entry is shared across all platforms. `contentHash` is the per-workflow hash from the lock file (`SHA-256(schemaVersion + ":" + rawSource [+ "\0" + assetDigest])`).
- **Deps:** `deps/{platform}-{arch}/{lockfileHash}.tar.gz` (e.g., `deps/linux-arm64/def456.tar.gz`) — platform-specific. Native dependencies in `node_modules` differ across architectures, so each platform/arch combination gets its own cache entry.

The orchestrator derives the target platform/arch for dep cache lookups by probing `AgentRegistry.findAvailable()` with the workflow's first job's `runsOn` labels to find a representative matching agent, then using that agent's platform and arch. Falls back to `linux/x64` if no matching agents are registered.

### TTL and eviction (touch-on-read)

Both caches refresh TTL on read via `touch-on-read`. An entry's lifetime is reset every time an orchestrator issues a pre-signed GET URL for it. Default TTL is `KICI_CACHE_TTL_DAYS=30`; entries unused for 30 days expire at the storage level. Actively used sources and deps stay in cache indefinitely as long as they continue to be referenced by inbound webhooks or reruns. See [`docs/operator/dependency-caching.md`](../operator/dependency-caching.md#cache-behavior) for configuration.

For the full per-package bucket and prefix inventory — cache, logs, cold-store, and the observability sidecar buckets — see [orchestrator storage layout](../operator/orchestrator/storage-layout.md).

### Pre-signed URL upload flow

Agents upload artifacts directly to S3 using pre-signed PUT URLs. This eliminates the orchestrator as a data proxy — only coordination messages flow through WebSocket.

```
Agent                         Orchestrator                    S3
  |                                |                           |
  |-- cache.upload.request ------->|                           |
  |   { type: "source"|"dep",     |                           |
  |     key: "source/..." or       |                           |
  |          "deps/..." }          |                           |
  |                                |-- getUploadUrl(key) ----->|
  |                                |   (PutObject pre-sign)    |
  |<-- cache.upload.response ------|                           |
  |   { url: "https://s3.../..." } |                           |
  |                                |                           |
  |-- HTTP PUT (artifact body) --------------------------->|
  |   (direct S3 upload)           |                           |
  |                                |                           |
  |-- cache.upload.complete ------>|                           |
  |   { type, key, depsHash? }     |-- initMeta(key) -------->|
  |                                |   (CopyObject to set     |
  |                                |    TTL metadata)          |
  |                                |-- put(hashKey) --------->|
  |                                |   (companion .hash file   |
  |                                |    for deps integrity)    |
```

The two-phase metadata approach (`upload via PUT` then `initMeta via CopyObject`) works around the limitation that S3 pre-signed URLs cannot include custom metadata headers. For dependency tarballs, the agent also reports the SHA-256 content hash in `cache.upload.complete`; the orchestrator stores it as a companion `.hash` file alongside the tarball. When dispatching execution jobs, the orchestrator reads this hash and includes it as `depsHash` in `job.dispatch`, enabling agent-side integrity verification on download. Source tarballs do not use a companion `.hash` file — the workflow `contentHash` carried in `sourceTarHash` is used to verify the extracted source against the lock file after extraction, which covers drift end-to-end.

### URL delivery (downloads)

Agents receive pre-signed S3 GET URLs (15-minute expiry) directly in `job.dispatch` messages. Agents download artifacts from S3, bypassing the orchestrator for all data transfer.

## User-facing cache flow

The source/dep cache above is internal: the orchestrator owns its keys and decides when to hit or build. The **user-facing cache** is driven by the workflow author — the declarative `cache: { key, paths, restoreKeys? }` on a job/step, or the imperative `ctx.cache.restore()` / `ctx.cache.save()` API (see [SDK caching reference](../user/sdk/caching.md)). It reuses the same object-storage backend and the same direct-to-storage presigned-URL transport, but the agent — not the orchestrator — initiates each restore and save over WebSocket.

The agent's cache module archives `paths` into a gzipped tarball (computing a SHA-256 over the bytes) and streams downloads back through a checksum-verified extract pipeline. The orchestrator's `UserCache` owns the `cache/<orgId>/<repoId>/<scope>/<key>` namespacing, the immutable first-save check, the `restoreKeys` prefix scan, the two-phase atomic save, and per-org quota/TTL eviction.

### Restore flow

```
Agent                              Orchestrator (UserCache)          Object storage
  |                                      |                                |
  |-- cache.user.restore.request ------->|                                |
  |   { key, restoreKeys? }              |-- exact key in read prefixes ->|
  |                                      |   (isolated: iso/<runId>/      |
  |                                      |    then shared/; trusted:      |
  |                                      |    shared/ only)               |
  |                                      |-- restoreKeys prefix scan ---->|
  |                                      |   (newest match wins)          |
  |                                      |-- getUrl(matched) + touch ---->|
  |<-- cache.user.restore.response ------|                                |
  |   { hit, matchedKey?,                |                                |
  |     downloadUrl?, tarHash? }         |                                |
  |                                      |                                |
  |-- HTTP GET (tarball body) -------------------------------------->|
  |   (direct download; verify tarHash, extract paths)                |
```

The restore resolves the exact `key` across the ref's read prefixes first, then each `restoreKeys` prefix in order (newest matching entry wins). A trusted ref reads only `shared/`; an untrusted/fork ref reads its own `iso/<runId>/` scope and then falls back to `shared/`. On a hit the response carries a presigned GET URL plus the tarball's `tarHash`, which the agent verifies before extracting.

### Save flow (two-phase atomic)

```
Agent                              Orchestrator (UserCache)          Object storage
  |                                      |                                |
  |-- cache.user.save.request --------->|                                |
  |   { key }                            |-- has(final key)? ------------>|
  |                                      |   (immutable: skip if exists)  |
  |                                      |-- getUploadUrl(.tmp-<uuid>) -->|
  |<-- cache.user.save.response ---------|                                |
  |   { uploadUrl?, skip }               |                                |
  |                                      |                                |
  |-- HTTP PUT (tarball body) ----------------------------------->|
  |   (direct upload to temp object)                              |
  |                                      |                                |
  |-- cache.user.save.complete -------->|                                |
  |   { key, tarHash, sizeBytes }        |-- copy(temp -> final) -------->|
  |                                      |-- delete(temp) --------------->|
  |                                      |-- initMeta(final) ------------>|
  |                                      |-- put(.hash) + put(.size) ---->|
  |                                      |-- enforce per-org quota ------>|
```

The save is **immutable** and **atomic**. The orchestrator declines (`skip: true`) up front if the exact key already exists. Otherwise the agent uploads to a `.tmp-<uuid>` object via a presigned PUT, then `cache.user.save.complete` triggers a server-side copy temp→final, a delete of the temp, an `initMeta` to stamp TTL metadata, and `.hash` / `.size` companion writes. Because the final key only appears after the copy, a crashed upload never leaves a corrupt committed entry. The committing save then enforces the per-org byte quota, evicting oldest entries until the org is back under `KICI_USER_CACHE_QUOTA_BYTES`.

### Trust → scope mapping

The orchestrator threads a `cacheRefScope` onto each `job.dispatch`. A **trusted** ref (the repo's own branches, default branch) maps to the `shared` write scope; any other ref (a fork PR) maps to `isolated`, writing to a per-run `iso/<runId>/` scope. This is the cache-isolation model: a fork can restore from the trusted `shared/` cache but can never write into it, so it cannot poison the entries a trusted branch later restores. The org segment of the key namespace (`cache/<orgId>/`) is the per-tenant boundary — no tenant can read another tenant's cache. See [orchestrator storage layout](../operator/orchestrator/storage-layout.md#user-cache) for the full prefix map and quota/TTL knobs.

## Internal event routing flow

Internal events (custom events from `ctx.emit()` and system events from workflow/job completion) flow through the event router for fan-out delivery to matching workflows.

```
Step ctx.emit('event-name', payload)
  |
  v
Agent IPC (fork channel or stdout JSON-lines)
  |
  v
Agent -> event.emit WS message -> Orchestrator
  |
  v
EventRouter.emit()
  |-- CircuitBreaker check (chain depth, rate limit) — fail-fast, in-memory
  |-- BEGIN TRANSACTION
  |     |-- EventStore.writeWith(tx) -> INSERT into kici_events table
  |     |-- pg_notify('kici_event_channel', eventId) (queued; fires on commit)
  |-- COMMIT (rollback discards both insert and notify atomically)
  |
  v
All Orchestrators LISTEN on 'kici_event_channel' channel
  |
  v
EventRouter.onNotification(eventId) [private]
  |-- EventStore.tryLeaseForProcessing(eventId, nodeId, leaseDurationMs)
  |     (atomic UPDATE: claim only if processed=false AND dlq_at IS NULL
  |      AND (claimed_at IS NULL OR claimed_at < NOW() - leaseDurationMs);
  |      increments attempts and records claimed_at/claimed_by atomically)
  |-- If lease acquired:
  |     |-- processSubscriptions(event):
  |     |     |-- If RegistrationIndex available:
  |     |     |     Look up registrations by trigger type
  |     |     |     TrustStore.isTrusted() (for cross-repo events)
  |     |     |     matchAllWorkflows() against registered workflows
  |     |     |-- Else (no RegistrationIndex):
  |     |     |     TrustStore.isTrusted() (for cross-routing-key events)
  |     |     |     matchAllWorkflows() against in-memory lock file subscriptions
  |     |     |-- For each match: onEventMatched(event, lockFile, matchedWorkflows)
  |     |
  |     |-- On success: markProcessed (commits processed=true, clears lease)
  |     |-- On failure (any onEventMatched throws):
  |           |-- If attempts >= maxDispatchAttempts: markDlq('exhausted_retries')
  |           |-- Else: recordDispatchFailure (sets next_retry_at via exponential
  |                     backoff with full jitter; clears lease)
  |
  v
Job dispatch to agents (standard pipeline)
```

### At-least-once delivery + DLQ

Two invariants keep events from being silently lost:

- **Cron-fire atomicity:** `tryClaimFire` (advances `cron_last_fired`) and the
  event-row INSERT + `pg_notify` execute inside the same database transaction.
  If the leader process is killed between the two writes, the transaction
  rolls back and no `last_fired_at` advance leaks. The next tick re-evaluates
  and fires cleanly.
- **Dispatch retries:** the lease pattern (`tryLeaseForProcessing`) marks an
  event as in-flight without committing it as processed. When a handler
  throws, the lease wrapper records the failure, schedules a retry, and on
  the leader's retry-scanner tick the event is re-published via `pg_notify`.
  After `maxDispatchAttempts` (default 5) the
  event lands in the DLQ (`dlq_at` set, `dlq_reason='exhausted_retries'`)
  and is surfaced via Prometheus (`kici_orch_event_dlq_*`), Grafana
  (`event-delivery` dashboard), and the kici-admin CLI
  (`kici-admin event-dlq {list,count,retry,discard}`).
- **Crash detection:** when a node crashes mid-dispatch, its lease ages out
  after `leaseDurationMs` (default 60 s). The leader's
  `EventRetryScanner` releases the expired lease and re-publishes
  `pg_notify` so a healthy node picks the event up. Each release increments
  `kici_orch_event_lease_expirations_total` — a steady > 0 rate is the
  visible signal that an orchestrator instance is dying mid-dispatch.

### System events

The orchestrator auto-emits system events after execution completes:

- **`workflow_complete`** -- emitted when all jobs in a workflow finish (carries workflow name, status, duration)
- **`job_complete`** -- emitted when a single job finishes (carries workflow name, job name, status, duration)

These events are stored in the same `kici_events` table and matched against `workflowComplete()` and `jobComplete()` triggers in the lock file.

### Event.emit WS protocol

```
Agent                          Orchestrator
  |                                |
  |-- event.emit ----------------->|
  |   { jobId, requestId,         |
  |     eventName, payload,        |
  |     target? }                  |
  |                                |-- store event
  |                                |-- NOTIFY
  |<-- event.emit.response --------|
  |   { requestId, deliveryId? }   |
  |                                |
```

### Registration extraction flow

When code is pushed to the default branch, the orchestrator extracts event-triggered workflows from the lock file and stores them as registrations for cluster-wide event matching.

```
Git Push to Default Branch
==========================

GitHub Webhook -> Platform Relay -> Orchestrator Processor

  Processor (on default-branch push):
    |-- lockFileCache.get() (fetch/cache lock file by blob SHA)
    |-- extractRegisterableWorkflows(fullLockFile)
    |       |-- For each workflow entry in lock file:
    |       |     Check if any trigger type is registerable
    |       |     (kici_event, workflow_complete, job_complete,
    |       |      generic_webhook, schedule, lifecycle)
    |       |-- Return array of registerable workflows
    |
    |-- globalWorkflowPolicy.isWorkflowRepoAllowed() (if policy configured)
    |       |-- Filter out global workflows from repos not on the allow-list
    |
    |-- registrationStore.replaceAll(repoIdentifier, workflows, routingKey, credentials, { commitSha })
    |       |-- BEGIN TRANSACTION
    |       |-- DELETE FROM workflow_registrations WHERE routing_key AND repo_identifier
    |       |-- INSERT new registrations (with commit SHA for lock file pinning)
    |       |-- COMMIT
    |
    |-- registrationStore.bumpVersion()
    |       |-- UPDATE registry_versions SET version = version + 1
    |
    |-- registrationIndex.refreshIfNeeded(newVersion)
    |       |-- If local version != remote version:
    |       |     Load all registrations from DB
    |       |     Rebuild primary index (by customer:repo)
    |       |     Rebuild secondary index (by trigger type)
    |       |     Update local version
    |
    |-- cronScheduler.refreshCache() (defense-in-depth)
    |
    |-- eventRouter.emit('registration.updated', { repo, workflows })
```

### Cron schedule evaluation flow

Cron schedules are evaluated periodically by the Raft leader only.

```
Cron Schedule Evaluation
========================

CronScheduler (runs every 30 seconds, Raft leader only):
  |-- registrationIndex.getCronSchedules()
  |-- For each schedule:
  |     |-- new Cron(cronExpression, { timezone })
  |     |-- cron.previousRuns(1) -> most recent past scheduled time
  |     |-- Check last-fired cache (prevent double-fire)
  |     |-- If due and not recently fired:
  |           |-- BEGIN TRANSACTION
  |           |     |-- cronStore.tryClaimFire(registrationId, previousRun, tx)
  |           |     |     (atomic DB claim — prevents duplicate fires in
  |           |     |      multi-orchestrator clusters via WHERE last_fired_at <
  |           |     |      firedAt guard)
  |           |     |-- If claim successful:
  |           |           |-- eventRouter.emitInTx(__schedule_fire, tx)
  |           |                 |-- EventStore.writeWith(tx) -> INSERT kici_events
  |           |                 |-- pg_notify('kici_event_channel', id) on tx
  |           |-- COMMIT (rollback discards both writes; pg_notify fires on commit)
  |           |-- On commit:
  |                 |-- Update local last-fired cache
  |                 |-- EventRouter matches against registered workflows
  |                 |-- Matched workflows dispatched via standard pipeline
```

Recovery on leader election loads the `cron_last_fired` table into the
last-fired cache and fires once per missed schedule. Because the claim and
the event-row insert now share a transaction, a crash between the two no
longer leaves `last_fired_at` advanced with no event row — the rollback
discards both writes and the next tick fires cleanly.

#### Timing characteristics

- **Tick interval:** Hardcoded at 30 s (`evaluationIntervalMs` defaults to `30_000` in `CronScheduler` and is not exposed via orchestrator config or env vars). Changing it requires a code change.
- **Fire jitter:** A schedule due at time `T` fires at the first tick `>= T`, i.e. `0–30 s` after the scheduled moment, never before. The event payload's `scheduledAt` carries the cron-computed time (not the dispatch time), so downstream consumers see the intended schedule.
- **Per-tick concurrency:** All schedules are processed serially in a single `for` loop on the leader (`packages/orchestrator/src/cron/cron-scheduler.ts`, `evaluate()`). Each registration costs one in-memory cron computation plus two DB writes (`tryClaimFire` upsert + `eventStore.write` + `pg_notify`). Throughput is therefore bounded by sequential DB write latency: at ~5–15 ms per registration, 50 schedules firing in the same tick complete in well under a second between the first and last fire.
- **Recovery semantics:** On leader election, `recoverMissedSchedules()` calls `cron.previousRuns(1)` per schedule -- it fires at most one event per schedule regardless of how long the cluster was leaderless. There is no backfill for multiple missed scheduled instants.
- **Multi-node deduplication:** During cluster startup multiple nodes may transiently self-elect (dormant mode). The atomic `tryClaimFire` upsert with a `last_fired_at < firedAt` `WHERE` guard ensures only one node's emit succeeds; losing nodes update their local cache and skip emit.
- **Sub-minute crons:** Supported but bounded by the 30 s tick. `* * * * *` fires roughly once per minute with up to 30 s of drift; sub-30-second cadences are not achievable without lowering the interval in code.

## Generic webhook flow

Generic webhooks from non-GitHub sources follow a parallel ingestion path. The webhook can arrive directly at the orchestrator or be relayed through the Platform.

```
External Service (ArgoCD, Jenkins, Grafana, etc.)
  |
  v
POST /webhook/:orgId/generic/:sourceId
  |
  +--> Platform path:
  |      |-- Resolve source by routing key generic:<orgId>:<sourceId>
  |      |-- Relay via WebSocket to orchestrator (see internal/platform/data-flows.md)
  |      v
  +--> Orchestrator path (direct or via Platform relay):
         |-- GenericSourceManager.getByOrgAndName(orgId, sourceId)
         |-- Payload size check (per-source maxPayloadBytes)
         |-- Rate limit check (per-source rateLimitRpm)
         |-- Verify signature (HMAC-SHA256, bearer token, IP allowlist, or none)
         |-- Deduplication check (idempotency key within dedup window)
         |-- GenericWebhookNormalizer.normalizeEvent() -> SimulatedEvent
         |-- Match against lock file triggers (genericWebhook type)
         |-- Dispatch matched jobs to agents
```

### Generic vs GitHub webhook differences

| Aspect         | GitHub Webhooks             | Generic Webhooks                                |
| -------------- | --------------------------- | ----------------------------------------------- |
| Signature      | HMAC-SHA256 (always)        | Configurable (HMAC, bearer, IP, none)           |
| Event type     | X-GitHub-Event header       | Configurable header or payload field            |
| Delivery ID    | X-GitHub-Delivery header    | Configurable header or auto-generated UUID      |
| Lock file      | Fetched from repo           | Cached from lock file subscription              |
| Git operations | Clone, fetch, changed files | None (optional -- non-repo workflows supported) |

## Database topology

The orchestrator owns its own PostgreSQL database, with the authoritative `execution_runs`, `execution_jobs`, `execution_steps`, `dispatch_queue`, `dedup_cache`, `workflow_registrations`, `environments` / `scoped_secrets` / `environment_bindings`, `agent_tokens`, `cluster_meta`, and related tables. Each orchestrator deployment uses its own `KICI_DATABASE_URL`; database users are scoped per service.

## Execution reporting flow

After job execution, results flow back through the tiers:

```
Agent                    Orchestrator             Platform              GitHub
  |                          |                     |                  |
  |-- job.status ----------->|                     |                  |
  |   (completed/failed)     |                     |                  |
  |                          |-- execution.status ->|                  |
  |                          |   (run metadata)     |-- upsert        |
  |                          |                      |   execution_runs |
  |                          |-- job.status.forward>|                  |
  |                          |   (job metadata)     |-- upsert        |
  |                          |                      |   execution_jobs |
  |                          |-- GitHub Checks API ---------------------->|
  |                          |   (check run update)  |                  |
  |                          |                      |                  |
```

The orchestrator updates:

1. **GitHub Check Runs** via the Checks API (conclusion, summary, duration)
2. **Execution runs** in the orchestrator's own database (authoritative source)
3. **Platform execution status** via WebSocket (`execution.status` and `job.status.forward` messages, which the Platform upserts into its own projection tables)

## Re-run and cancel flows

The dashboard enables users to re-run completed workflows and cancel running workflows. Both flows use a REST-over-WS proxy pattern: the Platform receives a REST request from the dashboard, forwards it to the orchestrator via WebSocket, and returns the orchestrator's response.

### Re-run flow

```
Dashboard                    Platform                         Orchestrator
    |                          |                              |
    |-- POST /orgs/:id/runs/  ->|                              |
    |   :runId/rerun (auth)    |-- Cooldown check             |
    |                          |   (last_rerun_at < 5s ago?)  |
    |                          |                              |
    |                          |-- run.rerun.request (WS) --->|
    |                          |   { runId, triggeredBy }     |
    |                          |                              |-- Load original run from DB
    |                          |                              |-- Read webhook payload from storage
    |                          |                              |-- Re-fetch lock file at original SHA
    |                          |                              |-- Dispatch new jobs via Dispatcher
    |                          |                              |-- Record execution with parent_run_id + original_run_id
    |                          |                              |-- execution.status (WS, via callback)
    |                          |                              |   { parentRunId, originalRunId, triggeredBy }
    |                          |<- run.rerun.response (WS) ---|
    |                          |   { newRunId }               |
    |                          |                              |
    |<- 200 { newRunId } ------|                              |
    |                          |-- UPDATE last_rerun_at       |
    |-- Navigate to new run    |                              |
    |                          |                              |
```

Key design points:

- **Cooldown enforcement:** The Platform enforces a 5-second cooldown per original run via the `last_rerun_at` column. Rapid re-run attempts receive 429 Too Many Requests.
- **Payload reuse:** The orchestrator reads the original webhook payload from filesystem/object storage and stores a copy for the new run (enabling re-run of re-runs).
- **Lock file at original SHA:** The lock file is re-fetched at the original commit SHA, ensuring the re-run uses the same workflow definition.
- **Lineage tracking:** The new run has `parent_run_id` pointing to the immediate parent run, `original_run_id` pointing to the root ancestor run (for chain traversal), and `triggered_by` recording the user identity.
- **No trigger matching:** Re-runs skip deduplication, normalization, and trigger matching. They go directly from lock file parse to job dispatch.

### Cancel flow

```
Dashboard                    Platform                         Orchestrator          Agent(s)
    |                          |                              |                    |
    |-- POST /orgs/:id/runs/ ->|                              |                    |
    |   :runId/cancel (auth)   |                              |                    |
    |                          |-- run.cancel.request (WS) -->|                    |
    |                          |   { runId, cancelledBy }     |                    |
    |                          |                              |-- Find active jobs  |
    |                          |                              |   from dispatch queue
    |                          |                              |-- job.cancel (WS) ->|
    |                          |                              |   (for each agent)  |-- Abort step
    |                          |                              |                    |-- Cleanup
    |                          |<- run.cancel.response (WS) --|                    |
    |                          |   { cancelledJobs: N }       |                    |
    |                          |                              |<- job.status -------|
    |<- 200 { cancelledJobs } -|                              |   (cancelled)      |
    |                          |-- UPDATE cancelled_by        |                    |
    |                          |                              |                    |
```

The cancel flow is asynchronous: the orchestrator sends `job.cancel` to agents and immediately responds with the count. Agents asynchronously abort their current step, clean up, and report `job.status: cancelled` back to the orchestrator.

### Payload storage flow

Webhook payloads are stored during initial processing and retrieved later for re-runs and the payload viewer.

```
Webhook arrives                          Payload retrieved
    |                                        |
    v                                        v
processWebhook()                     GET /orgs/:id/runs/:runId/payload
    |                                        |
    v                                        v
logStorage.append(                   Platform -> dashboard.payload (WS)
  executions/{runId}/                        |
  webhook-payload.json,                      v
  JSON.stringify(payload)            Orchestrator -> logStorage.read(
)                                      executions/{runId}/
    |                                  webhook-payload.json
    v                                )
Filesystem or object storage               |
                                           v
                                    dashboard.payload.response (WS)
                                      { payload: {...} }
```

### Event-log payload streaming flow

The dashboard's event-log detail panel reads webhook bodies through a chunked transport so the dashboard can render progress as bytes arrive. The orchestrator slices the payload into 64 KiB chunks and streams them up to the browser.

### Lineage query

The lineage endpoint (`GET /orgs/:customerId/runs/:runId/reruns`) returns all runs with `parent_run_id` matching the given run ID.

## Trace ID propagation

Every webhook event is assigned a trace ID (`requestId`) at ingestion. A second ID (`runId`) is added at dispatch time. Both propagate through the three tiers via WebSocket protocol messages and are automatically injected into every log line using AsyncLocalStorage.

```
GitHub         Platform                  Orchestrator              Agent
  |              |                        |                       |
  |-- webhook -->|                        |                       |
  |              |-- generate requestId   |                       |
  |              |-- requestContext.run()  |                       |
  |              |   (requestId)          |                       |
  |              |                        |                       |
  |              |-- webhook.relay (WS) ->|                       |
  |              |   { ..., requestId }   |                       |
  |              |                        |-- requestContext.run() |
  |              |                        |   (requestId)         |
  |              |                        |                       |
  |              |                        |-- generate runId      |
  |              |                        |-- enrichRequestContext |
  |              |                        |   ({ runId })         |
  |              |                        |                       |
  |              |                        |-- job.dispatch (WS) ->|
  |              |                        |   { ..., requestId }  |
  |              |                        |                       |-- requestContext.run()
  |              |                        |                       |   (requestId, runId,
  |              |                        |                       |    jobId)
  |              |                        |                       |
  |              |                        |                       |-- log: "Run: X | Trace: Y"
  |              |                        |                       |-- execute steps
  |              |                        |                       |
```

### How it works

1. **Platform ingestion:** The webhook handler generates a `requestId` (UUID) and wraps the entire request in `requestContext.run()`. All log lines within this async scope automatically include `requestId`.

2. **WebSocket relay:** The `requestId` is included in the `webhook.relay` message sent to the orchestrator. For cross-instance relay (via Valkey pub/sub), the `requestId` is serialized in the notification payload.

3. **Orchestrator processing:** The orchestrator wraps webhook processing in `requestContext.run()` using the `requestId` from the relay message (falling back to a new UUID for backward compatibility). When a `runId` is generated for job dispatch, it is enriched into the existing context via `enrichRequestContext()`.

4. **Job dispatch:** Both `requestId` and `runId` are included in the `job.dispatch` WebSocket message to the agent.

5. **Agent execution:** The agent wraps each `onJobDispatch` callback in `requestContext.run()` with `requestId`, `runId`, and `jobId`. A trace header is printed once at job start. All subsequent log lines carry all trace fields automatically.

6. **Check run summaries:** GitHub Check Run updates include `Trace: <requestId> | Run: <runId>` in the summary text, giving operators a direct link from GitHub UI to Loki queries.

### Implementation

Trace propagation uses Node.js `AsyncLocalStorage` from `@kici-dev/shared`. A logger format reads the current context and injects fields into every JSON log line -- no changes needed at individual call sites.

Tier identification is handled at the infrastructure level: the `service` Loki label (set by Grafana Alloy from the systemd unit / log source) identifies which service produced the log (`platform`, `orchestrator`, `agent`, etc.). For agent logs forwarded through the orchestrator's stdout, the parsed JSON also carries an inner `service: 'agent'` field — query both with `{service="orchestrator"} | json | service="agent"` to disambiguate.

## Output chaining data flow

Output chaining allows steps to consume outputs from preceding steps (within a job) and jobs to consume outputs from preceding jobs (across jobs). The data flows through several phases.

### Definition time

When workflow code runs at definition time (`step()`, `job()` calls):

- `step()` creates an `OutputProxy<T>` via `createStepOutputProxy(stepName)` and attaches it as `.result`
- `job()` creates an `OutputProxy<any>` via `createJobOutputProxy(jobName)` and attaches it as `.result`
- The proxy is an ES6 `Proxy` object that defers all property access to a module-global `OutputsMap`
- No outputs exist yet -- accessing `.result.field` before execution throws "has not produced outputs yet"

### Compile time

The compiler processes the workflow definition:

- Unnamed steps (bare functions and id-less `step()` calls) receive counter IDs: `step-1`, `step-2`, etc.
- Unnamed jobs (id-less `job()` calls with UUID names) receive counter IDs: `job-1`, `job-2`, etc.
- The lock file records `hasOutputs: true` for steps with Zod output schemas
- Step counters are scoped per job; job counters are scoped per workflow

### Execution time (local test runner)

When `kici test` runs a workflow:

1. **SDK module resolution:** The runner resolves `setStepOutputsMap` / `setJobOutputsMap` from the same `@kici-dev/sdk` module instance that the workflow uses (ensures the proxy reads from the same map)
2. **Map injection:** Fresh `OutputsMap` and `StepRefMap` are created and injected via `setStepOutputsMap()` / `setStepRefMap()` before each job
3. **Step execution:** Each step runs sequentially. If the step returns a value, it is stored in the `OutputsMap` keyed by step name
4. **Bare function normalization:** Bare functions in the steps array are assigned counter names and registered in the `StepRefMap` (maps function reference to step name)
5. **Proxy resolution:** When a subsequent step accesses `stepRef.result.field`, the proxy reads from the `OutputsMap`
6. **ctx.outputsOf():** Resolves step outputs by reference (Step object or bare function). For bare functions, looks up the step name in the `StepRefMap`

### Cross-job output aggregation

After each job completes in the local test runner:

1. Step outputs from the completed job are aggregated into the `jobOutputsMap`
2. **Multi-step jobs:** Outputs are nested under step names: `{ stepName: { field: value }, ... }`
3. **Single-step jobs (run shorthand):** Outputs are flattened directly: `{ field: value }` (no step-name nesting)
4. The `jobOutputsMap` is injected via `setJobOutputsMap()`, enabling `jobRef.result.stepName.field` or `jobRef.result.field` access

### IPC transport (agent sandbox)

In the agent sandbox (remote pipeline execution):

1. Step return values are captured and included in `step.complete` IPC messages (optional `outputs` field)
2. The agent aggregates step outputs and includes them in the `job.complete` IPC message
3. **Within-job chaining:** The sandbox populates the `OutputsMap` as steps complete, so `.result` and `ctx.outputsOf()` resolve correctly within a single job
4. **Cross-job chaining:** The orchestrator collects plain outputs from completed upstream jobs at dispatch time (querying the DB for jobs listed in `needs`), then passes them as `upstreamJobOutputs` in the `job.dispatch` message. The sandbox receives this map and populates the `jobOutputsMap` via `setJobOutputsMap()`, enabling `ctx.jobOutputs()` and `jobRef.result` access across job boundaries. Secret outputs follow a separate encrypted path via `SecretOutputStore`.

#### Within-job output flow

```
Step A completes          Step B accesses A.result
    |                          |
    v                          v
Return value             Proxy.get('field')
    |                          |
    v                          v
OutputsMap.set('A', val)  OutputsMap.get('A')
    |                          |
    v                          v
Stored in shared map     Returns val.field
```

#### Cross-job output flow

```
Job A completes               Orchestrator dispatches Job B
    |                              |
    v                              v
Outputs stored in DB          Query upstream job outputs (needs)
                                   |
                                   v
                              job.dispatch includes upstreamJobOutputs
                                   |
                                   v
                              Sandbox populates jobOutputsMap
                                   |
                                   v
                              ctx.jobOutputs('A') resolves
```

## Browser protocol (Platform to dashboard)

The Platform tier exposes a `/ws/browser` WebSocket endpoint for dashboard clients (auth, log subscription / streaming / gaps, run / job / step status updates, `run.event` / `job.context` for the Summary tab).

## See also

- [Architecture overview](overview.md) -- three-tier model and component responsibilities
- [Protocol messages](protocol-messages.md) -- WebSocket message schemas
- [Event system internals](./webhooks/event-system.md) -- event router, registration model, cron scheduler
- [State machine](./execution/state-machine.md) -- job execution state transitions
- [Webhook delivery](./webhooks/webhook-delivery.md) -- detailed webhook processing pipeline
- [Operator: dependency caching](../operator/dependency-caching.md) -- configuration guide
- [Operator: monitoring & tracing](../operator/observability/monitoring.md) -- trace fields and Loki queries
- [Operator: event routing & generic webhooks](../operator/event-routing.md) -- generic source setup and trust management
- [SDK reference: output chaining](../user/sdk/core.md#output-chaining) -- user-facing output chaining API

---

## Architecture overview

Source: https://docs.kici.dev/architecture/overview/

KiCI uses a three-tier relay model that separates webhook routing from code execution. Customer code never leaves customer infrastructure -- the Platform tier handles only webhook verification and routing, while the orchestrator and agent tiers run on customer-managed servers.

## Three-tier relay model

The system is organized into three deployment tiers connected by WebSocket channels:

```mermaid
flowchart LR
    GH["GitHub"]
    PLATFORM["Platform\nWebhook router"]
    ORCH_A["Orchestrator A\nExecution brain"]
    ORCH_B["Orchestrator B\nExecution brain"]
    AGENT_A["Agent\n(x64)"]
    AGENT_B["Agent\n(arm64)"]

    GH -- "HTTP\n(webhooks)" --> PLATFORM
    PLATFORM <-- "WebSocket\n(relay + telemetry)" --> ORCH_A
    PLATFORM <-- "WebSocket\n(relay + telemetry)" --> ORCH_B
    ORCH_A <-- "WebSocket P2P\n(reroute + progress\n+ Raft)" --> ORCH_B
    ORCH_A <-- "WebSocket\n(dispatch + status)" --> AGENT_A
    ORCH_B <-- "WebSocket\n(dispatch + status)" --> AGENT_B
    ORCH_A -- "GitHub API\n(lock file, checks)" --> GH
    AGENT_A -- "git clone" --> GH
    AGENT_B -- "git clone" --> GH
```

**Why three tiers?** Trust boundaries. The Platform relay never sees customer code -- it only verifies webhook signatures and forwards payloads. The orchestrator matches triggers against the lock file without cloning repositories. Only the agent, running on customer infrastructure, clones code and executes steps.

This model also enables fully self-hosted deployment: all three tiers can run on customer infrastructure, with the Platform tier receiving webhooks directly from GitHub.

## Component responsibilities

### Platform

The Platform tier is a thin webhook router. It verifies inbound webhook signatures, relays the payload to the correct orchestrator over WebSocket, and writes a delivery row to its event log. It does not process, store, or execute any customer code. Multi-tenant SaaS specifics (Stripe-driven billing, dashboard surface, identity provider integration) are documented in the internal docs.

### Orchestrator (`@kici-dev/orchestrator`)

The orchestrator is the execution brain. It decides what to run and dispatches work to agents.

- **Trigger matching** -- Evaluates lock file triggers against webhook payloads to determine which jobs to run. Uses branch, path, and event matching via picomatch.
- **Lock file caching** -- Fetches `kici.lock.json` from the configured provider's API (GitHub, generic webhook, universal-git, or internal). An LRU cache wraps the per-provider fetcher, keyed by `{provider}:{repo}:{ref}` so cross-provider fallback resolutions stay isolated.
- **Agent registry** -- Tracks connected agents with label-based routing for job dispatch.
- **Job queue** -- PostgreSQL-backed FIFO queue for reliable dispatch.
- **Webhook pipeline** -- Dedup, event mapping, lock file fetch, trigger matching, and job dispatch in a single pipeline.
- **Multi-orchestrator clustering** -- Optional peer-to-peer coordination via direct WebSocket connections. Enables cross-architecture job routing (e.g., x64 coordinator reroutes arm64 jobs to a peer), high availability, and dedicated coordinator topologies. Uses Raft consensus for leader election (orphan recovery). See [Multi-Orchestrator Architecture](./clustering/multi-orchestrator.md).
- **Auto-scaler** -- Optional pluggable module for ephemeral agent provisioning. Supports containers (Docker/Podman), bare-metal processes, and Firecracker microVMs as backends. Spawns agents on demand when no matching agent is connected, with label-based routing, two-level capacity limits (global + per-backend), warm pools, YAML configuration (`scalers.d/` directory support), and SIGHUP reload. Disabled by default -- orchestrator works without it.
- **Independent database** -- Has its own PostgreSQL database separate from the Platform. Stores execution runs/jobs/steps, dispatch queue, webhook secrets, dedup cache, and scaler state. The orchestrator's `execution_runs` and `execution_jobs` are the authoritative source of truth. The Platform receives execution status updates via WebSocket messages (`execution.status`, `job.status.forward`).

> Source: `packages/orchestrator/src/pipeline/processor.ts` (webhook pipeline), `packages/orchestrator/src/cluster/` (P2P coordination), `packages/orchestrator/src/scaler/` (auto-scaler module), `packages/orchestrator/src/server.ts` (Platform/hybrid entry point)

### Agent (`@kici-dev/agent`)

The agent is the execution worker. It runs on customer infrastructure and has full access to customer code.

- **Repository cloning** -- Clones the target repo with token-based auth (token in HTTP headers, not URLs, to prevent leakage).
- **Step execution** -- Runs steps sequentially with full `StepContext` (zx shell, logger, environment, workflow/job metadata).
- **Docker support** -- Container-based step execution via `docker exec` for isolated environments.
- **Log streaming** -- Chunked log streaming back to the orchestrator with configurable size limits.
- **Graceful shutdown** -- SIGTERM with 10s grace period, SIGUSR1 for drain mode.

> Source: `packages/agent/src/execution/job-runner.ts` (job lifecycle), `packages/agent/src/server.ts` (entry point)

## Supporting packages

### `@kici-dev/engine`

Shared business logic used by all three tiers. Single source of truth for cross-tier concerns. Has no internal `@kici-dev/*` dependencies (only zod, picomatch, and jsonpath-plus).

- Protocol message schemas (Zod-based, direction-specific unions including dashboard REST-over-WS, browser live streaming, test run lifecycle, observer channel, log pull, run events, peer-to-peer, cluster join, and source registration)
- Provider interfaces (WebhookNormalizer, LockFileFetcher, ChangedFilesFetcher, CloneTokenProvider, RepoUrlBuilder, ContributorResolver, CheckStatusPoster)
- Trigger matching engine (branch, path, event evaluation)
- Execution state machine (11 states, 16 events, pure functions)
- Webhook signature verification (HMAC-SHA256, timing-safe)
- WebSocket close codes (unified across all tiers)
- WebSocket rate limiting (WsRateLimiter)
- Environment allowlist (safe env var filtering)
- Secrets management (secret context resolution)
- Environment model (scoped secrets, env merge, protection gates)
- Label utilities (platform label derivation, runsOn normalization, `kici:*` reserved namespace, role labels)
- Audit policy and retention (per-action access-log sampling, warm-retention windows for cold-store eligibility, federated activity row schema)
- Scaler backend type enum (`container`, `bare-metal`, `firecracker`, `kubernetes`)
- Registration trigger type enum (registerable trigger discriminator)
- Bundler config (shared bundler configuration consumed by `e2e/helpers/service-deploy.ts`; the agent runtime uses the `@kici-dev/core/ts-loader-hook` to transform TypeScript on import, with no runtime bundler step)

> Source: `packages/engine/src/`

### `@kici-dev/sdk`

User-facing SDK for defining workflows in TypeScript. Provides factory functions (`workflow()`, `job()`, `step()`), trigger builders (`pr()`, `push()`), rules (`rule()`, `skip()`), matrix utilities, and DAG validation.

> Source: `packages/sdk/src/`

### `@kici-dev/compiler`

CLI tooling for workflow authors. Compiles `.kici/workflows/*.ts` to `.kici/kici.lock.json`, provides watch mode, local test execution, project initialization, and pre-commit hook integration.

> Source: `packages/compiler/src/`

### `@kici-dev/core`

Light shared utilities with no server-side dependencies — JSON-structured logging, error helpers, human-readable formatting (`formatBytes`/`formatDuration`/`formatUptime`), cryptographic helpers (`sha256`/`sha256File`/`deriveSharedSecret`), zx initialization (`initZx()`), and the TypeScript loader hook that transforms TypeScript on import. It is the dependency-light core that the SDK, compiler, and `kici` CLI consume directly so they stay free of heavier server-only dependencies. `@kici-dev/shared` re-exports it, so existing `@kici-dev/shared` import paths keep working.

> Source: `packages/core/src/`

### `@kici-dev/shared`

Shared utilities used across packages, including everything from `@kici-dev/core` (re-exported) plus server-side helpers. Provides `initZx()` for zx initialization, `createLogger()` for JSON-structured logging with TTY-aware formatting, `createPool()`/`createDb()` for typed PostgreSQL connections, `createMetricsRoutes()`/`createHealthRoutes()` for HTTP route factories (Prometheus metrics and health endpoints), `RingBuffer` for bounded collections, `requestContext`/`getRequestContext()`/`enrichRequestContext()` for async local storage request context, `getReconnectDelay()` for exponential backoff, `formatBytes`/`formatDuration`/`formatUptime` for human-readable formatting, `sha256`/`sha256File`/`deriveSharedSecret` for cryptographic utilities, `initTelemetry`/`createMeter` for OpenTelemetry integration, and `setupGracefulShutdown` for coordinated service shutdown with ordered steps.

> Source: `packages/shared/src/`

### Dashboard

Web UI for KiCI. A browser single-page application that provides the operator dashboard with execution run listing, run detail views, real-time log streaming, settings management, and keyboard shortcut support. Authenticates via OIDC against the identity provider and communicates with the Platform REST-over-WebSocket API.

### `kici` (wrapper)

Unscoped wrapper package that provides the `kici` CLI command. Re-exports `@kici-dev/compiler/cli` so users can install `kici` globally or use it via `npx kici`.

> Source: `packages/kici/`

### `kici-admin` (admin CLI wrapper)

Unscoped wrapper package that provides the `kici-admin` CLI command. Re-exports `@kici-dev/orchestrator/cli` for orchestrator administration tasks.

> Source: `packages/kici-admin/`

## Package dependency graph

The following diagram shows how `@kici` packages depend on each other. Solid arrows are direct dependencies; dashed arrows are peer or dev dependencies (labeled).

```mermaid
flowchart TD
    CORE["@kici-dev/core"]
    SDK["@kici-dev/sdk"]
    COMPILER["@kici-dev/compiler"]
    SHARED["@kici-dev/shared"]
    ENGINE["@kici-dev/engine"]
    PLATFORM["Platform"]
    ORCH["@kici-dev/orchestrator"]
    AGENT["@kici-dev/agent"]
    DASH["Dashboard"]

    DASH --> ENGINE
    DASH -.->|dev| PLATFORM
    SHARED --> CORE
    SDK --> ENGINE
    SDK --> CORE
    COMPILER --> ENGINE
    COMPILER --> CORE
    COMPILER -.->|peer| SDK
    PLATFORM --> ENGINE
    PLATFORM --> SHARED
    ORCH --> ENGINE
    ORCH --> SHARED
    ORCH -.->|dev| AGENT
    AGENT --> ENGINE
    AGENT --> SDK
    AGENT --> SHARED
    AGENT --> CORE
    KICI["kici (wrapper)"]
    KICI --> COMPILER
    KICI --> CORE
    KICIADMIN["kici-admin (admin CLI)"]
    KICIADMIN --> ORCH
    KICIADMIN --> AGENT
```

**Leaf packages** (no `@kici` dependencies): `@kici-dev/core` and `@kici-dev/engine`. These can be tested and built independently. `@kici-dev/shared` builds on `@kici-dev/core` and re-exports it. The dashboard depends on `@kici-dev/engine` for shared types (protocol schemas, state machine) and imports the Platform's API type definitions as a dev dependency, but communicates with backend services at runtime via HTTP/WebSocket, not at compile time.

**Runtime tiers** (Platform, orchestrator, agent) all depend on `@kici-dev/engine` for shared business logic and `@kici-dev/shared` for utilities. Only the agent depends on `@kici-dev/sdk` (it loads workflow definitions at runtime).

## Connection overview

KiCI uses three WebSocket layers for real-time communication.

### Platform ↔ Orchestrator

The orchestrator connects outbound to the Platform WebSocket endpoint. After authentication (API key validated via SHA-256 hash lookup), the connection is used for webhook relay, execution telemetry (events, status, logs), source registration, and peer discovery. The Platform can also relay `job.reroute` messages between orchestrators that cannot reach each other directly.

### Orchestrator ↔ Orchestrator (P2P)

When multiple orchestrators are deployed, they establish direct WebSocket connections to each other on the `/ws/peer` endpoint. Peers are discovered via the Platform matchmaker (Platform/hybrid modes) or static configuration (`KICI_CLUSTER_PEERS` env var, independent mode). Connections are authenticated with a mutual pre-shared key (PSK). Traffic includes agent inventory heartbeats, job rerouting, progress reporting, cancel propagation, and Raft leader election. These messages never transit the Platform tier.

> See [Multi-Orchestrator Architecture](./clustering/multi-orchestrator.md) for clustering details and [Protocol Messages](protocol/dashboard.md#orchestrator-orchestrator-messages-peer-to-peer) for message schemas.

### Orchestrator ↔ Agent

The agent connects outbound to the orchestrator WebSocket endpoint. After registration (agent ID, labels, concurrency), the connection is used for job dispatch, status reporting, and log streaming.

> See [Protocol Messages](protocol-messages.md) and [Webhook Delivery](./webhooks/webhook-delivery.md) for detailed message flows and schemas.

## Authentication and multi-tenancy

KiCI uses application-level tenant isolation. The Platform dashboard API accepts three authentication methods (PATs, API keys, JWTs) and enforces org membership on every `/api/v1/orgs/:customerId/*` request.

## See also

- [Multi-Orchestrator Architecture](./clustering/multi-orchestrator.md) -- P2P clustering, Raft consensus, job rerouting
- [State Machine](./execution/state-machine.md) -- execution lifecycle tracking across all tiers
- [Protocol Messages](protocol-messages.md) -- WebSocket message schemas for all three layers
- [Webhook Delivery](./webhooks/webhook-delivery.md) -- end-to-end trace of a webhook through all three tiers

---
