# Architecture Overview (Production)

This document describes the hardened production architecture for
`agent-learning-compounder`, with explicit trust boundaries and operational
contracts for runtime hooks, cron integration, and state.

## 1) System Goals

- Produce durable, evidence-backed agent context without persisting raw prompts,
  raw tool output, transcript chunks, or raw network traffic.
- Keep hook/event telemetry bounded and boundedly queryable for skill-health and
  remediation decisions.
- Make repo onboarding reproducible by standardizing what is written, where it is
  written, and how it is validated.
- Preserve local security posture: avoid absolute-path leaks and mutable config
  writes without explicit operator approval.
- Keep automation opt-in through manifest + explicit scheduling action.

## 2) Trust Boundaries

- **Untrusted input**: Repo instruction files, transcript exports, runtime hook
  events, and prior session artifacts can contain adversarial content.
- **Trust boundary A (runtime package)**: Installed scripts and references in
  `<skill-root>/agent-learning-compounder` are trusted only through checksum /
  source provenance checks plus local readback checks.
- **Trust boundary B (repo-local integration)**: `.agent-learning.json`,
  `.agents/skills/...` and `.claude/...` integrations are repo-specific and may
  be modified by users; installers refuse tracked-path writes.
- **Trust boundary C (runtime state)**: Telemetry and refresh artifacts are write
  sinks that must enforce regular-file checks, bounded schema shape, and
  secret-safe normalization.
- **Trust boundary D (operator)**: Scheduler registration, hook runtime apply,
  and durable automation commands are operator decisions and must remain explicit.

## 3) Component Map

- **Installer layer**
  - `install.sh`: installs package copy under target skills root and can run
    package smoke checks.
- **Repo bootstrap layer**
  - `scripts/init_learning_system.py`: creates repo integration, state,
    refresh manifest, and optional hook integration.
- **Ingestion layer**
  - `scripts/build_repo_baseline.py` and `scripts/map_active_skills.py` produce
    baseline context and active skill inventories.
  - `scripts/extract_sessions.py` / `scripts/distill_learning.py` convert
    session logs into compact reports.
- **Hook layer**
  - `scripts/collect_hook_event.py` appends normalized hook events.
  - `scripts/install_runtime_hooks.py` writes runtime adapters after explicit `--apply`.
  - `bin/agent_dispatch.py` owns the shared dispatch field policy used by
    collector, runtime adapters, and MCP reporting.
- **Refresh layer**
  - `scripts/refresh_learning_state.py` is the intended periodic task, represented
    in a manifest.
- **Health/load surfaces**
  - `reports/latest-approved-gates.md`
  - `reports/latest-skill-context.md`

## 4) State Topology

### 4.1 Config root

For production hardening, the intended state root is:

```text
<repo>/.agent-learning
```

This keeps all per-repo state colocated and avoids cross-repo root-leak
ambiguity.  

`init_learning_system.py` also supports explicit overrides via:

- `--state-dir`
- `AGENT_LEARNING_STATE_DIR`
- `--personal`

### 4.2 Derived paths

State is split by deterministic repository identity:

```text
<state-root>/repos/<repo-id>/hooks/*
<state-root>/repos/<repo-id>/reports/*
<state-root>/repos/<repo-id>/hook-events.jsonl
<state-root>/repos/<repo-id>/improvement-queue.jsonl
<repo>/.agent-learning.json
```

### 4.3 Repo-local artifacts (bootstrap minimum)

- `.agent-learning.json` (integration payload and canonical outputs)
- `<state-root>/repos/<repo-id>/config.json` (`state_version`, retention, runtime markers)
- `<state-root>/repos/<repo-id>/baseline.json`
- `<state-root>/repos/<repo-id>/domain-rules.active.json`
- `<state-root>/repos/<repo-id>/skill-map.json`
- `<state-root>/repos/<repo-id>/reports/latest-approved-gates.md`
- `<state-root>/repos/<repo-id>/reports/latest-skill-context.md`
- `<state-root>/repos/<repo-id>/automation/agent-learning-refresh.manifest.json`
- `<state-root>/repos/<repo-id>/hooks/collect-agent-learning-event.sh`
- `<state-root>/repos/<repo-id>/hooks/agent-learning-hooks.manifest.json`
- `<state-root>/repos/<repo-id>/hook-events.jsonl`

## 5) Runtime Hook Contract

Hook integration is a two-step contract:

1. The repo bootstrap creates a repo-scoped event wrapper and manifest.
2. Runtime integration is added only with explicit `--apply` and only after
   dry-run review.

The shared wrapper expects a single JSON object on stdin and forwards it to the
shared collector. Runtime-specific installer paths:

- Codex: `.codex/hooks.json`
- Claude: `.claude/settings.local.json`

The runtime adapter contract requires:

- event normalization + repo association,
- bounded field set (`ts`, `event`, `runtime`, `repo`, `skill`, `tool`, `outcome`,
  `path`, `command_class`, plus short tags),
- agent-dispatch aliases normalized only through the shared dispatch policy,
- no raw transcript persistence,
- explicit command validation before runtime execution.

## 6) Runtime Adapter Matrix

| Runtime | Default Config Target | Installer Mode | Event Source | Notes |
| --- | --- | --- | --- | --- |
| codex | `.codex/hooks.json` | `--runtime codex` | Codex CLI hook lifecycle | default scope is repo-local |
| claude | `.claude/settings.local.json` | `--runtime claude` | Claude settings hooks | default scope is repo-local |

Both runtimes share the same manifest-driven adapter command and wrapper target.

### Scheduler semantics

`agent-learning-refresh.manifest.json` is a **declaration only**.
It does not register any cron/systemd/launchd/other schedulers.
Operators must explicitly register the manifest command with their scheduler of
choice.

## 7) Config and Schema Versioning

- `.agent-learning.json`:
  - `state_version` (currently 1)
  - canonical pointers to reports and integration state.
- `<state-root>/repos/<repo-id>/config.json`:
  - `state_version` (currently 1), retention, telemetry flags, and state root metadata.
- `agent-learning-hooks.manifest.json`:
  - `schema_version` (currently 1), hook command, manifest event allowlist, telemetry defaults.
- `agent-learning-refresh.manifest.json`:
  - `schema_version` (currently 1), command tuple, outputs, install note.

Every manifest-bearing object must validate as JSON with required fields before write.

## 8) Extensibility

- **New runtime adapter**:
  - add runtime path handling in installer,
  - define event mapping,
  - enforce command integrity,
  - add regression coverage for dry-run/apply behavior and command validation.
- **New transcript source**:
  - add source extractor entry in source-adapters.
  - preserve bounded normalization and redaction policies.
- **New output destination**:
  - add manifest schema and add readback validation in onboarding and refresh.

## 9) Error Handling

- Installer refuses to write tracked files (repo-local config + runtime configs).
- Runtime adapter rejects:
  - symlink commands,
  - non-regular files in execution path,
  - manifest/command mismatch,
  - commands outside expected state root.
- Telemetry writes reject symlink destinations.
- Hook ingestion normalizes input but persists only allowed fields.
- Init can fail fast (`--self-test`) with explicit missing-file diagnostics.

## 10) Onboarding Runbook

1. Install package to the chosen root.
2. Run repo bootstrap with state in `<repo>/.agent-learning`.
3. Confirm health contract: `.agent-learning.json`, baseline artifacts,
   manifest files, and approved outputs exist.
4. Review hook plan with `--dry-run`, then apply only with explicit `--apply`.
5. Register refresh manifest with scheduler only when explicit automation is
   desired.
