Flag & Observability Conventions
Flag & Observability Conventions
Section titled “Flag & Observability Conventions”One-page operator reference for the cross-vertical conventions every vertical in the consumer-agent stack inherits. Source: spec PR #300.
Flag naming grammar
Section titled “Flag naming grammar”Every feature flag follows:
ai_assistant_<domain>_<purpose>[_<modifier>]<domain>— one of the enumerated vertical names (shop,rewards,play,ereceipts,offerdetails,restaurant,retailer,support), a platform component (orchestrator,gateway,worker,relay), a special scope (pool,infra,user), or a sub-agent id (subagent_<id>).<purpose>— one ofexperiment,pool,killswitch,enabled,ramp,test.<modifier>(optional) — version (v3), variant, or platform constraint.
The validator picks the last recognized purpose token. So ai_assistant_rewards_v3_experiment_killswitch parses as purpose=killswitch, domain rewards_v3_experiment.
Enforced by: scripts/lint_flag_names.py (wired into make lint + pre-commit). Legacy consumer_agent_* flags are grandfathered via LEGACY_FLAG_NAMES in src/consumer_agent/platform/flag_naming.py.
Lifecycle (five computed states)
Section titled “Lifecycle (five computed states)”created → ramping → fully-rolled-out → eligible-for-removal → removedState is computed from Flipper API + code-grep + flip history, never authored. A flag at 100% for ≥30 days with zero churn moves to eligible-for-removal. Cleanup PRs follow the PLT-553 / PLT-554 working example.
Kill-switch convention
Section titled “Kill-switch convention”Every production-impacting flag must have a paired flag with the _killswitch suffix in the same enum. Example:
- Cohort/test:
ai_assistant_rewards_v3_experiment - Paired killswitch:
ai_assistant_rewards_v3_experiment_killswitch
The killswitch is operator-flippable (no engineering approval needed at flip time). Defaults ON. OFF immediately drops to the previous stable state at next factory invocation.
Internal-only flags (FORCE_* env-var overrides, debug toggles) are exempt — register them in INTERNAL_ONLY_FLAGS in flag_naming.py with a one-line comment.
Enforced by: scripts/lint_kill_switch_presence.py.
Parent/child pool-test pattern
Section titled “Parent/child pool-test pattern”ai_assistant_user_pool (platform-owned)├── ai_assistant_rewards_v3_test (rewards-owned)├── ai_assistant_shop_pricing_test (shop-owned)└── ai_assistant_restaurant_v1_test (restaurant-owned)Cohort key is user_id. Sticky assignment via Feature Flipper. Multiple test flags hang off one pool without coordination.
Required Grafana panels per vertical
Section titled “Required Grafana panels per vertical”Every vertical with production traffic has:
- Engagement — turns/episode, attach rate, intent-completion, p50/p95 latency.
- Judge scores — per-judge score distribution from PC5; threshold lines visible.
- Kill-switch state — current state of every kill-switch scoped to the vertical; flips in the last 7 days.
Verticals that emit DM types additionally have:
- DM funnel (per PLT-629) — Targeted → Sent → Delivered → Displayed → Tapped → CTA; sliced by
dm_typeandexperiment_arm.
Templates live in PLT-628’s implementation repo; per-vertical dashboards inherit them.
Five mandatory slicing dimensions
Section titled “Five mandatory slicing dimensions”Every per-traffic metric label, span attribute, and PS5 event-store column carries:
| Dimension | Source | Null when |
|---|---|---|
vertical | consumer_agent.platform.verticals | Never null; platform for cross-vertical orchestrator/gateway |
sub_agent_id | PC1 Agent Definition id | Non-agent emissions (e.g., gateway routing) |
dm_type | PD3 DM type id | Chat-flow traffic |
experiment_arm | PC6 cohort assignment | No experiment active |
agent_definition_version | PC1 + PF1 | Non-Agent-Definition emissions |
Import from consumer_agent.platform.slicing_dimensions:
from consumer_agent.platform.slicing_dimensions import ( VERTICAL, SUB_AGENT_ID, DM_TYPE, EXPERIMENT_ARM, AGENT_DEFINITION_VERSION,)Enforced by: scripts/lint_metric_dimensions.py. Existing sites that pre-date the convention are in LEGACY_METRIC_SITES; they shrink toward zero as code is touched. Sites that are genuinely not per-traffic opt out with a # convention: not-per-traffic line comment.
Cardinality denylist
Section titled “Cardinality denylist”Raw user_id, request_id, principal, session_id are forbidden as label keys on metric/span emissions (FR-8). For debug-trail attribution use hashed_user_id():
from consumer_agent.platform.hashing import hashed_user_id
counter.add(1, { VERTICAL: "rewards", ... "user_hash": hashed_user_id(user_id), # bounded-cardinality})Even hashed identifiers shouldn’t appear on metric labels — only on trace attributes where operational cardinality bounds can be enforced.
Enforced by: scripts/lint_metric_cardinality.py.
Required alerts per vertical
Section titled “Required alerts per vertical”Wired to Rootly (per PLT-628 alert-routing track):
| Alert | Condition | Severity |
|---|---|---|
| Safety/refusal judge floor breach | eval.judge.score_distribution{judge_classification=safety} < floor | page |
| Quality judge regression | same, quality < baseline - tolerance | notify |
| Kill-switch flipped OFF | flag state change observed | page |
| Engagement drop | engagement metric drops > threshold% vs baseline | notify |
| DM funnel anomaly (DM verticals only) | tap rate or CTA rate vs baseline > threshold% | notify |
Noise budget: ≤ 1 page per on-call week at steady state.
Onboarding a new vertical
Section titled “Onboarding a new vertical”See .claude/skills/onboarding-a-vertical-to-conventions/SKILL.md for the five-step sequence. PF5 (Vertical Scaffolding + Validation) invokes it.
Where the code lives
Section titled “Where the code lives”| Surface | Path |
|---|---|
| Domain enum | src/consumer_agent/platform/verticals.py |
| Dimension constants | src/consumer_agent/platform/slicing_dimensions.py |
| Flag-name validator | src/consumer_agent/platform/flag_naming.py |
| Legacy allow-lists | src/consumer_agent/platform/legacy_metric_sites.py |
| Hashing helper | src/consumer_agent/platform/hashing.py |
| Lint scripts | scripts/lint_flag_names.py, lint_kill_switch_presence.py, lint_metric_dimensions.py, lint_metric_cardinality.py |
Go-side parity lives at internal/platform/ in consumer-graph-worker, rover-agent, and consumer-context-service.