Noma Agent Safety Proof

/app/examples/agent-plan.noma

file:///app/examples/agent-plan.noma

PASS

Patch resultapplied

Validationok → ok

Lines preserved99.4%

Operations1

Pre SHA99057ceb

Post SHA98ded78c

Source bytes5648 → 5648

Write allowedyes

Agent Loop

1. Read161 source lines

2. Discover IDs21 canonical IDs

3. Export Context5882 LLM chars

4. Simulate Patchapplied

5. Validateok

6. Previewsandboxed artifact

Patch Operations

#	Op	Target	Payload
1	`update_attribute`	decision-q3-direction	{ "op": "update_attribute", "id": "decision-q3-direction", "key": "status", "value": "accepted" }

Pre-Validation

No issues found.

Post-Validation

No issues found.

ID Registry

ID	Type	Line	Title / aliases
`q3-roadmap-decision`	section	8	Q3 Roadmap Decision
`options-at-a-glance`	section	17	Options at a glance
`decision`	section	33	Decision
`decision-q3-direction`	directive:decision	35
`decision-matrix`	section	40	Decision matrix
`claims-and-evidence`	section	50	Claims and evidence
`claim-research-wedge`	directive:claim	52
`claim-docs-too-slow`	directive:claim	64
`risks`	section	75	Risks
`risk-narrow-icp`	directive:risk	77
`risk-format-not-sticky`	directive:risk	83
`risk-llm-export-quality`	directive:risk	89
`timeline`	section	94	Timeline
`open-questions`	section	114	Open questions
`oq-pricing-model`	directive:open_question	116
`oq-pdf-engine`	directive:open_question	121
`agent-tasks`	section	126	Agent tasks
`task-validate-claim-research-wedge`	directive:agent_task	128
`task-watch-stale-evidence`	directive:agent_task	134
`task-export-as-review-prompt`	directive:agent_task	140
`export`	section	145	Export

Source Diff

 
 ## Decision
 
-::decision{id="decision-q3-direction" status="proposed"}
+::decision{id="decision-q3-direction" status="accepted"}
 Start with **Option B — Research Workflows**. Narrowest wedge, fastest signal,
 keeps the door open to A and C as adjacent expansions.
 ::

LLM Context Used For Agent Work

# Agent Planning Artifact — Q3 Roadmap Decision
# Q3 Roadmap Decision  [#q3-roadmap-decision]

[SUMMARY]
Three candidate directions for next quarter. This document captures the
options, trade-offs, risks, and timeline as structured blocks so an agent
can revisit and update each section independently — and so the recommendation
can be exported as a prompt for a follow-on review pass.

[/SUMMARY]

## Options at a glance  [#options-at-a-glance]

[GRID columns=3 min="14rem" gap="0.9rem" wide=true]
[CARD title="A · Docs Platform" icon="docs"]
Build a hosted publishing target for Noma. Highest revenue ceiling, longest path to value.

[/CARD]

[CARD title="B · Research Workflows" icon="search"]
Lean into claims/evidence/risk blocks for analyst teams. Narrow ICP, fastest to first paying customer.

[/CARD]

[CARD title="C · General Reports" icon="report"]
Position Noma as the default format for AI-generated reports across domains. Broadest TAM, weakest wedge.

[/CARD]

[/GRID]

## Decision  [#decision]

[DECISION id="decision-q3-direction" status="proposed"]
Start with Option B — Research Workflows. Narrowest wedge, fastest signal,
keeps the door open to A and C as adjacent expansions.

[/DECISION]

## Decision matrix  [#decision-matrix]

| Dimension             | A · Docs | B · Research | C · Reports |
| --------------------- | -------- | ------------ | ----------- |
| Time to first revenue | 6–9 mo   | 6–10 wk      | 4–6 mo      |
| Wedge sharpness       | Medium   | High         | Low         |
| Existing block fit    | Strong   | Native       | Medium      |
| Defensibility         | Network  | Workflow     | Brand only  |
| 18-month revenue cap  | High     | Medium       | High        |

## Claims and evidence  [#claims-and-evidence]

[CLAIM id="claim-research-wedge" confidence=0.74]
Research and analyst teams are the sharpest wedge for Noma because their
existing tools (Word, Notion, Confluence) lack first-class claim/evidence/risk
primitives, and they already structure documents this way mentally.

[/CLAIM]

[EVIDENCE for="claim-research-wedge" source="user-interviews-apr-2026"]
Of 11 analyst-team interviews in April, 9 described their current workflow as
"copy-paste claims into a doc and hope someone catches stale ones." All 9 said
they would pay for a tool that flagged stale evidence automatically.

[/EVIDENCE]

[CLAIM id="claim-docs-too-slow" confidence=0.68]
A docs platform is the higher revenue ceiling, but time-to-revenue is too long
to be the wedge. Better as a follow-on once Noma has format adoption.

[/CLAIM]

[EVIDENCE for="claim-docs-too-slow" source="docs-platform-benchmark-2026"]
Comparable docs-platform launches (Mintlify, GitBook) took 12–18 months to
reach $100k ARR; research-tool launches (Mem, Reflect) hit it in 6–9 months
with a tighter ICP.

[/EVIDENCE]

## Risks  [#risks]

[RISK id="risk-narrow-icp" severity="medium" owner="ferax564"]
Research-team ICP is small (~3k orgs globally). Even high conversion caps the
business below docs-platform scale. Mitigation: use research wedge to drive
format adoption, then expand to docs.

[/RISK]

[RISK id="risk-format-not-sticky" severity="high" owner="ferax564"]
If teams don't keep editing in Noma after first artifact, the workflow value
disappears. Mitigation: ship the agent patch protocol in week 3 so updates
flow back into source automatically.

[/RISK]

[RISK id="risk-llm-export-quality" severity="low" owner="ferax564"]
LLM export quality determines whether agents trust Noma source as canonical.
Easy to verify, easy to fix. Tracked separately.

[/RISK]

## Timeline  [#timeline]

[GRID columns=4 min="12rem" compact=true wide=true]
[CARD title="Wk 1 · Format"]
Parser, AST, frontmatter, JSON export, basic validation.

[/CARD]

[CARD title="Wk 2 · Artifact"]
HTML renderer, default theme, cards/grids/tabs/charts, mobile.

[/CARD]

[CARD title="Wk 3 · Agent"]
LLM export, patch protocol, copy-as-prompt buttons.

[/CARD]

[CARD title="Wk 4 · Launch"]
3 demos, README, spec, comparison page, OSS release.

[/CARD]

[/GRID]

## Open questions  [#open-questions]

[OPEN_QUESTION id="oq-pricing-model"]
Per-seat, per-document, or per-render? Research-team workflows favor per-seat;
agent-driven artifacts favor per-render. Decide before week 3.

[/OPEN_QUESTION]

[OPEN_QUESTION id="oq-pdf-engine"]
Keep Puppeteer as the report-PDF path or add Typst for longer books? Puppeteer
now covers first-class PDFs; Typst may still be worth evaluating for book output.

[/OPEN_QUESTION]

## Agent tasks  [#agent-tasks]

[AGENT_TASK id="task-validate-claim-research-wedge"]
Re-run the interview tally each month. If claim-research-wedge evidence base
drops below 8 of 11 supporting interviews (or new interviews contradict),
lower the claim's confidence attribute and add a counterevidence block.

[/AGENT_TASK]

[AGENT_TASK id="task-watch-stale-evidence"]
Every two weeks, scan evidence blocks for source attributes older than
60 days. Flag any whose underlying source has changed materially. Do not
auto-edit — propose a replace_block patch for human approval.

[/AGENT_TASK]

[AGENT_TASK id="task-export-as-review-prompt"]
On request, package this document's decision, top three claims, and all
risks of severity ≥ medium into an LLM prompt for a second-opinion review.

[/AGENT_TASK]

## Export  [#export]

[EXPORT_BUTTON format="prompt" target="decision-q3-direction"]
Label: Copy decision + risks as a review prompt

[/EXPORT_BUTTON]

[EXPORT_BUTTON format="markdown" target="summary"]
Label: Copy summary as Markdown

[/EXPORT_BUTTON]

[EXPORT_BUTTON format="json" target="document"]
Label: Copy full document AST

[/EXPORT_BUTTON]

> The point of this artifact is not the prose — it's that an agent can re-open
> it next month, walk the decision/claim/risk graph, and update only the parts
> that changed. Everything else stays put, and the Git diff stays clean.