Noma Agent Safety Proof

/app/examples/agent-plan.noma

file:///app/examples/agent-plan.noma

PASS
Patch resultapplied
Validationok → ok
Lines preserved99.4%
Operations1
Pre SHA99057ceb
Post SHA98ded78c
Source bytes5648 → 5648
Write allowedyes

Agent Loop

1. Read161 source lines
2. Discover IDs21 canonical IDs
3. Export Context5882 LLM chars
4. Simulate Patchapplied
5. Validateok
6. Previewsandboxed artifact

Patch Operations

#OpTargetPayload
1update_attributedecision-q3-direction
{
  "op": "update_attribute",
  "id": "decision-q3-direction",
  "key": "status",
  "value": "accepted"
}

Pre-Validation

No issues found.

Post-Validation

No issues found.

ID Registry

IDTypeLineTitle / aliases
q3-roadmap-decisionsection8Q3 Roadmap Decision
options-at-a-glancesection17Options at a glance
decisionsection33Decision
decision-q3-directiondirective:decision35
decision-matrixsection40Decision matrix
claims-and-evidencesection50Claims and evidence
claim-research-wedgedirective:claim52
claim-docs-too-slowdirective:claim64
riskssection75Risks
risk-narrow-icpdirective:risk77
risk-format-not-stickydirective:risk83
risk-llm-export-qualitydirective:risk89
timelinesection94Timeline
open-questionssection114Open questions
oq-pricing-modeldirective:open_question116
oq-pdf-enginedirective:open_question121
agent-taskssection126Agent tasks
task-validate-claim-research-wedgedirective:agent_task128
task-watch-stale-evidencedirective:agent_task134
task-export-as-review-promptdirective:agent_task140
exportsection145Export

Source Diff

 
 ## Decision
 
-::decision{id="decision-q3-direction" status="proposed"}
+::decision{id="decision-q3-direction" status="accepted"}
 Start with **Option B — Research Workflows**. Narrowest wedge, fastest signal,
 keeps the door open to A and C as adjacent expansions.
 ::

LLM Context Used For Agent Work

# Agent Planning Artifact — Q3 Roadmap Decision
# Q3 Roadmap Decision  [#q3-roadmap-decision]

[SUMMARY]
Three candidate directions for next quarter. This document captures the
options, trade-offs, risks, and timeline as structured blocks so an agent
can revisit and update each section independently — and so the recommendation
can be exported as a prompt for a follow-on review pass.

[/SUMMARY]

## Options at a glance  [#options-at-a-glance]

[GRID columns=3 min="14rem" gap="0.9rem" wide=true]
[CARD title="A · Docs Platform" icon="docs"]
Build a hosted publishing target for Noma. Highest revenue ceiling, longest path to value.

[/CARD]

[CARD title="B · Research Workflows" icon="search"]
Lean into claims/evidence/risk blocks for analyst teams. Narrow ICP, fastest to first paying customer.

[/CARD]

[CARD title="C · General Reports" icon="report"]
Position Noma as the default format for AI-generated reports across domains. Broadest TAM, weakest wedge.

[/CARD]

[/GRID]

## Decision  [#decision]

[DECISION id="decision-q3-direction" status="proposed"]
Start with Option B — Research Workflows. Narrowest wedge, fastest signal,
keeps the door open to A and C as adjacent expansions.

[/DECISION]

## Decision matrix  [#decision-matrix]

| Dimension             | A · Docs | B · Research | C · Reports |
| --------------------- | -------- | ------------ | ----------- |
| Time to first revenue | 6–9 mo   | 6–10 wk      | 4–6 mo      |
| Wedge sharpness       | Medium   | High         | Low         |
| Existing block fit    | Strong   | Native       | Medium      |
| Defensibility         | Network  | Workflow     | Brand only  |
| 18-month revenue cap  | High     | Medium       | High        |

## Claims and evidence  [#claims-and-evidence]

[CLAIM id="claim-research-wedge" confidence=0.74]
Research and analyst teams are the sharpest wedge for Noma because their
existing tools (Word, Notion, Confluence) lack first-class claim/evidence/risk
primitives, and they already structure documents this way mentally.

[/CLAIM]

[EVIDENCE for="claim-research-wedge" source="user-interviews-apr-2026"]
Of 11 analyst-team interviews in April, 9 described their current workflow as
"copy-paste claims into a doc and hope someone catches stale ones." All 9 said
they would pay for a tool that flagged stale evidence automatically.

[/EVIDENCE]

[CLAIM id="claim-docs-too-slow" confidence=0.68]
A docs platform is the higher revenue ceiling, but time-to-revenue is too long
to be the wedge. Better as a follow-on once Noma has format adoption.

[/CLAIM]

[EVIDENCE for="claim-docs-too-slow" source="docs-platform-benchmark-2026"]
Comparable docs-platform launches (Mintlify, GitBook) took 12–18 months to
reach $100k ARR; research-tool launches (Mem, Reflect) hit it in 6–9 months
with a tighter ICP.

[/EVIDENCE]

## Risks  [#risks]

[RISK id="risk-narrow-icp" severity="medium" owner="ferax564"]
Research-team ICP is small (~3k orgs globally). Even high conversion caps the
business below docs-platform scale. Mitigation: use research wedge to drive
format adoption, then expand to docs.

[/RISK]

[RISK id="risk-format-not-sticky" severity="high" owner="ferax564"]
If teams don't keep editing in Noma after first artifact, the workflow value
disappears. Mitigation: ship the agent patch protocol in week 3 so updates
flow back into source automatically.

[/RISK]

[RISK id="risk-llm-export-quality" severity="low" owner="ferax564"]
LLM export quality determines whether agents trust Noma source as canonical.
Easy to verify, easy to fix. Tracked separately.

[/RISK]

## Timeline  [#timeline]

[GRID columns=4 min="12rem" compact=true wide=true]
[CARD title="Wk 1 · Format"]
Parser, AST, frontmatter, JSON export, basic validation.

[/CARD]

[CARD title="Wk 2 · Artifact"]
HTML renderer, default theme, cards/grids/tabs/charts, mobile.

[/CARD]

[CARD title="Wk 3 · Agent"]
LLM export, patch protocol, copy-as-prompt buttons.

[/CARD]

[CARD title="Wk 4 · Launch"]
3 demos, README, spec, comparison page, OSS release.

[/CARD]

[/GRID]

## Open questions  [#open-questions]

[OPEN_QUESTION id="oq-pricing-model"]
Per-seat, per-document, or per-render? Research-team workflows favor per-seat;
agent-driven artifacts favor per-render. Decide before week 3.

[/OPEN_QUESTION]

[OPEN_QUESTION id="oq-pdf-engine"]
Keep Puppeteer as the report-PDF path or add Typst for longer books? Puppeteer
now covers first-class PDFs; Typst may still be worth evaluating for book output.

[/OPEN_QUESTION]

## Agent tasks  [#agent-tasks]

[AGENT_TASK id="task-validate-claim-research-wedge"]
Re-run the interview tally each month. If claim-research-wedge evidence base
drops below 8 of 11 supporting interviews (or new interviews contradict),
lower the claim's confidence attribute and add a counterevidence block.

[/AGENT_TASK]

[AGENT_TASK id="task-watch-stale-evidence"]
Every two weeks, scan evidence blocks for source attributes older than
60 days. Flag any whose underlying source has changed materially. Do not
auto-edit — propose a replace_block patch for human approval.

[/AGENT_TASK]

[AGENT_TASK id="task-export-as-review-prompt"]
On request, package this document's decision, top three claims, and all
risks of severity ≥ medium into an LLM prompt for a second-opinion review.

[/AGENT_TASK]

## Export  [#export]

[EXPORT_BUTTON format="prompt" target="decision-q3-direction"]
Label: Copy decision + risks as a review prompt

[/EXPORT_BUTTON]

[EXPORT_BUTTON format="markdown" target="summary"]
Label: Copy summary as Markdown

[/EXPORT_BUTTON]

[EXPORT_BUTTON format="json" target="document"]
Label: Copy full document AST

[/EXPORT_BUTTON]

> The point of this artifact is not the prose — it's that an agent can re-open
> it next month, walk the decision/claim/risk graph, and update only the parts
> that changed. Everything else stays put, and the Git diff stays clean.

Post-Patch Artifact Preview