Skip to main content

11. Superiority Program

Goal

Make Seraph superior for a power-user guardian use case, not merely “more capable” in the abstract.

Implementation mirror:

  • docs/implementation/10-superiority-delivery.md owns the shipped-on-develop translation of this program
  • docs/implementation/00-master-roadmap.md owns the live 10-PR queue

That means winning on:

  • durable human modeling
  • intervention quality
  • operator legibility
  • safe real-world execution
  • workflow leverage
  • dense, efficient control surfaces

Where Seraph Can Realistically Win

1. Guardian state, not just session state

Seraph already has the right foundations: soul, goals, vector memory, observer inputs, strategist, proactive delivery, and a first explicit guardian-state layer. The next step is to deepen its quality, confidence, and reuse so the guardian state becomes the default backbone rather than a thin synthesis pass.

2. Intervention quality, not just proactive activity

Seraph already ships an explicit intervention-policy baseline plus a first observer salience/confidence/interruption-cost model. The product should win by deciding better when to act, defer, bundle, or stay silent through stronger calibration, outcome learning, and a real feedback loop.

3. Reliability that is visible and testable

Seraph already has a stronger deterministic runtime-eval story than the reviewed official Hermes materials, and at least clearer documented eval rigor than the reviewed OpenClaw and IronClaw sources. That advantage should expand into broader behavioral contracts, not stop at runtime seams.

Where Seraph Is Currently Behind

1. Primary interface

Seraph’s guardian cockpit is the right product direction, but it is still thinner than the best reference surfaces. OpenClaw, Hermes, and IronClaw still present denser operator-first surfaces with stronger artifact, workflow, and trace control today.

2. Workflow composition

Seraph now has delegation and specialist foundations plus first-class reusable workflow composition. It is still behind on operator-facing workflow control, artifact round-tripping, and broader composition breadth.

3. Execution hardening

Seraph has approvals, tool policy modes, secret redaction, and sandboxed shell paths, but the reviewed OpenClaw and IronClaw materials document stronger isolation and execution-boundary posture.

4. Reach outside the browser

Seraph has a browser surface, WebSocket path, and native daemon foundation, but the reviewed competitors document richer channel and operator reach today.

Program Of Work

Interface and control plane

  • deepen the first guardian cockpit shell into the unquestioned primary workflow surface
  • add stronger linked widgets for state, evidence, artifacts, interventions, workflows, and traces
  • keep a fixed command/composer surface and explicit interrupt/approval controls

Guardian intelligence

  • deepen guardian-state synthesis from the first shipped salience/confidence layer into a richer calibrated backbone
  • evolve intervention policy from a first shipped baseline into a stronger adaptive decision layer
  • capture intervention outcomes and user feedback
  • deepen observer salience and confidence modeling into a stronger learning loop

Runtime and execution

  • deepen provider routing with scoring and broader guardian-flow behavioral evals
  • close safe-execution gaps through stronger policy visibility and isolation hardening

Presence and leverage

  • extend the first native-notification baseline into broader non-browser reach
  • deepen first-class workflow composition into clearer operator-facing workflow control, artifact round-tripping, and broader leverage on top of tools, skills, MCP, and specialists

Proof Of Superiority

Seraph should only claim superiority on an axis when all three are true:

  1. the benchmark doc shows a source-backed Ahead
  2. the shipped implementation docs show the capability on develop
  3. there is a repeatable demo or eval path that makes the claim inspectable

Translation To Delivery

This research program maps directly to the implementation tree, but the live queue should not be duplicated here.

Use:

  • docs/implementation/10-superiority-delivery.md for the current implementation translation
  • docs/implementation/00-master-roadmap.md for the live rolling 10-PR sequence