Sanshar Prototype Details

Sanshar Swarm

Event-driven multi-agent operating lab

Sanshar is my prototype lab for turning frontier models into an operating system of peers: worker, reviewer, witness, verifier, and assistant-style peer. The hard problem is not "can a model answer?" It is whether a system can observe the right event, choose the right tool, act safely, leave proof, and improve without silently learning the wrong thing.

Public sanitized architecture repo

Problem

Long-running assistants drift, miss surface delivery, over-trust summaries, and lose state across sessions.

Built

Manager-Agent-Verifier packet loops, source packets, reason codes, handoffs, postproof, and peer-specific lanes.

Proof

Expected-vs-observed metrics, read-back contracts, cursor state, local canaries, and retickets for repeated failures.

Why It Matters

This maps directly to enterprise Claude adoption: tool use, evals, safety gates, workflow reliability, and customer trust.

ObserveEvent surfaces, terminal state, files, voice, images, and peer outputs.
PacketizeSource refs, hashes, claim type, freshness, confidence, and privacy state.
DecideDynamic routing across risk, surface, model, verifier, and autonomy dimensions.
ActLocal reversible work first; external or high-risk changes require approval.
ClosePostproof, read-back, reason codes, retickets, and learning candidates.

Dynamic Dimensions

Language, modality, surface, user intent, urgency, privacy, trust, resource pressure, model choice, and autonomy level.

Failure Handling

False positives, false negatives, stale context, missed promises, timeouts, and repeated-output misses become reason-coded evidence.

Resource Gates

Memory pressure, rate limits, active agents, latency budget, and tool availability shape whether the system acts or backs off.

Learning Gate

Runtime hypotheses stay separate from durable memory, canon, knowledge graph, and training candidates until verified.

Why It Matters

Enterprises need agents that know when to act, when to ask, and how to explain the decision afterward.

SignalDetect what changed and classify the event.
ZoomSelect the smallest fresh context window.
GateCheck risk, permission, resources, and privacy.
ProofCompare expected vs observed and close or reticket.

Event Capture

Gateway events are used for live state. Bounded REST reads are reserved for catch-up and verification.

Source Packets

Important messages become source refs with timestamp, content hash, attachment metadata, claim type, and confidence.

Read-Back

Posts and reactions require read-back before the system claims delivery or attention.

Boundaries

Private/no-agent-attention surfaces are excluded from live attention and only reviewed through explicit bounded requests.

Why It Matters

Customer-facing AI needs reliable surfaces: acknowledgements, attachments, delivery checks, and human-readable state.

GatewayCapture live events rather than polling for state.
Source PacketBind claims to source, freshness, and proof grade.
ReactUse low-noise attention markers only after capture.
Read BackVerify surface delivery with returned message metadata.
ReticketMissing output becomes reusable pipeline work.

Voice

Attachment metadata, bounded download, hashing, STT, language detection, intent classification, and confidence reporting.

Vision

Image and screenshot handling as inspectable evidence, not decorative input.

Chess

LLM coach/judge workflows wrapped around validated game state, legal moves, replay, and learning trails.

Product Goal

Make multimodal AI useful for real users: explain, verify, recover from misses, and improve the workflow.

Why It Matters

Applied AI work becomes real when models handle messy human inputs without pretending confidence they do not have.

AcquireCollect metadata and bounded source evidence.
InterpretUse STT, language routing, visual inspection, or deterministic validators.
RouteSelect response surface, model, verifier, and escalation path.
RespondUse text or voice only when allowed and useful.
EvaluateRecord confidence, misses, and improvement candidates.

Infrastructure

Minimal cloud footprint, infrastructure-as-code, DNS, TLS, web server hardening, and controlled ingress.

Operations

Connectivity tests, startup behavior, restart checks, reachability proof, and user-facing runbooks.

Security

Public web path separated from private management path, with rate limiting and security headers.

Support Fit

Combines AWS networking, customer-style troubleshooting, documentation, and proof-driven debugging.

Why It Matters

It shows the same operating discipline needed to help customers deploy AI systems securely and reliably.

ProvisionCloud instance, DNS, firewall policy, and deploy scripts.
SecureTLS, headers, rate limits, and bounded access assumptions.
VerifyReachability, service status, headers, and browser rendering.
OperateReadable runbooks and postproof for future recovery.

A prototype that connects models to real surfaces

Event-driven multi-agent operating lab

Problem

Built

Proof

Why It Matters

Dynamic policy instead of static behavior

Dynamic Dimensions

Failure Handling

Resource Gates

Learning Gate

Why It Matters

Human-facing event surface with proof

Event Capture

Source Packets

Read-Back

Boundaries

Why It Matters

Multimodal inputs routed through confidence and validators

Voice

Vision

Chess

Product Goal

Why It Matters

Low-cost access path with operational proof

Infrastructure

Operations

Security

Support Fit

Why It Matters