/ technical brief · v0.4 2026-04 · private beta

Darkfield: a two-model architecture
for autonomous operational vision.

This brief describes the system as it stands in private beta. It covers the architecture, the data flow, the training mechanism, and the deployment topology. It is intended for technical leads at partner organizations evaluating an integration.

01

TL;DR

Darkfield is a vision platform consisting of two proprietary models. THINK plans pipelines and supervises evaluation; SEE perceives every frame at sub-fifty-millisecond latency. THINK orchestrates SEE end-to-end — composing detection graphs, scoring outputs against its own annotations, and dispatching per-camera retrains — with no human in the loop.

  • Onboarding is three inputs: an operation description, an RTSP stream, and a tracking prompt — all in plain language.
  • Output is a structured table whose schema THINK authors and revises. Notifications can be email, SMS, or voice call.
  • The model audits itself on rolling samples and triggers per-camera finetunes when precision falls below tolerance.
02

Architecture

Five stages, two models, one closed loop. The arrows below describe the runtime data flow during steady-state operation; onboarding is the same path executed in slow motion under human review.

user
Inputs
operation description (text · pdf · docx) RTSP stream tracking prompt (natural language)
system 2
THINK
inspect stream · verify camera sufficiency author pipeline graph · declare output schema score SEE on rolling samples · diagnose drift dispatch per-camera finetunes · validate before swap
system 1
SEE
open-vocabulary detection (boxes + masks) type-aware recognition (subtype hierarchy) persistent tracking (no third-party tracker) per-camera adapters · hot-swappable weights
output
Events
structured rows · the schema THINK declared email · SMS · voice alerts (modality chosen by THINK) video clips · screenshots · full provenance

figure 02.1 · runtime data flow · arrows are causal, not synchronous

03

Why two models.

A single foundation model that watches every frame is too expensive to run continuously and too slow to react inside a control loop. A single small model that runs every frame is, on its own, brittle: it cannot decide what to track, cannot revise its own pipeline, cannot tell you when the camera is in the wrong place.

Darkfield separates the two cognitive roles deliberately. THINK is the planner — multimodal, slow, expensive, and run only when a decision is needed. SEE is the perceiver — small, fast, specialized to a single camera, and run on every frame. THINK controls SEE the way a senior engineer controls a junior one: by configuration, by inspection, and by retraining — all of it autonomous from the partner's point of view.

The split is structural, not a packaging decision. The two models are trained separately, evaluated separately, and improved on different cadences. SEE specializes per-camera over hours; THINK improves across the entire fleet over weeks.

04

Model specifications.

THINK SEE
role planner · supervisor perceiver · per-frame
modality multimodal (text + vision + tools) vision · prompted by language
parameter count undisclosed · private beta undisclosed · private beta
typical latency seconds to minutes per decision < 50 ms per frame on a single GPU
invocation on-demand · throttled continuous · per-camera
training cadence monthly · across the fleet continuous · per-camera adapter
output plans · schemas · evaluations · alerts boxes · masks · IDs · OCR strings
05

Continuous training.

A vision model deployed once and never retrained loses precision as the scene drifts: lighting changes with the seasons, conveyors are re-tooled, signage is replaced, cameras are bumped. Conventional CV pipelines respond to this with a calendar — a quarterly retrain whose effectiveness is unmeasurable until the next one.

Darkfield retrains on a signal, not a calendar. THINK pulls a rolling sample of recent events, scores SEE's outputs against its own annotations, and launches a per-camera finetune when precision drops below the tolerance declared at onboarding. The new weights are validated on a held-out slice before they are hot-swapped into production.

The result is that SEE is, in effect, a different model on every camera — and a slightly different one every week. We have observed a 6–11pp recall gap closed within the first 48 hours of operation on partner sites, with no human annotation involved. The methodology is written up internally and will be published once the private beta closes.

06

Performance characteristics.

Indicative numbers from steady-state operation across partner sites. Latency targets are met on a single L4 GPU per camera; THINK runs on a small shared cluster.

metricSEE · per cameraTHINK · cluster
p50 latency 22 ms / frame 2.4 s / decision
p99 latency 48 ms / frame 14 s / decision
throughput ~30 fps · 1080p up to 1,200 decisions/min
typical hardware 1× L4 (24 GB) shared H100 pool
uptime SLO 99.5 % per stream 99.9 % control plane
finetune cost ~ $4 · 90 min / cycle
07

Deployment topology.

Three modes are supported. The choice is driven by network conditions, residency requirements, and how much hardware the partner is willing to host.

001 / cloud

Hosted

Cameras stream to Darkfield's regional ingest. SEE and THINK run in our infrastructure. Lowest operational burden, highest dependency on uplink bandwidth.

  • UK primary
  • SOC 2 Type II in progress
  • ≥ 4 Mbps per 1080p stream
002 / hybrid

Edge SEE · cloud THINK

SEE runs on a partner-hosted edge box at the camera. Only events and clips reach the cloud, where THINK supervises. A good fit for sites with constrained uplinks or sensitive footage.

  • 1× L4 / 4 cameras at the edge
  • Footage never leaves the site
  • ~ 50 KB/event in egress
003 / on-prem

Air-gapped

Both models on the partner's hardware. Updates are signed and shipped on disk. Required for some defense and critical-infrastructure deployments.

  • 1× H100 minimum for THINK
  • Signed weight updates · no telemetry
  • Quarterly on-site engineering review
08

Security and compliance.

Data residency
UK primary today. On-prem and air-gapped deployments available for partners with stricter requirements.
Footage retention
Default: clips and screenshots retained for 30 days; structured event rows retained indefinitely. Both configurable per-deployment.
Access control
SSO via OIDC. Per-camera and per-schema RBAC. Full audit log of every model decision, every tool invocation, and every operator action.
Compliance posture
UK GDPR-compliant by default. SOC 2 Type II in progress, expected Q3 2026. ISO 27001 on the roadmap for 2027.
Model security
Weight files signed and verified at load. Per-camera adapters are quarantined from the foundation model and from other adapters.
Privacy posture
No biometric identification by default. Optional face-blurring at the edge before any frame leaves the site. PII never used as a training signal.
/ next step

Bring us a stream and a question.

The brief is not the product. The product is a six-week onboarding in which the model writes its own pipeline against your operation. We're taking on a small number of partners this quarter.

hello@linox.co.uk
onboarding
4–6 weeks · weekly syncs
requirement
1 IP camera · RTSP
output
schema + alerting · day 14
document version
v0.4 · 2026-04