/ technical brief · v0.4 2026-04 · private beta

Darkfield: a two-model architecture
for autonomous operational vision.

This brief describes the system as it stands in private beta. It covers the architecture, the data flow, the training mechanism, and the deployment topology. It is intended for technical leads at partner organizations evaluating an integration.

contents

01 · TL;DR
02 · Architecture
03 · Why two models
04 · Model specifications
05 · Continuous training
06 · Performance
07 · Deployment topology
08 · Security and compliance

01

TL;DR

Darkfield is a vision platform consisting of two proprietary models. THINK plans pipelines and supervises evaluation; SEE perceives every frame at sub-fifty-millisecond latency. THINK orchestrates SEE end-to-end — composing detection graphs, scoring outputs against its own annotations, and dispatching per-camera retrains — with no human in the loop.

Onboarding is three inputs: an operation description, an RTSP stream, and a tracking prompt — all in plain language.
Output is a structured table whose schema THINK authors and revises. Notifications can be email, SMS, or voice call.
The model audits itself on rolling samples and triggers per-camera finetunes when precision falls below tolerance.

02

Architecture

Five stages, two models, one closed loop. The arrows below describe the runtime data flow during steady-state operation; onboarding is the same path executed in slow motion under human review.

user

Inputs

operation description (text · pdf · docx) RTSP stream tracking prompt (natural language)

system 2

THINK

inspect stream · verify camera sufficiency author pipeline graph · declare output schema score SEE on rolling samples · diagnose drift dispatch per-camera finetunes · validate before swap

system 1

SEE

open-vocabulary detection (boxes + masks) type-aware recognition (subtype hierarchy) persistent tracking (no third-party tracker) per-camera adapters · hot-swappable weights

output

Events

structured rows · the schema THINK declared email · SMS · voice alerts (modality chosen by THINK) video clips · screenshots · full provenance

figure 02.1 · runtime data flow · arrows are causal, not synchronous

03

Why two models.

A single foundation model that watches every frame is too expensive to run continuously and too slow to react inside a control loop. A single small model that runs every frame is, on its own, brittle: it cannot decide what to track, cannot revise its own pipeline, cannot tell you when the camera is in the wrong place.

Darkfield separates the two cognitive roles deliberately. THINK is the planner — multimodal, slow, expensive, and run only when a decision is needed. SEE is the perceiver — small, fast, specialized to a single camera, and run on every frame. THINK controls SEE the way a senior engineer controls a junior one: by configuration, by inspection, and by retraining — all of it autonomous from the partner's point of view.

The split is structural, not a packaging decision. The two models are trained separately, evaluated separately, and improved on different cadences. SEE specializes per-camera over hours; THINK improves across the entire fleet over weeks.

04

Model specifications.

	THINK	SEE
role	planner · supervisor	perceiver · per-frame
modality	multimodal (text + vision + tools)	vision · prompted by language
parameter count	undisclosed · private beta	undisclosed · private beta
typical latency	seconds to minutes per decision	< 50 ms per frame on a single GPU
invocation	on-demand · throttled	continuous · per-camera
training cadence	monthly · across the fleet	continuous · per-camera adapter
output	plans · schemas · evaluations · alerts	boxes · masks · IDs · OCR strings

05

Continuous training.

A vision model deployed once and never retrained loses precision as the scene drifts: lighting changes with the seasons, conveyors are re-tooled, signage is replaced, cameras are bumped. Conventional CV pipelines respond to this with a calendar — a quarterly retrain whose effectiveness is unmeasurable until the next one.

Darkfield retrains on a signal, not a calendar. THINK pulls a rolling sample of recent events, scores SEE's outputs against its own annotations, and launches a per-camera finetune when precision drops below the tolerance declared at onboarding. The new weights are validated on a held-out slice before they are hot-swapped into production.

The result is that SEE is, in effect, a different model on every camera — and a slightly different one every week. We have observed a 6–11pp recall gap closed within the first 48 hours of operation on partner sites, with no human annotation involved. The methodology is written up internally and will be published once the private beta closes.

06

Performance characteristics.

Indicative numbers from steady-state operation across partner sites. Latency targets are met on a single L4 GPU per camera; THINK runs on a small shared cluster.

metric	SEE · per camera	THINK · cluster
p50 latency	22 ms / frame	2.4 s / decision
p99 latency	48 ms / frame	14 s / decision
throughput	~30 fps · 1080p	up to 1,200 decisions/min
typical hardware	1× L4 (24 GB)	shared H100 pool
uptime SLO	99.5 % per stream	99.9 % control plane
finetune cost	~ $4 · 90 min / cycle	—

07

Deployment topology.

Three modes are supported. The choice is driven by network conditions, residency requirements, and how much hardware the partner is willing to host.

001 / cloud

Hosted

Cameras stream to Darkfield's regional ingest. SEE and THINK run in our infrastructure. Lowest operational burden, highest dependency on uplink bandwidth.

UK primary
SOC 2 Type II in progress
≥ 4 Mbps per 1080p stream

002 / hybrid

Edge SEE · cloud THINK

SEE runs on a partner-hosted edge box at the camera. Only events and clips reach the cloud, where THINK supervises. A good fit for sites with constrained uplinks or sensitive footage.

1× L4 / 4 cameras at the edge
Footage never leaves the site
~ 50 KB/event in egress

003 / on-prem

Air-gapped

Both models on the partner's hardware. Updates are signed and shipped on disk. Required for some defense and critical-infrastructure deployments.

1× H100 minimum for THINK
Signed weight updates · no telemetry
Quarterly on-site engineering review

08

Security and compliance.

Data residency: UK primary today. On-prem and air-gapped deployments available for partners with stricter requirements.
Footage retention: Default: clips and screenshots retained for 30 days; structured event rows retained indefinitely. Both configurable per-deployment.
Access control: SSO via OIDC. Per-camera and per-schema RBAC. Full audit log of every model decision, every tool invocation, and every operator action.
Compliance posture: UK GDPR-compliant by default. SOC 2 Type II in progress, expected Q3 2026. ISO 27001 on the roadmap for 2027.
Model security: Weight files signed and verified at load. Per-camera adapters are quarantined from the foundation model and from other adapters.
Privacy posture: No biometric identification by default. Optional face-blurring at the edge before any frame leaves the site. PII never used as a training signal.

/ next step

Bring us a stream and a question.

The brief is not the product. The product is a six-week onboarding in which the model writes its own pipeline against your operation. We're taking on a small number of partners this quarter.

hello@linox.co.uk

onboarding: 4–6 weeks · weekly syncs
requirement: 1 IP camera · RTSP
output: schema + alerting · day 14
document version: v0.4 · 2026-04

Darkfield: a two-model architecture for autonomous operational vision.