system 2 · planner · multimodal

Darkfield THINK
— a model that designs the pipeline.

THINK is a multimodal language model trained to read industrial documentation, watch camera streams, author detection schemas, and supervise SEE. It runs when reasoning is needed — not on every frame.

← models overview · darkfield SEE →
/ capabilities

What THINK does.

01 / conversational planning

Ingests business context before detection begins.

THINK reads PDFs, SOPs, and KPIs. It asks clarifying questions in plain language, negotiates the output schema, and proposes notification rules — before a single frame is processed. The onboarding conversation is the specification.

02 / live video understanding

Watches the stream; decides if the camera is sufficient.

THINK pulls a frame from the stream and evaluates it against the declared task. If the angle is wrong for a plate read, or the lighting is insufficient for small-object detection, it says so — and tells you where to put a different camera.

03 / pipeline authorship

Composes the detection graph itself.

THINK chains detect / track / OCR / count / zone steps into a runnable pipeline. It writes the output schema. No CV engineer authored the graph — the model did.

04 / spatial grounding

Points at objects; resolves language to coordinates.

THINK can take a camera frame and locate a named zone, object class, or spatial relationship in pixel space. This grounds abstract language — "the staging area near dock B" — into the exact polygon SEE uses for zone detection.

05 / finetune controller

Evaluates SEE; decides when and how to retrain.

THINK pulls a rolling sample of recent events and scores SEE's output against ground truth it authored. Below tolerance, it diagnoses the failure mode — labelling, coverage, or model-fit — and launches a per-camera finetune accordingly. It can also swap to a different base model when one fits the problem better.

/ example trace

Reasoning over
an operations onboarding.

A lightly edited trace from an actual onboarding session. Partner details anonymised. The full trace runs to approximately twelve turns; this is an excerpt.

think · onboarding session · cold-chain partner · 22 cameras
partner
We run a cold-chain distribution hub — 22 cameras across three bays and a yard. We want to know where every pallet is at all times, and which pallets haven't moved in more than two hours.
think
Understood. Before I propose a schema, let me ask two things. First: are pallets individually identified today — barcodes, QR codes, RFID labels? If so, I can add a pallet ID column to the schema; if not, I'll track by position and re-appearance. Second: when you say "hasn't moved in two hours" — is that two hours from arrival in the bay, or two hours of no movement at all from any camera angle?
partner
Pallets have barcodes — they're readable from roughly 3 metres. And two hours from last visible movement, not from arrival.
think
Good. I'll inspect CAM-04 and CAM-11 first — those are the bays where the barcode angle is most likely to be critical. Pulling a frame from each now and running quality checks.
tool result
CAM-04: angle sufficient for barcode at >85% of pallet positions. Recommend slight tilt down (≈8°) to improve coverage at the far corner.
CAM-11: angle insufficient — barcodes face away from camera at current position. Recommend repositioning to south wall, 2.4m height.
think
CAM-04 is almost there — a minor tilt will get you to full coverage. CAM-11 needs to move; barcodes are facing the wrong way at its current position. I'd suggest the south wall at about 2.4 metres before we go further. Can you reposition it?

Meanwhile, here's a draft schema for your approval:

table pallet_positions · columns: camera, timestamp, pallet_id, zone, last_moved, dwell_minutes, alert_flag
partner
CAM-11 repositioned. That schema looks right. What's the alert condition?
think
I'll set alert_flag = true when dwell_minutes ≥ 120. For notification: given this is a logistics operation, I'd suggest an SMS to the shift manager for dwell > 2 hours, and an escalation call if it reaches 4 hours with no acknowledgement. I'll configure both — you can adjust the thresholds later if you want.

pipeline · detect: pallet · track: on · OCR: barcode · zones: bay_a, bay_b, bay_c, yard · dwell threshold: 120 min alert · tier: P2 · channel: SMS · condition: dwell ≥ 120 min · recipient: shift_manager
partner
Confirmed. Let's go.
/ architecture

Architecture notes.

THINK is a multimodal language model with a vision encoder, designed to orchestrate SEE end-to-end. The architecture is a transformer-class model trained on a combination of real industrial CCTV footage, annotated SOP corpora, synthetic operational scenarios, and reasoning traces generated from partner onboarding sessions.

The context window is sufficient to hold a full SOP document, multiple camera frames, and the running plan simultaneously. Latency target is seconds-to-minutes per planning turn, and under five seconds for a single-frame stream-quality check.

/ safety & limits

Known failure modes.

THINK performs poorly on novel verticals not represented in training — highly bespoke manufacturing processes with no analogue in the SOP corpus are the current outer limit. Mitigation: the onboarding window keeps a human in the loop until the schema has been verified against at least 500 real events.

All autonomous actions are logged to an immutable audit trail. Every per-camera finetune records the evaluation that triggered it and the validation that cleared it. The audit log is visible to the partner — the model itself is not user-tunable, by design.

/ specification

Model card.

model Darkfield THINK
role Planner · pipeline author · finetune controller · alert dispatcher
system System 2 — runs on reasoning events, not on every frame
modality Multimodal — text, image, video frames, structured documents (PDF, DOCX)
context window Sufficient to hold a full SOP, multiple frames, and the running plan
latency Seconds–minutes per planning turn · <5s per single-frame stream-quality check
params undisclosed in private beta
training data Real industrial CCTV · SOP corpora · synthetic operational scenarios · partner onboarding traces
availability Private beta · partner access only

// citations and linked papers → research.html#papers

See THINK plan
against your operation.

We're onboarding a small number of partners in private beta.

read about SEE →