SecureGate Docs

Training

Training pipeline — weapon detection fine-tuning, ArcFace domain adaptation, quality scoring, and benchmarking

Overview

All training scripts live in the securegate-ai/models repository. The production models are pre-trained open-source checkpoints. Fine-tuning is used for domain adaptation (venue-specific weapon detection) and quality scoring.

Training Scripts

ScriptPurposeBase ModelData SourceBest Result
train.pyWeapon detection fine-tuningYOLO26n (COCO)Roboflow Weapon-2 (3,839 imgs)94.9% mAP@50
train_face.pyFace recognition domain adaptationArcFace glintr100Event attendee face crops (S3 / API / local)
train_quality.pyFace quality scoringQualityNet (small CNN)Same face crops (auto-labeled by blur/angle/size)
benchmark.pyModel comparison and evaluationAny ONNX modelTest datasets
collect_faces.pyData collection from live systemN/ASecureGate API

Weapon Detection (train.py)

Fine-tunes YOLO26n for weapon/knife/gun detection. The pipeline downloads the dataset, converts labels, trains, and exports to ONNX.

cd models
uv sync
export ROBOFLOW_API_KEY=<your_key>
uv run python train.py --dataset roboflow --epochs 100
uv run python train.py --export-only

Latest Results (March 2026)

Trained on Roboflow Weapon-2 dataset (3,455 train / 384 val / 259 test images):

EpochPrecisionRecallmAP@50mAP@50-95
50.8360.6640.7900.491
200.9090.8310.9120.657
400.9420.8320.9290.692
660.9450.8810.9490.750

MPS-specific settings: batch=16 (TAL bug >16), cache=ram, workers=0. See SG-TR-2026-001 for full methodology.

Data Format

Standard YOLO format:

weapon_dataset/
+-- images/
|   +-- train/
|   +-- val/
+-- labels/
|   +-- train/
|   +-- val/
+-- data.yaml

data.yaml:

train: images/train
val: images/val
nc: 3
names: ['Weapon', 'Knife', 'Gun']

Export to ONNX

After training, export for deployment:

uv run python -c "
from ultralytics import YOLO
model = YOLO('runs/detect/train/weights/best.pt')
model.export(format='onnx', imgsz=640, simplify=True)
"

Face Recognition (train_face.py)

Fine-tunes ArcFace glintr100 on venue-specific attendee faces for improved recognition in specific lighting and camera conditions.

uv run python train_face.py \
  --base-model models/antelopev2/glintr100.onnx \
  --data faces/ \
  --epochs 10 \
  --lr 1e-4 \
  --batch 64

This performs supervised contrastive learning — pushing embeddings of the same person closer together and different people farther apart, starting from the pre-trained glintr100 weights.

Data Collection

Face crops can be collected from the live system:

uv run python collect_faces.py \
  --api-url https://api.securegate.dev.satschel.com \
  --event-id evt_abc123 \
  --output faces/ \
  --token $TOKEN

This downloads registered attendee face crops (with tenant CEK decryption handled server-side) and organizes them into per-identity directories for training.

Quality Scoring (train_quality.py)

Trains a lightweight CNN to predict face quality scores from face crops.

uv run python train_quality.py \
  --data faces/ \
  --epochs 50 \
  --batch 32

Training labels are auto-generated from objective quality metrics:

MetricWeightRange
Blur (Laplacian variance)0.30-1
Yaw angle0.250-90 degrees
Pitch angle0.250-90 degrees
Face size (bounding box area)0.20-1 (normalized)

The composite score (0-1) labels each face crop for regression training. The trained model replaces the heuristic quality filter with a learned scorer.

Benchmarking (benchmark.py)

Compares model accuracy and latency on a test dataset:

uv run python benchmark.py \
  --models models/glintr100.onnx models/glintr100_finetuned.onnx \
  --test-data test_faces/ \
  --metrics rank1 tar_far001 latency

Metrics

MetricDescription
Rank-1Percentage of queries where the correct match is the top result
TAR@FAR=0.1%True accept rate at 0.1% false accept rate
TAR@FAR=1%True accept rate at 1% false accept rate
LatencyMean inference time per face (ms)

Environment

All training uses uv for Python environment management:

cd models
uv sync
uv run python train.py --help

GPU training requires CUDA 12.x and at least 8GB VRAM. The training scripts auto-detect available GPUs and use DataParallel for multi-GPU training.

On this page