Models
Complete model inventory — all pre-trained and fine-tuned models used in the SecureGate inference pipeline
Production Models
All models used in production are pre-trained open-source models. No proprietary training data is required for base functionality.
| Model | Source | Format | Service | GPU | Purpose |
|---|---|---|---|---|---|
| scrfd_10g_bnkps | InsightFace antelopev2 (MIT) | ONNX | ingest | Yes | Face detection — bounding boxes, confidence, 5-point landmarks |
| glintr100 | InsightFace antelopev2 (MIT) | ONNX | embed | Yes | 512-d ArcFace face embeddings for recognition |
| CodeFormer | sczhou/CodeFormer | PyTorch | enhance | Yes | Face restoration (fidelity=0.7) — recovers degraded faces |
| RealESRGAN x4plus | xinntao/Real-ESRGAN (BSD-3) | PyTorch | enhance | Yes | 4x background upscaling for low-resolution captures |
| YOLO26n | Ultralytics (fine-tuned) | ONNX | browser | No (WASM) | NMS-free weapon/knife/gun detection in-browser |
| YOLOv8n Firearm | Subh775/Firearm_Detection_Yolov8n | ONNX | ingest | Yes | Server-side weapon/knife/gun detection |
| MediaPipe Face Mesh | Google (Apache 2.0) | TFLite | browser | No (WASM) | 468-point 3D face landmarks for browser overlay |
Model Details
Face Detection (scrfd_10g_bnkps)
- Architecture: SCRFD (Sample and Computation Redistribution for Face Detection)
- GFLOPs: 10
- Outputs: bounding boxes, confidence scores, 5-point landmarks (left eye, right eye, nose, left mouth corner, right mouth corner)
- Input: any resolution RGB image
- Fallback: buffalo_l model pack if antelopev2 unavailable
Face Recognition (glintr100)
- Architecture: ResNet-100 trained with ArcFace loss (Additive Angular Margin)
- Embedding dimension: 512 (float32, L2-normalized)
- Input: 112x112 aligned RGB face crop
- Similarity metric: cosine similarity (dot product of L2-normalized vectors)
- Benchmark: 99.8% accuracy on LFW, 98.0% on CFP-FP
Face Restoration (CodeFormer)
- Architecture: Transformer-based codebook lookup
- Fidelity parameter: 0.7 (balance between quality and identity preservation)
- Input: degraded face crop (any size)
- Output: restored face (same size)
- Handles: blur, noise, compression artifacts, low resolution
Background Upscaling (RealESRGAN x4plus)
- Architecture: Enhanced SRGAN with U-Net discriminator
- Scale factor: 4x
- Input: any resolution RGB image
- Output: 4x upscaled image
- Handles: real-world degradation (blur, noise, JPEG artifacts)
Weapon Detection (YOLO26n)
- Architecture: YOLO26 nano (NMS-free head, 2.4M params, 5.4B FLOPs)
- Classes: Weapon, Knife, Gun
- Training: Fine-tuned on Weapon-2 dataset (3,839 images, Roboflow)
- Performance: 94.9% mAP@50, 75.0% mAP@50-95, 95.3% precision, 88.2% recall
- ONNX size: 9.7 MB (opset 12, FP32, static 640×640)
- Input: 640×640 RGB
- Runtime: onnxruntime-web (WebGPU → WebGL → WASM)
- Latency: ~30ms per frame in browser
- Paper: SG-TR-2026-001
Weapon Detection Server (YOLOv8n Firearm)
- Architecture: YOLOv8 nano
- Source: Subh775/Firearm_Detection_Yolov8n (Hugging Face)
- Training data: Subh775/WeaponDetection (9,493 images, Roboflow)
- Classes: Weapon, Knife, Gun
- Input: 640x640 RGB
- Runtime: ONNX Runtime with CUDA EP
- Latency: ~4ms per frame on GH200
Face Mesh (MediaPipe)
- Architecture: MediaPipe Face Mesh
- Landmarks: 468 3D points
- Input: RGB video frame
- Runtime: TFLite (WebAssembly in browser)
- Purpose: browser-side face landmark overlay, head pose estimation
Model Storage
Models are stored in the container image and downloaded on first boot:
/models/
+-- antelopev2/
| +-- scrfd_10g_bnkps.onnx (face detection)
| +-- glintr100.onnx (face recognition)
+-- codeformer/
| +-- codeformer.pth (face restoration)
+-- realesrgan/
| +-- RealESRGAN_x4plus.pth (background upscaling)
+-- weapon/
| +-- yolov8n_firearm.onnx (server weapon detection)
+-- browser/
+-- yolo26n.onnx (browser weapon detection)
+-- face_mesh.tflite (browser face landmarks)Models are fetched from Hugging Face or InsightFace model repositories. The hf CLI is used for Hugging Face downloads.