Real-Time Face Detection Across Surveillance Cameras for Headcount and Security

The challenge: knowing how many people are inside a customer-facing facility

Large automobile dealership networks operate a category of facility that sits in an awkward middle ground for traditional security infrastructure. On one side, you have the security requirements: valuable inventory, high-value transactions, controlled-area service workshops, and the occasional bad actor who presents a real safety or theft risk. On the other side, you have the business requirement: customers must be able to walk in freely. A dealership that makes customers pass through a turnstile or badge-in at reception is not a dealership that sells many cars.

This constraint rules out the conventional answers to occupancy monitoring. Badge-based access control requires enrolment and carries friction that is acceptable in an office but not in a retail environment. Manual door counts require a dedicated person at every entry point - expensive, inaccurate, and unavailable during peak hours when counts matter most. The existing CCTV infrastructure covered every entrance and public area, but the cameras were used exclusively for post-incident review. No system was extracting occupancy signals from that footage in real time.

The business problems this created were concrete. Security staff had no objective basis for knowing whether the facility was at normal occupancy or significantly above it - which matters for managing crowds during sales events, service drive rush periods, and promotional days. Staffing decisions were based on intuition and historical patterns rather than actual footfall. For security specifically, there was no systematic way to detect when a previously barred individual - a known fraudster, someone who had made threats, or someone involved in a prior incident - appeared at a facility. The CCTV footage would show them after the fact, not in time to act.

The solution: face detection across existing camera streams

The system operates as an analytics layer over the existing surveillance camera infrastructure. No cameras were replaced. The software ingests the live video streams from cameras positioned at facility entrances and exits, and runs face detection - not full facial recognition for identification of known individuals, but detection of faces as they appear in frame. The distinction is important: the system counts and tracks faces for occupancy purposes without building a database of visitor identities.

At each entrance and exit camera, the system maintains a directional count. A face detected moving in the inward direction increments the facility headcount. A face detected moving outward decrements it. The running headcount is updated continuously and displayed on a security dashboard. Every detected face - whether entering or exiting - is snapshot-recorded with a timestamp and camera identifier. These snapshots are retained for a configurable period and searchable by time range and camera location. If a security incident occurs and the question is "who was in the building between 14:00 and 15:30," the answer is a filtered view of the snapshot log rather than a manual review of hours of footage.

The multiple-camera architecture matters for accuracy. A single entrance camera captures faces well under controlled lighting conditions; it degrades when the sun angle changes, when groups of people enter together, or when someone moves quickly through the frame. The system handles these conditions by correlating across multiple camera angles where overlap exists, applying confidence thresholds to detections, and flagging low-confidence periods for review rather than silently dropping counts. The headcount figure on the dashboard is accompanied by a confidence indicator so operators can see when conditions have degraded detection quality.

Multi-cam Real-time across all streams

Standard Hardware - Intel i7, 16GB RAM

Instant Blacklist match alerts

Blacklist detection and security alerts

A separate module handles what might be called the recognition function - matching detected faces against a defined blacklist of individuals. This is a smaller, purpose-specific task compared to general crowd identification. The blacklist is populated manually by the security team with photographs of individuals who have been formally flagged: prior theft suspects, individuals who have made threats, people banned following previous incidents.

When a face detected at any entrance camera matches a blacklist entry above the defined confidence threshold, the security team receives an immediate alert. The alert includes the matched photograph, the live snapshot from the camera, the camera location, and the timestamp. The security officer can assess the match and respond within the time window before the individual moves deeper into the facility. The alert does not wait for a human to review footage - it is generated and pushed automatically, in real time.

The confidence threshold for blacklist alerts is tuned conservatively. A false positive in this context - alerting security about someone who is not actually a blacklisted individual - is a manageable nuisance. A false negative - failing to alert on someone who is - is a missed security event. The system therefore alerts on any match above a lower confidence threshold and includes the confidence score in the alert, allowing the security officer to apply their own judgment. Multiple high-confidence matches on the same individual within a short time window generate a higher-priority escalation.

The most useful insight from the deployment wasn't the blacklist capability - it was the hourly visitor pattern data. Dealerships had intuitions about peak hours; the system showed the actual distribution. The difference between what managers expected and what the data showed consistently surprised them, and the staffing adjustments that followed were immediate.

Hardware: no GPU cluster required

A common objection to real-time video analytics is infrastructure cost. The assumption is that processing multiple live camera streams simultaneously requires a GPU server or cloud-based inference infrastructure. This deployment proved that assumption wrong for the scale involved. The entire system - ingesting multiple camera streams, running face detection, maintaining headcount, and matching against the blacklist - runs on a standard Intel Core i7 workstation with 16GB RAM. No GPU, no dedicated server hardware, no cloud inference endpoint.

This matters beyond the initial cost. A system that runs on commodity workstation hardware can be maintained by any technically competent person. It doesn't require a specialist ML infrastructure team to keep running. If the workstation fails, a replacement can be sourced locally and the software reinstalled. The operational dependency on specialised infrastructure - which is one of the real costs of deploying ML in commercial settings - is eliminated.

Results: headcount, patterns, and caught incidents

Real-time facility headcount is now available on the security dashboard continuously. During peak periods - weekend sales events, end-of-month service rushes - the operations team can see occupancy trends as they develop and respond before situations become difficult to manage. Staffing on high-footfall days was adjusted based on the actual visitor pattern data rather than estimates, and the adjustments held up against subsequent observations.

The blacklist detection capability has caught incidents that manual security monitoring missed. The combination of multiple camera streams, snapshot logging, and real-time alerting provides a level of coverage that a human security guard monitoring CCTV feeds cannot reliably replicate across a facility with multiple entry points. The incidents caught involved individuals who would not have been identified until post-incident footage review - by which time the relevant action was no longer available.

The deployment required no changes to the camera infrastructure and no network infrastructure beyond connecting the analysis workstation to the camera feeds. The existing CCTV system became more valuable without being replaced. That is the architecture principle this deployment validated: intelligence as an analytics layer on top of infrastructure that already exists, delivered at a price point that makes sense for a multi-site commercial operator.

Real-Time Face Detection Across Surveillance Cameras for Headcount and Security

The challenge: knowing how many people are inside a customer-facing facility

The solution: face detection across existing camera streams

Blacklist detection and security alerts

Hardware: no GPU cluster required

Results: headcount, patterns, and caught incidents

Related articles

Computer Vision for Inline Quality Inspection on the Production Line

Crime Analytics Platform for State Police Operations

AI Agents vs. Chatbots: What's the Actual Difference?

Ready to see it in action?