IT Brief UK - Technology news for CIOs & IT decision-makers
Modern us traffic control room cctv city road network monitoring

Milestone unveils traffic VLM & XProtect summarisation

Tue, 30th Dec 2025

Milestone Systems has introduced a vision language model focused on traffic video, alongside a new video summarisation tool for its XProtect video management software and a VLM-as-a-service offer for developers.

The company said the Hafnia VLM uses NVIDIA Cosmos Reason and targets one of the largest issues in video surveillance: the volume of footage that staff must review manually. The launch covers both end-users of XProtect and software builders that want to embed video analysis in their own products.

The vision language model specialises in understanding traffic scenes. It runs on cloud infrastructure or regional data centres and uses data sets from either Europe or the US. Milestone said it has fine-tuned the model on 75,000 hours of real-world video that it describes as responsibly sourced.

The new Video Summarization plug-in for the XProtect Smart Client converts segments of video into structured text summaries. Users send a short video clip and a text prompt, and the model returns a written description within the XProtect interface.

Milestone said the tool allows operators to search within text summaries based on what occurred in the video rather than relying on timestamps or manual tagging. Users can bookmark and filter summaries and link them to existing XProtect events and rules so that the system generates summaries when specific alarms or alerts occur.

The company said early reports from users indicate that summarisation may cut operator false alarm fatigue by up to 30%. It said the software filters out irrelevant movement or noise, which directs operator attention towards events it categorises as valid.

According to Milestone, the plug-in integrates directly into the XProtect Smart Client and installs within minutes. The company said it offers the tool as a free download and charges only when the VLM processes prompts.

Traffic-focused model

Milestone said Hafnia VLM uses NVIDIA Cosmos Curator for data preparation and runs as a regionalised system. It currently offers customised models for the US and EU markets and plans further regions.

The model follows prompt-based instructions for traffic-related operations. Milestone said this focus improves performance on tasks such as traffic monitoring and incident review when compared with general-purpose AI models.

The company said the training data set uses auditable data lineage and aligns with GDPR and the EU AI Act. It said this legal alignment supports deployments in regulated markets and in public-sector traffic management.

Milestone has positioned the new VLM as its underlying engine for both the XProtect summarisation plug-in and third-party integrations. It described the Hafnia platform as one of the more advanced video AI systems in the traffic sector due to its tuning on real-world operational footage.

VLM as a service

The Hafnia VLM as a Service, or VLMaaS, gives developers, integrators and partners API-based access to the traffic-focused model. Access runs over HTTPS and uses a pay-per-use structure based on API calls, rather than up-front licence or training fees.

Milestone said VLMaaS enables developers to add video intelligence functions to applications without setting up or fine-tuning their own AI stacks. The service targets both small-scale pilots, such as minimum viable products, and larger-scale production platforms.

The company said development teams can reduce AI and analytics effort by up to 70 times compared with tuning their own models to reach similar results. It said the service can support standalone offerings or integrate with Milestone's own product portfolio.

Cities already using XProtect for traffic management, including Genoa in Italy and Dubuque in the US state of Iowa, have signalled interest in the new tools. These customers plan to apply the technology in urban traffic control and incident response.

Andrew Burnett, Acting Chief Technology Officer at Milestone Systems, said the launch addresses long-standing workload issues in video monitoring and implementation.

"With the Vision Language Model as a Service and Video Summarization for XProtect, we're tackling some of the most challenging bottlenecks: video overload and time-consuming manual work. Operators get immediate insight directly within XProtect; builders get API‐first access to production‐ready intelligence without bespoke training or heavy infrastructure.

Because this model is specialized for real-world traffic video and fine-tuned on responsibly sourced data, customers can trust the results, deploy with confidence, and enhance all existing solutions in place. It's the fastest, most advanced and impactful path to turning video into actionable outcomes," said Burnett.

Milestone plans further regional variants of Hafnia VLM and aims to extend both the XProtect plug-in and the VLMaaS offer as part of its traffic and urban management portfolio.