How deep learning is advancing video analytics

Video has taken the world by storm with a myriad of intelligent devices continuously capturing vast amounts of data about how people live and what they do.

At the end of 2014, IHS Technology estimated over 245 million operational cameras were active globally. London alone has 500,000 cameras dotted throughout the city, which works out at about one camera for every 16 people.

Thanks to smart cameras, CCTV devices, and even drones mounted with intelligent cameras, users are able to record videos at an unprecedented scale and pace.

This vast store of data-rich content is used for a range of purposes – from gaming and law enforcement to crowd management at large events. In the retail industry, some stores are using heat mapping to understand and improve the shopping experience.

>See also: The UK’s top 50 data leaders and influencers

The sheer volume of video produced makes it impossible for people to manually process it effectively. Intelligent video analysis is required to filter and process data, trace moving objects, detect abnormalities, and generate alarms so that regulatory bodies can take appropriate action.

Deep learning technologies are making it possible to process and analyse vast streams of footage. It’s an area that’s been seeing significant investment and research.

Mimicking the process of the human brain, the technique uses sophisticated, multi-level, “deep” neural networks to create systems that can perform feature detection from massive amounts of unlabeled training data.

Data scientists in both industry and academia are using graphics processing units (GPUs) to accelerate their deep learning algorithms. GPUs process highly parallel computing tasks – like video and graphics – quickly and efficiently.

GPU-powered deep learning has led to groundbreaking improvements across a variety of applications, including image classification, speech recognition, and natural language processing.

In video analytics, organisations can now mine massive amounts of visual data to glean valuable insight about what is happening in the world.

As GPU computing has advanced, researchers can train deep neural networks with larger datasets in less time, while also reducing their data centre infrastructure footprints and cost of operation.

The highly parallel architecture of GPUs has enabled a new software model (deep learning) whereby billions of software-neurons are trained to learn about the world.

The widespread diffusion of the core parallel computing architecture, CUDA, also plays a big role in the recent progress of deep learning solutions.

CUDA helps every programmer to quickly and easily port their solution from a laptop to a robot, drone or smart camera running on Jetson (NVIDIA’s embedded computing platform). As a result, some advanced software will manage itself, thanks to the GPU’s compute and parallel capabilities.

Herta Security is an example of a company that relies on deep learning for intelligent video analytics. The Barcelona-based company specialises in facial recognition solutions that track and match faces instantaneously. Its video surveillance technology is used across a variety of settings, such as airports, banks, sports stadiums and shopping malls.

In the retail industry, intelligent video surveillance could identify the small groups of repeat offenders responsible for most shoplifting at stores. Most facial recognition systems can now notify security personnel on their mobile devices within seven seconds.

This would never be possible through a manual process. Only automated solutions, delivered by deep learning and artificial intelligence, can competently and effectively analyse the huge amount of data that video generates.

Sports stadiums, concerts and other large-scale events also rely on video analytics to keep the public safe. Thanks to the advancements in facial recognition technology and video analytics, each person’s face can now be scanned, analysed and, if needed, identified with security throughout a location so that their entry is not permitted.

>See also: The data of design: how analytics opens new doors for embedded system design

Today’s facial recognition algorithms are ten times more accurate than those of 2002 and 100 times more accurate than those of 1995, according to the National Institute of Standards and Technology.

Over time, the demand for intelligent video analytics is only going to increase as more and more industries realise the benefits for its application and as the systems are able to recognise and analyse a growing number of behaviours.

As the infrastructure for it grows, so does the need for deeper integration of servers, sensors and software. No one wants the hassle of being forced to recompile or recode their applications.

That’s why it’s important for the companies that are developing these tools to work toward creating an open and integrated platform so that these solutions can work seamlessly together.


Sourced from Serge Palaric, VP of sales, embedded and OEMs, NVIDIA Europe

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...

Related Topics