Ultra-compressed and more efficient AI models for industry

AI has rapidly evolved into an important strategic pillar. It is no longer just hype, instead it has already entered operational business across many sectors. Alongside this enthusiasm, however, lies a technical and economic reality: AI models, especially the most powerful large language models (LLMs), are becoming larger, extremely energy-intensive, and expensive to scale effectively.

The growing size of these LLMs drives an ever-increasing demand for computing resources, high-end GPUs, and vast cloud infrastructure. For many companies, total operating costs are prohibitively high.

Quantum-inspired tensor networks offer more cost-efficient approaches

In response, a new approach is emerging that makes AI more accessible, efficient, and adaptable to local conditions: quantum-inspired tensor networks. These networks offer several advantages over conventional compression techniques. Instead of creating ever-larger models, the focus shifts to compressing existing models by tensorisation. This is the process of identifying layers in a neural network that are suitable for reduction and breaking large matrices within these layers into smaller, interconnected matrices and quantisation, the reduction of numerical precision. This can shrink models by up to 95 per cent while maintaining performance and drastically improving efficiency.

At its core, the technology restructures the representation of neural networks to eliminate unnecessary parameters while preserving the network’s full functionality. The technique works by identifying and retaining only the most relevant correlations between data points.

The result is an AI model compact enough to run on devices previously excluded from AI deployment. By simplifying the internal architecture, compressed models also process queries faster (measured in tokens per second), leading to quicker user interaction, system responses, and results. Energy efficiency is also improved: as fewer operations are required per inference, energy demand can fall by up to 50 per cent, reducing operational costs. Finally, there is the decisive advantage of hardware independence. These ultra-compressed models can be deployed across a wide range of platforms, from large servers to edge devices, avoiding dependence on rare or expensive GPU clusters and internet connectivity.

While the theoretical foundations of tensor networks come from quantum mechanics, their application in AI is fully compatible with conventional digital infrastructure. In this way, ideas from quantum science directly benefit traditional computing environments.

The result is a much smaller AI model that performs just as well – and in some cases even better – than the original LLM.

From the cloud to the edge: localised AI models

Until now, a cloud-centred architecture has dominated the AI sector. But ultra-compressed AI models are fundamentally changing this paradigm. Being much smaller, more efficient, and more processor-friendly, they enable a shift towards local deployment models at the so-called edge. This approach is not only more practical but also opens up many new application possibilities.

Examples can be found across many industries. For example, in vehicles, AI systems for navigation and safety can run directly on board, independent of cloud services that could fail in tunnels or remote areas. Consumer electronics and smart home devices can now offer AI features offline, greatly enhancing privacy and usability. In industrial automation, edge AI can monitor machines and optimise workflows without sending sensitive data externally – a particular advantage for regulated sectors such as life sciences or for locations without stable internet connections.

In healthcare, privacy is a core ethical requirement with patient records among the most sensitive data sets. Compressed AI models enable complex models to run on local devices or in secure, private clouds, keeping patient data within the organisation’s firewall.

Defence also benefits from compressed AI models. Modern military operations increasingly rely on real-time data analysis from drones, surveillance systems, and tactical decision aids. As these systems are often deployed in remote or hostile areas without stable cloud or internet connections, local AI solutions are essential.

Compressed AI models for more sustainable industrial processes

One of the most compelling validations of compressed AI models took place at a manufacturing plant in Europe. The goal was to reduce the size of the manufacturer’s existing AI model – used in the production of automotive components – without compromising performance.

Using advanced tensor network-based compression methods, the model size was significantly reduced. As a result, the AI model delivered roughly twice the response speed and improved integration into existing plant systems, cutting the energy consumption for running the model by around 50 per cent. The compressed AI model enabled localised real-time decision-making – in robotics, quality control, or maintenance – without sending data to remote servers or relying on unstable internet access.

For manufacturing companies committed to lean production and environmental responsibility, these savings not only mean measurable cost reductions but also a faster route to smarter, more efficient production.

Compressed AI offers new opportunities for industry

From manufacturing to the operating theatre, compressed models enable faster insights, better energy efficiency, and greater data privacy for organisations without compromising accuracy.

AI is no longer defined by size but by ingenuity. Compressed AI represents a crucial evolution in the way we develop, deploy, and use machine learning models. This does not mean less performance – rather, it means an industry ready for both the present and the future.

Román Orús is co-founder and chief scientific officer at Multiverse Computing

Why knowledge is the ultimate weapon in the Information Age – Learn how to build a human knowledge-first approach to AI, so that your organisation can run on the best information possible

Reduce, re-use, be frugal with AI and data – Kirsty Biddiscombe of NetApp explains the benefits of being frugal with data and how it’ll help your organisation to run more effectively

Ultra-compressed and more efficient AI models for industry

Here is why compressed AI models could offer new opportunities for your organisation and why they're more sustainable long-term

Quantum-inspired tensor networks offer more cost-efficient approaches

From the cloud to the edge: localised AI models

Compressed AI models for more sustainable industrial processes

Compressed AI offers new opportunities for industry

Read more

Ultra-compressed and more efficient AI models for industry

Here is why compressed AI models could offer new opportunities for your organisation and why they're more sustainable long-term

Quantum-inspired tensor networks offer more cost-efficient approaches

From the cloud to the edge: localised AI models

Compressed AI models for more sustainable industrial processes

Compressed AI offers new opportunities for industry

Read more

Related Topics

Related Stories

Digital friction is where most AI initiatives fail

Are you really ready for AI? Exposing shadow tools in your organisation

From generative to agentic AI – now the real transformation begins

Reduce, re-use, be frugal with AI and data

Related Stories

Ultra-compressed and more efficient AI models for industry

Digital friction is where most AI initiatives fail

Are you really ready for AI? Exposing shadow tools in your organisation

From generative to agentic AI – now the real transformation begins