How AI search as a service is overcoming the unstructured data challenge

Data management startup Nuclia is helping organisations drive value from its data, through its low-code AI search engine API

With 80 per cent of company data being unstructured, including text, images and video, getting the most possible value from rising amounts of these assets is proving a challenge across all business sectors. Businesses often meet pitfalls in keyword search capabilities that fail to properly take context, formats or languages into account, leaving users with insufficient results.

To solve this challenge, Barcelona-headquartered data startup Nuclia is delivering an API that leverages what company CEO and co-founder Eudald Camprubi has named ‘AI search as a service’, capable of finding and indexing data across any source. An end-to-end solution, it can extract data from file repositories, audio, video, URLs and databases, split it into paragraphs, and present an index that shows exactly where any chosen piece of information is in the file. This is based on continuously trained language models, the creation of which owes much to data annotation.

Users can leverage NucliaDB for storage and indexing, as well as having multi-cloud infrastructure in place. The solution’s open source makeup provides flexibility around how customer data can be utilised compliantly and innovatively.

Source: Nuclia

Nuclia presented its AI-powered low-code capabilities during the most recent edition of the IT Press Tour in Lisbon.

Founding and funding

Nuclia’s founding staff wrote their first lines of code in late 2018, following a pre-founding, consulting-focused stint for its leadership team in the US. The company has been operating fully remotely, and currently has a team of over 20 engineers on its payroll. The startup is backed by investors Elaia from France, and Crane in the UK, having secured seed funding totalling €5.5m.

Its customer base of around 20 businesses, as of now, are all operating across Europe, and when finding its footing in the search space, a gap was found in search capabilities. Organisations were unable to get truly helpful results when using Internet search environments such as Sharepoint, showing that a solution was needed beyond keyword search.

A six-step process to AI search

With a digital skills gap continuing to affect the innovation efforts of organisations across all industries, Nuclia has opted for a low-code approach for its customers, that can involve search in the same way the user would on Google, or the input of a file or URL into Nuclia’s system.

This manifests itself through a six-step process adopted by the startup:

  1. Assets from any data source — including Google Drive and Amazon Drive and OneDrive — can be used to find and index data.
  2. Any kind of data, whether text, video or audio, can be scoured for data.
  3. Files are then transcribed if necessary, before text sorted into paragraphs can be extracted for the user.
  4. Insights are then presented around the context of any data.
  5. Data is vectorised, with the capabilities of the algorithm transforming from focus on a single value, to multiple values at the same time, for optimisation going forward.
  6. Full text in paragraphs, as well as insights and vectors, are finally stored on the open source Nuclia database.

Customer use cases

Use cases proving valuable for Nuclia’s small but burgeoning customer base across Europe include:

  • Customer service-focused employees are able to get the bottom of queries, through sorting through archived conversation and behavioural data, more quickly.
  • Unstructured data search being utilised by staff in educational institutions including universities.
  • Data anonymisation for healthcare and pharmaceutical companies dealing with sensitive patient data, which enables compliance with GDPR.
  • Video training indexing capabilities for HR personnel in businesses, with companies starting to move away from text-based training methods and towards more interactive, multimedia means.

When it comes to pricing, Nuclia takes into account the database used by the customer; the amount of data needed to index; and the amount of search queries required.


Why the EU’s Artificial Intelligence Act could harm innovation — Exploring the possible pitfalls of the EU Artificial Intelligence Act.

Breaking down no-code and low-code security automation — Discussing the benefits that low-code and no-code can bring to security automation.

Avatar photo

Aaron Hurst

Aaron Hurst is Information Age's senior reporter, providing news and features around the hottest trends across the tech industry.