The ins and outs of NoSQL data modelling

Think of a NoSQL data model as the visual blueprint for an application — the equivalent of the CAD printout produced by an architect before building or renovating a house. Using this example, the blueprint enables a discussion with the homeowner about how rooms in the house will be laid out. It allows the architect to discuss the plan with the building contractor. It is the reference document for the bricklayer.

Without data modelling, the architect would only have words to describe the house design to the homeowner and to the bricklayer, which is obviously not productive, not collaborative, and quite risky. The same can be said for building out NoSQL applications without data modelling. Is it realistic to think that one can design an application with no structure, schema, or relationships?

>See also: Businesses are at a database crossroads

With that in mind, let’s understand that the actors change with regards to NoSQL data modelling. Typical users really depend on the size of the development project, with data modelling for NoSQL often viewed as best suited for larger organisations with IT architecture and data governance departments engaged with complex projects and applications.

Data modelling tools are designed for use by functional analysts, designers, architects, and database administrators (DBAs). Naturally, only developers are comfortable looking at application code to figure out how data storage is organised, but other project stakeholders also need to understand the underlying data structure and this is possible with data modelling.

Data modelling is critical to understanding data, its interrelationships, and its rules. A data model is not just documentation, because it can be forward-engineered into a physical database.

In short, data modelling solves one of the biggest challenges when adopting NoSQL technology: harnessing the power and flexibility of dynamic schemas without falling in the traps that a lack of design structure can create for teams.

It eases the on-boarding of NoSQL databases and legitimises the adoption in the enterprise roadmap, corporate IT architecture, and organisational data governance requirements. More specifically, it allows us to define and marry all the various contexts, ontologies, taxonomies, relationships, graphs, and models into one overarching data model.

Benefits derived from data modelling

Higher application quality – A data model is the equivalent of an architect’s blueprint before a building construction starts. Data modelling is the visual expression of a development team’s understanding of the business and its rules.

The data modelling process is the most effective way to gather correct and complete business data requirements and business rules, so as to ensure that the system will operate in the intended manner.

>See also: Graph databases lie at the heart of $7 trillion self-driving car opportunity

The process generates more questions than any other modelling approach, leading to higher integrity and discovery of the relevant business rules. And its visual nature facilitates communication and collaboration between business users and subject matter experts.

• Quicker time to market – Thanks to proper data modeling, application developers don’t have to discover unknown requirements themselves, and can focus on developing with fewer errors and reach their sprint commitments. This will in turn lead to earlier delivery of high-quality, value-adding functionality, easier acceptance testing, and a quicker payback on development.

• Lower development and maintenance costs – Data modelling catches errors and inconsistencies early in the process, when they are easy and cheap to correct. Given the exponential evolution of bug fixing costs as a project progresses, it’s always better to evaluate and think through options early, rather than after the software has been written.

Even more so in an Agile development environment, development costs can be reduced significantly because a good data model will reveal upfront otherwise unknown or unanticipated requirements. And with NoSQL’s flexibility, the data model can rapidly evolve in an organised manner.

• Improved data quality – Data corruption and inaccurate data are even worse than application errors. A good data model defines the metadata so the data itself can be properly understood, queried, and reported on. To truly leverage the power and flexibility of NoSQL, it is still important to ensure the enforcement of domain definitions, field constraints, editing rules, and integrity of relationships.

>See also: The UK’s top 50 data leaders 2017

It actually turns out to be more important given that such enforcement is seldom possible at the database level, and needs to be maintained in the application code. A data model will provide the developers with a roadmap and checklist for such enforcement.

• Data governance (GDPR & PII) – Companies around the world need to demonstrate compliance with privacy regulations on personally identifiable information. To do so, they need to document the proper handling of attributes concerned, and monitor daily that compliance is maintained. This monitoring becomes more of a challenge with Agile Development, self-managed teams, and dynamic schemas of NoSQL databases, but data modelling can rigorously manage this effort.

• Better performance – Data modelling provides DBAs with the means to understand the database and tune it for fast performance, without having to search through the code to discover the schema. Given the nature of NoSQL, the data modelling process outlines a method to start thinking in terms of queries and data representation, rather than in terms of storage.

• Business intelligence – What’s the use of possessing a great deal of data, only to have no efficient way – or no way at all – to use it? In other words: how can anyone effectively query Big Data if they do not know what is in it, or how it is structured? A good data model, built on query and reporting requirements, is a starting point for data mining. It will spot trends and patterns, and make predictions to help a business navigate challenges and opportunities.

>See also: A silver lining: GDPR the catalyst for customer 360?

• Documentation and knowledge transfer – Data modelling provides documentation to facilitate communication between business stakeholders and technical experts, using a common vocabulary and a business domain glossary. A data model is effective at expressing abstractions in a clear and succinct manner, and it serves as a training aid through staff turnover.

• Enhanced integration – With data modeling of all corporate applications, the creation of a meta repository provides a common vocabulary, identifies relationships and redundancies, and resolves discrepancies so disparate systems are well integrated together.

What about ROI?

A discussion about NoSQL data modelling is not complete unless we take a look at Return on Investment (ROI). ROI is a widely used measure to compare the effectiveness of IT projects and investments.

The basic ROI calculation is to divide the net return from an investment by the cost of the investment, and express the result as a percentage.
The ROI formula for NoSQL data modelling is: ROI % = (Benefit due to data modelling – Cost of Investment) / Cost of Investment X.

An alternative method is to calculate payback, or the length of time it takes for the cumulative gains from an investment to equal cumulative costs. In other words, how long it takes for an investment to pay for itself.

>See also: How consumer demand is driving mass customisation 

You can also use NPV (Net Present Value) to represent the return a project will make at a specified discount rate. Or you may calculate the IRR (Internal Rate of Return) to show the yearly return percentage of the investment.

In the end, no project or approach has an automatic right to approval or budget. Decisions to invest in an IT methodology or software have to compete with all other business needs and initiatives, and you should of course make sure to use the arguments and evidence to demonstrate what is best suited for your company, business users, and technical staff.

 

Sourced by Pascal Desmarets, CEO of Hackolade

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...