Hadoop in finance: big data in the pursuit of big bucks

There are very few industries that are as data-centric as banking and insurance. If there’s one thing that can be taken as certain in today’s rapidly changing financial world it’s that both the amount and the value of data collected is constantly increasing day by day.

Every interaction that a client or partner system has with a banking institution produces actionable data that has potential business value associated with it.

After the 2008 crisis, regulatory constraints have required the recording and reporting of more data than ever before, and, as capital and liquidity reserve requirements have also increased, the need to know exactly how much capital needs to be reserved, based on current exposures, is critical.

In order to keep up with market trends and run services more effectively, Apache Hadoop, a platform for storing and analysing massive amounts of data, is currently gaining momentum. Its market is forecasted to grow at a compound annual growth rate of 58%, surpassing $1 billion by 2020.

>See also: Big data projects planned in the boardroom for 2017

As data becomes digital, financial institutions can no longer ignore the importance of harnessing information. Major Banks, insurers and capital markets are now able to leverage Hadoop for better decision making, fraud detection, new product development and accurate forecasts.

This article will delve into some of the various opportunities presented by Hadoop in the financial sector.

Distributed computing for smarter decision making

Hadoop is often used in the provision of financial services due to its power in both data processing and analysis. Banks, insurance companies and security firms rely on it to store and process huge amounts of data they accrue in the course of their employ, which would otherwise require expensive or cost prohibitive hardware and software licenses.

Large retail banks receive thousands of incoming applications for checking and savings accounts every week. Practice dictates that bankers would normally consult 3rd-party risk scoring services before opening an account, or granting a loan.

They can (and do) override do-not-open recommendations for applicants with poor banking histories. Many of these high-risk accounts overdraw and charge-off due to mismanagement or fraud, costing banks millions of dollars in losses. Some of this cost is passed on to the customers who responsibly manage their accounts.

Applications built on Hadoop can store and analyse multiple data streams and help, for example, regional bank managers control new account risk in their branches.

They can match banker decisions with the risk information presented at the time of decision and thereby control risk by highlighting if any individuals should be sanctioned, whether policies need updating, and whether patterns of fraud are identifiable.

>See also: What’s the key to big data and AI being successful?

Over time, the accumulated data informs algorithms that may detect subtle, high-risk behaviour patterns unseen by the bank’s risk analysts.

This is especially important to insurance industries who seek to gather actionable intel on their prospective clients quickly, to make better decisions concerning risk, especially in “sticky” financial relationships (like mortgages) which often last for more than 10 years.

Being successful in this means millions in savings a year, which can then be passed on to low-risk clientele or company shareholders. Hadoop also supports predictive modelling, allowing enterprises to quantify future risk and develop market strategies accordingly.

This information can also then be turned into publicly reportable intelligence, increasing the perceived expertise (and thus brand value) of the enterprise engaged in data analysis.

Turning data into fraud protection

Not only does Hadoop help build companies’ prestige, it also protects them from the professional malfeasance that could be detrimental to it. This especially is a factor that should be studied carefully by financial sector workers, since one of the main benefits they would obtain is the protection against fraud through their data analysis efforts.

Because of the massive capacity for storage in Data Lakes, extensive records can be constantly collated and updated, including what decisions were made, what risks were present at the time of decision, how internal policies on the issue have changed over time, and whether there have been emerging patterns of fraud.

This is crucial also because maintaining control over data and understanding how to query it are set to be massively important considerations for regulatory reporting and adherence. Banks need to understand their existing customer data to predict and modify mortgages as appropriate for borrowers in financial distress.

Tracking money laundering

Following the establishment of international anti-money laundering (AML) requirements to prevent the cross-border flow of funds to criminal and terrorist organisations, malicious actors have chosen to hide in the ever more complex world of trading, away from banks.

This is a global business valued at more than $18.3 trillion, formed of an intricate web of different languages and legal systems.

>See also: The missing piece in the big data puzzle

Due to storage limitations, leading institutions are unable to archive historical trading data logs, subsequently reducing the amount of information available for risk analysis until after close of business. Under these premises, this gap creates a window of time which would allow money laundering or rogue trading go unnoticed.

Hadoop is now able to provide unprecedented speed-to-analytics and an extended data retention timeline, allowing more visibility into all trading activities for a comprehensive and thorough scrutiny.

The trading risk group accesses this shared data lake to processes more position, execution and balance data. They can do this analysis on data from the current workday, and it is highly available for at least five years—much longer than before.

The bottom line

These benefits, ranging from faster data analysis to money-laundering protection, come from being able to act on hitherto undetectable evidence and customer patterns.

One of the great features of Hadoop is that it allows enterprises to store, analyse and share multiple data streams simultaneously, a fact that is immensely useful helping the users of this technology detect anomalies in their company’s output.

A regional bank manager, for example, might have to control new account risk in their branches, the indicators of which can be extremely subtle and nigh imperceptible to the average employee.

>See also: Predictive analytics: good governance holds key to accurate forecasting

Using Hadoop, this information can be collected, scanned for anomalies over various branches, and sent by an automated ticker plant in real time to the relevant decision maker, saving time and improving operational margins.

All the factors outlined above have real world effects on the bottom line of modern institutions and this is only truer for the financial industry. The fact that big data has been warmly welcomed by the financial community only makes the competitive disadvantage felt by tech laggards more pronounced. The point is, data matters more and more, just as data surrounds us more and more, so it’s important that businesses accept and learn to navigate this environment.


Sourced by Vamsi Chemitiganti, GM financial services, Hortonworks

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...

Related Topics

Financial Services