The U.S. Internal Revenue Service (IRS) has reported that the hack into its computer databases, which was revealed in May, was much more extensive than first thought.
The IRS said in late May the tax return information of about 114,000 U.S. taxpayers had been illegally accessed by cyber criminals over the preceding four months, with another 111,000 unsuccessful attempts made.
But last night, the Wall Street Journal reported that the attacks may have affected more than 300,000 taxpayer accounts – with another 600,000 suspected failed attempts by third parties to gain access to taxpayer data.
> See also: US tax season is over, but the danger isn't
As Ken Westin, senior security analyst at Tripwire, explains, the news highlights the challenges of identifying the scope of a breach.
'This is a perfect example of how unrelated data breaches imperil us all,' says Westin. 'Cybercriminals have identified ways to correlate and aggregate data compromised in other breaches to increase their profits.'
It seems attackers already had some information about their victims in order to be able to log on in the first place. Since many taxpayers submit tax returns covering other people such as spouses and dependents, each initial account breach then opened up avenues for identity theft and the exposure of personal information for malicious intentions.
As Dan Holden, director of the security and response team at Arbor Networks explained back in May when the attack first came to light, this use of 'big data' aggregation is becoming an increasingly common among cyber criminals:
'We are seeing as a growing trend for cybercriminals becoming more advanced in the past year. There are several examples now where cyber criminals are using information from previous campaigns in new campaigns and this seems to be an example of that.'
'As we saw in Neverquest,' Holden continues, 'there is the primary use case of the campaign, and then the ability to steal additional information for a future or different campaign. It seems as though they’ve used information from a previous or another campaign – by either stealing or buying the information – in this very sophisticated campaign. Cybercrime is getting far more targeted as of late, and this is a great example of that.'
The information that was used such as Social Security numbers, date of birth, tax filing status (whether married or single) and street address is the same type of information that was compromised by US health organisation Anthem and a handful of other breaches.
> See also: How to respond to a data breach
In the case of the IRS attacks, the entire database itself was not compromised directly, instead the data was harvested from legitimate website forms making it more difficult to identify which requests were fraudulent and which were legitimate.
'This attack highlights the fact that big data isn’t just something utilised by legitimate businesses but also cyber criminals and fraudsters,' says Westin. 'The data used to perpetrate this attack was originally harvested from multiple sources, including open source data and data from other breaches.'
'In this case the criminals were able to quickly correlate disparate data sets to create complete profiles; once this was completed they then automated the IRS 'Get Transcript' form submission to extract additional information that can then be used to file fraudulent tax returns.'