Firstly, let’s define what threat intelligence is in the context of cyber security. Threat intelligence is created by a process which takes raw data and information from a variety of sources and turns it in to strategically, tactically or operationally valuable information.
The typical sources of this raw data and information often include human-supplied: human intelligence (HUMINT), Internet-published: open-source intelligence (OSINT), Network-traffic-derived: signals intelligence (SIGINT), and Technical artefacts: cyber-intelligence (CYBINT) or cyber-specific technical intelligence (TECHINT).
The collected raw data and information is then categorised, analysed further, and given context and meaning, producing the required threat intelligence.
How threat intelligence is distributed?
Threat intelligence is provided in variety of formats, depending on the audience. These formats include human-readable summary reports, machine-readable formats, and specific machine signatures.
The de facto emerging standard for machine-readable cyber threat intelligence is STIX (Structured Threat Information eXpression – a structured language for cyber threat intelligence).
Each STIX message may contain one or many cyber-observables (which are encoded in Cybox – a cyber-observable expression language). This machine-readable intelligence is increasingly shared, either intra- or inter-organisationally, via TAXII-compatible machine-to-machine services.
STIX constructs on Intelligence, TTPs, and attribution
We will use the key STIX constructs when discussing what we can learn from malware analysis. The summary STIX architecture, which captures the constructs and their typical relationships, is shown below:
The acronym TTP above refers to adversary tactics, techniques, and procedures – what would be described in law enforcement terms as the adversary’s modus operandi. This acronym is important, as it is one of the outputs that are often used for attribution to a specific set of threat actors.
So what threat intelligence we can learn from malware analysis? When considering malware analysis it is first important to understand how malware samples are obtained, as this affects whether the analyser has sight of the original incident.
The most common ways for malware to be obtained are via cyber-incident response projects, honeypots, public sharing platforms such as VirusTotal, private or semi-private industry malware-sharing platforms, private industry intelligence groups
Organisations such as NCC Group typically obtain malware from all of these sources.
Things we can learn 1: Observables
When analysing modern malware, one of the most common forms of intelligence obtained is what STIX would call an observable. This might be a DNS domain name, an IP address, an email address used by the threat actor to communicate with the malware, or a website URL used to propagate it.
The value and lifetime of different types of observable will vary depending on a number of factors.
With families of malware using embedded configurations to control these parameters, tools such as NCC Group's MICE (Malware Inspection and Config Extraction) framework are huge aids in extracting this type of information.
Things we can learn 2: indicators
Indicators are the observables we’ve seen in a specific pattern and given a specific context. These are used to facilitate detection with network threat sensors.
Things we can learn 3: TTPs
Whether it is the first or the fiftieth time we observed the behaviour of a threat actor we will gain insight into their tactics, techniques, and procedures.
As we see more activity and malware samples which can be attributed to the same actor, the picture becomes richer, especially as humans and groups are nothing if not creatures of habit. These habits might include things such as how their implant operates or obfuscation techniques they use which are unique to them.
Things we can learn 4: courses of action
Once we understand how the malware works, we are then in a position to advise preventative and reactive courses of action against that sample, potentially-related samples, and other samples used by the same threat actor.
These courses of action will typically include methods for detecting the sample at a network or host level, NCC Group detailed in its Derusbi malware technical note earlier this year.
Why isn’t malware the oracle to all our intelligence requirements?
So while we’ve listed four key areas, we haven’t identified malware as being able to provide reliable information on incidents, exploit targets, campaigns, or threat actors.
While this information may become clear over time, after analysing many samples, it often cannot be found by malware analysis alone. Instead further sources of data and information are often required, to produce the required context and meaning.
There is a partial exception, however, in the case of targets – malware can sometimes, by revealing the type of information it is attempting to extract, provide information about a previously-unknown target.