Threat Hunting, but…¿Where and what? — Collection Management Framework

8 min readApr 22, 2020

TL;DR

Often DFIR teams must perform threat hunting actions to proactively identify anomalous behaviors in our networks or customers. However, on many occasions we do not know where to search or directly do not know which technologies we have to collect this information.

A Collection Management Framework (CMF) can help us in these situations. This is based on which information each data source gives us and which of these sources are available for hunting.

Objective

Before talking about what is a CMF, it is important to explain in summarized form the real meaning, because the main job of the CMF is to support this work.

Threat hunting is that activity that aims to proactively discover malicious activity of an actor or group of actors after the previous generation of hypotheses. Due to its a complex process, there are different maturity models to carry it out.

Grados de madurez. Fuente: sqrrl — Hunting Maturity Model (HMM). Source: Sqrrl

Without going into much more detail about threat hunting because it is not the objective, more information about this activity and what each degree of maturity consists of can be found in the next entry in Sqrrl.

Having already, organizations must be able to generate a CMF on that technology that they have and that the threat hunting team can consume to identify threats on the network. Otherwise, discovering threats will be such a tiring task that the team will not be able to meet the intelligence requirements that have been demanded of them.

A CMF must be able to answer many of the questions that threat hunters ask themselves during their activity, such as:

Where do I get the data for hunting?
What data is available?
For how long is such data stored?
How can I consume the data?

Workflow

On many occasions, to carry out the threat hunting activity it is necessary to know what we are going to look for. In order to do this, you can generate hypotheses and work on them. However, in order to generate the hypotheses, we must know the threats that we face.

If we do not follow the last comments, we will be looking for something that we ourselves do not know about any generic threat that could not carry a risk for the company.

As an example, a possible flow to carry out threat hunting activities correctly with defined objectives would be the following:

First of all, it is important to know our adversaries and prioritize those who may have more activity and motivations in the sector that the organization operates, since they pose more risk. This work is done by Cyber Threat Intelligence teams.

Something that can help us in this case is a Threat Modeling, getting to see in a structured way those potential groups that could have interaction with us at any time. More information on threat modeling here.

Some of the information that we will obtain in the knowledge phase about the adversary could be the following:

TTPs

Tactics, techniques and procedures used by the actor or group of actors that we intend to carry out hunting activities later.

Obtaining this information at a low level will later help the next two phases, therefore, it is highly recommended to map the tactics and techniques with the MITRE ATT&CK knowledge base in any of its variants (Enterprise, Mobile and PRE), in addition, they have just launch a beta that incorporates very comprehensive sub-techniques, more information about this here.

Tradecraft

When we talk about tradecraft we mean the techniques, methods and technologies used by an adversary during an intrusion. Although it may seem very similar to TTPs, tradecraft gives a different and more specific vision, but TTPs partially abstract us from the behavior.

An example of tradecraft could be when we talk about an adversary using spearphishing with an attached Word document and embedded macros. After the download and subsequent execution, it drops a RAT from the FlawedAmmyy family.

All this mentioned above, if we were to talk about TTPs, would remain in the MITER tactics and techniques at a high level, without going into detail of families.

Malware

These are all malicious capabilities developed by the actor or, failing that, reused from existing ones. Know them is important, since it will later help in the generation of the scenario and execution by a red team. It is also possible to extract tactics and techniques from malware individually.

Software

Like malware, when we refer to software, we are talking about those capabilities that actors use during their intrusion but that are not considered malware, since many of these capabilities are features that are incorporated into operating systems today (LOLBINs). To make this category clear, an example might be wscript.exe or cmd.exe.

Generation of hypotheses based on the selected adversaries would be the next activity that should be carried out and, as in the previous case, it is the responsibility of the Cyber Threat Intelligence team.

Once you are able to know what your adversaries are, what TTPs they carry out during their operations, what malware and/or software they use and what motivations and objectives they are folllowing, you have the ability to create a scenario to carry out against your organization.

These scenarios can be very similar to campaigns that have been carried out in the past, but above all, they must be realistic so that the red team can emulate them in a satisfactory way.

All this information should be collected in a report that will be subsequently delivered to the red team to execute it, precisely in the next phase, which is the scenario execution.

Nowadays there are many tools that make the task of scenario emulation and detection easier in an automated way. From paid products like Cymulate to open source tools like Caldera and Mordor can help us in this case.

Any execution that is carried out will be under the responsibility of the red team, which is in charge of carrying out the operations and subsequently testing the threat hunting team for detection.

Lastly, we are in the hunting phase, where the team must identify malicious activities that have been carried in the network with the support of a previously generated CMF, thus achieving to know where to obtain the information they are looking for and what type of information is.

Hunting activities are completely linked to active defenses and it is important to know that there is a very significant factor here, humans, which means that automating this phase completely is not possible, however, there are certain aspects that They can help us, such as for example an alert generated by IDS, which could lead to a subsequent hunting process.

Collection Management Framework (CMF)

At this point, is time to start working on a CMF in order to make life easier for the threat hunting team. This work must be the result of a collaboration between different departments such as telecommunications, IT, architecture, security … Depending on each organization, there will be more or less team collaboration for the operations they carry out in their day to day.

The main objective of working at the CMF is to be able to have a list of all the technologies from which information can be obtained to carry out threat hunting activities and detect suspicious activities.

As mentioned before, it is important that the CMF can answer questions such as:

Where do I get the data for hunting?
What data is available?
For how long is such data stored?
How can I consume the data?

For this, we will define first what information we are interested in knowing for each data source available in the organization to do hunting. The good thing about this is that we can adapt it to our needs and incorporate or eliminate fields that do not interest us.

Type of data
Killing chain phase covering
Observable collection
Data storage time
Data storage location
Department owner
Data set
Category

Defined the previous points, that we are going to fill in for each data source, it is time that we begin to identify those sources that during an intrusion, could give us information to be able to look for anomalies in the organization’s network. As an example for this post, we will focus on the following data sources:

Firewall
Windows systems
IDS
Web logs

Defined the fields that we are going to fill in and the data sources, it is time to get down to work and make a simple table that can help us like the following:

Simply with the information we have filled in in the table above, we can answer many questions in the event of an intrusion.

In case they have exploited a vulnerability of the web application, we will only have information to investigate 60 days ago, everything that happened after those days, the threat hunting team will not be able to satisfy an intelligence requirement.

On the contrary, if it has caused a network intrusion and establishes some type of communication with a C2, we could detect that behavior 90 days ago at best.

Other information that the table can give us in this case is about those phases within the kill chain where we have less visibility. In this example, due to the type of data sources that we have collected, if an infection were to occur in an endpoint system and it were to persist, we would only have the logs of the Windows systems to cover the “installation” phase, since the rest of Data sources do not cover this phase.

So far, all the information we have discussed can be integrated into a platform that the people of Rabobank have developed called DeTTECT.

As they say, the objective of the tool is to help blue teams based on a scoring established in ATT & CK to identify the quality, visibility and detection that an organization has in its registered data sources. At this point, with the information from the CMF, we could start working on this tool.

To better understand how this application works, I recommend watching the presentation and presentation by Ismael Valenzuela where he gives practical examples of DeTTECT.

LinkedIn: https://linkedin.com/in/joseluissm/

References

Threat Hunting and Hunting Maturity Models — https://medium.com/@sqrrldata/the-cyber-hunting-maturity-model-6d506faa8ad5

Adversary Emulation Plans — https://attack.mitre.org/resources/adversary-emulation-plans/

Collection Management Frameworks — Beyond Asset Inventories for Preparing for and Responding to Cyber Threats — https://dragos.com/resource/collection-management-frameworks-beyond-asset-inventories-for-preparing-for-and-responding-to-cyber-threats/