The origins of the Technique Inference Engine (TIE):

ipninichuck · 10-23-2024 07:03 AM

The origins of the Technique Inference Engine (TIE):

In a leaky basement with no windows an alarm sounds, dDefenders are given a set of observed adversary actions and must make a decision that will affect the direction of a security event investigation. The stakes are high as the wrong decision can either misdirect analysts or prove a costly mistake giving the adversary room to maneuver further. This is not a SOC at a real company but a proof-of-concept game called Attack Poker. I presented the idea during a lightning talk at ATT&CKCon 2.0 . Ever since working on that project I believed that only by analyzing chains of ATT&CK tTechniques could we begin to improve our abilityies to stop adversaries.

A small group of like minded individuals discussed the future of using ATT&CK to make it easier to prioritize alert data at ATT&CKCon 4.0. These individuals included my colleague Andy Shepard and Ingrid Skoog who was the head of R&D for the Center for Threat Informed Defense (or Center) at the time. The discussion turned into a project proposal at Center. Many Ccenter members sponsored the project and their input and data contributions further refined the idea. With sponsors, Center researchers worked endlessly on the project. This combined effort resulted in the publication of the Techniques Inference Engine (TIE).

Building blocks of TIE:

The concepts that led to TIE were inspired by two previous Center research projects. The first was ATT&CK Flow that provides a method to describe entire chains of techniques and their associated STIX objects. Flow was very inspiring and provided an emphasis away from just atomic techniques and has moved us closer to analyzing sequential adversary behavior. The second project was known at the Top ATT&CK Techniques. This project was the first attempt to identify choke points within the matrix that would limit adversary movements depending on the techniques chosen. The existence of choke points along the matrix is a key point to understand. It is said that adversaries can move freely amongst the techniques, but this is not necessarily true. Depending on the goal of the adversary, they will choose certain techniques at a point in time that makes it possible to find which techniques are most likely to be part of the path they must take to achieve their goal. These two previous projects were extremely important as steps towards formulating the approach that would be taken with TIE.

Obstacles faced:

The largest obstacle to achieving the goals of the project was collecting the dataset that would be used to train the ML model. There were two main difficulties to overcome. The first was the needed size of training data. One major problem when developing a model is possible overfitting. This happens when the data set is either too small in size, or does not have a large enough variation in the data to accurately represent the correlation between data points. When overfitting becomes too high, the data no longer represents a true relationship with reality, andbut instead just reinforces the data present and thus no longer has useful information. The findings at this point will not help the analyst as they are mostly just repeating the small amount of data available. The second problem faced were the biases that might be brought into the model depending on the data chosen and more importantly the variation within the data. For this reason no emulated data was used, as this data might possibly represent what security teams typically test TTPs versus what is actually observed in adversary behavior in the wild. Although such data would have increased both volume and variation, the bias could have severely skewed results. The dataset also had to be chosen to allow the largest volume of data from CTI reports to be included. An early decision made was that concurrent techniques without regard for order would be used. This type of concurrent analysis versus sequential meant that more available data could be used in the model. Currently there is not enough sequential data to provide a good dataset for the models considered. The dataset collected is one of the largest collections of concurrent TTPs available and is a very significant outcome of the project.

How to use TIE:

The first available method is web application. The steps to use it include choosing the techniques that have been observed, then results are the most likely concurrent techniques in descending order. Also included are a number of convenient filters and ordering choices. The filters include Platform, Group and Campaign. One of the most convenient features is the ability to group the techniques according to tactics to give better context regarding the goal of the adversary. For advanced users a python dev notebook is available.

Use Cases:

Improved Incident Investigations: TIE can help investigators quickly identify and analyze techniques used in an incident, even those initially missed.
Enhanced Intelligence Analysis: TIE enables analysts to generate more comprehensive and insightful reports by connecting disparate pieces of information.
Real-time Investigative Guidance: A REST API leveraging TIE's Python implementation can provide analysts with real-time guidance and direction for their investigations.
Advanced Detection Engineering: TIE assists detection engineers in creating more effective, multi-event rule sets with fewer false positives and improved true positive identification.
Use Case Effectiveness Measurement: TIE provides a novel way to measure the effectiveness of Use Cases by assessing their ability to detect chains of techniques, going beyond traditional 1-1 ATT&CK technique mapping.
Improved ATT&CK Operationalization: By enabling more granular and realistic assessments of Use Case effectiveness, TIE significantly enhances the practical application and operationalization of the ATT&CK framework.

Closing Thoughts:

I'm most excited about the future direction of this research. With TIE complete, the project will lead to further research and innovative ways to analyze technique chains. My goals include exploring the use of ATT&CK Flow and LLMs to capture these sequential chains. ATT&CK is often called a common language, and ATT&CK Flow provides the perfect structure to describe sequential technique analysis within that language. The challenge lies in building a comprehensive database, but LLMs could help by automatically generating ATT&CK Flows from observed adversary behavior. I believe that as the community uses TIE, we'll see both novel applications and a clearer picture of its future potential.

The Google Cloud Security Community is upgrading platforms!

Read more and check out our FAQ.

Tie Fighter at 12'o clock

The origins of the Technique Inference Engine (TIE):

Building blocks of TIE:

Obstacles faced:

How to use TIE:

Use Cases:

Closing Thoughts: