Bad guys hide their malicious commands using sneaky tricks – these obfuscation techniques are amazingly effective at bypassing modern defenses. A few common techniques are:
I've got an idea how we can detect some of these obfuscation techniques in Google SecOps by counting the number of unusual substrings that show up in obfuscated commands, and I'm going to explore it in this blog post.
The YARA-L strings.count_substrings function (documentation) will be doing the heavy lifting in this experiment – for more details, check out the Getting to Know Google SecOps: String Function: Counting Substrings blog and video.
These samples weren't picked because they're anything special – they just happened to have examples of the kind of obfuscation we often see.
https://www.virustotal.com/gui/file/72bc80e835674a1578a279db4ce4613a791474f276218f5544dbc4c850015ae2
How can we detect this kind of obfuscation? It's clear to a human that these commands look very different from normal commands. Can we count some of the unusual substrings we see in these examples, and alert when the sum exceeds some threshold?
We could go deep into data analysis to find the most distinctive features, but that's a post for another day! Let's test our hypothesis with just a few substrings that jump out to me:
As a best practice to improve rule efficiency and minimize noise, let's match only on PROCESS_LAUNCH events and a few Windows commands commonly invoked with obfuscation.
In the outcome section we'll capture the command line itself, the count of each feature using strings.count_substrings, and then we'll add them up into a single $sus_score that we'll use in the condition.
To make sure the rule works, I'll start with a very low threshold for $sus_score.
In my environment this matches thousands of events even over a tiny time range, which is exactly what we want. Now we have confidence that the rule logic works, and we can begin tuning it.
After that initial test, it's clear that we need to increase that $sus_score threshold. The next step is to adjust that up to exclude false positives and run a retrohunt with the rule. We can look at the $sus_score values in the rule outcomes to get an idea of what is normal in our environment, then increase the threshold and run again. Once we get to a point where there are few or no false positives, the rule is ready for production.
I was surprised at how well this simple rule worked in my environment, even using only the few features mentioned above. There's a ton of room for improvement! We could take another look at the distinctive features of obfuscated commands and add more ($Null might be a good one). We could also use a data-driven approach and write some code to identify the most unique features. But for right now, this has some pretty good results.
This is just one example of using the strings.count_substrings function in Google SecOps to identify obfuscated commands. The approach can be applied to many other types of data, and there are many other functions and features implemented in YARA-L that allow some very cool analysis.
rule obfuscation_substrings {
meta:
rule_name = "Obfuscation Substrings"
description = "Count substrings common in obfuscated commands and alert when the sum exceeds a threshold."
events:
$e.metadata.event_type = "PROCESS_LAUNCH"
$e.principal.process.command_line = /pwsh|powershell|cmd|bitsadmin|certutil/ nocase
outcome:
$cmd = $e.principal.process.command_line
$caret_count = strings.count_substrings(strings.to_lower($e.principal.process.command_line), "^")
$bracket_char_count = strings.count_substrings(strings.to_lower($e.principal.process.command_line), "[char]")
$plus_count =strings.count_substrings(strings.to_lower($e.principal.process.command_line), "+")
$escaped_x_count = strings.count_substrings(strings.to_lower($e.principal.process.command_line), "\\x")
$sus_score = $caret_count + $bracket_char_count + $plus_count + $escaped_x_count
condition:
$e and $sus_score > 7
}