Counting Characters to Find Obfuscated Commands

mosesschwartz

Bad guys hide their malicious commands using sneaky tricks – these obfuscation techniques are amazingly effective at bypassing modern defenses. A few common techniques are:

  • String concatenation, constructing commands by adding smaller parts together: "Hello"+" "+"World",
  • Representing characters in different ways, like escaped hex characters (\x42) and [char](123) syntax in PowerShell, and
  • Inserting caret (^) characters inserted where they are ignored by Windows: "C:\Windows\system32\cmd.exe" /c "stArt /MIN "" P^oWErs^h^E^L^L.E^X^E -eNC.

I've got an idea how we can detect some of these obfuscation techniques in Google SecOps by counting the number of unusual substrings that show up in obfuscated commands, and I'm going to explore it in this blog post.

The YARA-L strings.count_substrings function (documentation) will be doing the heavy lifting in this experiment – for more details, check out the Getting to Know Google SecOps: String Function: Counting Substrings blog and video.

Examples of Obfuscation in the Wild

These samples weren't picked because they're anything special – they just happened to have examples of the kind of obfuscation we often see.

https://www.virustotal.com/gui/file/7779cb34f098db6f9f0433f4279c44fc59ff39bc52f57c421e437d6cfd776460...

image.png

https://www.virustotal.com/gui/file/053c49fa2eb75f2026ce9dd89e076b2be3418ba9af3fa80652d34d80454b9e50...

image.png

https://www.virustotal.com/gui/file/72bc80e835674a1578a279db4ce4613a791474f276218f5544dbc4c850015ae2

image.png

Detection Hypothesis

How can we detect this kind of obfuscation? It's clear to a human that these commands look very different from normal commands. Can we count some of the unusual substrings we see in these examples, and alert when the sum exceeds some threshold?

Feature Selection

We could go deep into data analysis to find the most distinctive features, but that's a post for another day! Let's test our hypothesis with just a few substrings that jump out to me:

  • ^ 
  • [char]
  • +
  • \x

Writing the Rule

As a best practice to improve rule efficiency and minimize noise, let's match only on PROCESS_LAUNCH events and a few Windows commands commonly invoked with obfuscation.

In the outcome section we'll capture the command line itself, the count of each feature using strings.count_substrings, and then we'll add them up into a single $sus_score that we'll use in the condition.

image.png

Testing the Rule

To make sure the rule works, I'll start with a very low threshold for $sus_score.

image.png

In my environment this matches thousands of events even over a tiny time range, which is exactly what we want. Now we have confidence that the rule logic works, and we can begin tuning it.

Tuning the Rule

After that initial test, it's clear that we need to increase that $sus_score threshold. The next step is to adjust that up to exclude false positives and run a retrohunt with the rule. We can look at the $sus_score values in the rule outcomes to get an idea of what is normal in our environment, then increase the threshold and run again. Once we get to a point where there are few or no false positives, the rule is ready for production.

Conclusion

I was surprised at how well this simple rule worked in my environment, even using only the few features mentioned above. There's a ton of room for improvement! We could take another look at the distinctive features of obfuscated commands and add more ($Null might be a good one). We could also use a data-driven approach and write some code to identify the most unique features. But for right now, this has some pretty good results.

This is just one example of using the  strings.count_substrings function in Google SecOps to identify obfuscated commands. The approach can be applied to many other types of data, and there are many other functions and features implemented in YARA-L that allow some very cool analysis.

Full Rule Text

 

rule obfuscation_substrings {
  meta:
    rule_name = "Obfuscation Substrings"
    description = "Count substrings common in obfuscated commands and alert when the sum exceeds a threshold."

  events:
    $e.metadata.event_type = "PROCESS_LAUNCH"
    $e.principal.process.command_line = /pwsh|powershell|cmd|bitsadmin|certutil/ nocase

  outcome:      
    $cmd = $e.principal.process.command_line
    $caret_count  = strings.count_substrings(strings.to_lower($e.principal.process.command_line), "^") 
    $bracket_char_count = strings.count_substrings(strings.to_lower($e.principal.process.command_line), "[char]") 
    $plus_count =strings.count_substrings(strings.to_lower($e.principal.process.command_line), "+") 
    $escaped_x_count =  strings.count_substrings(strings.to_lower($e.principal.process.command_line), "\\x") 
    $sus_score = $caret_count + $bracket_char_count + $plus_count + $escaped_x_count
    
  condition: 
    $e and $sus_score > 7
}
5 0 36.6K