New to Google Secops: Top Ten YARA-L Rules Troubleshooting Tips

jstoner · 12-18-2024 08:00 AM

I’ve been asked a few times in the past month for tips that I use to troubleshoot YARA-L rules. As I thought about it, I realized this covers a lot of ground because when building detection logic, we need to account for the UDM fields, the values within those fields and the structure of the rule to be successful. Based on this, I’m fairly confident I won’t cover everything in a single blog, but I’ve put together a “Top 10” list of items to leverage as you write and refine your YARA-L rules. These are ordered from a flow perspective from event generation to basic rule creation to more refining and enhancing of a rule. This also will not be my final attempt at writing a troubleshooting tip blog, so hopefully this a good first pass that will help break down roadblocks you encounter as you build rules.

#1 - Simulate what you are trying to detect

Having the data that accurately represents what you are trying to detect makes building rules much simpler. While it may require some additional setup and planning, being able to identify a set of events in your data set and focus your rule building on those events can simplify your troubleshooting because you already have the answer in your dataset, you just need the logic to align with the events! This may not always be possible, but try to simulate the condition you are trying to detect.

#2 - Explore your data using search

Once the data is generated, or if you know the data is in your data set, find it! Use the UDM search interface to view the data, discover the fields in the UDM schema that are important to your detection, and take note which of these fields are enriched fields that leverage asset or user entity context or even geo-ip enrichment. Use the UDM field viewer to select these UDM fields to copy and paste them or use the UDM query you created and paste that into the rules editor to start populating the events section of your rule.

#3 - Make sure the required YARA-L sections are defined

All rules will have meta, events and condition sections. If events are being aggregated or joined to other data sets, like the entity graph, a match section is also required. We’ve covered some of this a while back here and here but it’s worth repeating. The rule will not compile without these sections, so keep an eye in the top left corner of the rule editor for the green check box.

#4 - Using the Test Rule Function

With the event(s) simulated and the events section populated, use one of the best features in the rule editor - the test rule function. Test the rule you’ve just written on the time range of the simulation you ran and see if the rule fires. Did you know you can test a rule without saving it? Just update your rule (add or remove criteria) and test. The rule test window supports a time range up to 14 days. Because you can test rules against any data in a tenant, it’s handy if you have a set of older events that you want to go back and validate a new rule against. The other lovely thing about the test rule function is that even if the rule is set to alerting, the test rule will not generate an alert.

#5 - Handling Case Sensitivity

Dealing with capitalization can be a pain anytime we build content, like rules, and when dealing with Windows systems, this can be extra challenging. The nocase modifier and the functions strings.to_lower and strings.to_upper are important to be mindful of when troubleshooting your rule. Might we have variability in the command line or file path (or other fields)? If so, we may want to consider using nocase like this:

$process.target.process.file.full_path = "c:\\windows\\system32\\lsass.exe" nocase

If you would prefer strings.to_lower, this works as well:

strings.to_lower($process.target.process.file.full_path) = "c:\\windows\\system32\\lsass.exe"

#6 - Escaping Backslashes

If the values being evaluated contain backslashes, like those separating folders and files, make sure they are being escaped with an additional backslash. This applies to both strings and regular expressions. If a user enters a command or file path like c:\windows\system32\lsass.exe they’ll be disappointed when no results are found. Pay attention to those escape characters and see what happens.

rule mimikatz_behavior_serkurlsa {
 meta:

 events:
   $process.metadata.event_type = "PROCESS_OPEN"
   $process.target.process.file.full_path = "C:\\Windows\\system32\\lsass.exe"
   $process.target.resource.name = "0x1010"
   $process.target.resource.resource_subtype = "GrantedAccess"

 condition:
   $process
}

Pro tip: When using the Copy UDM button in the UI and pasting the output into the rule editor, the extra backslashes are automatically populated even though they are not displayed in the event viewer.

#7 - Grouping logic with parentheses

Parentheses are great for handling a mix of logic in a single rule. However, placement of the parentheses can shift that logic and cause rules not to trigger. The example event section here is based on an example from the Sigma rule set. Without dissecting the Sigma rule, at a high level we are expecting ((A or (B and C)) and ((D and E) or (F and G and H))) for the rule to trigger.

Here is a breakout of the parentheses and grouping in YARA-L with this complex logical expression. As you build out rules with logic that mixes AND/OR/NOT, laying out the logical expression ahead of time can make it easier to visualize the rule you are attempting to build. Just remember that within a parenthesis AND is not assumed so AND and OR must be used to separate portions of the larger expression.

events:
   $registry.metadata.event_type = "REGISTRY_MODIFICATION"
   (
       strings.contains($registry.target.registry.registry_key, "\\System\\CurrentControlSet\\Services") or
       (
           strings.contains($registry.target.registry.registry_key, "\\System\\ControlSet") and
           strings.contains($registry.target.registry.registry_key, "\\Services")
       )
   )
   (
       (
           strings.contains($registry.target.registry.registry_value_data, "ADMIN$") and
           strings.contains($registry.target.registry.registry_value_data, ".exe")
       )
       or
       (
           strings.contains($registry.target.registry.registry_value_data, "%COMSPEC%") and
           strings.contains($registry.target.registry.registry_value_data, "start") and
           strings.contains($registry.target.registry.registry_value_data, "powershell")
       )
   )

#8 - Commenting out lines

When building logic like we just discussed, a good technique to troubleshoot portions of a logical expression is to use the double forward slash // to comment out a line. Alternatively, the /* and */ will comment a multiline portion of the rule. Using the above example, ((A or (B and C)) and ((D and E) or (F and G and H))), perhaps we want to just make sure detections are firing using the rule test with a less complex expression first, like (A or (B and C)). We could comment out the second half of our expressions using /* and */ until we are satisfied with the results from the first part of our rule and then incrementally add additional logic to our rule. This commenting can be incredibly useful when trying to isolate a specific problematic line of criteria and can be used not just in the events section, but anywhere in the rule.

#9 - Outcomes to view values

The outcome section is a great place to perform troubleshooting because it is responsible for outputting values that can be used in other platforms when a rule triggers. These outcome variables are available for viewing in the test rule results by clicking the columns button and adding them to the view. In the example below, we want to manipulate the field that contains the command line and perform a regular expression capture and base64 decode. If we leave those expressions in the events section only, we won’t be able to view the transformed values. Adding outcome variables provides a method to view the values being manipulated in the rule and makes it easy to tune the expressions in the events section and re-test. Once we are happy with the results, we can remove these variables from the outcome section if we no longer need them.

#10 - Adjusting our threshold in the Condition section

If your rule requires N number of events to have taken place prior to a rule triggering or N number of unique users or assets were associated with the rule, chances are the condition section will require more than just $login or whatever your event variable is. I saved this one for last because tuning the threshold will often use commenting and outcome variables to troubleshoot the output. Conditions are the last item evaluated in the rule so we can include calculations from the outcome section. In the example of a password spray, we can output our computed counts of distinct users and total user logins to the outcome section and view them in the test rule output. We can then use the commenting feature in the condition section to adjust the output in our detection as we determine if we want our rule to use one or more computations for this threshold we are looking to set.

I hope these ten tips were handy and will assist you as you continue to develop your own rules. I’ll try to keep an eye out for more questions around troubleshooting that can be added to future blogs like this. Don’t forget to engage with us at secopscommunity.com for more tips, blogs and videos on rules, searches and much more in Google SecOps!

New SecOps Webinar May 14th! Learn about Gemini's generative AI within Google SecOps