Detection Rule Creation

Hey all, I'm attempting to create a detection rule that brings in data across 3 different log_types:

  • Azure_AD 
  • Microsoft Defender Endpoint
  • Microsoft Graph Alert

When I start to add Azure_AD in to the mix I am running into an error where it says "validating intermediate representation: event variables are not all joined by equalities, the joined groups are: (azure_event), (graph_event), (mde_event)".

Now looking at my logic I'm not quite sure what the error is meaning. What I really want essentially is whenever Defender generates an alert this detection rule will also bring in events across the other two log types based on shared fields. Is that not possible with 3 log_types cause It works perfectly if I remove Azure_AD. Any help is appreciated!

Here is my current rule:

 

rule microsoft_security_alerts {

  meta:
    author = "Cody Brandt"
    description = "This rule looks for MDE alerts with High or Medium severity across multiple log types"
    status = "Production"
    severity = "High"

  events:
    // Pulling all alerts created by Defender, specifically looking at the below severities
    $mde_event.metadata.log_type = "MICROSOFT_DEFENDER_ENDPOINT" and
    ($mde_event.security_result.severity = "CRITICAL" or
     $mde_event.security_result.severity = "HIGH" or
     $mde_event.security_result.severity = "MEDIUM" or
     $mde_event.security_result.severity = "LOW")

    // Creating a match variable based on the threat name that is in Defender
    $mde_event.principal.user.userid = $threat

    // Pulling all alerts created by Microsoft Graph, specifically looking at the below severities
    $graph_event.metadata.log_type = "MICROSOFT_GRAPH_ALERT" and
    ($graph_event.security_result.severity = "CRITICAL" or
     $graph_event.security_result.severity = "HIGH" or
     $graph_event.security_result.severity = "MEDIUM" or
     $graph_event.security_result.severity = "LOW")

    $azure_event.metadata.log_type = "AZURE_AD" and
    $azure_event.metadata.event_type = "USER_LOGIN"

    // Comparing events that are shared across the log_types
    ($graph_event.target.user.userid = $mde_event.principal.resource.attribute.labels["AadUserId"] or
    $mde_event.security_result.threat_name = $graph_event.security_result.rule_name or 
    $azure_event.target.user.user_display_name = $mde_event.principal.resource.attribute.labels["DisplayName"] or
    $graph_event.metadata.event_type = $azure_event.metadata.event_type)

  match:
    $threat over 30m

  condition:
    $mde_event and $graph_event and $azure_event
}

 

 

0 3 339
3 REPLIES 3

The error message "validating intermediate representation: event variables are not all joined by equalities, the joined groups are: (azure_event), (graph_event), (mde_event)" means that your YARA-L rule is attempting to correlate events from three different log types โ€“ Azure AD (azure_event), Microsoft Graph Alert (graph_event), and Microsoft Defender Endpoint (mde_event) โ€“ but it's not doing so correctly according to YARA-L's syntax and logic.

The Problem: Lack of Direct Equality Joins

YARA-L requires that all event variables within a rule be joined together using direct equality comparisons between their fields. This ensures that the rule engine can correlate events based on shared attributes. Your rule attempts to join events using or conditions, which isn't how YARA-L correlation works.

In your rule, you have three separate blocks defining criteria for each log type:

$mde_event.metadata.log_type = "MICROSOFT_DEFENDER_ENDPOINT" and ...
$graph_event.metadata.log_type = "MICROSOFT_GRAPH_ALERT" and ...
$azure_event.metadata.log_type = "AZURE_AD" and ...

Then you have this block attempting to join them:

($graph_event.target.user.userid = $mde_event.principal.resource.attribute.labels["AadUserId"] or
$mde_event.security_result.threat_name = $graph_event.security_result.rule_name or
$azure_event.target.user.user_display_name = $mde_event.principal.resource.attribute.labels["DisplayName"] or
$graph_event.metadata.event_type = $azure_event.metadata.event_type)

The or conditions here don't create valid joins. YARA-L will treat these as separate conditions, not as a way to link events together. This leads to the error message, as the azure_event, graph_event, and mde_event variables are not directly joined through equalities.

How YARA-L Joins Work:

A valid join in YARA-L would look like this:

$event1.some_field = $event2.some_field

This establishes a direct correlation between event1 and event2 based on the shared value in some_field. All event variables in a rule must be connected in this way, either directly or transitively.

Solution: Establish Direct Equality Joins

To fix your rule, you need to find fields that are common across the three log types and use those fields to create direct equality joins. For example, if all three log types contain a user ID field, you could use that to link them:

$mde_event.principal.user.userid = $user_id
$graph_event.target.user.userid = $user_id
$azure_event.target.user.userid = $user_id 

This would create a valid join, allowing the rule to correlate events where the user_id is the same across all three log types.

Additional Considerations:

  • Field Mapping: You'll need to carefully examine the UDM schema for each log type to identify the correct fields for joining. Ensure that the fields represent the same attribute across all three log types.
  • Transitive Joins: YARA-L also supports transitive joins, where events are indirectly linked through a shared variable. For example:
    $event1.fieldA = $temp_var
    $event2.fieldB = $temp_var
    $event3.fieldC = $temp_var
    
  • Logic Adjustments: You might need to revise your rule's logic (the condition section) to reflect the specific correlations you want to achieve.

Important Note: It's essential to understand that simply joining events from different log types doesn't guarantee a meaningful detection. You'll need to carefully consider the relationships between the events and define appropriate conditions to identify the specific malicious behavior you're targeting.

You could try something like: 

rule microsoft_security_alerts {
  meta:
    author = "Cody Brandt"
    description = "This rule looks for MDE alerts with High or Medium severity across multiple log types"
    status = "Production"
    severity = "High"

  events:
    // MDE Events
    $mde_event.metadata.log_type = "MICROSOFT_DEFENDER_ENDPOINT" 
    ($mde_event.security_result.severity = "CRITICAL" or 
     $mde_event.security_result.severity = "HIGH" or 
     $mde_event.security_result.severity = "MEDIUM" or 
     $mde_event.security_result.severity = "LOW") 
    $mde_event.principal.user.userid = $threat

    // Graph Events
    $graph_event.metadata.log_type = "MICROSOFT_GRAPH_ALERT"
    ($graph_event.security_result.severity = "CRITICAL" or 
     $graph_event.security_result.severity = "HIGH" or 
     $graph_event.security_result.severity = "MEDIUM" or 
     $graph_event.security_result.severity = "LOW") 

    // Azure AD Events
    $azure_event.metadata.log_type = "AZURE_AD" 
    $azure_event.metadata.event_type = "USER_LOGIN" 

    // Direct Joins Between Event Variables
    $mde_event.principal.resource.attribute.labels["AadUserId"] = $graph_event.target.user.userid  
    $graph_event.target.user.userid = $azure_event.target.user.userid

  match:
    $threat over 30m

  condition:
    $mde_event and $graph_event and $azure_event 
}

Explanation of Changes:

  1. Direct Joins: The key change is the removal of the OR conditions and the introduction of direct equality comparisons between the event variables. We're using $mde_event.principal.resource.attribute.labels["AadUserId"] = $graph_event.target.user.userid and $graph_event.target.user.userid = $azure_event.target.user.userid assuming these fields contain shared user identifiers across the log types. You might need to adjust these field selections based on the actual UDM schema of your data.

  2. Simplified Condition: The condition remains the same, requiring all three event types to be present.

Important Considerations:

  • Field Validity: Ensure that the fields used for joining (AadUserId, userid, and user_display_name) are present and contain comparable values in all three log types. Refer to the UDM schema for field definitions and data types.
  • Correlation Logic: This rule is based on the assumption that the user ID is a reliable way to correlate events across these log sources. If there are other relevant fields for correlation (e.g., timestamps, IP addresses, resource names), consider incorporating them into the events section to strengthen the rule.
  • False Positives: Be mindful of potential false positives. Multiple users might share the same user ID in different contexts, leading to incorrect correlations. Test the rule thoroughly with real data and adjust the logic or time window as needed.

This revised rule should address the "event variables are not all joined by equalities" error and achieve the desired correlation across three log types, though it might take some tweaking.

Thank you so much for the detailed explanation, this actually makes way more sense now with the direct equality joins. I will go ahead and test this out, but so far its looking like it returns the results that I am looking for. I think the biggest thing now is just finding appropriate conditions like you mentioned.

This is a big help though cause it will move me in a better direction!