Hi everyone,
I am working on SecOps Native Dashboards and encountered an issue while using the sum function in YARA-L while creating visualizations.
Below is my query:
metadata.vendor_name="ABC"
principal.ip!=""
target.ip!=""
$query=query
match:
$query
outcome:
$total_bytes=sum(bytes)
The result values are 4 times the expected values. For example, if the required result is 30, the query returns 120 (4 ร 30).
I also observed that, When I remove the principal.ip and target.ip filters from the query, I get the correct result.
metadata.vendor_name="ABC"
$query=query
match:
$query
outcome:
$total_bytes=sum(bytes)
Why is this happening? Is there any alternative or solution to fix this?
Thanks,
Prashant Nakum
Solved! Go to Solution.
@prashant_nakum This is caused by how the repeat fields are unnested and passed to the Match & Outcome sections. The behavior is described here and is worthwhile to read https://cloud.google.com/chronicle/docs/detection/yara-l-issues#outcome_aggregations_with_repeated_f...
There are some recommended workarounds in the doc but it doesn't look like those can readily be applied in this scenario since you are trying to use the SUM function which doesn't have a distinct event version.
You should be able to get this working by modifying your outcome section to include the ratio of unnested events to distinct events in your calculation with something like this:
$event_ratio = (count(metadata.id) / count_distinct(metadata.id))
$sum_bytes=sum(bytes)
$actual_bytes = $sum_bytes / $event_ratio
Hi @prashant_nakum. Some initial analysis on my side points to the presence of the two IP conditions (the same ones you pointed out) as being the reason why the two versions behave differently. I believe it has to do with how the conditions interplay with the logic in the outcome section.
The outcome calculation is performed based on the subset of those initially matched events that also satisfy the principal.ip!="" and target.ip!="" conditions. The problem arises if the underlying query (query in your example) can match the same event multiple times under different matching contexts due to the presence of the additional conditions.
If the same event matches the query and also has non-empty principal.ip and target.ip, it might be considered a separate "match" in the context of the second rule's conditions. This could lead to the bytes values being summed multiple times.
@prashant_nakum This is caused by how the repeat fields are unnested and passed to the Match & Outcome sections. The behavior is described here and is worthwhile to read https://cloud.google.com/chronicle/docs/detection/yara-l-issues#outcome_aggregations_with_repeated_f...
There are some recommended workarounds in the doc but it doesn't look like those can readily be applied in this scenario since you are trying to use the SUM function which doesn't have a distinct event version.
You should be able to get this working by modifying your outcome section to include the ratio of unnested events to distinct events in your calculation with something like this:
$event_ratio = (count(metadata.id) / count_distinct(metadata.id))
$sum_bytes=sum(bytes)
$actual_bytes = $sum_bytes / $event_ratio