rule perf tuning

Does the order of the events section affect rule performance?

For example, in this rule, would moving the ref list further up in the section help?

$e.metadata.vendor_name = "Microsoft"

$e. metadata.product_name = "Windows"

$e.target.process.comand_line = /powershell.exe/

$e.security_result.action = "BLOCK"

...

not $e.principal.user in %excludedusers

1 12 324
12 REPLIES 12

Thx for posting this.

I often wonder if there are ways I need to change my SIEM searches or rule syntax to be more efficient and fast.

I'm used to using job inspector in Splunk so I can know definitively which of my syntax practices are effective and which are wasteful. 

I think I agree with you that writing SIEM rules / searches top down - i.e. leading with the parts of the search that cut down the scope of data to search most. So like you  I usually lead with vendor_name, product_name, service or asset, and so on down to the elements that occur more widely across logs e.g. event = ALLOW|BLOCK etc.

Also - from my experience I don't think there  is any pre-processing of the query in the way Splunk would optimize queries.

HTH

 

SecOps doesn't have an expectation that the user puts things in a certain order to get more performance out of the search itself, there is still some compilation that goes on behind the scenes.

Fields like metadata.event_type, principal.ip, target.ip and target.hostname and principal.hostname are great fields to eliminate other data so if possible use these where possible. I try to use metadata.event_type nearly everywhere if feasible.

If you can fold regex statements together separating criteria with | that is better than having, say 3 separate regex but again it may not always be possible.

We have some more that perhaps we can put a blog together on more broadly but hopefully this answers the immediate question.

I think I get most of the filtering for performance and what I need to include. Really more wondering if the ordering of the filters makes a difference. Ex. If my exclusion list in the example above - or even an inclusion list as that may be smaller is used, should I move it up in the query

@jstoner please also comment on ordering search syntax. From your reply are you suggesting putting e.g. metadata.event_etc above vendor_name product_name?

Do I need to break my Splunk habit?

 

It isn't on the user to worry about ordering. I start with metadata eventtype from a mental model but there isn't a you must do x or y for users to worry about query optimization. 

Nope, as mentioned, there isn't an expectation that putting your reference list first (or last) will have bearing on the performance.

@jstoner My personal experience suggests it may matter what order the syntax is in. I have run some tests where I compared different orders of syntax. However I know anecdotes are not data and there's no easy  visibility into the process or what is changing from day to day.

The lack of visibility into what is actually going on is frustrating yet understandable as the platform and it's processes are changing and improving over time.

I appreciate the feedback @Chris_B. I had not seen anything published to date but went back and did some additional digging and based on the feedback, it sounds like there are some nuances as well as some progress being made on some content that should be able to point you in the right direction. So, apologies that I didn't previously have the information to share but we will have that additional information posted here shortly. 

.

Maybe I am misunderstanding, but it seems like you are saying that reference list checks are expensive (makes sense) and that expensive filters should be placed later in the events section (makes sense), but then you also say that moving the reference list check earlier/higher is a good idea to improve efficiency (?).

Is it not the exact opposite? You want the early predicates to be the cheap filters so that the expensive filters have less to process afterwards.

.

Ah okay, makes sense. Thanks.