Regex pattern not working as expected

Oct 15 17:08:29 |Check Point|VPN-1 & FireWall-1|Check Point|Log|http|Unknown|act=Accept app=HTTPS cn1Label=Elapsed  server_outbound_packets=30 service_id=https sig_id=4 src=10.25.18.12

So I have this text
I want to check if the text. 
I want to write a regex for log filtering to check if the text has the following conditions : 
1. contains Check Point
2. act=Accept  (or Allow) [small or caps]
3. src should not be private


This is the regex I have written,
regexp: .*(?i)(Check Point|fortigate).*act=(?i)(Deny|Drop).*src=(?!(10[.]|172[.](?:1[6-9]|2[0-9]|3[0-1])[.]|192[.]168[.])).*


but this is not working. 
Can anyone help?

0 7 383
7 REPLIES 7

Hi @asinghz297,

As previously mentioned, I would recommend performing this data manipulation and extraction via a parser, however the below rule should work. Noting that the example event has been mapped to 'metadata.description' (if not modify all occurrences of metadata.description in the below to the relevant udm field.)

rule Regex_Example {
 
  meta:
    author = "Ayman C"
    description = "Regex example"


  events:

    //$inputstring = //"Oct 15 17:08:29 |Check Point|VPN-1 & FireWall-1|Check Point|Log|http|Unknown|act=Accept app=HTTPS cn1Label=Elapsed  server_outbound_packets=30 service_id=https sig_id=4 src=10.25.18.12"

    $EventType = re.capture($event.metadata.description, /:[0-9]{2}\s\|([^\|]+)/)
    $ActionOutcome = re.capture($event.metadata.description, /act=([^\s]+)/)
    $SourceIP = re.capture($event.metadata.description, /src=([^\s]+)$/)

    $EventType = "Checkpoint" and ($ActionOutcome = "Accept" nocase or $ActionOutcome = "Allow" nocase) and not $SourceIP = /(^10[.]|172[.](?:1[6-9]|2[0-9]|3[0-1])[.]|^192[.]168[.])/

outcome:

    $FullEvent = $event.metadata.description
    $EventTypeExtracted = $EventType
    $ActionOutcomeExtracted = $ActionOutcome
    $SourceIPExtracted = $SourceIP
  condition:
    $event
}




Hi @AymanC 
The point is that i don't want a text extraction but a pattern matching so as to allow only specific logs in the instance.
So my requirement is to write a single regex to take care of all the 3 conditions. 

Hi @asinghz297 ,

If you want to filter on ingest using regexp, you need to investigate using the forwarder, the OTEL collector, or a third party product such as Cribl.  The rule above is a post-ingest circumstance, creating detections based on the pattern match.

From a practical standpoint, I'd be hesitant to exclude firewall logs that don't match a pattern?  One might not need them today, but what if one needed them tomorrow?

I need to do this so as to reduce the EPS on the instance.  Basically drop the checkpoint firewall logs for inbound traffic which are denied by firewall. Can do that using regex

The solution I came up with was : 

.*(?i)(Check Point|fortigate).*act=(?i)(Deny|Drop).*src=(?!(10[.]|172[.](?:1[6-9]|2[0-9]|3[0-1])[.]|192[.]168[.])).*

is it okay?

matthewnichols
Community Manager
Community Manager

Hi @asinghz297 What are your ingestion methods and what are your current EPS needs? What does you see as your max EPS?

Hi @matthewnichols 
regardless of the EPS my concern is to drop the mentioned logs.

Can you check my regex query for the same and suggest any changes?

.*(?i)(Check Point|fortigate).*act=(?i)(Deny|Drop).*src=(?!(10[.]|172[.](?:1[6-9]|2[0-9]|3[0-1])[.]|192[.]168[.])).* 

Hi @asinghz297 , it's a little tough to give regular expression advice when you haven't said what level of regular expression support your engine has?  I did check that out and the way you reference RFC1918 doesn't work all that well in a few different engines, plus the use of case insensitivity as an inline modifier isn't supported on all platforms (and RE2, which Google uses in most places, is one of those), and then RE2, at least, doesn't support lookaround assertions, and your expression uses a negative lookahead.

For RE2, this pattern worked on that one log sample above:

.*(Check Point|fortigate).*act=(Deny|Drop).*src=([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*

It also works when you use .+ rather than .*.

If you want to stay closer to your original pattern and you're using something that supports a negative lookahead, this looks OK:

.+(Check Point | fortigate).+(act=Deny|Drop).+src=(?!10\.|172\.(?:1[6-9]|2[0-9]|3[0-1])\.|192\.168\.)\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

This is all minimally tested advice so I'd verify it on some larger samples rather than relying on it!