Deduplicating Email Addresses/Iterating over repeated fields in parser

Hey All

In the workspace user parser my users get the same email added to the entity.user.email_addresses field. 
sample user record with duplicated email addressessample user record with duplicated email addresses

This is slightly annoying, so I tried to remove it. My idea was to just dedupe at the bottom of the parser so that I can merge in future changes relatively easily.

However, i'm struggling to understand what my options are for dealing with an array/repeated field. It looks like i'm not able to iterate over, reference individual values or flatten it back to a string. The only thing I can find that works is checking for existence, so I can have a bunch of if statements that look something like this:

 

 

if [var_email] =~ /.*specificallyExcludedDomain\.com$/ or [var_email] in [var_entity][user][email_addresses]{
}
else {
  mutate{
    merge => {
      "var_entity.user.email_addresses" => "var_email"
    }
  }
}

 

 


Am I missing an obvious way to interact with these repeated fields in the parser? Any ideas would be awesome. 

0 1 210
1 REPLY 1

There is a similar post about that in here ;

https://www.googlecloudcommunity.com/gc/SIEM-Forum/Parsers-selecting-last-element-in-an-array-withou...

The main idea is constructing a variable -let us call it "listRegex" - that has the values in this format  "value1|value2"|value3|..." , and then checking each new value in a conditional-if with regex like operator ~=  against the "listRegex" variable, since "listRegex" is formulated as a regex  ; the conditional will either append or discard the duplicate values keeping only the unique ones.