Error in Parsing Arrays in JSON Logs for UDM Mapping

Hello , 

I have been trying to parse an array from JSON log, where the size of the array is varies in different logs. I have attached the sample of the log below.

vishnu_manu_0-1724166673393.png

I’ve run into an issue that I haven’t been able to resolve when trying to parse "name" from the log and  merge it to the UDM "security_result.threat_feed_name", which accepts string (repeated) values.


Here's the screenshot of the code and error:
vishnu_manu_1-1724167014062.png 
 vishnu_manu_3-1724167068594.png
At a high level, the code uses a "for loop" to fetch all JSON elements from the array as "elem." The "elem.name" is used to parse only the name and then merge it into the UDM.
 
Similarly I'm facing the same error when trying to parse the "confidenceLevel" as "security_result.confidence_score" in the UDM, which accepts float (repeated) values.

I’ve tried various possibilities to solve the error to the best of my knowledge, but I haven’t found a solution yet.
 
If anyone has a solution or suggestion that I can use to resolve this, please share it in a reply.

 

Solved Solved
0 5 411
1 ACCEPTED SOLUTION

So the problem is that while security_result is repeated threat_feed_name is not, so your first merge won't work, which is what the error indicates. You need to create multiple security_results or something like that ( you can do security_result.about.security_result and do repeated there if you prefer to keep one main security result). You can see the design pattern below. 

 

for feed in reportedFeeds {

mutate {
copy => {
"security_result.threat_feed_name" => "feed.name"
}
}

mutate {
merge => {
"udm.security_result" => "security_result"
}
}

mutate {
remove_field => ["security_result"]
}
}

View solution in original post

5 REPLIES 5

So the problem is that while security_result is repeated threat_feed_name is not, so your first merge won't work, which is what the error indicates. You need to create multiple security_results or something like that ( you can do security_result.about.security_result and do repeated there if you prefer to keep one main security result). You can see the design pattern below. 

 

for feed in reportedFeeds {

mutate {
copy => {
"security_result.threat_feed_name" => "feed.name"
}
}

mutate {
merge => {
"udm.security_result" => "security_result"
}
}

mutate {
remove_field => ["security_result"]
}
}

Citreno,

Thanks for the solution, the problem is solved.

I would like to understand this more better. Can you explain the high level of how it works ?

Specifically, what's the role of "remove_field" ?

mutate {
remove_field => ["security_result"]
}
}

Ah great question and one i still struggle explaining properly, not sure I understand it very well myself. There's a weird statefulness about for loops, perhaps it's a bug in how copy/replace followed by merge behaves, maybe it does it by pointer or something, who knows? But normally you would expect if you instantiate security_result.threat_feed_name = "A" inside a for loop, that on the next iteration that variable will disappear as it's in the heap, something like that. I'm rusty with these things but i think that's how things behave in most programming languages. In CBN even though you instantiate inside the for loop it still maintains state and doesn't re-instantiate, instead the value = "A" remains in the state as seen in the statedump, and worse off once you merge it somehow replaces the previously merged item "A" with the newly replaced "B" doubling the value of "B". It's definitely confusing. In any case to simplify things the remedy is to make it behave like a normal for loop by removing the security result variable entirely, otherwise you can see how the array just duplicates the last item in our case "B". In this screenshot you see the output of statedump before and after the copy/merge combo if remove_field is removed.

 2024-08-21_09-47-19.jpg

Honestly, I'm still confused about this, but I hope I'll figure it out eventually.

I've learned how Logstash works mostly by understanding the errors rather than just relying on the documentation.

Haha yes, sometimes it's the ultimate red, green, refactor! Just run at it until you encounter an error and you get to sort of memorize the patterns. 

I'm confused about the behavior too, spent far too long staring at it! Strongly suspect it's a bug to be honest, so just giving you the remedy here without explaining root cause unfortunately. Good luck. What i found is that remove_field rarely hurts at the end of a for loop if the variable is instantiated inside of it so i just do it by muscle memory now.