Hi, I have a nested json data from a log source. It has a URL field but it can't be directly mapped to "url_back_to_product". We need to extract a part of it and concatenate it to a standard Top Level Domain
Sample json
"info": {
"data": "https://myvendor.com:443/string/string/string-abcd1-abcd2-abcd3-abcd4-abcd5" ,
}
Only the final part of the URL (string-abcd1-abcd2-abcd3-abcd4-abcd5 )has to be extracted and concatenated to a standard static tld - https://mycompany.com/strings/
Field Extraction
url_back_to_product - https://mycompany.com/strings/string-abcd1-abcd2-abcd3-abcd4-abcd5
Please advise on the possible configurations.
Solved! Go to Solution.
You'll need to use the GROK function to match on the value you want to capture. This can be assigned to a field which is stored in state and used to concat on a static string.
Grok - https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
Supported patterns - https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns
GROK is built on top of regex, so if a pattern doesn't exist you can write your own to use for the capture group.
1. Add a Grok function and use regex to create a pattern and capture group to store the required string
2. Use that new field to append on the known URL string and assign that concat string to the UDM field.
Below is an example of using regex to capture a value and assign it to the capture group `url_back_to_product`. That value is then assigned to the UDM field by appending it to the static URL
e.g
grok {
match => {
"info.data" => "https://myvendor.com:443/\\S+/\\S+/(?P<url_back_to_product>.+)"
}
on_error => "no_url_back_to_product"
overwrite => ["url_back_to_product"]
}
mutate {
replace => {
"event.idm.read_only_udm.metadata.url_back_to_product" => "https://mycompany.com/strings/%{url_back_to_product}"
}
on_error => "url_back_to_product"
}
Field extractors have some limitations, what you are wanting to do can only be done with a CBN snippet.
You might want to take a look at the split function: https://cloud.google.com/chronicle/docs/reference/parser-syntax#split_function
You'll need to use the GROK function to match on the value you want to capture. This can be assigned to a field which is stored in state and used to concat on a static string.
Grok - https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
Supported patterns - https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns
GROK is built on top of regex, so if a pattern doesn't exist you can write your own to use for the capture group.
1. Add a Grok function and use regex to create a pattern and capture group to store the required string
2. Use that new field to append on the known URL string and assign that concat string to the UDM field.
Below is an example of using regex to capture a value and assign it to the capture group `url_back_to_product`. That value is then assigned to the UDM field by appending it to the static URL
e.g
grok {
match => {
"info.data" => "https://myvendor.com:443/\\S+/\\S+/(?P<url_back_to_product>.+)"
}
on_error => "no_url_back_to_product"
overwrite => ["url_back_to_product"]
}
mutate {
replace => {
"event.idm.read_only_udm.metadata.url_back_to_product" => "https://mycompany.com/strings/%{url_back_to_product}"
}
on_error => "url_back_to_product"
}
Hi @alube @cbryant, Thanks for your kind inputs. I took the GROK function approach and utilized URIPATH and regex to capture the required values.
if [info][data] != "" {
grok {
match => {
"info.data" => "%{URIPATH:full_path}(?<url_back_to_product>incident-[^/]+)"
}
on_error => "no_url_back_to_product"
overwrite => ["url_back_to_product"]
}
mutate {
replace => {
"event.idm.read_only_udm.metadata.url_back_to_product" => "https://mycompany.com/strings/%{url_back_to_product}"
}
on_error => "url_back_to_product"
}
}
Many Thanks again!