Solved: Json parsing

Former Community Member · 11-21-2023 10:28 PM

Hello everyone, we are currently developing a parser for logs that are in JSON format (not raw JSON). Is it possible to create a parser specifically designed for parsing JSON logs?

{
        "id": "*****",
        "asset": {
            "id": "****",
            "name": "******",
            "display_ipv4_address": "*****"
        }

Can we write a parser if the log is in the above format, Without ingesting as raw json

if yes can you please help me with a sample parser

cmmartin_google

The log needs be on a single line, e.g.,

{"k1":"v1","k2":"v2"}

versus being split over multiple lines, e.g.,

{
  "k1":"v1",
  "k2":"v2"
}

Usually the originating source can output newline delimited JSON, but if not then it would something ahead of ingestion to Chronicle to format the log, e.g, `jq -c` to create compressed JSON.

View solution in original post

cmmartin_google

If you didn't want to use the JSON filter (https://cloud.google.com/chronicle/docs/reference/parser-syntax#extract_json_formatted_logs) then you could use a GROK and regex each field (https://cloud.google.com/chronicle/docs/reference/parser-syntax#grok_extraction_syntax)

If you have a JSON object in a log field, you can combine the two approach above, e.g., use a GROK to extract the JSON from the log, then call the JSON filter.

Former Community Member

Hi martin, Thank you for the response. I want to know if we can parse Nested Json Log directly in chronicle, without converting it into raw Json.

cmmartin_google

If you have a single JSON object with nested fields you can use this syntax (https://cloud.google.com/chronicle/docs/reference/parser-syntax#manipulating_json_arrays)

If it's a JSON Array, i.e., multiple JSON logs in one object, you need to split them in advance of Chronicle SIEM ingestion.

Former Community Member

Hello,

Is it possible to parse the below mentioned logs, as the logs are getting broken during the ingestion. Logs are getting ingested line by line.

{
"timestamp": "2022-12-23T12:34:56Z",
"level": "info",
"message": "User logged in",
"user_id": "abcdefghij",
"user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
}

cmmartin_google

We require newline delimited JSON, i.e., the JSON object needs all be on one line. You'd require the source device to send in this format, or an intermediary pipeline to create a compact JSON object.

Former Community Member

Can you please explain with an example

cmmartin_google

The log needs be on a single line, e.g.,

{"k1":"v1","k2":"v2"}

versus being split over multiple lines, e.g.,

{
  "k1":"v1",
  "k2":"v2"
}

Usually the originating source can output newline delimited JSON, but if not then it would something ahead of ingestion to Chronicle to format the log, e.g, `jq -c` to create compressed JSON.

aravind_s12321

Hi Martin,

Can we parse a log with the below mentioned format with out making any changes while ingesting?

[
  {
    "header": {
      "name": "EcoScope Data",
      "well": "35/12-6S",
      "field": "Fram",
      "date": "2022-06-14",
      "operator": "GeoSoft",
      "startIndex": 2907.79,
      "endIndex": 2907.84,
      "step": 0.01
    },
    "curves": [
      {
        "name": "MD",
        "description": "Measured depth",
        "quantity": "length",
        "unit": "m",
        "valueType": "float",
        "dimensions": 1
      },
      {
        "name": "A40H",
        "description": "Attenuation resistivity 40 inch",
        "quantity": "electrical resistivity",
        "unit": "ohm.m",
        "valueType": "float",
        "dimensions": 1
      }
    ],
    "data": [
      [2907.79, 29.955],
      [2907.80, 28.892],
      [2907.81, 27.868],
      [2907.82, 31.451],
      [2907.83, 28.080],
      [2907.84, 27.733]
    ]
  }
]