Need Help: Issues with Log Format While Integratin...

skadav · 01-01-2025 03:10 AM

Hello,

I recently integrated an S3 bucket for one of our application logs. The logs are in JSON format, and the files in the S3 bucket are named in the format FileName.json.gz.

SecOps is successfully pulling the logs, but I'm encountering an issue with the log format. The logs are breaking improperly—either after each line, comma, or curly brace { } —resulting in an incorrect format when ingested.

I’ve double-checked the basics but haven’t been able to pinpoint the root cause of this issue.

Has anyone faced a similar problem? If so, how did you resolve it? Any guidance or troubleshooting steps would be greatly appreciated.

Thanks in advance for your help!

cmmartin_google

Chronicle SIEM requires newline delimited JSON at present, e.g., each JSON log is on a single line and ends with a newline character.

Are your source logs pretty printed (multiple lines) rather than newline delimited? If so, that could the cause. and as far as I know there is no way to collect these with the native Feed Management, and rather it requires either 1) a custom ingestion solution, or 2) you update the source logging to output in single line JSON.

skadav

Hello @cmmartin_google

Yes, logs are in formatted JSON and unfortunately there is no way to change source logging to output in single line JSON.

To create a custom ingestion, I can write python script. But can you please help me what need to import to support BOTO3 library in IDE?

cmmartin_google

Example Python scripts for Chronicle SIEM are available at the Google SecOps Github repo: https://github.com/chronicle/api-samples-python Specifically you'd want to send the flattened JSON via the https://github.com/chronicle/api-samples-python/blob/master/ingestion/create_unstructured_log_entrie... endpoint (ideally as a batch, up to 1MB per request).

An alternative, that may work at low volume, is mount the S3 storage into a VM, and use our Chronicle SIEM Collector (BindPlane) with the FileLog receiver, which supports multi-line logs https://observiq.com/docs/resources/sources/filelog

skadav

Hello @cmmartin_google

Thanks for sharing the repo. but unable to find related to S3 bucket.

cmmartin_google

For the reading from an AWS S3 bucket I would look at something like: https://docs.aws.amazon.com/code-library/latest/ug/python_3_s3_code_examples.html

You could adapt an example from https://github.com/chronicle/ingestion-scripts to work with S3.

skadav

Hello @cmmartin_google

Thanks for sharing, but I am not able to "import boto3" in IDE. Getting error ==> ModuleNotFoundError: No module named 'boto3'

also not able to find anything for AWS or S3 in https://github.com/chronicle/ingestion-scripts

skadav

I tried installing AWS S3 from the Marketplace to explore the code and see how Boto3 is used in an IDE. Surprisingly, same issue with this as well. When I try running the code, I’m getting an error related to Boto3, indicating that it’s not being recognized.

cmmartin_google

Apologies, if I understand the question it's about setting up the AWS SDK (Boto3) in your IDE, which are areas beyond what I can offer support for.

That said, there is a Boto3 plugin for VSCode which may be a starting point - https://marketplace.visualstudio.com/items?itemName=Boto3typed.boto3-ide

skadav

Hello @cmmartin_google

Thank you.

I found the solution now for it. we need to import "boto3" library to integration and dependency library "jmespath"

The Google Cloud Security Community is upgrading platforms!

Read more and check out our FAQ.

Need Help: Issues with Log Format While Integrating application logs via S3 bucket into SecOps