Hello again,
I've been trying all day to custom the level of the logging Python library in my Beam data pipeline. I've been sending a parameter DEBUG through the UI, and according to the value, set something like:
Apache Beam SDK for Python uses the Python logging
module, and indeed Google Cloud Dataflow does have some control over the logging level.
However, your approach to dynamically set the log level based on a parameter should generally work. Here are a few things you might want to check or consider:
Configuration in the Dataflow UI: In the Google Cloud Dataflow UI, you can specify default worker log levels. Ensure that these are not set to a level that would override your settings (for example, they are not set to WARNING
or ERROR
which would suppress INFO
and DEBUG
logs).
Logger Initialization: Make sure that you are setting the log level on the correct logger instance. The logging
module in Python has a hierarchical structure of loggers. If you are setting the log level on a child logger, but the parent logger has a higher log level set, the messages from the child logger may not be displayed. You may want to try setting the log level on the root logger for testing purposes, using logging.getLogger().setLevel(...)
, and see if that changes the behavior.
DEBUG Parameter: Ensure that the DEBUG
parameter is being correctly passed and read in your pipeline code.
Logging Behavior: Note that setting the logging level to DEBUG
will include all logs at the DEBUG
level and above (i.e., INFO
, WARNING
, ERROR
, CRITICAL
). Setting it to INFO
will include INFO
level and above. Make sure that your log messages are being logged at the appropriate level.
Dataflow Runner: When running pipelines using the DataflowRunner, the worker logs are sent to Cloud Logging. You can view these logs in the Google Cloud Console.
You can set the `default_sdk_harness_log_level` flag to DEBUG when launching dataflow jobs.
I also suggest you take a deep dive into how logging works in python. This article is really useful: https://docs.python.org/3/howto/logging.html