Hello,
I am currently tasked with analyzing the historical resource consumption of our Dataproc ephemeral clusters on Google Cloud Platform. The goal is to make informed decisions about configuring an on-premises cluster equivalent to our cloud setup.
Given that our Dataproc clusters are ephemeral and get deleted upon job completion, I am looking for a retrospective analysis of resource metrics such as CPU usage, memory consumption, machine count, and core utilization during the lifespan of these clusters.
How to achieve it in the simplest way?