Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

count no of objects in a bucket/folder - custom metric

Hello,

I am really trying to create a custom metric to "count the no of objects present in a bucket/folder/folder"

Once I create a metric I want to export the results to the big-query table via sink with the parameters (project id, bucket name, folder name, no of objects, and timestamp)

Could anyone tell me how to make it?

ex: my project - X, bucket name: Y, subfolder: Z or maybe bucket has many sub-folders but I need to count from one of the sub-folders

Solved Solved
0 3 5,656
1 ACCEPTED SOLUTION

GCS object count metrics can be grouped by bucket, but not by object prefix (e.g. "folder/folder").  I wonder if you might have a better time using storage insights (https://cloud.google.com/storage/docs/insights/inventory-reports) or BigQuery object tables ( https://cloud.google.com/bigquery/docs/object-table-introduction) to get all object metadata into BigQuery and then create your metrics as queries over that data in BigQuery. 
 
Note that Inventory Reports are shipped as GCS objects. You can use them in BigQuery either by defining external tables in BigQuery or just running a regular BigQuery load job.
 

View solution in original post

3 REPLIES 3

GCS object count metrics can be grouped by bucket, but not by object prefix (e.g. "folder/folder").  I wonder if you might have a better time using storage insights (https://cloud.google.com/storage/docs/insights/inventory-reports) or BigQuery object tables ( https://cloud.google.com/bigquery/docs/object-table-introduction) to get all object metadata into BigQuery and then create your metrics as queries over that data in BigQuery. 
 
Note that Inventory Reports are shipped as GCS objects. You can use them in BigQuery either by defining external tables in BigQuery or just running a regular BigQuery load job.
 

Thank you, KirTitievsky

I configured the daily Inventory report and got the CSVs into another bucket but we have few pipelines that run hourly and I wanted to run this inventory report hourly


1. While configuring we have only two options: a) daily b) Weekly  (No configuration for time)but I want to configure them on my schedule (I want to configure time manually when to start and to run the report hourly)

2. Currently my daily schedule is 5:37 UTC (time configured automatically no chance to provide manually) but the report is available after 6:30 UTC there is a 1-hour delay b/w the job run and report availability. 

Could you please tell me, if Is there any chance I can custom-schedule by giving time and dates at my own convenience?

Regards,
Sandeep

Hi @sandeepguptha9,

Thanks for hopping in @KirTitievsky, You can follow KirTitievksky suggestion or you can also try this workaround:

1. List the total count in a bucket.

 

gsutil ls gs://bucket/foldername/** | wc -l

 

2. Append the stdout to a CSV file using the awk command or sed command. 
3. After that you can load the CSV file to the Bigquery.

Hope this also helps!