Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Mysterious destination tables and schemas with weird behavior

I'm doing some cost attribution work and I've noticed there are these values in my `JOBS_BY_PROJECT` view that are exhibiting some odd behavior. I would like to see if anyone can explain to me what these jobs are?

These records have a `destination_table.dataset_id` that looks like '_123var456char' (starts with an underscore and followed by a 40 character varchar. The corresponding `destination_table.table_id` starts with 'anon' and is followed by a 64 character varchar. 

Some of the odd behavior includes:
- Not being able to query the view with the following `where` clause:

 

select * 
from `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
where destination_table.dataset_id like '_%'

 

Instead, I have to do some arbitrary replacing first:

select * 
from `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
where replace(destination_table.dataset_id, '_', 'WEIRD') like 'WEIRD%'


- When I go to search for the tables/schemas in the UI, it does not not exist
- When I search the Bigquery Project Job History for the JOB ID corresponding to the Job that created these destination tables/schemas, it also does not exist.

0 1 823
1 REPLY 1

The behavior you're describing sounds a lot like temporary or intermediate tables created by BigQuery for various operations.

In BigQuery, there are some operations that can't be completed in a single step. Instead, the system may have to create temporary tables or datasets to hold intermediate results. These temporary tables and datasets often have names that begin with an underscore ("_"), and can have other peculiar naming conventions like starting with 'anon'. As such, they're often easy to spot.

As for why you're having trouble querying them directly, that's likely because of SQL's rules for identifiers (which include table and column names). In many SQL dialects, identifiers that start with an underscore or contain special characters need to be enclosed in backticks (`) or double quotes ("). However, it seems that BigQuery is not allowing you to query these tables even with backticks, which suggests that they might be system tables and not meant to be directly accessed.

Regarding not being able to find the tables in the UI or the job history, temporary or intermediate tables are usually deleted once the operation that created them is complete. That's probably why you're not able to find them. The absence of corresponding job IDs could also be a sign that these were system-level operations that aren't logged the same way user-initiated jobs are.

As far as cost attribution is concerned, you might not need to worry about these temporary tables. They're typically used for internal operations by BigQuery and the costs associated with them are typically rolled into the cost of the operation that created them. If you're seeing charges that you can't account for, it might be worth reaching out to Google Cloud Support for help.