Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BigQuery datasets broken

Hello community,

I am moving the BQ datasets in all projects across my organization from the single-region europe-west3 into the multi-region "EU". This has been working fine for all projects except one. 

Usually I created a new replica per dataset, made it primary and deleted the old replica.

When I try this with datasets within this project, the replica creation process runs forever (I left it running for 5 days straight, with no result).

To workaround this, I tried to clone a dataset (which works fine) into a swap dataset, then delete the old one and clone the swap dataset into a new one with the same name as the original one. However, when I delete the dataset, it doesn't disappear from the dataset list. Only its tables are gone. When I click at it, it says "Not found: Dataset [...]". But when I try to create a dataset with the original name, I get "Already Exists: Dataset [...]".

So now I'm stuck. I cannot delete these datasets properly and I cannot create replicas for them. They seem to be broken somehow.

What can I do to fix them?

Thank you!

Cheers

Solved Solved
2 9 982
1 ACCEPTED SOLUTION

The original issue was caused by the org policy "iam.allowedPolicyMemberDomains". Since the GCP project had been migrated into the organization, there were some users that had access to the dataset, but didn't belong to the organization / the allowed domains. Removing the access of these users from the datasets or adding them to the organization (in Cloud Identity) solved the issue, so that the replication processes finished successfully.

A special case is caused by search-console-data-export@system.gserviceaccount.com. You can find solution approaches to this one here: https://www.googlecloudcommunity.com/gc/Data-Analytics/Can-t-import-search-console-data-in-Big-Query...  

View solution in original post

9 REPLIES 9

Hi @Karsten189,

Welcome to Google Cloud Community!

I understand your frustration that this is not working in your current project, especially since it functions correctly in your other projects.

You can try the following steps to guide you through deleting datasets using the Google Cloud Console:

  1. Navigate to BigQuery: Go to the Google Cloud Console and select BigQuery from the navigation menu.
  2. Select Your Project: Make sure you have the correct project selected in the project dropdown.
  3. Locate the Dataset: In the Explorer panel on the left side, find the dataset you want to delete.
  4. Open Dataset Menu: Click on the three dots ("...") next to the dataset name.
  5. Choose "Delete Dataset": Select "Delete dataset" from the menu.
  6. Confirm Deletion: A dialog box will appear. Important: If the dataset has tables, you'll be prompted to confirm that you want to delete the dataset and all its tables. Type the dataset ID (as prompted) to confirm the deletion, then click "Delete."

You can also try different ways in deleting datasets in this documentation.

Also, with regards to your issue in the Dataset Replication process, you can check this documentation as this is subject to some limitations that might be the cause of the issue.

If the issue continues, please contact Google Cloud Support. When reaching out, provide as much detail as possible and include any relevant screenshots of the errors you've encountered. This information will help them diagnose and resolve your problem more effectively.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

+ faced with the same issue

I faced the same issue when I had similar behavior, using omni external connection to aws s3. 

Hi all,

@paulynann thank you for your reply. Interestingly, when I came back from vacation today, the dataset had disappeared and I could successfully copy back from the swap dataset into a new one with the original name. However, when I tried to apply this procedure to all the other datasets, the issue appeared again. In detail, the behaviour is slightly different now: I can delete the dataset successfully, but when I re-create it and click on it, I get "Dataset not found: [...]". So the result stays the same.

Also, when this happens (and also when I run into the dataset replication error) the BQ Studio UI becomes slower and slower until the browser tab eventually crashes. I tested this with different browsers and I let a colleague test this, too.

These issues seem to me like several severe BigQuery bugs. To me it seems like BigQuery has trouble re-creating a dataset with the same name as one that has been recently deleted and is still in the time-travel window. Also, it seems that when these issues appear, the BQ Studio application keeps starting infinite loops that eventually overload the CPU thread. 

By the way, none of the limitations mentioned in the documentation apply to my case. @Qorh it could be that you are affected from the limitation on Omni locations. 

+1 having the same issue. Was trying to create a dataset with default settings, found it created at the wrong region, delete and recreate at the correct region, then it breaks.

Please fix it.

Same here! Exact same behavior, deleted the dataset in the US region, recreated in EU -> It's listed but when opening it says the dataset doesn't exist (indeed the UI becomes very slow)

+1 Same here, change the location for the dataset from Berlin to Belgium with same name, and I face same Issue. 

The issue with the workaround ("Dataset not found: [...]") should be fixed now. If you were affected, you should be able to see the related incident.

For the original issue (data replication stuck) I created a support ticket, which is currently handled by Google.

The original issue was caused by the org policy "iam.allowedPolicyMemberDomains". Since the GCP project had been migrated into the organization, there were some users that had access to the dataset, but didn't belong to the organization / the allowed domains. Removing the access of these users from the datasets or adding them to the organization (in Cloud Identity) solved the issue, so that the replication processes finished successfully.

A special case is caused by search-console-data-export@system.gserviceaccount.com. You can find solution approaches to this one here: https://www.googlecloudcommunity.com/gc/Data-Analytics/Can-t-import-search-console-data-in-Big-Query...