Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

duplicates

dataset is called zip_to_zip_moving, i have four tables there, one is called zillow_corrected,realtor_corrected, trulia_corrected and redfin_corrected. I want to compare zillow_corrected and realtor_corrected, zillow_corrected and trulia_corrected, zillow_corrected and redfin_corrected, realtor_corrected and trulia_corrected, realtor_corrected and redfin_corrected, trulia_corrected and redfin_corrected. in all tables there are 5 columns, string_field_0, string_field_1, string_field_2, string_field_3, string_field_6 and i should find and delete duplicates based on comparing string_field_0, string_field_2, string_field_3, in all tables, but i wanna delete these duplicates based on this: 1. if duplicates are found in this comparison "zillow_corrected and realtor_corrected", delete from realtor_corrected table 2. if duplicates are found in this comparison "realtor_corrected and trulia_corrected", delete from realtor_corrected table 3. if duplicates are found in this comparison "realtor_corrected and redfin_corrected", delete from realtor_corrected table 4. if duplicates are found in this comparison "zillow_corrected and trulia_corrected", delete from trulia_corrected table 5. if duplicates are found in this comparison "zillow_corrected and redfin_corrected", delete from redfin_corrected table 6. if duplicates are found in this comparison "trulia_corrected and redfin_corrected. ", delete from trulia_corrected table In addition, as a result i wanna receive only unique addresses, combined from all tables. and final list should contain all 5 columns: string_field_0, string_field_1, string_field_2, string_field_3, string_field_6 can you give me query for finding and deleting duplicates from big query and get final unique list?
 
 
 
 
0 1 193
1 REPLY 1

Hi @anigordel,

Welcome back to Google Cloud Community.

Upon checking your issue I came across a topic that might be related to it.

You can check the link to check it out:
https://stackoverflow.com/questions/49909067/how-to-remove-duplicate-rows-in-google-bigquery-based-o...

Thank you