Hi,
I see following error when quering iceberg data in s3 from bigquery. Can you please help. Also let me know if there is any workaround possible.
Error while reading data, error message: Only data files are currently supported. Positition and equality delete files for merge on read are not supported. File 's3://XXXXX-XXX-dev-XXXX/iceberg/XXXta/yNXXwl1g/app_id_bucket=8/server_date=20240517/XXXXX_XXXXX_00094_bvgih-5e34be59-98c3fd9345a4.parquet' has content type: 'Position Delete'
Reagrds,
Suraj.
I tried optimize and vaccum command in aws athena but didn't help. I still see delete files in s3 which is causing this. Is there a way that I can continue querying this data from bigquery.
BigQuery's Iceberg connector doesn't yet fully support the Iceberg feature of "position deletes" (or "delete files"). These files are part of Iceberg's merge-on-read mechanism, which tracks deleted records to ensure accurate query results.
Here are several strategies you can consider:
Filter Out Delete Files in BigQuery:
_file_metadata
pseudo column and filtering based on it.Pre-Process Iceberg Data:
Data Transformation:
Example: Filtering in BigQuery (Conceptual)
SELECT *
FROM your_iceberg_table
WHERE _file_metadata['content_type'] != 'Position Delete'
Important Considerations