Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataplex Quality scan cost optimizing

Hi,
I'm wondering if there's some way to configure a custom query in Dataplex's data quality scans in a cost-optimized way. For example i want do uniqueness check of id, Auto DQ rule will generate following query:

 

WITH
    `3f00a224-2204-4ae5-bd7b-2ffa70afc926` AS (
        SELECT
            *
        FROM
            `my-project.my_dataset.my_table`


SELECT
    *
FROM
    `3f00a224-2204-4ae5-bd7b-2ffa70afc926`
WHERE
    `id` IN (
        SELECT
            `id`
        FROM
            `3f00a224-2204-4ae5-bd7b-2ffa70afc926`
        GROUP BY
            `id`
        HAVING
            COUNT(`id`) > 1);

 

in my case I would like not to do "select *", instead select only id what's implicit lower costs.

 

SELECT
    id
FROM
    `3f00a224-2204-4ae5-bd7b-2ffa70afc926`
WHERE
    `id` IN (
        SELECT
            `id`
        FROM
            `3f00a224-2204-4ae5-bd7b-2ffa70afc926`
        GROUP BY
            `id`
        HAVING
            COUNT(`id`) > 1);

 

Is this achievable ?


0 2 508
2 REPLIES 2