Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

One Column Different Data Types

Please assist? I am pulling in data from Google Campaign Manager. I am facing this big issue. My Metric column contains data with different data types. This creates a lot of work and frustration and manual "data convert" task's. How do i solve this problem?

4 REPLIES 4

 

# START_DATE = DATE_ADD(CURRENT_DATE(), INTERVAL -31 DAY)
# END_DATE = DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY)
  SELECT
    Campaign_ID,
    _DATA_DATE AS Date,
    COUNT(*) AS count,
    COUNT(DISTINCT User_ID) AS du
  FROM
    `dataset.impression_campaign_manager_id`
  WHERE
    _DATA_DATE BETWEEN start_date
    AND end_date
  GROUP BY
    Campaign_ID,
    Date

 

Hi @yrstruly 

When you transfer the data from Google campaign manager there are some pre requistes we need to ensure.

1. Migration steps  must be correct : Here is the blog for help: 

2. After migration of data we need to sample it before pulling it to a excel sheet or any reporting platform . You can use SQL query to sort the data. For example : The following sample query analyzes the number of impressions and distinct users by campaign over the past 30 days. (Impressions and distinct users by campaign) Quoted above.

Thanks!

Hope this helps!!

 

 

 

Thank you for the feedback, but this is not relevant to my topic. Metric data from CM have different data types e.g int and decimals in one column. That is problematic for BI reporting. I am looking for a way to split this one column into two separate columns so that i can assigned to correct data type to each column.

Hi

Can you please consider cast in the bigquery

Example : 

SELECT
COUNTRY,
DATE_UPDATED,
SAFE_CAST(TOTAL_VACCINATIONS AS INT64) AS TOTAL_VACCINATIONS,
SAFE_CAST(PERSONS_VACCINATED_1PLUS_DOSE AS INT64) AS PERSONS_VACCINATED_1PLUS_DOSE,
SAFE_CAST( PERSONS_FULLY_VACCINATED AS INT64) AS PERSONS_FULLY_VACCINATED,
regexp_replace(VACCINES_USED, r','"\n") as VACCINES_USED,
FIRST_VACCINE_DATE,
SAFE_CAST(NUMBER_VACCINES_TYPES_USED AS INT64) AS NUMBER_VACCINES_TYPES_USED ,
SAFE_CAST(PERSONS_BOOSTER_ADD_DOSE AS INT64) AS PERSONS_BOOSTER_ADD_DOSE

FROM
  `tracing-matrix.covid19.WHO_Vaccination data`
WHERE
  DATE(_PARTITIONTIME) = "2022-05-12"
AND
COUNTRY='India'

GROUP BY 1,2,3,4,5,6,7,8,9
ORDER BY 1 DESC

Thank you for the response. The Cast function is a short term solution for a long term problem. I would like to follow strict guide lines in building a good data warehouse for reporting. This means, getting the data into the right format from the beginning. This might lead to big problems down the line, especially when you have lots of data. Have you ever came across "Derived columns" in bigquery?