Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Converting SAS scripts containing macros, data step into BigQuery compatible

Hi Folks,

I have a SAS code which has sas macro comprising of data step and procedural SQLs within it. Below is the sample code,

%macro VoidRate(product_abbre);

data ndcs2_dummy;
set savedata.ndcs2_dummy;
run;

proc sql;
%connect_db(&dbuser_dummy.,libname=yes, dbmstemp=yes);
drop table &dbuser_dummy..ndcs2_dummy;

create table &dbuser_dummy..ndcs2_dummy(distribute=random bulkload=yes dbtype=(chndc='character varying(17)') bl_options="logdir '%sysfunc(pathname(work))'") as
select * from work.ndcs2_dummy;

data savedata.count_all_&product_abbre._dummy;
format chain_name $30. ichainID 3.;
set count_dummy count_ra_dummy count_kroger_dummy;
run;

proc means data=savedata.count_all_&product_abbre._dummy n mean ;
var first_void ;
output out=void_dummy mean=first_void;
format first_void percent8.1;
run;

%mend VoidRate;

 

%macro mcr_encodedups_perm(dwdb=xx,dsnin=dummy_encode,dsnout=xx,debug_flg=N,verbose_flg=N);

%local begin_timestamp;
*** Set beginning timestamp ***;
%let begin_timestamp=%sysfunc(datetime());
%let verbose_flg=%upcase(%substr(&verbose_flg,1,1));
%let debug_flg=%upcase(%substr(&debug_flg,1,1));


%if &verbose_flg=N %then %do; %let debug_flg=N; %end;

%let rc_data=%sysfunc(exist(admin_errout_dups));
%if %eval(&rc_data=0) %then
%do;
proc datasets library=work nolist;
delete admin_errout_dups;
quit;
%end;

%let rc_data=%sysfunc(exist(&dwdb..&dsnin));*User must have dsnin data set;
%if %eval(&rc_data=0) %then
%do;
%put -------------------------------------------------------------------------------;
%put Error - DSNIN DataSet &dsnin does not exist;
%put Processing Aborted ;
%put -------------------------------------------------------------------------------;
%goto exit;
%end;

%let nzlogs='%sysfunc(pathname(work))';


*** Check to be sure 3 key variables are on dsnin data set ***;
%let dsid=%sysfunc(open(&dwdb..&dsnin,i));
%let pat_key=%sysfunc(varnum(&dsid,pat_key));
%let tran_nbr=%sysfunc(varnum(&dsid,tran_nbr));
%let rc=%sysfunc(close(&dsid));
%if %eval(&dsid=0) %then
%do;
%put -------------------------------------------------------------------------------;
%put Error - DSNIN DataSet &dwdb..&dsnin does not exist;
%put Processing Aborted ;
%put -------------------------------------------------------------------------------;
%goto exit;
%end;
%if %eval(&pat_key=0) or %eval(&tran_nbr=0) %then
%do;
%put -------------------------------------------------------------------------------;
%put Error - Key Data Element does NOT exist in &dsnin. data set;
%if %eval(&pat_key=0) %then %put %str(Variable "pat_key" does not exist);
%if %eval(&tran_nbr=0) %then %put %str(Variable "tran_nbr" does not exist);
%put Processing Aborted ;
%put -------------------------------------------------------------------------------;
%goto exit;
%end;

proc sql;
/* %connect_db(pohus2,libname=yes);*/
%connect_db(&dwdb,auth=Adheris_Netezza_Power_Auth);

%if %sysfunc(exist(&dwdb..NDCGCN)) %then %do;
drop table &dwdb..NDCGCN;
%end;
%if %sysfunc(exist(&dwdb..ndc_keys)) %then %do;
drop table &dwdb..ndc_keys;
%end;
%if %sysfunc(exist(&dwdb..ptNdcXrf)) %then %do;
drop table &dwdb..ptNdcXrf;
%end;
%if %sysfunc(exist(&dwdb..NdcXrf)) %then %do;
drop table &dwdb..NdcXrf;
%end;
%if %sysfunc(exist(&dwdb..ptNdcXrf2)) %then %do;
drop table &dwdb..ptNdcXrf;
%end;
%if %sysfunc(exist(&dwdb..&dsnout.00)) %then %do;
proc sql; drop table &dwdb..&dsnout.00;
%end;
%if %sysfunc(exist(&dwdb..&dsnout.000)) %then %do;
proc sql; drop table &dwdb..&dsnout.000;
%end;
%if %sysfunc(exist(&dwdb..&dsnout)) %then %do;
proc sql; drop table &dwdb..&dsnout;
%end;
%if %sysfunc(exist(&dwdb..&dsnout)) %then %do;
proc sql; drop table &dwdb..&trx_sold_check;
%end;

quit;

%if &verbose_flg. = N %then %do;
proc sql;
%connect_db(&dwdb,auth=Adheris_Netezza_Power_Auth);

execute( create temp table &dsnout as
select t2.*
from &dsnin sq1
join asv_dupenc_v t2 using(pat_key,tran_nbr)
distribute on (pat_key)
) by &dwdb;
quit;
%goto exit;
%end;

%mend mcr_encodedups_perm;




I know we can convert SQL statements from SAS to BQ but I want to understand whether it is possible to convert sas macros, data step and all the functionalities in SAS compatible to BigQuery. If yes is there are any documentations, website or any cheat sheet for the same?

Any leads on this are much appreciated. 

0 1 1,758
1 REPLY 1

Yes, converting SAS code to BigQuery is feasible, but achieving success requires an understanding of the inherent differences between the two systems and recognizing that there's no one-size-fits-all guide for conversion. Here are some steps you can take:

SAS Macros to BigQuery: No Direct Translation

Challenge: SAS macros offer powerful code reusability and parameterization that doesn't have a direct counterpart in BigQuery.

Workarounds:

  • User-Defined Functions (UDFs): Encapsulate reusable logic within UDFs (JavaScript or SQL) in BigQuery. For instance, a SAS macro that calculates a complex metric could be efficiently rewritten as a UDF.

  • Stored Procedures: Use BigQuery stored procedures for parameterized, modular SQL, providing a structured approach similar to SAS macros.

  • Templating Engines: Employ tools like Jinja2 with SQL templates compatible with BigQuery to dynamically generate queries, mimicking SAS macro functionality. Jinja2 allows for SQL query templates with placeholders for dynamic substitution, akin to parameterization in SAS macros.

Data Step to BigQuery: SQL as the Bridge

  • The logic within SAS data steps often translates directly into BigQuery SQL, facilitating a straightforward conversion for this aspect of your code.
  • BigQuery's design for set-based operations across distributed data means iterative SAS processes might need reimagining to leverage BigQuery's strengths fully.

SAS Functions to BigQuery: Equivalents and Workarounds

Extensive Overlap: BigQuery's SQL function library covers many SAS functions, such as SUM, AVG, and SUBSTR.

Filling the Gaps: For SAS functions without direct BigQuery equivalents:

  • UDFs: Custom UDFs can replicate specific SAS calculations.

  • Alternative Approaches: Sometimes, combining BigQuery functions can achieve similar outcomes to a single SAS function.

Conversion Approach: 

  1. Dissect Your Code: Break down SAS macros, data steps, and functions to understand their core logic.

  2. Prioritize SQL Conversion: Directly translate SAS SQL and data step code to BigQuery SQL wherever feasible.

  3. Adapt Macro Logic: Utilize UDFs, stored procedures, and templating engines to replicate macro functionalities.

  4. Find Function Substitutes: Identify BigQuery equivalents for SAS functions or create necessary UDFs.

  5. Test, Refine, Repeat: Validate your BigQuery implementations against SAS outputs, optimizing for accuracy and performance.

The path from SAS to BigQuery involves understanding both platforms' capabilities and a customized, iterative approach. Consider a phased strategy and tap into online communities and professional support for complex conversions. There's no "SAS to BigQuery" magic button, but with a strategic approach and the right resources, successful conversion is within reach.