Google Cloud CSV to Google Sheets

TJ
Bronze 2
Bronze 2

Hi support,

My goal is to write a custom code script for a Google Sheet in Apps Script that auto-retrieves (ie. via Trigger) a csv that I have stored in my Google Cloud Storage bucket.  My preference is to complete this daily automated task with the csv file marked as private, for security purposes, and if possible, to use OAuth to complete the internal retrieval of the csv data and to parse it to a spreadsheet -- with the csv file to be updated with new data on a recurring basis.

Since I'm relatively new to the Google Cloud space, with basic but limited understanding of Apps Script JSON-based coding knowledge/skill, I found this article (dated in 2021) which has the custom code instructions detailed already:

https://h-p.medium.com/google-cloud-csv-to-google-sheets-740243e04f3a

The coder/author of the article mentions at the end that this is a good way to keep the data process internal, and going through OAuth keeps it even out of the service account (again, better security protocols/best practice).  As a result, I tried implementing the sample code provided in the above article for testing purposes. 

 

I'm trying to run a test with the provided code script in the above article as the base and so far, after following per the steps described -- whenever I run the test in Apps Script -- I have gotten to where the popup screen asks to authenticate the OAuth creds for the Web Application, and then after I approve it, the Execution log results in the following:
 
--------------------
11:17:01 AM Notice Execution started
11:17:02 AM Notice Execution completed
--------------------
 
When I check the Logging results in GCP, it doesn't show many further details, just that the web application script was called with the google api -- but none of the other 'logging' sections from the code script are to be found in the logs.
 
As a result, I don't understand where to troubleshoot further. It is unclear to me whether the issue is from the OAuth creds not going through -- although other users in forums have indicated that if OAuth is not working properly, it would result in an error -- and it doesn't appear that my test run results in an error.  At least, it doesn't say 'error' anywhere in the Execution Log in Apps Script, nor in the GCP Cloud Logging log.  I can see that the linked Google sheet is actually clearing/sizing the range that I have specified in the Apps Script, but no data from the csv in my Google Bucket is being written into the sheet. I've checked, and re-checked all my IAM permissions -- and even talked with a Google Support agent via chat, and he said all the permissions looked good from what he could tell -- and then he recommended that I post a request on here in the Community forum to ask to any feedback. 
 
I've even added permissions for the OAuth email address (and service account) into my Google Workpace admin. I've gone through each line of the code, and also studied the resources that are linked per the above article. I've also tried modifying the code extensively, based upon either the github instructions -- and also other users who have reported trying similar tests, but the same issue has persisted.
 
I've tried adding all appropriate scopes into the 'apsscript.json' header of the Apps Script extension of the linked Google sheet - the same issue persists, with the same Execution Log in Apps Script.  And after the initial pop-up screen where I had at first authenticated the Web Application, now that no longer appears (unless I make significant alterations to the Apps Script code -- then on occasion I may get the pop-up again requesting OAuth authentication -- but if I approve it, the same Execution Log is the result, with no data displaying in the spreadsheet).
 
So -- it's unclear to me -- is there an OAuth issue here?  And if there is, then why is there no OAuth error message in either the Execution Log or Cloud Logging log?  Or, is there an issue with identifying or finding the csv file in the Google Bucket? I've tripled checked all my detailed credential and links including the bucket name, file name (ie. the bucket object), Script ID and the right URI added to OAuth, I've made sure the Google Cloud project ID is added to the Apps Script, the OAuth 2 library is added, the Google Drive and Google Storage APIs are added, the APIs have been enabled in GCP, etc.
 
I wonder if perhaps any API changes by Google -- in the time since the article was written in 2021 -- may have made this internal API retrieval discontinued.  I would prefer to find a way to resolve the issue so the data parses and is displaying properly into the linked Google sheet, but if I'm unable to reproduce the test on my end, then I may have to look into going an alternate route.
 
If you have any feedback, please advise, and I would certainly appreciate any insights or suggestions.
 
Thanks,
TJ
 

 

Solved Solved
1 29 7,582
2 ACCEPTED SOLUTIONS

Stephane_fr
Google Developer Expert
Google Developer Expert

Hi

Nice you were able to make it work.

Good job and welcome to apps script worl 🙂 This is really advanced coding in Apps Script and go to all the questions on the forum it is not really easy.

Have a good day,

Stéphane

View solution in original post

Hi TJ,

Thanks for that code, it cleared up where to put both of the closing }.

The appsscript.json code was useful to see how it related to the extra lines in the code.

Please see the update in the file. I have transferred the apps script into the Apps Script Editor, and added the necessary API's and modified the appsscript.json file.

So this should mirror a real life example.

Except for 1 difference... i have created a new var (line 49) which is then used in lines 51 and 53. So theres no need to input the spreadsheet ID twice.

Also, i have added a comments throughout, to identify where we need to input out custom variables etc.

Please check it out, and let me know if this is accurate or if i have missed something 🙂

View solution in original post

29 REPLIES 29

Stephane_fr
Google Developer Expert
Google Developer Expert

Hi

 

I checked the article but I think there is a more easier solution.

What I would recommenad it is to check this article instead : https://www.labnol.org/code/20074-upload-files-to-google-cloud-storage

 

One of the pre requisite is the account that run the script have the cloud storage right on GCP.

In this article it demonstrate how to upload a file but on your case you will have to do a GET request and not a POST and don't forget to include in the header the bearer from apps script.

 

In the appsscript.json file also add the scope for Cloud Storage :  https://www.googleapis.com/auth/devstorage.read_write

 

Stéphane

Hi Stéphane (@Stephane_fr),

Thank you for your reply.

I tried implementing the code per the article of your link.  However, after I keyed in the appropriate params at the top of the code, and changed the METHOD to GET (instead of POST), it resulted in an error.

Please advise:  what are you referring to when you say "don't forget to include in the header the bearer from apps script" -- is that what you explained further in the appsscript.json file?  If so, then I have already tested it by including the right scopes added to the script -- so I wouldn't think that is the source of the issue here in this code example.  Or, were you referring to something else, for me to perhaps include in the 'header' line of code, as in - were you referring to this section?  Should I try to make a change to this:

 

    headers: {
      Authorization: 'Bearer ' + service.getAccessToken(),
    },

 

I also tried modifying the URL, since it appears to me that the URL string is intended for an 'upload' (ie. writing from Google Drive and to Google Cloud Storage bucket) -- since I'm trying to retrieve the opposite direction (ie. from the bucket and put into Google Drive).  In other words, this line:

 

var url = 'https://www.googleapis.com/upload/storage/v1/b/BUCKET/o?uploadType=media&name=FILE'

became this:

var url = 'https://www.googleapis.com/download/storage/v1/b/BUCKET/o?downloadType=media&name=FILE'

However, that also didn't work.  I tried to find a Google help page that describes the proper code structure for a download / PUT -- but didn't find it upon initial research.  Would you have any reference link available?

Instead, in one of the above two examples, in testing different URL structures, one test actually overwrote the data within my bucket (and also tried to modify its metadata from a text/csv file into an application/pdf -- and I have no idea where that is found in the example code per the article link illustration Lol).

Also, would it help if I changed the function name from 'upload...' to 'download...'?

function uploadFileToGCS()

Or if you may have any other suggestions or feedback, I'd certainly appreciate it.

Thanks,

TJ

Stephane_fr
Google Developer Expert
Google Developer Expert

Hello

 

I will try to make it simpler.

In fact to download a CSV you need to do an URLFecth to get the data. Here a good example of that : https://www.fabiofranchino.com/blog/how-to-fetch-and-parse-a-csv-in-google-apps-script/

The difficulty in your case is on Cloud Storage file is not public so you need to provide a bearer to access it, that is why in the url fetch you will have to specify

var res = UrlFetchApp.fetch('http://www.example.com/my.csv',{ method: 'GET',
headers: {
Authorization: 'Bearer ' + ScriptApp.getOAuthToken(),
})

 

By using ScriptApp.getOAuthToken() I assume the account hat run the apps script code also have access to the bucket.

For that you need to display the manifest file in apps script and add the scope for cloud storage

Stéphane

Hi Stephane_fr,

Thank you for your reply.

I was more interested in your initial response, since I thought there would be value in getting the entire csv file, and importing it directly into Google Drive.  But since my tests on implementing that didn't work, then I went back in the direction per your second reply, which was regarding to use the 'UrlFetchApp.fetch' script function.

Through more tests, I was eventually able to resolve the issue and successfully test the very first script function from this article:

https://h-p.medium.com/google-cloud-csv-to-google-sheets-740243e04f3a

 
It was two minor points that worked:
 
1) Added } at the end of two of the functions, in order to close out those two functions (otherwise, Apps Script wouldn't save the code script at all)
 
2) For the line of code where it says:
var file_clear = {"spreadsheetId": 'X',"ranges": ['SpreadsheetTabName!A1:L50']};
 
I had initially assumed -- inadvertently -- that where it has the 'X' was where the spreadsheet ID was to be copied -- similar to how in the params section at the top, each 'X' is replaced by specific data for the bucket, filename, etc. But by keying in the spreadsheet ID there in that particular 'X', it was resulting in the persistent error of no data reaching the sheet.
 
Overall, the instructions that were outlined in the article by its original author were accurate, and kudos goes to hp for writing a nicely done and well written script and walkthrough per the article!
 
If anyone else may be interested in replicating that code, perhaps the above two points may help them implement it successfully as well.
 
Thanks,
TJ

Stephane_fr
Google Developer Expert
Google Developer Expert

Hi

Nice you were able to make it work.

Good job and welcome to apps script worl 🙂 This is really advanced coding in Apps Script and go to all the questions on the forum it is not really easy.

Have a good day,

Stéphane

Hi TJ,

Sorry to bother you. I am also testing the same code, and having similar issues.

Would you have the final script that worked for you? (Perhaps can share in an anonymous sheet via the Forms used in Docs Editors )

Many thanks.

Hi Mr_Shane,

Ok, I went ahead and copied my example custom script into the Forms used in Docs Editors, which you can see here at this link:

https://docs.google.com/spreadsheets/d/1iqIg8PRAIuy_iu9AeTTqHBkehpqA8ouc1zFyCrJVaOA/edit#gid=2100307...

I've replaced any individual key data code with a red-labeled 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' -- similar to the article, you just replace the 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' with the proper credentials provided by your Google accounts (ie. Drive, Storage buckets, etc.).  It's also important to be sure to properly link your Sheets / Apps Script to your GCP Project (which is done under the Apps Script settings), and I also had to do a lot of testing with the Buckets settings to properly set those up to be able to have the file data in the CSV retrieved for the right security and privacy protocols.  When I did all of that, I had got it working 100% initially (although lately I've been running into time out errors per my other post response).

If any of this may help you with your testing, then feel free to post any feedback on here if you're able to get your script working.  I would be eager to learn if you find out anything in your testing, which may be able to also assist me in trouble shooting this script even further.

Have a nice day.

Thanks,

TJ

Hi TJ,

Thanks for that code, it cleared up where to put both of the closing }.

The appsscript.json code was useful to see how it related to the extra lines in the code.

Please see the update in the file. I have transferred the apps script into the Apps Script Editor, and added the necessary API's and modified the appsscript.json file.

So this should mirror a real life example.

Except for 1 difference... i have created a new var (line 49) which is then used in lines 51 and 53. So theres no need to input the spreadsheet ID twice.

Also, i have added a comments throughout, to identify where we need to input out custom variables etc.

Please check it out, and let me know if this is accurate or if i have missed something 🙂

Hi Mr_Shane,

Yes, this looks great!  Nice addition of the new var, that simplifies the code and it's more intuitive.  Obviously, kudos to the article author for putting it all together, but you've got a nice finishing touch!  The comments are very helpful too.  Overall, our collaborative contributions to this will help anyone else who may find this code useful as well. 🙂

Thanks,

TJ

Hey TJ,

I have just 1 more question regarding the original article point 5:

"Ensure the redirect URL depicted in the Authorization Screen is entered into the service account’s scope."

Do you have any idea what that means, where i should be looking in Cloud Console, etc ?

Hi Mr_Shane,

This is a topic that I still have a few questions about.  It may also be where I may be running into my 'time out' issue, since I may have at some point tweaked my permissions to my Storage bucket, and that may explain why my scripts haven't been able to access the files and are now hitting time out limit errors. Lol

Here's my present understanding:

So there's two different kinds of 'service accounts.' By default, 'service accounts' are made by GCP as kind of general security monitor protocol, in order to de-link from any email address (personal or business). Think of a 'service account' as a general email account, but with no email inbox, or a general account manager that -- has a email address type of moniker (ie. such as: yourprojectname@developer.gserviceaccount.com). If you go to your:

Google Cloud > IAM & Admin > Service Accounts

Using a hypothetical thought exercise, let's say we have one called "ABC", as the default Compute Engine auto-generated service account.

That is where you should find the service account that is linked with your Google Storage bucket permissions by default. I think with any new GCP Project that you make, Google auto-creates a default service account, like a "Compute Engine' one was made in my project after I created the VM instance. You can either use the default service account or make a new one, but regardless as to which service account you use, I think it has to be linked to the Storage bucket permissions in order to function properly, but I don't know that part for certain yet.

However, that's not the same 'service account' that the article is referring to in point #5 per your inquiry.  At least, I don't think so -- I think the article is referring to a different type of 'service account'.

Instead, once you initially link your GCP Project ID number, by going into the Apps Script settings within you Google Sheets file, when that link is made, what happens is that another type of default 'service account' is auto-created by that link. But this new 'service account' doesn't show up in the 'Google Cloud > IAM & Admin > Service Accounts' section, but rather you can find it by going to 'Google Cloud > APIs & Services > Credentials' in your GCP Project.

In addition, when I tested these steps, I found an auto-generated 'OAuth 2.0 Client ID' that was called 'Apps Script', it was a Web application with a randomly generated Client ID.  I don't know if it's correct to call that one a 'service account' or not based upon Google's standards -- but I think that is what the article was referring to when it says 'service account' in Step 5.  That Client ID was not editable, but I found through testing that if I tried to delete the Client ID, then the Project was no longer linked to the Apps Script any longer, and nothing would work. So you have to leave that one 'as is' in the project.  I think it get auto-created as soon as the Project ID is linked to the Apps Script settings -- but it may also be once the script is first run -- I don't recall offhand since it's been a little while since I first linked the project and ran the script.  Either way, it should get auto-created early on in the steps of the process.

Using a hypothetical thought exercise, let's say we have one called "NOP", as the auto-generated OAuth 2.0 Client ID which was made by linking the Project ID to the Apps Script settings (or by first Authorizing the script during its first run, whichever is appropriate).

On top of that, I also had manually made a new 'OAuth 2.0 Client ID', as described in Step 2 of 'Setting Up GCP' per the article. And that is where I think that Step 5 is referring to as the 'service account' -- or in other words, whatever Client ID that you make in Step 2 as the Web application, it should be editable, and if you go into edit the OAuth Client, there should be a section 'Authorized redirect URIs (for use with requests from a web server)'. That's where I put the URL link that I had copied-and-pasted from the initial Authorization Screen upon first running the script in Apps Script.

Using the same hypothetical thought exercise, let's say we have one called "XYZ", as the auto-generated OAuth 2.0 Client ID that we made for Step 2 of the article instructions.

Using our same hypothetical thought exercise, we now have three different 'service accounts', or well two of them may not be exactly 'service accounts' but rather 'OAuth 2.0 Client ID's'. They are, once again, "ABC" as the default service account made by GCP (ie. not in connection with the Google Sheets Apps Script code). Then we have as the second hypothetical example, "NOP" as the auto-generated Client ID that represents the link between the Project ID and Apps Script setting. And "XYZ" is the Client ID we created in Step 2 per the article.  These all have unique looking 'email addresses' attached to them but none of them are email addresses with inboxes, but rather 'service accounts' that help us manage certain things within GCP.

You'll also likely want to ensure that the 'OAuth 2.0 Client ID' that you create, as described in Step 2 per the article, is added to the Storage bucket within the access, as a 'Storage Admin'. What is unclear to me is which ones should be added. In my testing, I've tried adding all three of them, including "ABC", "NOP", and "XYZ" into the Storage Admin access for the GCP Storage bucket. And I think that worked. I don't know if that's too much access or if all three of them need to be added to the Storage Admin, but I was having so much difficulty getting the script to run properly at first, that I just kept on adding permissions hoping anything would get through the security protocols! Lol This is what I was referring to previously when I said I think Google and other coding bodies have made security too complex. It doesn't feel intuitively constructed as a code base. I now understand a lot better the principles involved -- they made it so that large corporations or groups/teams could better manage projects, accounts, etc. and share access without giving away personal or business email accounts. But in doing so in how they have set it up, it adds way too many layers that are now required for something as simple as setting up a CSV copy to Google Sheet. It would be nice if Google just wrote the code for this type of link and made it available with a few clicks for us average coders and regular every day users. Having said that though, if we understand the principles involved, then in the future, we can better manage more and larger projects with these various levels of access creds.

Now, a lot of my feedback on this topic is my 'best guess' -- because, I didn't find anywhere else to add the Authorized redirect URL into the "ABC" service account. Unless you can find where to add it on there? Also, the "NOP" Client ID is not editable, so even though it's necessary, there's no other place where I could find to add the redirect URL except within the "XYZ" Client ID that was made in Step 2 per the article.

Feel free to post any feedback per your testing.

Thanks,

TJ

TJ
Bronze 2
Bronze 2

Hi Mr_Shane,

Sounds good.  Although I got the code working per my description above earlier this year during the testing phase, when I deployed it in a real working environment, the code began having errors on the Google Cloud VM instance and Google Storage files in the buckets more lately, which I haven't yet had the opportunity to troubleshoot and resolve.

My initial guess is that these errors are due to the VM instances that I made with a built-in 'auto-updating' Google Cloud SDK.  The issue is that whenever the Google Cloud SDK on the VM instance auto-updates, it changes the folder name to the latest version.  I had setup an rclone sync to put some of my CSV files from the VM instance into a Google Cloud Storage bucket at regular daily / hourly intervals -- and automate that process, but now the Google Cloud SDK breaks up that automated process since the folder name changes with every auto-update of the GC-SDK server code.  Whenever that occurs, the files stored in the buckets stop updating, and then the custom script that I'm using to retrieve the data from the CSV stored in the bucket to the Google Sheet file -- they all begin to time out.  At least -- I think that's the source of the issue, but there may be more to that issue, especially with Google App Script's 6-minute timeout limit for script protocols.  That's really the larger issue, is that I'm trying to retrieve data from CSV files, with at least one file that is ~75mb of data, and Google Sheets caps the script sync process at like 6-minutes -- any script process running longer than that and it times out and doesn't complete the process.

Overall, I'm hesitant to post the script I have -- even though it had been working initially flawlessly and performing well for a few weeks before running into the Google Apps Script time out limit lately, and hasn't been working at all.  In other words, I wouldn't want to give any one a custom script that isn't working for me right now until I first have the chance to review it further and get it working properly again.  But it may be since the CSV files stored in the buckets aren't getting updated -- but conventional thinking would suggest that even if the CSV files weren't getting updated, they would still copy properly, it would just been older data that wouldn't be current or fresh each day.  And I'm using the custom script across a handful of files, with a few tweaks -- and all of those custom scripts ran into time out errors -- none of them are currently working.

Since I recently got back from vacation, which was when the errors began, it is unclear to me if it may be a simple fix or if I may need to go back to the drawing board to develop an alternate script.  I would really prefer to find a better way to get around the Apps Script 6-minute time limit, if possible.  And I have heard of an alternate solution, but it would basically be developing a new custom script elsewhere within Google Clouds offering, and building off of Apps Script (ie. to avoid the 6-minute time limit for script processing protocols).

I went back to check a moment ago one one example of the custom script I wrote based off the article here:

https://h-p.medium.com/google-cloud-csv-to-google-sheets-740243e04f3a

And my code is exactly like what is detailed in the above article.  The only things I've added were the code keys made specific to my Google Cloud and Google Sheets / Drive accounts.  That's it.  Well, and the two minor details that I had described in an above post that got it working for me 100% initially.

So I would say if you're testing, and you may not be using larger CSV files which may hit the 6-min time out limit by Apps Script -- then you wouldn't necessarily even need to see my custom script, because everything is already described perfectly in the above article.  I had to go back and re-read the above article about two dozen times to figure out every component.  You've got to grasp the concepts on each line that the article author describes in fine detail in order to get the script working.  In a sense, it's unfortunate that the security protocols implemented by Google (and many other code bodies at large) have made simple or basic tasks (like copying CSV file data from one source account to a data sheet in Sheets) way too complex in my view.  But those are the protocols they have setup and the only way to operate a custom script is to fully understand them and build within their structure.  If you just keep testing, and keep re-reading the above article, you should be able to get there to accomplish the purpose that you have for your script.

Hope this helps and good luck. 🙂

Thanks,

TJ

Hi TJ, thanks for the fast response.

Other than the first 4 variables which i have already figured out, Im just a little confused about where the 2 closing brackets were placed... are they both at the end of the script, or are they closing separate functions?

And for the 'X', according to the article documentation is that we need to replace that X in the middle of the code. would that be correct?

If i have all these things lined up correctly based on what was a working model, then i know i need to look elsewhere for what else might be causing the failure. So far i have played with replacing the X and putting the 2 closing } in various places (both at end, and one after each combination of 2 of the 3 functions) but still get the same results of the long string.

Sidenote: The initial auth is not actually OAuth to Cloud, it is simply Authenticating the apps script, so that has nothing to do with the script contacting cloud or using the OAuth.

Hi Mr_Shane,

Excellent questions.

Regarding the two brackets (ie. "}"), I have color-labeled the two backets in bolded-orange, for the two brackets that I added to the script from the article.  The first backet is to close out the 'var file_write' function (which I think was missing from the article's code example script) and the other bracket was to close out the 'function getService()' within the Authorization section (which was also missing per the article's code).

Regarding the 'X', if you're referring to the code line:

----------------------------------------------------------------------------------

var file_clear = {"spreadsheetId": 'X',"ranges": ['SpreadsheetTabName!A1:L50']};

----------------------------------------------------------------------------------

Then, no, that 'X' doesn't need to be replaced.  I referenced that in my prior post, where I had initially thought the 'X' on that code line was to be replaced with a credentialed valued -- it was only upon further testing, and also after I re-read the article more than a few handful of times, that I realized I was inadvertently misunderstanding the code detailed in the article.  In other words, the 'X' on that code line stays as just an 'X'.

It's confusing how it's written in the article because in the first section of the code, where it has these lines:

----------------------------------------------------------------------------------

//----Input Web Application Credentials and Bucket/File----
var params = {
CLIENT_ID: 'X',
CLIENT_SECRET: 'X',
BUCKET_NAME: 'X',
FILE_PATH: 'X.csv'
};

----------------------------------------------------------------------------------

In all of the above lines of code, the 'X' gets replaced by your Google account credentials.

But then later on, when the 'X' shows up again in the line of code that says:

----------------------------------------------------------------------------------

var file_clear = {"spreadsheetId": 'X',"ranges": ['SpreadsheetTabName!A1:L50']};

----------------------------------------------------------------------------------

That one stays as just 'X' and doesn't get replaced [however, you do need to also replace the SpreadsheetTabName on this same line of code with your tab name that you can pull from your Google Sheet at the bottom of the tab].  I think what the author of the article was trying to accomplish in the code was the 'X' is a defined variable, for only that line of code, where "X"="spreadsheetId", which is being replaced automatically within the code by the next two lines, where you put in your SpreadsheetID credentials:

 

var cleared = Sheets.Spreadsheets.Values.batchClear(file_clear, 'SpreadsheetID');
Logger.log(cleared);
var wrote = Sheets.Spreadsheets.Values.batchUpdate(file_write, 'SpreadsheetID');
Logger.log(wrote);

 Where the 'SpreadsheetID' is the actual credential from your Google Sheet that you key in twice in the above code lines.

Try that and see if it tests okay.

Side note: when you say "The initial auth is not actually OAuth to Cloud", what are you referring to as the initial auth?

Thanks,

TJ


"what are you referring to as the initial auth?"

Depending on the apps script, they can require authentication on first run, heres the documentation: https://developers.google.com/apps-script/guides/services/authorization

"replaced automatically within the code by the next two lines, where you put in your SpreadsheetID credentials"

This should now be easier with the new var i created on line 49 in the Apps Script Editor 🙂

Regarding the closing parenthesis, i just noticed the different colour. So now they have been added and documented/commented in the Apps Script Editor.

Between both our efforts, i think this edited Apps Script is more approachable than the original file from the article, and once i can get it working as is then i will start playing around with it to try to make the code easier to work with... or as you mentioned maybe we need to develop a new script 🙂

Hi Mr_Shane,

Good news.  I did some troubleshooting today on the time out error.  It turns out that the time outs weren't connected with the script per the article.  

Basically, to retrieve the data from the larger ~75mb CSV file, from the Storage bucket file and parse it into a Google sheet, I had setup an alternate custom script, different from the one in the article.  The other script separates the CSV file data into multiple smaller Google Sheets.  This is partly due to another limitations of Google Sheets to only be able to search within a limited range of data, of so many rows and columns before it errors out (unless you upgrade your Google Drive account to an Enterprise paid corporate-level version).  Then, I had set up each of the smaller Google Sheets file to run another similar customer script, to combine all the data back together again, one script running every hour.  I know it sounds counter intuitive -- why take a large CSV file, parse it into separate smaller sheet files, and then combine them all back together again?  Well, it was one way to handle the Apps Script 6-minute time limit on running processes.  If you try to run the script from the article on a larger CSV file, it will err out every time due to trying to copy too much data of rows and columns, and it won't complete.

Ultimately, again the good news is that the script from the article, as we confirmed again today is working properly -- did not have any time out errors,  at least not on my end, since I wasn't using that script to copy CSV file that were larger than say 10 or 15 mb total.

Instead, I was getting time out errors trying to run a script that was copying data from one Google Sheet to another -- it seems like the copying rate or speed within Google Sheets has slowed down since when I first started running the script, and it now takes twice as long to complete the copying of data from one sheet to another.  I wonder if Google has been tweaking the server resources allocated to Google Sheets for my basic Google Drive account?  I know that I have a paid Google Drive account but it comes with more server resource limitations than higher professional and corporate level tiers.

Anyway, thank you for your inquiry today, it gave me a reason to dive back in and troubleshoot the time out error which was somewhat loosely connected with the scripts that I have developed after finding the article earlier this year. 🙂  I still may look for an alternative option or another custom script to try to address how to migrate larger data from the ~75mb CSV file over to Google Sheets.  I want to try to 'future proof' the solutions as much as possible, to avoid having errors or break downs as Google inevitably makes changes to its servers and code structure down the line.  Right now, the setup I have in place of parsing the larger CSV file and then combining it back together again -- is probably too cumbersome of a solution, with too many moving parts.  I've already researched two or three options and that'll be next on my list of things to do going forward.  If you hear of any alternatives, please kindly let me know.

Thanks,

TJ

Hi TJ,

I got your previous message in email, but clicking it seems to be deleted here.

Anyhows, in the email, where you mentioned about the step 5 and 2 being linked... that sent me on a mission. 🙂

Where the source post says about the "service account’s" scope is a misnomer.. it is like you mentioned.. the "Authorised redirect URIs" of an "OAuth 2.0 Client".

So i deleted the orange triangle "OAuth 2.0 Client" for "Apps Script" and created my own using the same "Apps Script" name. Now finally i get the Auth window that i think you were at during your first enquiry. However, i get an "Authorisation Error Error 401: deleted_client The OAuth client was deleted."

Its further than i got before, but still a small way to go now.

For my case, i am setting things up to import a single CSV, which will have been a merged of many single row CSV files (i have no idea how many yet, im just learning how to do this to help a friend who asked if its possible to import many single row CSV). For the merge, i followed this very VERY simple procedure: https://medium.com/google-cloud/merge-clean-transform-your-csvs-easily-with-bigquery-3a73c0c26d57

And the reason for the merge, well.... i cant find much in the way of clear and working methods for a Cloud newb to setup and get this working with multiple CSV. I did try this method, but its not working for me: https://codelabs.developers.google.com/codelabs/cloud-function2sheet#0

If the Apps Script method wasnt so volatile (failure due to one piece of the puzzle), and could handle multiple or single CSV files, and didn't time out... then we would both be on a winner 🙂 haha.

Currently im writing out the step by step detailed instructions for this Apps Script method, heres a screenshot, maybe you can see something ive missed or done wrong:

Screen Shot 2022-09-22 at 08.54.38.png

 

 

Hi Mr_Shane,

Per my previous email, it is showing on my end within the community thread forum, just further up the page.  Search for "Step 2" (for example) on this page and you should be able to find it.

Per your reply, when you say that you "deleted the orange triangle "OAuth 2.0 Client" for "Apps Script"" -- as I had recently mentioned, during my testing, if you delete that auto-generated Client (ie. the orange triangle one), it breaks the connection between your GCP Project ID and the Apps Script.  So -- I wouldn't recommend deleting that one, but just leave it 'as-is'.  I think it serves the purpose of making a general link between the Apps Script and your GCP Project ID.  I don't know that it serves any other purpose, but without it the Apps Script doesn't run, which would explain why after you go the Auth window, then you saw the error of "an "Authorisation Error Error 401: deleted_client The OAuth client was deleted.""  In other words, the error is saying that your OAuth client, of the 'orange triangle" Client -- was deleted.

In my prior post, when I mentioned that I had tried deleting the orange triangle Client ID -- I was basically saying that I had tried deleting that one just for testing, but it didn't work and resulted in the same error that you got in the Auth window.  So I was trying to communicate -- not to do that, or not to delete that one -- to save you and everyone else from that same error. 🙂 I hope that clarifies it better.  And like I said, how Google and other code bodies have set it up -- there are so many seemingly random 'service accounts' on various levels that it makes it more complex than probably is necessary, which is why it doesn't seem very intuitive.  At some point, it would be nice if Google and other code bodies made a new setup without so many service accounts to keep track of.  For example, if I log into one of my buckets and view the permissions, there are about a dozen different randomly auto-generated 'service accounts' or Client IDs -- what are they all doing and why are they necessary?  Gah! Lol

Most articles on Medium are very helpful.  That's interesting that you're combining a single line of data in a CSV from multiple sheets for your friend.  Right on about the volatility of this method -- the downside is if any one piece of the puzzle fails, it breaks the whole method -- and Google may make updates in the future that cause the method to need to be serviced or updated.  The upside is that once the puzzle pieces fall into place and it's working -- then it's a really secure and robust way of preserving data from from GCP to your Google Drive account (but, with given the relatively low server resources of data transfer limitations unless upgrading to a higher tier Google account).

Yeah, I had also previously tested the method found in: https://codelabs.developers.google.com/codelabs/cloud-function2sheet#0 -- but it wasn't working for me either.  I did a deep dive into learning the concepts behind that method, but ultimately I determined that the method is missing some pieces -- which is what led me to test the article for this Apps Script method (per our above discussion), and after a lot of extensive testing, that one eventually worked out okay.

The step by step detailed instructions that you've got for this Apps Script method are great, and hopefully helpful to other coders going forward.  The only suggestion I would have is that I don't know if enabling the role of the Cloud Storage Legacy > Storage Legacy Bucket Owner / Object Owner would be necessary. For my testing, just enabling the Storage Admin worked since that gives it all permissions for the bucket. But, I guess adding more permissions would be alright. There's likely more options there that GCP is giving to account managers, and I'm somewhat unfamiliar with what 'Storage Legacy' refers to. Overall, it looks great!

Also, a quick comment on my testing, when I had tried deleting the auto-generated 'orange triangle' Client ID -- the one that appeared after linking the GCP Project ID into the Apps Script settings -- I think if I recall accurately, the only way to get that Client ID again was to ultimately delete the GCP Project entirely and start over with a brand new Project ID from scratch.  Since once the GCP Project ID and Apps Script settings were linked -- and the 'orange triangle' Client ID was auto-generated, I couldn't find any other way to get it back or to get a new Client ID re-auto-generated again, without starting over with another brand new Project ID.  FYI 🙂

Hey TJ,

Yes, your right.. the only reason i was "lucky" enough to get the popup window was due to deleting that orange triangle "Apps Script" OAuth.

What i did to fix it was quickly created a dummy project, and setup basic OAuth for it, then i linked the Apps Script to this new Project Number, and then i switched the Apps Script back to my original Project Number. What this did was pushed some kind of "new" request to the project and the project recreated the orange triangle. So now im back to where i started... the long string of gobblygook with no popup.

Knowing that you just had the Storage Admin is a help. The instructions alluded to having to setup something that had read/write.. and the legacy object was the only one that had those specific words in its description. You know someone is green when they take things literally and then try to band-aid by adding everything else that kinda looks good enough to maybe make something work.

I have now just enabled the Storage Admin only, and removed the others.

Hi Mr_Shane,

Another suggestion: have you tried adding the orange triangle "Apps Script" OAuth -- into the access permissions of your GCP Storage bucket?

Also, if you haven't already tested to try adding the default service account as well on the same access permission for the bucket -- I'm referring to the auto-generated service account located at:

GCP > IAM & Admin > Service Accounts

Thinking back, I may have had to test with tweaking the access permissions for some time before getting it to work properly -- including going into the Google Workspace account admin settings (the same Google Workspace account which has the Google Drive where the linked Google Sheet is being used for the Apps Script custom script per the article) -- and give the default service account (the one I found auto-generated at GCP > IAM & Admin > Service Accounts) full access privileges for the Google Workspace account.  I don't know offhand if that was even necessary, but I think -- and it's been a while since I initially had set this up, and I did so much testing and re-testing, that it's a little bit difficult for me to give the exact steps of what wound up working overall when it comes to the access privileges -- I think the default service account was part of the picture of what ended up working on my end.

And even though I don't exactly know all the steps that I took, what I can see when I check my account today is what I'm reporting here, exactly as it is currently working properly.  I know that it was a big sigh of relief when I eventually got the OAuth screen pop, but it was initially resulting in the same long string error message.  Then it was another "Finally!" moment when I got through the Auth screen and approved the popup.  But it was a lot of testing and I kept adding the various service accounts or Client IDs in different places and settings to get there.

The Storage Admin includes read/write access, even if it doesn't mention or say it on the description -- it's full privileges including read/write.  Without having extensively researched or testing it, my guess is the Legacy has to do with older / outdated code by Google.  See here for a more thorough description:

https://stackoverflow.com/questions/45533322/google-cloud-storage-confused-about-acl-iam-and-legacy-...

Anytime you see the word 'legacy' in connection to code in general, including Google Cloud Platform code, it usually refers to older or outdated code that is either being replaced by newer code (hopefully with some improvements), or will eventually be deprecated (ie. no longer used or isn't considered valid).

Just to double-check: did you setup your OAuth consent screen, located at:

GCP > APIs & Services > OAuth consent screen

?

Thanks,

TJ

Hi TJ,

I copied the long string that was showing in Apps Script, and pasted into new browser window. It gave me the response of "Success! You can close this tab.", which means there is sufficient access, but it's just not pulling the data from the CSV.

Then i tried adding the orange Apps Script OAuth to the bucket permissions, but it doesn't show up.

Next, I didn't have an auto-generated Service Account, so i created one, and added that to the bucket.

But the Apps Script then gave error 403.

So I removed the access, but still get error 403.

This task is driving me insane. 🙂

EDIT:

What i found after adding and removing the service account access to the bucket was that the OAuth account i had its secret automatically removed. Creating a new OAuth (that has the client ID and secret pair) and updating the apps script hasn't fixed the problem.. still error 403.

EDIT #2:

I found the fix for the 403 error.

adding these two lines of code:

var service = getService()
service.reset();
 
i have added these to the sample file.

Hi Mr_Shane,

I recall during testing of also that step, when you said, "copied the long string that was showing in Apps Script, and pasted into new browser window. It gave me the response of "Success! You can close this tab."  Once I had reached that point, I figured the link was complete between the GCP Project ID and the Apps Script -- and from that point on, I kind of left the service accounts as they were, since I didn't want anything to 'break' that successful Auth connection. Lol

However, I did get 403 error messages prior to that point.  That's interesting to learn per your tests and I'm glad you've found a solution.  I think part of the answer to the puzzle is -- what order are the steps to be completed?  That's still somewhat of a question mark in my view.  When I had tested to add the various service accounts / Client IDs to the bucket access permissions, I don't recall it resulting in a 403 error -- however, I maybe had not yet reached the 'Auth success' step for the pop-up by then..?  It's unclear to me since I still have them added to the Storage Admin on the bucket access, but there is no 403 error and no other issues with the scripts running daily on a timer.

Also, I've never heard of the OAuth account having its secret automatically removed -- that's another question mark.  Ultimately, it makes sense with the two lines of code you've added -- that resetting the service works.  But then I wonder -- so are those two lines of code resetting the authorization service every time the script runs in Apps Script?  In other words, does it require to re-authorize the OAuth link (via the pop-up) window upon every time the script runs?

Thanks,

TJ

"does it require to re-authorize the OAuth link (via the pop-up) window upon every time the script runs?"

The OAuth will need to get the appropriate credentials every time the connection is attempted. Whether theres another popup or not i dont know yet because i still havent managed to get the data to import even once... Its still a work in progress...

Speaking of work in progress, as im going through the code, im finding more places to shorten the code.

From what i can understand is that the 403 error is cause when one authorisation is holding and then the script attempts to make a second connection, which Cloud will reject, so causing the 403 "insufficient permission". So, the service.reset(); is necessary in order to prevent the 403 error.

I should also add that actually it is just 1 single line of code needed, not the 2, because the var service = getService() was already declared, so its not needed to be declared again.

I got the script snippet from the OAuth Library github page:

https://github.com/googleworkspace/apps-script-oauth2#logout

Hi Mr_Shane,

The only other thing that I can think of is that once I got to the "Success!" tab (by copying the URL once the pop-up Auth screen was approved) was to ensure that all of the personal keys entered into the various parts of the Apps Script were correct.  We've covered them already to some extent.

One thing I had tested was in the section for defining the var = params, for the line:

FILE_PATH"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.csv" // Enter the filepath to the file to import
 
At first, I didn't know if that was referring to both the filename of the CSV along with a longer string to the file path name.  Like in Windows, you've got something like "C:\Program Files\test.CSV".  But what I found was that here, it just requires only the filename, so for example "test.csv" or as an example:
 
FILE_PATH: "test.csv" // Enter the filepath to the file to import
 
That's if you've placed the file in the main bucket that you've created with no additional structure added.  But obviously, if you put more 'folders' within the bucket, then they would be added to the filename path.
 
Defining the bucket is simply the bucket name.
 
Another thing I noticed: Under the Client ID (as in, editing the Client ID within the credentials section of GCP > APIs & Services), there is a section that says "Authorized Javascript origins". On there, I have added two lines:
 
URIs 1
 
URIs 2
 
And then below that section, in the 'Authorized redirect URIs', I have added another line (in addition to the URL which was added from the Auth pop-up previously, which should be something like "https://script.google.com/macros/XXXXXXXXXXXXXX"):
 
 
Somewhere along the line, I found the all of the above additions recommended either buried in Google's help section for OAuth, or else in user forums per recommended feedback.  At the time, I wasn't getting the data flow to work, so I just get adding or testing things to try to enable the security settings properly.  And as I've said, I don't know exactly what caused it to work but I eventually got there.  I had assumed that you'd be able to get there easier or with less testing, but since it looks like you're running into similar issues that I went through with extensive testing, then yeah I don't know which of these 'tweaks' may be the missing puzzle piece that unlocks the data flow ultimately.
 
Hope this helps. 🙂
 
Thanks,
TJ

Hi TJ,

For the "Authorised redirect URIs", it is the Apps Script ID embedded in the link, which i have given the apps script snippet to generate the complete link for copy pasting.

For the "FILE_PATH", you will see it being referenced in the apps script as "encodeURIComponent(FILE_PATH)". What that does is accommodates for file paths where the CSV is inside a folder, without the "encodeURIComponent" the forward slash creates issues.

Heres the 1 liner that replaces the 3 lines of original code:

var url = "https://storage.googleapis.com/storage/v1/b/"+ BUCKET_NAME +"/o/"+ encodeURIComponent(FILE_PATH) +"?alt=media";

I have added the docs and drive javascript origins, and although they haven't gotten my script working, they may very well be part of the solution. My "instructions" are now greatly refined for simplicity, but i am now researching into if there is a specific order that the setup must follow so that everything ties together properly, for as i have already experienced is that the smallest change in one place strangely affects a different section, so i want to be sure that all the steps are in correct order to prevent anything not being tied together correctly.

See the screenshot of current steps:

Screen Shot 2022-09-24 at 05.30.27.png

Hi Mr_Shane,

The instructions look good.  Thank you for compiling those instructions.

For the "Authorised redirect URIs", yes, you are correct -- it is primarily the Apps Script ID embedded in the link.  However, there is the option to have more than one "Authorised redirect URIs" added, and per my previous note, I had added a second "Authorised redirect URIs" link -- although I don't know whether or not it was necessary.  It certainly wasn't mentioned by the original article, but -- I'm just identifying things I tested since -- perhaps anything may assist in helping to get your script working.  The second "Authorised redirect URIs" that I had added (which I've put on all of my Client IDs) is:

https://script.google.com/oauthcallback

I had found it mentioned somewhere in Google's help section, but I don't recall offhand where I found it, otherwise I could post it on here for your review.

Ultimately, and I was mulling this over lately -- thinking back, I'm fairly certain that in order to get my script fully working, I eventually tested and found that I had to use a service account, which was located in the GCP > IAM & Admin section, on top of the other 'orange triangle' and manually created 'Client ID' -- and I gave that service account full authorization to my Google Workspace account within the Google Workspace account settings.

It should be noted that to do that, you'll have to go outside of GCP, and into your Google Workspace admin to find the Google Workspace account settings for your Google Drive, and that is where you can give the service account, either auto-generated like it was for me, or you can manually create a new service account, from within your GCP > IAM & Admin section -- and to give that service account full access to your Google Workspace account.

I recall thinking that it was kind of unfortunate, since using the 'orange triangle' and 'Client ID' kind of limits the scope of those OAuth access permissions.  By using the service account from the GCP > IAM & admin section, and giving it full access to the Google Workspace account settings -- it sort of gives that one service account email address complete and total access -- which has far less scope limitations than the other two (of the orange triangle and Client ID).  In other words, there is more downside risk to a potential leak or exposure of the service account ID compared to the other two, by having given that service account so much access permissions.  So I would have really preferred to not have to go that route, and to not have to use the service account in that regard -- but if I recall accurately, that was the only way via testing that I could get everything to work -- not just the OAuth success screen, but then also get the data pushing and retrieval process work flow up and running smoothly as well.  I don't know if that's what the original author was referring to or not, but that's what I discover via testing worked on my end.

And then as I've mentioned in a prior post, I also gave that same service account, not only full access to my Google Workspace in the admin settings, but also I added the service account to each bucket access permissions as well, as 'Storage Admin' -- so that it has full 'read/write' access to the file(s) in the buckets.  Conventional thinking is that the service account shouldn't be necessary -- isn't that what the 'orange triangle' and manually created Client ID are already providing access and linking the Apps Script with the bucket files in GCP?  But, for whatever reason, based upon how GCP and Workspace are setup and structured, I don't think I was able to get the script working without adding another service account on top of everything else to link both together and get the data flow processing properly.  Hope this helps.

Have you tried any testing along these lines so far?

Thanks,

TJ

Hi TJ,

Thank you for taking the time to give the detailed information of things you have tried and tested to get to a working result.

This is supposed to be a solution where the Apps Script is getting access to the GCP, so it seems strange to have to give permissions for the GCP to have access to the Workspace.

Regarding the "App Engine default service account", what i discovered is that this is only created automatically after enabling  particular APIs in "APIs and services". And this does not include the 4 APIs that are mentioned in the article as being required.

You can force its creation by enabling another API, such as the "Cloud Functions API" which creates a "Google Cloud Functions Service Agent" principal on the bucket.

Having already enabled the "Cloud Functions API" and giving everything full access everywhere, i still cant get this working.

Im now looking for other solutions, that are documented well enough to be able to get a working result.

Hi Mr_Shane,

Sounds good, no worries.

That's interesting to learn regarding the App Engine default service account.  Although, I haven't ever used the App Engine -- but it may be another area I'd like to explore in the future when I have the opportunity.  I've heard good things about the App Engine.  You might be able to find a solution that works via the App Engine.

Well, if you find an alternate solution, please feel free to post it on this thread forum.

Thanks,

TJ

Top Solution Authors