Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Application Integration - FTP and GCS Connector

I am currently attempting to build an Application Integration in GCP which downloads multiple .zip folders from an FTP server and uploads them into Google Cloud Storage whilst retaining the file format.

So far my integration uses two connector tasks - one to download an FTP file, one to upload the file to Google Cloud Storage, and a data mapping task between these two tasks. I am using a Schedule trigger as we would like this to be run automatically on a regular basis.

I have been testing the integration and so far it creates a text file in Google Cloud Storage. How would I go about uploading the folder to GCS in its original format (.zip) as opposed to a text file? Also is there any way I can use a loop and run this on multiple files on the FTP server as once? I want to download multiple folders and upload them to GCS

If anyone has any advice or examples, it would be much appreciated

1 5 3,988
5 REPLIES 5

Hi @mylesa_rdg ,  Thanks for your question.  Yes, this is all possible.  The SFTP and GCS connectors both support a binary mode of transfer, which should allow you to preserve the .zip format.  The connectors will use a uuencoding/decoding mechanism to pass off the file to/from the integration layer.  

Please see a combination of these community posts and/or docs for the topics you need to know about:
- Using binary mode with GCS Connector: https://www.googlecloudcommunity.com/gc/Integration-Services/Application-Integration-GCS-Connector/m...
- Using binary mode with SFTP Connector: https://cloud.google.com/integration-connectors/docs/connectors/sftp/configure 
- Sample for SFTP to poll a directory, get a list of all the files, then process each file in a loop : https://www.googlecloudcommunity.com/gc/Integration-Services/SFTP-Connector-How-to-use-in-Applicatio... 
- If you want to create a folder on GCS: Create a new zero byte object in a non-existing folder. This creates the folder.  ie: upload an empty string to a bucket with an object that has a folder path in the name.

Hopefully these pointers help.  Let us know if you have any questions.

Hi again @shaaland

Thanks for responding to my question! I have created two integrations, one to poll the FTP server directory for files and another which will upload them to GCS. I've been able to transfer files from the FTP server to GCS (provided I don't get an "ParameterOversizedWarning" error or server errors). 

In the for each loop of the first integration to poll a directory, it says ""Failure Location": "Sub-integration execution" in "loopMetadata" in the logs. From what I can see, I have correctly configured the for each loop task to call the sub-integration. I'm not sure why this is happening

Also I saw in the README in the integration-samples project on Github that only text files are supported at this time. Is that still the case? We are trying to download binary files. I'm also aware that Application Integration is still in preview. I'm just wondering if I'm having issues because of limitations

Hi @mylesa_rdg ,

Thanks for your feedback and questions!  

We do support binary mode now. This thread has details about how to configure it.  Basically, there is a "HasBytes" setting in the connectorInputPayload to turn on binary (set to true), and then you use connector[Input|Output]Payload.contentBytes instead of connector[Input|Output]Payload.contents to map to/from the file contents. https://cloud.google.com/integration-connectors/docs/connectors/sftp/configure  

For the sub-integration execution error, my guess is that your API Trigger ID or Integration Name is from the sample and has not been changed to match with your actual integration API Trigger ID / Integration name.  Can you double check these configurations highlighted with the red boxes in the screenshot?

Screenshot 2023-03-17 at 10.05.28 AM.png

Hi @shaaland,

Really appreciate you getting back to me on this! 

Apologies, I forgot that I have in fact been testing the integration using binary mode. When I test the second integration (to get files), if I put in a filename it gets the file and uploads it to GCS as intended. I have been seeing a ParameterOversizedWarning in the logs, do you know how I would resolve these? It appears to be related to the size of the file

I downloaded the integrations and noticed I had the wrong integration name in the configurations you highlighted. I have since updated them but it's still not triggering the second one to get the files

We do have a 10 MB limit for the file size.  If your file is larger than this, that would explain the oversized error.

For the Integration that is being called, can you check the logs to see if there are any errors?  Also, go to the called integration and check the logs there.  It may be executing but having some other issue there?  If you can post screenshots or cut/paste the error messages that would help us find the problem. 

Top Solution Authors