Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Workflows run Dataform repo that connected to Github

Hi all,
I wonder if any can help?
I've connected Dataform repo to Github repo and I've also triggered via Workflows by referencing this doc. https://cloud.google.com/dataform/docs/schedule-executions-workflows
Here's the source code in Workflows:
main:
    steps:
    - init:
        assign:
          - repository: projects/<gcp_project_name>/locations/<dataform_repo_location>/repositories/<dataform_default_repo_name>
    - createCompilationResult:
        try:
            call: http.post
            args:
                url: ${"https://dataform.googleapis.com/v1beta1/" + repository + "/compilationResults"}
                auth:
                    type: OAuth2
                body:
                    gitCommitish: main
            result: compilationResult
        except:
            as: e
            steps:
                - known_errors:
                    switch:
                    - condition: ${not("HttpError" in e.tags)}
                      return: "Connection problem."
                    - condition: ${e.code == 404}
                      return: "Sorry, URL wasn't found."
                    - condition: ${e.code == 403}
                      return: "Authentication error."
                - unhandled_exception:
                    raise: ${e}
    - createWorkflowInvocation:
        try:
            call: http.post
            args:
                url: ${"https://dataform.googleapis.com/v1beta1/" + repository + "/workflowInvocations"}
                auth:
                    type: OAuth2
                body:
                    compilationResult: ${compilationResult.body.name}
                    invocationConfig: 
                        fullyRefreshIncrementalTablesEnabled: false
                        includedTags: ["ga4"]
                        includedTargets: []
                        transitiveDependenciesIncluded: true
                        transitiveDependentsIncluded: false
            result: workflowInvocation
        except:
            as: e_execute
            steps:
                - known_errors_execute:
                    switch:
                    - condition: ${not("HttpError" in e_execute.tags)}
                      return: "Connection problem."
                    - condition: ${e_execute.code == 404}
                      return: "Sorry, URL wasn't found."
                    - condition: ${e_execute.code == 403}
                      return: "Authentication error."
                - unhandled_exception_execute:
                    raise: ${e_execute}
    - complete:
        return: ${workflowInvocation.body.name + " complete"}
Before i connected Dataform repo to GitHub repo, I used the default Dataform repository and everything works well.  However, i got this error message in Workflows after i connected to GitHub.
HTTP server responded with error code 400
in step "unhandled_exception", routine "main", line: 33
{
  "body": {
    "error": {
      "code": 400,
      "message": "NPM Execution error: npm ERR! code EUSAGE\nnpm ERR! \nnpm ERR! The `npm ci` command can only install with an existing package-lock.json or\nnpm ERR! npm-shrinkwrap.json with lockfileVersion >= 1. Run an install with npm@5 or\nnpm ERR! later to generate a package-lock.json file, then try again.\nnpm ERR! \nnpm ERR! Clean install a project\nnpm ERR! \nnpm ERR! Usage:\nnpm ERR! npm ci\nnpm ERR! \nnpm ERR! Options:\nnpm ERR! [-S|--save|--no-save|--save-prod|--save-dev|--save-optional|--save-peer|--save-bundle]\nnpm ERR! [-E|--save-exact] [-g|--global] [--global-style] [--legacy-bundling]\nnpm ERR! [--omit <dev|optional|peer> [--omit <dev|optional|peer> ...]]\nnpm ERR! [--strict-peer-deps] [--no-package-lock] [--foreground-scripts]\nnpm ERR! [--ignore-scripts] [--no-audit] [--no-bin-links] [--no-fund] [--dry-run]\nnpm ERR! [-w|--workspace <workspace-name> [-w|--workspace <workspace-name> ...]]\nnpm ERR! [-ws|--workspaces] [--include-workspace-root] [--install-links]\nnpm ERR! \nnpm ERR! aliases: clean-install, ic, install-clean, isntall-clean\nnpm ERR! \nnpm ERR! Run \"npm help ci\" for more info\n\nnpm ERR! Log files were not written due to the config logs-max=0\n",
      "status": "FAILED_PRECONDITION"
    }
  },
  "code": 400,
  "headers": {
    "Alt-Svc": "h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000",
    "Cache-Control": "private",
    "Content-Length": "1316",
    "Content-Type": "application/json; charset=UTF-8",
    "Date": "Wed, 03 May 2023 12:46:50 GMT",
    "Server": "ESF",
    "Vary": "Origin",
    "X-Content-Type-Options": "nosniff",
    "X-Frame-Options": "SAMEORIGIN",
    "X-Xss-Protection": "0"
  },
  "message": "HTTP server responded with error code 400",
  "tags": [
    "HttpError"
  ]
}
I've pushed changes to github repo and with default branch "main".  (Dataform repo name is different from Github repo name. Not sure if it's matter?) I'm appologised that i can't share the github repo.  
 
I thought maybe i need to configure NPM package? After i configured NPM package, i got different error in Workflows. Anyone had the same issue before? Thanks in advance. 
 

 

 

 

HTTP server responded with error code 500
in step "unhandled_exception", routine "main", line: 33
{
"body": {
"error": {
"code": 500,
"message": "An internal error has occurred",
"status": "INTERNAL"
}
},
"code": 500,
"headers": {
"Alt-Svc": "h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000",
"Cache-Control": "private",
"Content-Length": "151",
"Content-Type": "application/json; charset=UTF-8",
"Date": "Wed, 03 May 2023 14:34:44 GMT",
"Server": "ESF",
"Vary": "Origin",
"X-Content-Type-Options": "nosniff",
"X-Frame-Options": "SAMEORIGIN",
"X-Xss-Protection": "0"
},
"message": "HTTP server responded with error code 500",
"tags": [
"HttpError"
]
}

 

 

 

Solved Solved
0 6 1,976
1 ACCEPTED SOLUTION

problem solved!! Dataform repo name needs to be the same as GitHub repo name.

View solution in original post

6 REPLIES 6

problem solved!! Dataform repo name needs to be the same as GitHub repo name.

Thanks for the hint... 
Works for us. 
It took me a while until I found the solution.

glad it's help.

unfortunately, the same problem appears again. i removed all of sqlx files and restart with a simple sqlx and it still shows error code 500 internal error.   I see now you have to configure Release and Workflow configuration in Dataform?  

Hi there, 

i'm apologise that i think i missed your message.  I didn't manage to fix the 500 internal error instead i started a new repo without GitHub connection.   Until now, i'm connecting with GitHub again. It was working well, today it fails

 

"HTTP server responded with error code 400
in step "unhandled_exception", routine "main", line: 28: {"body":{"error":{"code":400,"message":"Remote repository 'https://github.com/xxxx/xxxx.git' could not be reached.","status":"INVALID_ARGUMENT"}}

 

checked trouble shooting in Google Dataform docs: https://cloud.google.com/dataform/docs/troubleshooting#remote_github_repository_cannot_be_reached_ge...

sysph_0-1692259669123.png

So it could be dropped GitHub or GitLab connection. It said There is no need to take any action. Unless GitHub or GitLab issues persist, the subsequent scheduled releases can be successful.

Any thoughts?

 I built simple pipleline in Dataform repo (connected to GitHub repo) and execute via Workflows in GCP and this morning around 2.37am it is failed. I had the same issue before and happens again. This is the error message i see in Workflows execution logs:

 

"HTTP server responded with error code 400 in step "unhandled_exception", routine "main", line: 28: {"body":{"error":{"code":400,"message":"Remote repository 'https://github.com/xxxx/xxxx.git' could not be reached.","status":"INVALID_ARGUMENT"}}

 

checked trouble shooting in Google Dataform docs: https://cloud.google.com/dataform/docs/troubleshooting#remote_github_repository_cannot_be_reached_ge...

 

So it could be dropped GitHub or GitLab connection. It said There is no need to take any action. Unless GitHub or GitLab issues persist, the subsequent scheduled releases can be successful.  

Based on the error message and the troubleshooting guide, it appears that the connection to GitHub might have been temporarily disrupted. This could be due to various reasons such as a network outage, issues with GitHub servers, or firewall restrictions.

Immediate Recommendations:

  1. GitHub Server Status: Check the current status of GitHub servers at GitHub Status Page to see if there are any ongoing issues.

  2. Firewall Settings: Ensure your firewall settings permit traffic to and from GitHub servers.

  3. Re-authenticate: Consider re-authenticating your connection with GitHub in Dataform to refresh the connection.

Additional Troubleshooting Steps:

  1. Repository URL: Double-check that the GitHub repository URL configured in your Dataform project is correct.

  2. Local Execution: Try executing the Dataform workflow locally. If the issue is reproducible, it might provide more insights.

  3. Proxy Server: If you're utilizing a proxy server, ensure it's correctly configured to allow access to GitHub.