Hello,
Can we use GCP Artifact registry for Dataform private packages.
In Google Cloud, the Artifact Registry is designed to store container images, language packages, and other artifacts. For Dataform, a tool for managing data transformation and modeling in BigQuery, the focus is on managing SQL-based project code and dependencies.
The recommended approach for Dataform projects, especially when dealing with private components or reusable code, is to use a version control system (VCS) like GitHub, GitLab, or Bitbucket. This allows you to version control your SQL scripts, Dataform scripts (.js files), and configurations, facilitating collaboration, code reviews, and essential development workflows.
To share and reuse code across Dataform projects, consider:
Google Cloud doesn't have a feature within Artifact Registry specifically for Dataform packages. It's better suited for Docker images, Maven, npm packages, etc. Use a VCS for managing Dataform dependencies and private, reusable components, leveraging features like Git submodules.
Hello @himkush , we recently had a similar use case. We have a lot of Dataform Repositories that share common Javascript functions. We decided to push them into a Javascript npm package in GCP Artifact Registry. This allowed us better version control Javascript dependencies across independent Dataform repositories. We followed this:
[1] Generate .npmrc file content using this documentation - https://cloud.google.com/artifact-registry/docs/nodejs/authentication#auth-password
[2] Adding that .npmrc file straight into Dataform repo's main folder. Recommended approach is to actually also store your generated "password" token from step 1 into a GCPSM, and then adding it to Dataform Repo's setting. Link to that documentation: https://cloud.google.com/dataform/docs/private-packages#npmrc-token
Hope this helps!