I am migrating from Dataform web to BQ Dataform on GCP. I have created the Dataform repository, connected to my gitlab repository, created a development workspace and all are good. I have also changed what is necessary in package.json and dataform.json. I have yet to create release configurations and workflow configurations. I have a few basic questions on the next steps.
Appreciate any advice on this migration.
You should turn off the scheduling on Dataform Web only after successfully migrating and verifying all schedules in BQ Dataform. It's crucial to ensure that the new setup in BQ Dataform is functioning as expected to avoid any disruptions in your data pipeline.
Estimated Migration Time
The migration time is subjective and depends on various factors including the complexity and quantity of your schedules and data. Allocate ample time for the migration, including additional time for testing, verification, and addressing any unforeseen issues.
Migration Advice
Additional Tips
Conclusion
Your migration endeavor is a significant task, and meticulous planning and execution are paramount. Your outlined approach is solid, and incorporating these additional suggestions can further enhance the migration process. Wishing you a smooth and successful migration!
Thank you rot eh tips and advice. Will follow them when migrating mine.
I have started the migration by connecting to the gitlab repository. Created a token for the access to gitlab in GCP Dataform. However, since then, I'm getting error on my Dataform Web
Error: 2 UNKNOWN: fatal: unable to access 'https://dataform:[token]@gitlab.com/xxxxx/xxxx/': The requested URL returned error: 403
I have not changed anything on the Dataform Web. It's related to the gitlab token, and I did not remember changing anything on Dataform Web or deleting any token by accident. Now my schedules still run on Dataform Web, but any new code or changes will not be compiled by Dataform Web.
The error message you are seeing indicates that Dataform Web is unable to access your GitLab repository using the provided token. This could be due to several reasons:
Troubleshooting Steps:
Token Verification: Ensure the token is valid and hasn't expired by logging into GitLab and navigating to "Personal Access Tokens" under your profile. Here, you can view all tokens and their expiration dates.
Token Permissions: Verify that the token has the necessary permissions. This can be checked by editing the token's permissions in GitLab.
Private Repository Access: If your repository is private, ensure the token is linked to a GitLab user with appropriate access rights.
Restart Dataform Web: Sometimes, simply restarting the application can resolve connectivity issues.
Contact Support: If the issue persists, consider reaching out to Dataform's support team for further assistance.
Additional Tips:
New Token: Consider generating a new token and integrating it with Dataform Web.
Clone Test: Try cloning your repository using the token. Successful cloning indicates the token's validity and permissions.
Network Test: Attempt accessing GitLab from an alternate network to rule out any network-related issues.
Impact on Schedules:
While your schedules will continue to run on Dataform Web, any new code changes or modifications to existing schedules won't be possible until the connectivity issue is resolved.
Thanks again for the quick response. I did check and try the first three troubleshooting steps, though I'm no longer getting compilation error, I do still receive the same error message. I have replaced the token and with these permission
api, read_api, read_user, create_runner, read_repository, write_repository, read_registry, write_registry
I am the owner of the repository, so my personal token should not be an issue. On restarting Dataform Web, how do you even do that, if it is running on the cloud? Is there a way to restart it. My apology for not being aware of such function.