Hello,
I have some custom legacy machine translation models that I'd like to upgrade since AutoML API will be deprecated in September this year.
I noticed that only models that don't have an associated legacy dataset are available to upgrade by themselves. All my legacy models have an associated legacy dataset so I was wondering what would be the best practice to upgrade these models.
Thanks in advance.
Hi @elgonher,
Welcome to Google Cloud Community!
You're correct that the limitation of upgrading legacy models with associated datasets presents a significant challenge when migrating from AutoML Translation due to its deprecation.
The issue arises because the modern Translation API (often using the Advanced Translation API and generally referred to as "Custom Translation" in the Cloud Console) handles datasets and models separately. In contrast, legacy AutoML Translation tightly coupled them. The ideal workflow now is:
Since your legacy models are intrinsically tied to their datasets, you'll essentially need to recreate the models using the modern approach.
Here are some strategies you can consider to upgrade your custom legacy machine translation models from AutoML Translation, given the associated legacy datasets:
1. Export Data: Export the parallel data (source & target language pairs) from your legacy AutoML Translation datasets.
2. Create New Datasets: Create new Translation Datasets within the Cloud Translation API, and import your exported data into these datasets.
3. Train New Models: Train new custom translation models using the Advanced Translation API, selecting the new Translation Datasets you created.
4. Evaluate: Thoroughly evaluate the performance of your new models against your legacy models.
You can also refer to the following documents for more details:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Hi ibaui,
Thank you for your answer.
I was investigating a bit further and was wondering if I could just upgrade the legacy datasets first and then the legacy models. I found this option within the datasets section:
Select existing AutoML (legacy) datasets to manage through the Cloud Translation API instead of the AutoML API. During the upgrade process, a new dataset is created that is a copy of your existing dataset with a new ID. Models that are associated with the upgraded datasets are also upgraded. Your existing legacy datasets and models remain accessible and unchanged during and after the upgrade process.
As I have quite a few models, I guess this would prevent manual creation of new datasets and training of new models, am I correct?
Thanks,
Elvira
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |