Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Glossary application in Google Translate custom AutoML models

I am using a TMS to perform translations by connecting to my custom AutoML models.  When I don't use glossaries in the TMS setup, the engine seems to behave normally (with the usual mistakes an MT engine can make), but when I use a complex glossary made out of several thousand entries of product names my company makes, I start seeing weird behaviors such as duplication of terms and dropping of important part of text.

Is there a document that explains what mechanism Google Translate V3 is going through when applying a glossary to a sentence?  I don't  think it's just a straight search and replace.  Can you shed some light?

0 3 1,316
3 REPLIES 3

Good day @Michel_FC ,

Welcome to Google Cloud Community!

One of the reasons why this issue is happening is if you have not re-trained the models in several months and the model was created from previous versions of AutoML (for example, v1beta1), models must be re-trained to ensure the stability and consistency of translations and to get the necessary upgrades since original training. 

Here is a documentation on how to create and use a glossary: https://cloud.google.com/translate/docs/advanced/glossary#create_a_glossary

Also take note of the stopwords, stopwords will ignore some words that are included in the glossary. https://cloud.google.com/translate/docs/advanced/glossary#stopwords

You can also reach out to Google Cloud Support: https://cloud.google.com/support

We are using a 3rd party to train the models, and I do indeed know that the French model where I have seen this behavior was trained around over a year ago (March 2022).   Do you have a precise cut-off date, or is there a way for me to check what version of AutoML was used for the model?

Thank you.

I see the last release of AutoML seems to be on 10/28/2022.  Are you saying that all models created before 10/28/2022 may misbehave with glossaries?  Thanks for the clarification.

https://cloud.google.com/translate/automl/docs/release-notes