Dear Community,
while training a Google Document Classifyer, the training failed with the below errors. On inspection of the error, I noticed that all failed documents belong to classification lables which have been disabled due to insuficcent data at the time.
I am wondering why Google is tryingin to import documents which are match agained a disabled labled. Shouldnt they be skipped by default? Any advice how to solve and to go from here would be appriciated. Delete the document from the bucket (side issue, re-sync storage does not work, delete items in the bucket remain even after sync as the count of total document does not decrease)? Delete the lable for now?
"code": 3, "message": "Invalid document.", "details": [ { "@type": "type.googleapis.com/google.rpc.ErrorInfo", "reason": "INVALID_DOCUMENT", "domain": "documentai.googleapis.com", "metadata": { "document": "gs://PATH TO DOCUMENT", "reason": "The document: doesn't contain any ground-truth entity defined in the Schema."
Hi @sonicsw,
Welcome to Google Cloud Community!
It appears you're experiencing issues with Google Document AI attempting to process documents linked to disabled or inadequately populated classification labels.
Here are potential solutions that might help you resolve the issue :
For more information about Custom Document Classifies, you can read this documentation.
If the issue persists, I suggest contacting Google Cloud Support as they can provide more insights to see if the behavior you've encountered is a known issue or specific to your project.
I hope the above information is helpful.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |