Hi @Pedrohenr1,
Welcome to Google Cloud Community!
The error message "Internal error encountered" with code 13 and the worker pool exit code 1, along with the inaccessible logs, points to a problem within the Google Cloud Document AI infrastructure during your processor training.
Understanding the Error:
- Internal Error Code 13: This indicates an issue within the Document AI service itself, not necessarily related to your training data.
- Replica Workerpool Exit: The error message suggests a problem during training, with a worker pool exiting abnormally.
- Limited Information: Lack of access to detailed logs due to permission errors makes pinpointing the exact cause challenging.
Troubleshooting Steps:
Retry the Training: Sometimes, internal errors can be transient. Rerunning the training job a few times can sometimes resolve the issue.
Resource Limits:
- Document Size: Document AI has size limits for training documents. While 1916 documents might not be too many, their combined size could exceed the limit. Try splitting your training data into smaller batches and retry the training.
- Memory: The training process might require more memory than your environment allows. Check your VM instance or cluster configuration and ensure it has enough memory for training.
- CPU: If your CPU resources are insufficient, the training could fail. Increase the number of CPU cores allocated to the training job.
Data Issues:
- Incorrect File Formats: Document AI supports specific file types (PDF, JPG, PNG). Ensure your documents and datasets are in the correct format.
- Data Quality: Check for corrupt files, incomplete data, or inconsistent formatting within the training documents. Even a small number of bad files can disrupt the training process.
- Data Balance: Ensure you have a sufficiently balanced dataset. If one type of document dominates the dataset, it might hinder the model's ability to generalize effectively.
You can read through this documentation for more details on how Document AI allows you to train new processor versions.
If the issue persists, I suggest contacting Google Cloud Support, as they can provide more insights and take a deeper look at your issue. Provide them with detailed information about the problem, including error messages, steps you've taken, and the urgency of resolving the issue.
I hope the above information is helpful.