"Vertex AI was unable to import data into dataset "[dataset]"
Hello Vertex AI Customer,
Due to an error, Vertex AI was unable to import data into dataset "[dataset]".
Additional Details: Operation State: Failed with errors
Resource Name: [resource]
Error Messages: Internal error occurred. Please retry in a few minutes. If you still experience errors, contact Vertex AI.
To view the error on Cloud Console, go back to your dataset using [link]
Sincerely, The Google Cloud AI Team
Following the link takes me back to the dataset import page, where it says "Unable to import data due to errors". If I click "Details" it has the same error message as the email. It won't allow me to browse the dataset or train a new model, because it failed to import.
What puzzles me is that sometimes it imports successfully or partially, sometimes it doesn't, with nearly identical datasets on the very same day, leading me to believe the file size, CSV formatting, and character encoding must be acceptable. Even so, I have tried:
I don't know what the problem is because the error message does not say anything meaningful. Any help is appreciated.
It is indeed vague, but commonly is due to CSV formatting. I see that you checked your CSV, you can view this documentation for accepted csv formats for autoML(Or what AutoML training for your use case) here. I would suggest to file a support if its still persistent.
Also as a another reference I see a community post with an answer for a similar inquiry like this one here.
Thank you for the response. Just to be sure, I checked my CSV file against the formatting and data requirements for importing a multi-label text classification dataset into AutoML:
My intuition is that the dataset may be too large. However, larger datasets have imported successfully in the past: for example, one was 263 MB with 220,863 documents, and it partially imported. Because some documents were lost, I had to manually remove labels on the dataset page that no longer applied to at least 10 documents before I could train a new model, but that is acceptable to me. It had a lot of error messages like this (where IMPORTFILE is the CSV dataset import file):
Error: Unable to get storage client in 10 retries for element: for: gs://IMPORTFILE line X
As far as I can tell, there is nothing special about the lines where it fails. Maybe something similar is happening behind the scenes when a dataset entirely fails to import. It looks like some kind of internal network error. I am still open to any ideas.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |