Solved: Re: Training pipeline failed with error message: T...

rameshsnew · 03-18-2025 10:35 PM

I'm getting this error message for a simple tabular text classification training job.

"Training pipeline failed with error message: The timestamp column must have valid timestamp entries."

The csv file has just two columns - text and label. Sample records attached.

There is no timestamp data. Why is training job forcing me to create a timestamp column?

Thanks!

"text","tags"
"My invoice INV-20245348 is showing an incorrect amount; could you please review?","Enquiry_Invoice"
"I need a copy of my contract which was signed on 2023-08-16","Contract"
"I'd like to renew my current contract which expires on 2024-04-02","Renew_Contract"
"Could you please provide the pricing details for the current contract?","Pricing_Enquiry"
"I'm having trouble processing my payment","Complaint_Payment"

MJane

Hi @rameshsnew,

Welcome to the Google Cloud Community!

The Vertex AI Tabular training job might expect a timestamp column because it often works with time-series data. Even though your task is text classification, the system might still expect a timestamp column by default.

It's also possible that your pipeline settings or dataset configuration mistakenly assume there's a timestamp column in your data.

Here are possible solutions that might help resolve issue:

Add a Dummy Timestamp Column - The simplest solution is to add a dummy timestamp column to your CSV file. For instance, you could add a column named "timestamp" and fill it with the current date and time for all rows, or any constant value. After adding the column, you might need to configure your training job in Vertex AI to explicitly exclude this dummy timestamp column from being used as a feature.
Verify Dataset Configuration in Vertex AI - Check the Vertex AI section in the Google Cloud Console, and proceed to the Datasets section to locate the dataset intended for training. Thoroughly examine the dataset configuration, paying close attention to settings such as time columns, data splitting methods, and column specifications or transformations. Verify that no column is mistakenly labeled as a timestamp column.
Check Data Splitting Settings - When setting up your training job, ensure the data is split correctly into training, validation, and test sets. If there's an option to split by a time column, disable it or assign it to a valid time-related column (if needed). Manual or random splits can help avoid issues with timestamps.

For more information about Best practices for creating tabular training data , you may read this documentation.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

View solution in original post

MJane