I'm trying to get a batch prediction from a time series forecasting model trained with AutoML. I'm seeing the following error in the BigQuery "errors_validation" table:
"There are rows with non-empty target values after this row. The time series has been excluded from predictions."
(There are 7 such error messages, one for each time series. Each error message indicates "01/01/2023" as the timestamp.†)
I just can't see how what the error message is saying could possibly be true.
- The granularity is weekly, and the forecast horizon is 26 weeks.
- For the batch prediction, I'm using a CSV file which consists of all the data used in training, plus an additional 26 weeks of future timestamps appended to it. (To be precise, each of the 26 future timestamps appears 7 times - once for each time series.)
- For each row with a future timestamp, the target column is empty (i.e., in the CSV there is nothing in the column whatsoever).
- I also tried a batch prediction sourced from a BigQuery table. I created the table from the CSV file described above. I examined the table and confirmed that there are null values in the target column wherever there is a future timestamp. I got the same errors.
I'm at a loss. Any help would be greatly appreciated.
† I'm not sure if this is significant in any way, but the forecast horizon actually starts on 12/04/2022 (earlier than the timestamp indicated by the errors). In any case, there is no data in the forecast horizon with a non-empty/non-null target value.
Solved! Go to Solution.
I was able to get batch predictions after making the following changes to the data and training a new model:
If it helps, here is an excerpt of the CSV file I'm using to get the batch predictions.
submission_week,weekly_expense_total,category ... 11/27/2022,320.49,Food 11/27/2022,385.88,Health & Fitness 11/27/2022,0,Learning & Development 11/27/2022,1318.37,Office 11/27/2022,450,Other 11/27/2022,0,Team Activities 11/27/2022,980.23,Travel & Accommodations <- End of historical data 12/04/2022,,Food <- First week of the forecast horizon 12/04/2022,,Health & Fitness 12/04/2022,,Learning & Development 12/04/2022,,Office 12/04/2022,,Other 12/04/2022,,Team Activities 12/04/2022,,Travel & Accommodations ... (followed by 25 additional weeks - forecast horizon is 26 weeks long)
As you can see, the target column is blank where the forecast horizon starts. In fact, it's blank for every row with a timestamp in the forecast horizon.
Thus the "There are rows with non-empty target values after this row" error message is really puzzling ("this row" being 01/01/2023 - not sure why that's the timestamp it calls out).
I also checked to see if the CSV contains any non-printable characters, but I couldn't find any.
I was able to get batch predictions after making the following changes to the data and training a new model:
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |