Hello, I am currently encountering an issue. I have created a dataset and am importing JSONL data from a GCS bucket. After importing, every data point is labeled appropriately, which is fine. However, the problem arises when I insert new JSONL data into the same dataset that was created previously; it shows up as unlabeled in the console. The format of the previous JSONL data and the new JSONL data schema are the same. Additionally, if I import JSONL data from the GCS into an empty dataset, it is labeled correctly. But when I insert new data with the same format, it appears as unlabeled.
I am facing this issue while implementing the Python code and have reviewed some Google Docs, such as:
In these links, the focus is on inserting JSONL data into an empty dataset, but I need to insert data into an already created and populated Vertex AI dataset.
If anybody has a solution to the above issue, please share the code or any relevant links. Thank you.
There might be inconsistencies within the values of the json file that you are appending, It might be helpful to post a sample of the added values and the original jsonl file for comaprison.
This is the sample JSONL file I am going to insert into an existing dataset. It contains only two JSON inputs. When I inserted this JSONL data into the existing dataset, which already contains data through Python code, it is being recognized as unlabeled.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |