Hi,
I'm seeing Internal Error with no details when I try to start a Tuning Job in Generative AI Studio console. The JSONL file uploaded should be correct since I've validated it in code. I'm using 1000 training steps and default learning rate multiplier of 1. I've selected us-central-1 as location for tuning.
Any idea what's happening here?
Solved! Go to Solution.
It worked after I 'Enabled All Recommended APIs' in Vertex AI Dashboard per this answer: https://stackoverflow.com/questions/76297835/internal-error-encountered-data-fetching-exception-vert...
The JSONL should follow this format, 1 example (record) per row:
Here is a sample code to generate it:
import torch
import json
from datasets import load_dataset
train_dataset = load_dataset("tatsu-lab/alpaca", split="train")
df = train_dataset.to_pandas()
df["input_text"]=df.text.astype(str)+': '+df.instruction.astype(str)
df["output_text"]=df.output.astype(str)
df=df[["input_text","output_text"]]
data_list = df.to_dict(orient='records')
with open('output_alpaca.jsonl', 'w') as file:
for example in data_list:
file.write(json.dumps(example) + '\n')
My data is already in the correct JSONL format. It's 11 MB in size.
Used https://jsonlines.org/validator/ for validation.
Seeing this error when I try using Vertex AI SDK for Python
Creating PipelineJob
Traceback (most recent call last):
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 72, in error_remapped_callable
return callable_(*args, **kwargs)
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/grpc/_channel.py", line 1161, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/grpc/_channel.py", line 1004, in _end_unary_response_blocking
raise _InactiveRpcError(state) # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INTERNAL
details = "Internal error encountered."
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.191.42:443 {created_time:"2023-08-17T21:48:08.818915-07:00", grpc_status:13, grpc_message:"Internal error encountered."}"
>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "tune_google_model.py", line 45, in <module>
tuning()
File "tune_google_model.py", line 34, in tuning
model.tune_model(
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/vertexai/language_models/_language_models.py", line 185, in tune_model
pipeline_job = _launch_tuning_job(
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/vertexai/language_models/_language_models.py", line 1134, in _launch_tuning_job
job = _launch_tuning_job_on_jsonl_data(
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/vertexai/language_models/_language_models.py", line 1198, in _launch_tuning_job_on_jsonl_data
job.submit()
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/google/cloud/aiplatform/pipeline_jobs.py", line 418, in submit
self._gca_resource = self.api_client.create_pipeline_job(
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/google/cloud/aiplatform_v1/services/pipeline_service/client.py", line 1347, in create_pipeline_job
response = rpc(
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/google/api_core/gapic_v1/method.py", line 113, in __call__
return wrapped_func(*args, **kwargs)
File "/Users/addarsh/virtualenvs/work-buddy/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 74, in error_remapped_callable
raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InternalServerError: 500 Internal error encountered.
It worked after I 'Enabled All Recommended APIs' in Vertex AI Dashboard per this answer: https://stackoverflow.com/questions/76297835/internal-error-encountered-data-fetching-exception-vert...
Hi @addarsh . Have you got resource exhausted error after that??
User | Count |
---|---|
13 | |
2 | |
1 | |
1 | |
1 |