Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Invalid return type: <class 'dict'> in vertex AI endpoint response

Hi:

I deployed a custom container on vertex AI endpoint. I am able to make a request, the model evaluates correctly and I am returning the following as the model output:

{'predictions': [tensor([0.0004, 0.0017, 0.0039, 0.0054, 0.0078, 0.0088, 0.0111, 0.0741])]}

But I still get the following error:

MODEL_LOG - model: biobert, Invalid return type: <class 'dict'>.

 

Any ideas on what the issue is?
 
Thanks
0 7 684
7 REPLIES 7

AndrewB
Community Manager
Community Manager

Hey @yamskonline - the issue likely lies in the format of the data your model is returning. The documentation says:

"Individual values in an instance object can be strings, numbers, or lists. You can't embed JSON objects."

Right now you are returning a tensor within the 'predictions' list. Try converting into a plain Python list or a NumPy array before returning it.

Thanks @AndrewB. I changed the output to a list of list and that worked. NumPy array did not work either. But that unfortunately did not solve the Invalid return type: <class 'dict'>. It looks like I cannot return a json/dict as the output from my container as follows?

{'predictions': [[0.0004, 0.0017, 0.0039, 0.0054, 0.0078, 0.0088, 0.0111, 0.0741]]}

I don't get the error if I just return [[0.0004, 0.0017, 0.0039, 0.0054, 0.0078, 0.0088, 0.0111, 0.0741]] from my container. Do I need to return the output in the above json format or is that created by vertex AI when returning the response? 

Now I don't get a server side error but get the following client side error:

 

Traceback (most recent call last):
  File "/home/yk/anaconda3/envs/phamily/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 75, in error_remapped_callable
    return callable_(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yk/anaconda3/envs/phamily/lib/python3.11/site-packages/grpc/_channel.py", line 1160, in __call__
    return _end_unary_response_blocking(state, call, False, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yk/anaconda3/envs/phamily/lib/python3.11/site-packages/grpc/_channel.py", line 1003, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.FAILED_PRECONDITION
details = "Container returned invalid prediction response. endpoint_id: 43973538112, deployed_model_id: 723690059904, response: [
0.0003649094433058053,
0.001728175790049136,
0.003946424461901188,
0.005377720575779676,
0.00779323186725378,
0.00883102510124445,
0.01114107202738523,
0.07414358109235764
]
."
debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B2a00:1450:4016:80b::200a%5D:443 {created_time:"2024-09-13T17:46:28.689560675+02:00", grpc_status:9, grpc_message:"Container returned invalid prediction response. endpoint_id: 439735381877850112, deployed_model_id: 7236907468696059904, response: [\n\t0.0003649094433058053,\n\t0.001728175790049136,\n\t0.003946424461901188,\n\t0.005377720575779676,\n\t0.00779323186725378,\n\t0.00883102510124445,\n\t0.01114107202738523,\n\t0.07414358109235764\n]\n."}"
>
 

Any help on this would be appreciated.

Hi @yamskonline,

In addition to @AndrewB’s response and addressing your latest post, you're encountering an issue with Vertex AI and custom container deployments. The problem lies in the expectation of the Vertex AI service regarding the response format.

When you deploy a custom container to Vertex AI, the service expects your container to respond with a specific format. This format is generally a JSON object, and it expects a particular structure within that object.

Additionally, for your container's response, you're returning a simple list, which doesn't conform to Vertex AI's expected JSON format.

As a workaround, you may try to return JSON with 'predictions' key. You need to wrap your prediction results in a JSON structure with a predictions key. Here's how to do it:

import json

def predict(input_data):
    # ... Your model prediction logic ...
    predictions = [0.0004, 0.0017, 0.0039, 0.0054, 0.0078, 0.0088, 0.0111, 0.0741] 

    # Return JSON response
    return json.dumps({"predictions": predictions})

We use json.dumps() to convert the Python dictionary {"predictions": predictions} into a JSON string. This is the format that Vertex AI expects.

The predictions key is crucial. Vertex AI uses this key to extract the predictions from your container's response.

Here’s an example of a request/response:

Request (Input to your container):

{"instances": [{"text": "This is a sample text"}]}

 Response (Expected output from your container):

{"predictions": [0.0004, 0.0017, 0.0039, 0.0054, 0.0078, 0.0088, 0.0111, 0.0741]}

Here’s additional approaches that you may consider:

Documentation: Always refer to the Vertex AI documentation for your specific model type or prediction task. It will provide detailed guidelines on the required response format.

Error Handling: It's wise to implement error handling to capture any exceptions during JSON serialization.

Batch Predictions: If you're handling batch predictions, the predictions key might contain a list of lists, with each inner list representing the predictions for a single instance in the batch.

I hope the above information is helpful.

Thanks for your response @ruthseki. This is very helpful though it unfortunately did not solve my problem. So as you pointed out I am now returning a json string output from my container using the post_process function of my custom handler as follows:

 

def inference(self, model_input):
        """
        Internal inference methods
        :param model_input: transformed model input data
        :return: list of inference output in NDArray
        """
        # Do some inference call to engine here and return output
        output = self.model_evaluation.eval_multilabel_model(model_input, self._explanations, device=self.device).detach().cpu().numpy().tolist()
        return [output]

    def postprocess(self, inference_output):
        """
        Return inference result.
        :param inference_output: list of inference output
        :return: list of predict results
        """
        # Take output from network and post-process to desired format
        postprocess_output = json.dumps({"predictions": inference_output})
        
        return postprocess_output

    def handle(self, data, context):
        """
        Invoke by TorchServe for prediction request.
        Do pre-processing of data, prediction using model and postprocessing of prediciton output
        :param data: Input data for prediction
        :param context: Initial context contains model server system properties.
        :return: prediction output
        """
        model_input = self.preprocess(data)
        model_output = self.inference(model_input)
        model_output = self.postprocess(model_output)
        print('\n\n\n\nMODULE OUTPUT', model_output, '\n\n\n\n')
                
        return model_output

 

 My input is a json:

 

{ 
   "instances": [
     { 
       "body": "Balance Problem Walker Spinal stenosis Nerve injury"
     }
   ]
}

 

But now I get "Invalid return type: <class 'str'>." error as you can see in the attached log data: 

yamskonline_0-1726691093014.png

 

So I am again at a loss 😏

@ruthseki @AndrewB I finally figured out the problem. But I would not have gotten to it without your help. Thank you!

The problem was actually a torchserve one. I am using torchserve as my HTTP server. Tunrs out torchserve expects the custom handler to return only a list as the predictions result. In its predict() method there is actually a check for the return value which is constrained to be a list:

https://pytorch.org/serve/_modules/ts/service.html#Service.predict

if not isinstance(ret, list):
            logger.warning(
                "model: %s, Invalid return type: %s.",
                self.context.model_name,
                type(ret),
            )
            return create_predict_response(
                None, req_id_map, "Invalid model predict output", 503
            )

 So the error message I was seeing is actually from torchserve and not from VertexAI. The error was raised as I was returning a json string while torchserve expected it to be a list. So all I had to do was change my container output to return a list of the json string expected by Vertex AI:

postprocess_output = json.dumps({"predictions": inference_output})
return [postprocess_output]

Thanks again!

 

Glad you got it working! Thanks for sharing the solution – that TorchServe detail is super helpful for others.

Keep the updates coming and consider joining Innovators to stay in the loop on all things Vertex AI.

Happy coding!