Error while trying to get explanation from (custom...

pankajsingla · 07-26-2022 01:38 AM

Hi,

I created a custom docker container to deploy my model on Vertex AI. The model uses LightGBM, so I can't use the pre-built container images available for TF/SKL/XGBoost. I was able to deploy the model and get predictions, but I get errors while trying to get explainable predictions from the model. I have tried to follow the Vertex AI guidelines to configure the model for explanations.
The example below shows a simplified version of the model that still reproduces the issue, with only two input features 'A' and 'B'.

Please take a look and tell me if the explanation metadata is supposed to be set differently, or if there is something wrong with this approach.

Environment details

Google Cloud Notebook
Python version: 3.7.12
pip version: 21.3.1
google-cloud-aiplatform version: 1.15.0

Reference

https://cloud.google.com/vertex-ai/docs/explainable-ai/configuring-explanations#custom-container

explanation-metadata.json

(Model output is unkeyed. The Vertex AI guide suggests using any memorable string for output key.)

{
    "inputs": {
        "A": {},
        "B": {}
    },
    "outputs": {
        "Y": {}
    }
}

Model upload with explanation parameters and metadata

! gcloud ai models upload \
  --region=$REGION \
  --display-name=$MODEL_NAME \
  --container-image-uri=$PRED_IMAGE_URI \
  --artifact-uri=$ARTIFACT_LOCATION_GCS \
  --explanation-method=sampled-shapley \
  --explanation-path-count=10 \
  --explanation-metadata-file=explanation-metadata.json

Prediction/Explanation Input

instances = [{"A": 1.1, "B": 20}, {"A": 2.2, "B": 21}]
# Prediction (works fine):
endpoint.predict(instances=instances)
# Prediction output: predictions=[0, 1], deployed_model_id='<>', model_version_id='', model_resource_name='<>', explanations=None
endpoint.explain(instances=instances) # Returns error (1) shown in stack trace below

# Another example
instances_2 = [[1.1,20], [2.2,21]]
# Prediction (works fine):
endpoint.predict(instances=instances_2)
# Prediction output: predictions=[0, 1], deployed_model_id='<>', model_version_id='', model_resource_name='<>', explanations=None
endpoint.explain(instances=instances_2) # Returns error
# Error: Nameless inputs are allowed only if there is a single input in the explanation metadata.

Prediction Server (Flask)

# Custom Flask server to serve online predictions
# Input for prediction
raw_input = request.get_json()
input = raw_input['instances']
df = pd.DataFrame(input, columns = ['A', 'B'])
# Prediction from model (loaded from GCP bucket)
predictions = model.predict(df).tolist() # [0, 1]
response = jsonify({"predictions": predictions})
return response

Stack trace of error (1)

---------------------------------------------------------------------------
_InactiveRpcError                         Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     49         try:
---> 50             return callable_(*args, **kwargs)
     51         except grpc.RpcError as exc:

/opt/conda/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
    945                                       wait_for_ready, compression)
--> 946         return _end_unary_response_blocking(state, call, False, None)
    947 

/opt/conda/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
    848     else:
--> 849         raise _InactiveRpcError(state)
    850 

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "{"error": "Unable to explain the requested instance(s) because: Invalid response from prediction server - the response field predictions is missing. Response: {'error': '400 Bad Request: The browser (or proxy) sent a request that this server could not understand.'}"}"
	debug_error_string = "{"created":"@1658310559.755090975","description":"Error received from peer ipv4:74.125.133.95:443","file":"src/core/lib/surface/call.cc","file_line":1069,"grpc_message":"{"error": "Unable to explain the requested instance(s) because: Invalid response from prediction server - the response field predictions is missing. Response: {'error': '400 Bad Request: The browser (or proxy) sent a request that this server could not understand.'}"}","grpc_status":3}"
>

The above exception was the direct cause of the following exception:

InvalidArgument                           Traceback (most recent call last)
/tmp/ipykernel_2590/4024017963.py in <module>
----> 3 print(endpoint.explain(instances=instances, parameters={}))

~/.local/lib/python3.7/site-packages/google/cloud/aiplatform/models.py in explain(self, instances, parameters, deployed_model_id, timeout)
   1563             parameters=parameters,
   1564             deployed_model_id=deployed_model_id,
-> 1565             timeout=timeout,
   1566         )
   1567 

~/.local/lib/python3.7/site-packages/google/cloud/aiplatform_v1/services/prediction_service/client.py in explain(self, request, endpoint, instances, parameters, deployed_model_id, retry, timeout, metadata)
    917             retry=retry,
    918             timeout=timeout,
--> 919             metadata=metadata,
    920         )
    921 

/opt/conda/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py in __call__(self, timeout, retry, *args, **kwargs)
    152             kwargs["metadata"] = metadata
    153 
--> 154         return wrapped_func(*args, **kwargs)
    155 
    156 

/opt/conda/lib/python3.7/site-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs)
     50             return callable_(*args, **kwargs)
     51         except grpc.RpcError as exc:
---> 52             raise exceptions.from_grpc_error(exc) from exc
     53 
     54     return error_remapped_callable

InvalidArgument: 400 {"error": "Unable to explain the requested instance(s) because: Invalid response from prediction server - the response field predictions is missing. Response: {'error': '400 Bad Request: The browser (or proxy) sent a request that this server could not understand.'}"}
---------------------------------------------------------------------------
# https://github.com/googleapis/python-aiplatform/issues/1526