Is there incompatibility between Auto ML Vision Ve...

antelion · 04-06-2024 11:36 AM

Hello, I am developing an app in Android/kotlin.
The app must make use of the ML Kit for object detection.
I have added these dependencies for ML Kit:

implementation 'com.google.mlkit:object-detection:17.0.1'
implementation 'com.google.mlkit:object-detection-custom:17.0.1'

I have trained an AutoML model for object detection on Vertex AI but when I try to use the .tflite file in my app, I receive the error:

"Unexpected number of dimensions for output index 0: got 3D, expected either 2D (BxN with B=1) or 4D (BxHxWxN with B=1, W=1, H=1)"

I have saved the .tflite file in the assets folder.
This is the code that should call the model:

val localModel = LocalModel.Builder()
.setAssetFilePath("my-model.tflite")
.build()

It would be astonishing if there was really incompatibility between an AutoML Vision generated tflite file and ML Kit. Could you please clarify the issue ? Thank you

Poala_Tenorio

The error message suggests that the output tensor has an unexpected number of dimensions.

The ML Kit for object detection typically expects the output tensor to be either 2D (BxN with B=1) or 4D (BxHxWxN with B=1, W=1, H=1), where B represents the batch size, H represents the height of the input image, W represents the width of the input image, and N represents the number of detected objects.

Since you trained your model using AutoML Vision, the output tensor's shape might differ from what ML Kit expects by default. Here are a few steps you can take to troubleshoot and potentially resolve the issue:

Verify Model Output Shape: Double-check the output tensor shape of your AutoML Vision model. You can use tools like TensorFlow Lite Interpreter to inspect the model's input and output tensors.
Custom Post-processing: ML Kit allows you to provide a custom post-processing logic to handle the output of the model. You might need to implement a custom post-processing logic that converts the output tensor from your AutoML Vision model into a format that ML Kit expects.
Convert Model: If the output tensor's shape is indeed incompatible with ML Kit's expectations, you might need to convert your AutoML Vision model into a format that is compatible with ML Kit. This could involve retraining the model using TensorFlow or TensorFlow Lite with specific configurations.
Check Documentation and Support: Make sure to check the documentation for ML Kit and AutoML Vision to see if there are any specific guidelines or considerations for using AutoML Vision models with ML Kit. You can also reach out to Google Cloud support for assistance with compatibility issues between AutoML Vision and ML Kit.

antelion

This is an AI generated answer that does not solve the main issue I mentioned. I have asked why is there incompatibility between two environments that should be naturally complementary like Vertex AI and ML Kit. Will this incompatibility be overcome in the future ?

aelters

@antelion Poala_Tenorio wrote:
If the output tensor's shape is indeed incompatible with ML Kit's expectations, you might need to convert your AutoML Vision model into a format that is compatible with ML Kit. This could involve retraining the model using TensorFlow or TensorFlow Lite with specific configurations.

Are there any guides to do this? Training a TF lite model so that its compatible with ML Kit should be straightforward and well documented, if Auto ML Vision Vertex AI does not generate a compatible model out of the box.

lucksp

I also see this error running a TFLite model in MLKit:

ERROR Error detecting objects: [Error: Failed to detect objects: Error Detecting Objects Error Domain=com.google.visionkit.pipeline.error Code=3 "Pipeline failed to fully start:
CalculatorGraph::Run() failed:
Calculator::Open() for node "BoxClassifierCalculator" failed: #vk Unexpected number of dimensions for output index 0: got 3D, expected either 2D (BxN with B=1) or 4D (BxHxWxN with B=1, W=1, H=1)." UserInfo={com.google.visionkit.status=<MLKITvk_VNKStatusWrapper: 0x301990010>, NSLocalizedDescription=Pipeline failed to fully start:
CalculatorGraph::Run() failed:
Calculator::Open() for node "BoxClassifierCalculator" failed: #vk Unexpected number of dimensions for output index 0: got 3D, expected either 2D (BxN with B=1) or 4D (BxHxWxN with B=1, W=1, H=1).}]

lucksp

I am seeing a similar issue with VertexAI default model trained to export as TFLite

@antelion wrote:
Hello, I am developing an app in Android/kotlin.
The app must make use of the ML Kit for object detection.
I have added these dependencies for ML Kit:
implementation 'com.google.mlkit:object-detection:17.0.1'
implementation 'com.google.mlkit:object-detection-custom:17.0.1'
I have trained an AutoML model for object detection on Vertex AI but when I try to use the .tflite file in my app, I receive the error:
"Unexpected number of dimensions for output index 0: got 3D, expected either 2D (BxN with B=1) or 4D (BxHxWxN with B=1, W=1, H=1)"
I have saved the .tflite file in the assets folder.
This is the code that should call the model:
val localModel = LocalModel.Builder()
.setAssetFilePath("my-model.tflite")
.build()
It would be astonishing if there was really incompatibility between an AutoML Vision generated tflite file and ML Kit. Could you please clarify the issue ? Thank you

lucksp

Any suggestions here? I trained my Object Detection model, out of the box from Vertex, and exported to TFLite.

According to the MLKit docs:

You can use any pre-trained TensorFlow Lite image classification model, provided it meets these requirements:

Tensors

The model must have only one input tensor with the following constraints:
- The data is in RGB pixel format.
- The data is UINT8 or FLOAT32 type. If the input tensor type is FLOAT32, it must specify the NormalizationOptions by attaching Metadata.
- The tensor has 4 dimensions : BxHxWxC, where:
  - B is the batch size. It must be 1 (inference on larger batches is not supported).
  - W and H are the input width and height.
  - C is the number of expected channels. It must be 3.
The model must have at least one output tensor with N classes and either 2 or 4 dimensions:
- (1xN)
- (1x1x1xN)
Currently only single-head models are fully supported. Multi-head models may output unexpected results.

So I ask the Google Team, does a standard TFLite model from Vertex automatically meet these requirements? I agree, it would be odd if the exported model file doesn't match MLKit by default...

Is there incompatibility between Auto ML Vision Vertex AI and ML Kit ?

Tensors