Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Image annotation: batch vs online json response

Hi everyone,

I am trying to convert an online image annotation process to a batch process, following the  python recipe here: https://cloud.google.com/vision/docs/batch (I'm using google-cloud-vision 3.1.4, and importing with: 'from google.cloud import vision').

The problem is, the properties of the json response I extract from the batch annotation is different from that I get from the online annotation, so that my existing postprocessing pipeline is not working with the batch outputs.

In the online version, I am using two steps to generate a json response: 

  1. api_response =vision.ImageAnnotatorClient().annotate_image(
    vision.AnnotateImageRequest(image, features))
  2. api_response_json_str = vision.AnnotateImageResponse.to_json(api_response)
where, the only feature I am specifying is vision.Feature(type_="DOCUMENT_TEXT_DETECTION").
 
In the batch version, a single step dumps the response as a json file in the gcs bucket specified in the output config:
  1. vision.ImageAnnotatorClient().async_batch_annotate_images(requestsoutput_config)
     
looking at the vision.ImageAnnotatorClient.async_batch_annotate_images(), the second step of the online process seems to be already executed (through image_annotator.AsyncBatchAnnotateImagesResponse), which makes sense. But it seems as if the behavior of the AnnotateImageResponse and the AsyncBatchAnnotateImagesResponse are different. 
 
Here a segment of the json from the online version:
{
    "property": {
        "detectedBreak": {
            "type": 1,
            "isPrefix": false
        },
        "detectedLanguages": []
    },
    "boundingBox": {
        "vertices": [
            {
                "x": 99,
                "y": 99
            },
            {
                "x": 106,
                "y": 99
            },
            {
                "x": 106,
                "y": 109
            },
            {
                "x": 99,
                "y": 109
            }
        ],
        "normalizedVertices": []
    },
    "text": "n",
    "confidence": 0.98761535
}
 
And here the same segment from the batch prediction:
{
    "property": {
        "detectedBreak": {
            "type": "SPACE"
        }
    },
    "boundingBox": {
        "vertices": [
            {
                "x": 99,
                "y": 99
            },
            {
                "x": 106,
                "y": 99
            },
            {
                "x": 106,
                "y": 109
            },
            {
                "x": 99,
                "y": 109
            }
        ]
    },
    "text": "n",
    "confidence": 0.98761535
}
Note how all the numbers match, but not all the content.
 
I could not find any relevant documentation. As the last resort, of course I can hack my code or the json files to make things work, but I feel like there must be a way to reproduce the behavior of the online request (and actually this must be the case by default!).
 
Any help would be greatly appreciated!
 
Best,
Onur
0 3 325
3 REPLIES 3

After updating google-cloud-vision to 3.7.4 (latest version), and importing with 'from google.cloud import vision_v1', the issue persists.

Hi @OnurKerimoglu,

Welcome to Google Cloud Community!

The differences between the JSON responses from the online and batch annotations in the Google Cloud Vision API possibly due to how the responses are structured and the specific details provided for certain attributes. 

The online image annotation gives a detailed, structured response tailored for immediate use while batch process method returns responses optimized for bulk processing. As you observe on the JSON responses, the detectedBreak attribute in the online response might be represented as an integer (indicating types like "SPACE"), while in the batch response, it may be a string. Because of this, your post-processing code will need to convert data types to handle the different formats.

Here are possible workarounds that might help you address this issue:

  • Normalize Response  - The normalize_batch_reponse  converts the batch response to match the structure of the online response. 
  • Update Your Pipeline - To ensure consistent handling in your existing pipeline, normalize the response before processing. This will allow your post-processing logic to handle both response types smoothly.
  • Testing - Thoroughly test the normalization function with various responses to ensure all edge cases are covered, especially for missing or differently structured fields.

For more detailed information about  Cloud Vision API you can read this documentation

I hope the above information is helpful.

 

Hi @MJane,

thank you for your response!

About the 'Normalize Response' workaround you suggested: which google cloud library provide the 'normalize_batch_response' function? Upon a quick search, I could not find it. Or are you suggesting to write such a custom function?

Also, is the required normalization process described in any of the documentation pages that you referred to? As I mentioned in my original post, I was not able to find any documentation page that specifically addresses this issue.

Best,
Onur