Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How to understand the JSON returned from Gemini with respect to citations

I'd like some help understanding the meaning of the JSON returned by Gemini with respect to citations. (I checked Google's API documentation, but I found it to be sparse and unclear.)

I've grounded Gemini 1.5 Pro in a Vertex AI Search app and data store. I've observed that when I prompt the model, in the JSON response may contain the following info pertaining to citations:

1) candidates[].groundingMetadata.groundingChunks[]
This is a list of objects with a retrievedContext property, which indicates the title and URI of a particular document in the data store.

2) candidates[].groundingMetadata.groundingSupports[]
As far as I've seen, this list contains a single object, which includes:
-- the model's text response to the prompt (segment.text)
-- a groundingChunkIndices[] list (Google's documentation indicates that this is a list of indexes into the groundingChunks[] list mentioned above.)

My questions:
Can someone please explain the actual meaning of groundingChunks[] vs. groundingChunkIndices[]? I don't know how to interpret these fields in terms of what they mean for citations.

For example, I've seen something like the following:

 

"groundingChunks": [
  {
    retrievedContext: {
      uri: 'gs://my-bucket/0.pdf',
      title: 'A'
    }
  },
  {
    retrievedContext: {
      uri: 'gs://my-bucket/1.pdf',
      title: 'B'
    }
  },
  {
    retrievedContext: {
      uri: 'gs://my-bucket/2.pdf',
      title: 'C'
    }
  },
  {
    retrievedContext: {
      uri: 'gs://my-bucket/3.pdf',
      title: 'D'
    }
  },
  {
    retrievedContext: {
      uri: 'gs://my-bucket/4.pdf',
      title: 'E'
    }
  }
]
"groundingSupports": [
  {
    segment: {
      endIndex: 69,
      text: '(Here is the text that the model produced in response to the prompt.)'
    },
    groundingChunkIndices: [ 0, 3 ],
    confidenceScores: [ 0.9919262, 0.9919262 ]
  }
]

 

As you can see, there are five documents in groundingChunks[], two of which are specified in groundingChunkIndices[]. What does this mean?
Did the model produce a response based on all 5 documents in groundingChunks[]? But then what does it mean that just two of them are indicated in groundingChunkIndices[]?

Thanks for your help!

0 3 1,523
3 REPLIES 3