Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Missing PageNumber Value in Vertex AI Search Unstructured Search Results

Hello,

I'm currently working with an unstructured (Import with Metadata.JSONL) search application using Vertex AI Search. I've noticed that the PageNumber value is missing from my search results. In my previous experience with similar search implementations, this value is typically present to indicate which page of the document the search result comes from.

I would appreciate clarification on:

  1. Under what conditions should the PageNumber value be present in the search results?
  2. Are there specific scenarios or document types where PageNumber information might not be available?

Has anyone else encountered similar behavior or could explain the expected behavior regarding PageNumber values in Vertex AI Search?

Thank you in advance for any insights.

Search Code Ref : 

 

response = client.search(
    request=discoveryengine.SearchRequest(
        query=user_query,
        # filter=f"category: ANY(\"{filter}\")",
        page_size=10,
        serving_config=serving_config,
        content_search_spec=discoveryengine.SearchRequest.ContentSearchSpec(
            extractive_content_spec=discoveryengine.SearchRequest.ContentSearchSpec.ExtractiveContentSpec(
                max_extractive_segment_count=4,
                return_extractive_segment_score=True,
                num_previous_segments=1,
                num_next_segments=1,
            )
        ),
        query_expansion_spec=discoveryengine.SearchRequest.QueryExpansionSpec(
            condition=discoveryengine.SearchRequest.QueryExpansionSpec.Condition.AUTO,
            pin_unexpanded_results=True,
        ),
        spell_correction_spec=discoveryengine.SearchRequest.SpellCorrectionSpec(
            mode=discoveryengine.SearchRequest.SpellCorrectionSpec.Mode.AUTO
        )
    )
)
print(response)

 



SearchPager Ref : 

 

SearchPager<results {
  id: "doc-164"
  document {
    name: "projects/REPLACE VALUE/locations/global/collections/default_collection/dataStores/REPLACE VALUE_1733723908768/branches/0/documents/doc-164"
    id: "doc-164"
    struct_data {
      fields {
        key: "title"
        value {
          string_value: "REPLACE VALUE.pdf"
        }
      }
      fields {
        key: "start_year"
        value {
          string_value: "2020"
        }
      }
      fields {
        key: "model"
        value {
          list_value {
            values {
              string_value: "VENUE"
            }
            values {
              string_value: "ALL"
            }
          }
        }
      }
      fields {
        key: "end_year"
        value {
          string_value: "Now"
        }
      }
      fields {
        key: "category"
        value {
          list_value {
            values {
              string_value: "QX1.6"
            }
            values {
              string_value: "ALL"
            }
          }
        }
      }
    }
    derived_struct_data {
      fields {
        key: "link"
        value {
          string_value: "gs://REPLACE VALUE.pdf"
        }
      }
      fields {
        key: "extractive_segments"
        value {
          list_value {
            values {
              struct_value {
                fields {
                  key: "relevanceScore"
                  value {
                    number_value: 0.83465969562530518
                  }
                }
                fields {
                  key: "id"
                  value {
                    string_value: "c1"
                  }
                }
                fields {
                  key: "content"
                  value {
                    string_value: "REPLACE VALUE"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

 

 

1 1 198
1 REPLY 1