Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Advice for relevant search & vertex ai search for retail

I have been working on vertex ai search for retail for large retail services. I really need an expert' advice. This would be a long story so thank you for your consideration and appreciate it if you can give me an advice as much as possible, as soon as possible.

First goal would be efficient relevance and good faceting search.
I would like to share the only one item of products that has been already indexed data catalog.:
{
  "name""projects/295037490706/locations/global/catalogs/default_catalog/branches/1/products/10000",
  "id""10000",
  "type""PRIMARY",
  "primaryProductId""10000",
  "categories": [
    "Bearings",
    "Mounted Bearings",
    "Mounted Ball Bearings",
    "Take-Up Bearings",
    "Take-Up Bearing Units"
  ],
  "title""AMI UEST207-23NP 1-7/16 Wide Accu-Loc Nickel Wide SL Plated Take-Up Unit",
  "brands": [
    "AMI Bearings"
  ],
  "languageCode""en-US",
  "attributes": {
    "count": {
      "numbers": [
        6
      ]
    },
    "Locking": {
      "text": [
        "Concentric collar"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "has_image": {
      "text": [
        "True"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "Insert_Material": {
      "text": [
        "Bearing steel"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "has_aux_data": {
      "text": [
        "False"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "Housing": {
      "text": [
        "Nickel Plated Cast Iron"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "Duty": {
      "text": [
        "Standard"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "shipping": {
      "text": [
        "0.0"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "has_alternates": {
      "text": [
        "False"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "mpn": {
      "text": [
        "UEST207-23NP"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "shipping_weight": {
      "text": [
        "3.6"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "I_D_": {
      "text": [
        "1 7/16 in"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "Seal": {
      "text": [
        "Contact Seal with Slinger"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    },
    "has_tech_specs": {
      "text": [
        "True"
      ],
      "catalogLevelSearchableSnapshot"false,
      "catalogLevelIndexableSnapshot"true
    }
  },
  "priceInfo": {
    "price"96.83,
    "originalPrice"96.83,
    "priceRange": {}
  },
  "availability""IN_STOCK",
  "uri""uri",
  "images": [
    {
      "uri""uri"
    }
  ],
  "conditions": [
    "new"
  ],
  "publishTime""2024-12-16T07:09:29.949806Z"
}
I would like to know if this product structure is good and enough for retail search, what is missing now and what should I update. Thank you.
 
Secondly - FACET - I am really hard on working this part.
Here as you can see it has several attributes keys like "Seal", "shipping_weight", "mpn", "Duty", "Housing","Insert_Material"
And sometimes ofthen over 100k products have more than 30 different facet keys
I extracted all possible facet keys there and noticed that they are over 4000 which is over 200 (vertex ai default one)
 
Actually I tried to do with dynamicFacetSpec which seems to help this but not working.
The facet keys I could see from the search results are always from the facet keys I configured when sending a request to service.
I have never seen facet keys appearing newly after running search engine, service. But you said it's possible. 
I hope it would work in this way you mentioned but I haven't found the correct way yet.
And for example, when I run this service, I needed to fix the facet keys up to 200, then it lacks the main attributes from the products
 
So I found my own solution which may be bad, but let me share it.
- Each product has their own different numbers of attribute keys so I updated the product attributes contaiing attribute keys count. 
- When someone search, then I search with them in order desc of attributes count, then I fetched 10 products which seems to have most of facet keys regarding search query. (sounds okay? I know it's not perfect)
- Then I build facet keys here (which is programmatic, not semantic search) and re-search the products based on search query and generated facet keys.
- This would be looking good but not sure they are relevant option, actually the result was not perfect, worse than searchspring in terms of relevancy measure.
 
----- 

 

 Please help me find the best solution for this dynamic search especially in case of 4000 attributes keys 
 
 
 
Thirdly, We also need to incorporate the facetable attribute data into the searchable description

For ex) if I search 1" pillow block bearing

Thank you for your feedback.

 

 

0 2 1,582
2 REPLIES 2

Hi @pavlodidushko,

Welcome to Google Cloud Community!

Your current product structure provides a solid foundation, particularly with key identifiers like name, id, type, and primaryProductId. You're also on the right track with hierarchical categories, descriptive title, brands, languageCode, essential priceInfo and availability, display elements like uri and images, conditions for used goods, and publishTime for new arrivals. However, there's room for optimization, especially to improve retail search capabilities.

Here’s what's missing or needs improvement:

  • attributes Structure:
    • text arrays: Using text arrays for single values is redundant. If an attribute only has one value (like Duty: "Standard"), just use a single string. Use arrays only if there are multiple values for a particular attribute.
    • catalogLevelSearchableSnapshot/catalogLevelIndexableSnapshot: These flags are not particularly useful for the Vertex AI Search for Retail service. They are more relevant for the underlying Google Cloud infrastructure of data storage. Vertex AI handles the indexing automatically so you do not need these flags. You should also be careful about setting catalogLevelSearchableSnapshot = true, because this will make all of the text data of this attribute searchable in full-text format, which can severely impact search relevance.
    • Consistency: Some attributes like "I_D_" seem to have inconsistent capitalization and formatting. You should normalize them to id with units, e.g., "id": "1 7/16 in" or separate "id": "1 7/16" and "id_unit": "in". The unit is also crucial for searches like "1 inch bearing".
    • Data Types: You have "count" as numbers but other attributes as text. Consider what type each attribute should be. Some could be numerical or date/time. Ensure you correctly identify which fields are numbers and date, since those values will not be searched as text unless specified in the search or facet settings.
  • Searchable Description/Summary: You don't have a dedicated field for a summary or description of the product. This is critical for long-tail searches. I would recommend generating a summary text that combines the most important attributes such as "brand", "material", "type", "application", "seal" to be used for full-text matching.
  • Synonyms/Variants: You'll need a way to handle synonyms (e.g., "pillow block" and "mounted bearing") and product variants (e.g., different sizes of the same product line). Vertex AI supports some synonyms in search configuration but it is good to add some synonyms in the data fields themselves.
  • Rich text description: If you have rich text description of the product, you can also store it as searchable data, and Vertex AI will index it.
  • Structured attributes: if possible, it would be helpful to add structured attributes, for example, in addition to I_D_, also have something like inner_diameter_mm with numerical value in mm, or inner_diameter_in with numerical value in inch. This will help with unit conversion, and numerical search filtering.

With regard to facets, you've identified the core problem: too many facet keys and a dynamic need for them. Your approach of using attribute counts is clever, but as you've seen, it's not optimal for relevance.

To improve dynamic faceting, I would suggest a combination of approaches: 

  1. Prioritize Facet Keys:
    • Manual Configuration: Identify your most important and common attributes and explicitly define them as facetable in your Search Service. Examples: brands, categories, duty, housing, seal, and id_unit
    • Count based: Count the number of occurrences of each attribute key in the whole product catalog, and select the ones that occur most frequently.
    • Query based: As you see in your approach, count the number of occurrences of each attribute in the top-n relevant products according to search query.
  2. Create a Rich Description:
    • The description should be detailed and include many terms and keywords for long tail searches.
  3. Use dynamicFacetSpec Effectively:
    • Instead of all attributes, use dynamicFacetSpec to show faceting on the attributes. This allows the Search Service to determine which attributes are best suited for facets based on the search context.
  4. Precompute Most Common Attributes (Optional):
    • For performance reasons, you can precompute the most common (and other frequently used) attributes during data ingestion (or update) and store them as product metadata. This is helpful if you have a large catalog and want to avoid real-time computation for attribute ranking.

In terms of incorporating facetable attributes into the searchable description, you're on the right track with the "1" pillow block bearing" example.

  1. Add Attributes to description: Include key facets like "pillow block" (which can also be a category), "bearing", the ID (1"), in your product description.
  2. Full Text Search: Since description is indexed as a full-text field, Vertex AI Search will match against your search query.
  3. Benefit: Users can find products based on their specific attributes even when they are not explicitly set as a searchable facet.

Here are steps that you may follow:

  1. Revise your data structure.
  2. Identify your core facets.
  3. Configure facets using the API with your core attributes.
  4. Enable dynamic faceting.
  5. Test, iterate, and analyze. Check how the facets and overall search relevancy perform in real-world scenarios.
  6. Consider adding synonyms in attributes or a separate synonym configuration to address search variations (e.g. pillow block vs mounted bearing).
  7. Consider A/B testing of different facet configurations and your overall search setup to better understand your specific dataset.

Finally, here are some documentation that you may find useful:

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

Thank you for your advice.

It means a lot to me. I have a question regarding the dynamicFacetSpec now.

I tried to use it on codebase but it's all the same result when I didn't use it.

Could you please walk me through how I can use this in codebase?

So I expected that when I add dynamicFacetSpec in codebase, I will get facet keys automatically from vertex ai search engine but actually not, they are not giving any keys to me.

For now, facet key search is the main issue.

I also calculated the frequencises of all facet keys and extracted top 200 facet keys but they are not good solutions.

For ex: top 200 was related to bearings. 

 

But when users search "shirt", "food", then the facet keys of bearings are not far away from "shirt", "food"

So actually in my engine, the search should be done two times. - one for getting relevant facet keys according to the user query, then one for searching with query & facet keys (without facet search).

 

 

Thank you for your further asssistance.

Pavlo