Research: Product Search Tool Consolidation

Date: 2026-04-24 Context: Agent latency is tied to tool call count × latency. BrightData web search is a primary latency driver — repeated searches cluster around comparison-style product queries that could be handled by domain-specific tools.

Current Search Landscape

Tools Available to the Agent

Tool	Backend	What It Returns	When Used
`search_products`	pam-ml-search (via rover-mcp)	fidoId, brand, category, description — no prices, no images	Default product search
`search_fidora_products`	FIDORA (via rover-mcp)	fidoId, prices, images, retailer URLs, availability — rich data	Feature-gated (`product_card=true`)
`web_search` (custom)	BrightData SERP API	title, retailer, productUrl, price, imageUrl — real PDPs from Button partners	Feature-gated (`web_search=true`), requires BrightData credentials
`web_search` (OpenAI native)	OpenAI built-in web search	General web results	Fallback when custom `web_search` flag is off
`search_offers`	Offer Search Service (via rover-mcp)	offerId, points, category, description, score	Offer discovery
`search_nearby_offers`	Lidar + Offer Search (via rover-mcp)	Nearby retailers + matching offers	Location-based queries
`fetch_webpage`	Direct HTTP fetch (via rover-mcp)	Page title + body text	Enriching URLs from search results

How the Agent Uses Search Today

From the system prompt (prompts/conversational.txt):

Product discovery with constraints (price, purpose, size, type): Agent is instructed to call web_search to find current product detail pages (PDPs)
Product-card component (prompts/components/product-card.yaml): Designed for dual-source results — catalog (search_products/search_fidora_products) merged with web (web_search)
Merge strategy: If a product appears in both catalog and web results, prefer catalog version (has fidoId for offer matching)

The Latency Problem

BrightData web search is expensive:

2-step pipeline: Step 1 (Google Shopping search) + Step 2 (product detail page fetch for each result)
top_n=5 means up to 5 product detail page fetches per search
Each step involves an external proxy call through BrightData’s infrastructure
Total latency per web_search call: likely 3-10 seconds depending on network

Agent fan-out pattern for comparison queries:

User: “Compare Tide vs Gain vs All laundry detergent”
Agent calls web_search 3 times (once per brand) → 9-30 seconds of search latency alone
Then may call fetch_webpage on individual PDPs → additional latency
This same query could be answered by a single search_fidora_products call with all three brands

When BrightData is genuinely needed vs when it’s overkill:

Query Type	Current Tool	Could Use Instead
”Find me laundry detergent offers”	search_offers	search_offers ✓ (already correct)
“Compare Tide vs Gain”	web_search (×2-3)	search_fidora_products (×1)
“What’s the cheapest cereal at Target?“	web_search + search_nearby_offers	search_fidora_products + search_nearby_offers
”Find me a good blender under $50”	web_search	Genuinely needs web search (non-grocery, not in catalog)
“What’s the price of Organic Oat Milk?“	web_search	search_fidora_products (if in catalog)
“Is Brand X recalled?“	web_search	Genuinely needs web search (safety/news)
“Best rated dog food 2026”	web_search	Genuinely needs web search (reviews/ratings)

Backend Capabilities

FIDORA (rich product data, already in CCS)

FIDORA already provides most of what BrightData returns for in-catalog products:

Data	BrightData	FIDORA	Notes
Product name	✓	✓
Price	✓ (current web price)	✓ (PriceCents)	FIDORA has retailer-specific prices
Image URL	✓ (web scraped)	✓ (NormalizedImageURL)	FIDORA has CDN-normalized images
Product URL	✓ (real PDP)	✓ (URL to retailer page)
Retailer	✓ (merchant name)	✓ (RetailerID → name)
Availability	✗	✓ (IsAvailable)	FIDORA is better here
Offers/Points	✗	Via FidoIndex + OG	Can be composed
Reviews/Ratings	✓	✗	Only web search has this
Non-grocery products	✓	✗ (grocery/CPG focused)	Web search needed for non-catalog

Key insight: For the ~80% of product queries that involve grocery/CPG products in Fetch’s catalog, FIDORA + Offer Guardian provides everything BrightData does, plus offer/point data, and without the latency penalty.

CCS Product Pipeline (from prior research)

CCS already has a comprehensive product enrichment pipeline:

FPS → product metadata (name, brand, category)
FIDORA → retailer-specific data (price, image, URL, availability)
Button → PPD, merchant logo, stickers
FidoIndex + Offer Guardian → matching offers with points
NELI → user eligibility filtering

All of this is available via GET /v1/products/{fido_id}/enrich or POST /v1/products/enrich (batch).

Where the New `product_search` Tool Fits

The proposed product_search tool would:

Accept: Natural language product query + optional filters (brand, category, price range, store)
Call CCS: Use the existing FIDORA search + enrichment pipeline
Return: Rich product data (name, price, image, URL, availability, offers, points) in a single call
Replace: Multiple web_search calls for in-catalog product queries

The flow:

Agent receives: "Compare Tide vs Gain vs All"
    ↓
Old: web_search("Tide") + web_search("Gain") + web_search("All") = 3 tool calls, 9-30s
    ↓
New: product_search("Tide vs Gain vs All laundry detergent") = 1 tool call, <2s
    ↓
Backend: CCS → FIDORA search → batch enrich (FPS + Button + Offers) → rich results

Services Involved

Service	Role	Changes Needed
consumer-agent	New `product_search` tool + prompt updates	New tool class, prompt rewrite, feature flag
consumer-context-service	Backend for product search + enrichment	May need a combined search+enrich endpoint
rover-mcp	Remove `search_products` tool (Fido-only)	Tool removal (replaced by CCS)

OpenAI Native Web Search

For non-product queries, OpenAI’s built-in web search tool can handle general web context:

No BrightData credentials needed
Integrated into the Responses API
Handles: news, reviews, general knowledge, non-grocery products
Configured as a dict-based tool in LangChain (already implemented when web_search flag is off)

Implementation Considerations

CCS Endpoint: Does CCS need a new combined “search + enrich” endpoint? Currently search_products returns bare FIDOs and enrichment is separate. A single call that searches and enriches would reduce round-trips.
Search Quality: FIDORA’s BM25 search may not match BrightData’s Google Shopping quality for fuzzy/natural language queries. May need to combine with pam-ml-search for better recall.
Offer Integration: A key advantage over web search — the product_search tool can return offers/points alongside product data, which web search cannot.
Fallback: When product_search returns no results (product not in catalog), the agent should fall back to OpenAI native web search, not BrightData.
Prompt Engineering: The system prompt needs significant updates to route product queries to product_search first and only use web search for genuinely external queries (reviews, recalls, non-grocery, etc.).
Feature Flags: Need a product_search flag to gate rollout. Can coexist with existing web_search and product_card flags during transition.

Research: Product Search Tool Consolidation

Research: Product Search Tool Consolidation

Current Search Landscape

Tools Available to the Agent

How the Agent Uses Search Today

The Latency Problem

Backend Capabilities

FIDORA (rich product data, already in CCS)

CCS Product Pipeline (from prior research)

Where the New product_search Tool Fits

Services Involved

OpenAI Native Web Search

Implementation Considerations

Where the New `product_search` Tool Fits