Skip to content

Phase 2 Tool Development Plan (Revised)

Key Principle: Focus on graph-backed features that require traversal, relationships, and temporal patterns. LLM can handle: Location finding, price comparison (via external APIs), recipe generation, and substitution reasoning.

Phase 1 Complete: 20 graph tools + 2 Points tools implemented.

Phase 2 Scope: 8 new tools focusing on graph-unique capabilities while maintaining a small, efficient graph.


  • User behavior patterns - purchase history, preferences, temporal patterns
  • Product relationships - similarity, category hierarchies, brand relationships
  • User-product affinities - what users buy together, collaborative filtering
  • Offer eligibility - personalized, time-bound relationships
  • Retailer visit patterns - where users shop, frequency, spending
  • Real-time pricing - Use external pricing API, too volatile for graph
  • Stock/inventory - Use external inventory API, changes too frequently
  • Store locations - Use geocoding service, not graph traversal
  • Recipe instructions - LLM can generate from product lists
  • Product substitutions - LLM can reason about similarity
  • Keep graph to users, products, offers, retailers (core entities)
  • Avoid cartesian explosion (Product × Retailer × Price × Time)
  • Use aggregates and patterns, not granular transactions
  • Prune old data (>2 years) periodically

What’s Missing: Comprehensive product metadata beyond name/brand/category

Why Graph, Not LLM:

  • User’s historical relationship with products (how often, when, spending)
  • Product attributes that enable filtering (organic, gluten-free, sustainable)
  • Dynamic attributes based on user context (previously_purchased, in_usual_rotation)

Solution: Enhance Product node with rich attributes


What’s Missing: “What stores are near me?” without querying every time

Why Graph, Not External API:

  • User’s habitual stores are already in graph (VISITED relationship)
  • Historical store preferences inform recommendations
  • Store-offer associations require graph traversal

Solution: Pre-compute user’s “primary stores” cluster, avoid storing all products×stores


What’s Missing: Track spending patterns over time

Why Graph, Not External:

  • Temporal spending patterns by category/brand/store
  • Behavioral changes (spending more/less on category)
  • Budget adherence tracking over time

Solution: Add spending data to PURCHASED relationship (aggregate)


What’s Missing: Which offers work together, optimal activation strategy

Why Graph, Not LLM:

  • Offer compatibility rules (can’t stack certain offers)
  • User eligibility across multiple offers
  • Product-offer mappings for user’s predicted purchases

Solution: Add offer stacking metadata, graph traversal for optimization


What’s Missing: New product discovery, discontinuation alerts

Why Graph:

  • Track when products enter/leave user’s purchase rotation
  • Identify emerging patterns in user behavior
  • Collaborative trends (similar users trying new products)

What’s Missing: Multi-user households, shared preferences

Why Graph:

  • Household relationships between users
  • Shared purchase patterns
  • Preference conflicts within household

Phase 2: Enhanced Context & Intelligence (8-10 weeks)

Section titled “Phase 2: Enhanced Context & Intelligence (8-10 weeks)”

Add rich product metadata, spending analytics, smarter offer optimization, household context, and social discovery without exploding graph size.


Replaces: get_product_context (enhance existing tool)

Why This Tool: LLM needs structured, personalized product data that only graph can provide:

  • User’s purchase history with this product
  • How it relates to user’s preferences
  • Dynamic attributes based on user context

Tool Spec:

{
"name": "get_product_details_enhanced",
"description": "Get comprehensive product details with user-specific context and attributes",
"input": {
"user_id": "string (required)",
"product_id": "string (required)"
},
"output": {
"product": {
"product_id": "string",
"name": "Greek Yogurt",
"brand": "Chobani",
"category": "Dairy",
"attributes": {
"organic": true,
"gluten_free": true,
"sustainable_packaging": false,
"tags": ["high-protein", "probiotic", "low-sugar"]
}
},
"user_context": {
"purchased_times": 15,
"last_purchase": "2025-01-01",
"avg_interval_days": 14,
"next_predicted": "2025-01-15",
"in_usual_rotation": true,
"spending_on_product": 74.85,
"price_trend": "stable"
},
"alternatives": [
{
"product_id": "PROD456",
"name": "Fage Greek Yogurt",
"similarity_score": 0.92,
"user_has_tried": false
}
],
"applicable_offers": [
{
"offer_id": "O123",
"title": "Dairy Bonus",
"points": 500
}
]
}
}

Schema Changes:

Product {
product_id: "P123",
name: "Greek Yogurt",
brand: "Chobani",
category: "Dairy",
// NEW: Attributes for filtering (not food-specific)
attributes: {
organic: true,
gluten_free: true,
vegan: false,
sustainable: false,
local: false,
handmade: false,
eco_friendly: true
},
tags: ["high-protein", "probiotic", "low-sugar"],
// Keep generic - works for any product type
product_type: "food", // or "household", "personal_care", "pet"
// Optional food-specific (if product_type = "food")
nutrition: {
calories: 150,
protein_g: 12,
sugar_g: 4
}
}

Implementation Effort: 1 week (enhance existing tool)


New Tool

Why This Tool: Graph stores temporal purchase patterns with spending - LLM can’t do time-series analysis without this data.

Tool Spec:

{
"name": "track_spending_patterns",
"description": "Analyze user spending patterns by category, brand, store, and time period",
"input": {
"user_id": "string (required)",
"lookback_days": "number (default: 90)",
"group_by": "enum: category|brand|store|week|month"
},
"output": {
"spending_summary": {
"total_spent": 487.56,
"period_start": "2024-10-08",
"period_end": "2025-01-07",
"breakdown": [
{
"group_name": "Dairy",
"spent": 125.43,
"percent_of_total": 25.7,
"purchase_count": 23,
"vs_prev_period": "+12%",
"trend": "increasing"
}
]
},
"insights": {
"top_spending_category": "Dairy",
"fastest_growing_category": "Snacks (+35%)",
"declining_categories": ["Beverages (-10%)"]
}
}
}

Schema Changes:

// Add to existing PURCHASED relationship
PURCHASED {
times: 5,
qty: 10,
first: datetime,
last: datetime,
timestamps: [datetime, ...],
// NEW: Aggregated spending data
total_spent: 49.95,
avg_price_per_unit: 4.99,
min_price: 4.49,
max_price: 5.49
}
// Optional: Add budget tracking to User
User {
user_id: "U123",
zip: "60601",
// NEW: Budget preferences (optional)
monthly_budget: 500.0,
budget_alerts: true
}

Implementation Effort: 2 weeks


New Tool

Why This Tool: Graph stores offer eligibility, stacking rules, and product applicability - LLM can’t reason about complex constraint satisfaction.

Tool Spec:

{
"name": "optimize_offer_activation",
"description": "Find optimal offer activation strategy for maximum points based on predicted purchases",
"input": {
"user_id": "string (required)",
"prediction_window_days": "number (default: 7)",
"include_shopping_list": "array of product_ids (optional)"
},
"output": {
"strategy": {
"total_points_potential": 2500,
"offers_to_activate": [
{
"offer_id": "O123",
"title": "Dairy Bonus - 500 pts",
"points": 500,
"products_needed": ["PROD1", "PROD2"],
"user_likely_to_buy": ["PROD1", "PROD2"],
"confidence": "high",
"stackable_with": ["O456"]
}
],
"recommended_purchases": [
{
"product_id": "PROD1",
"name": "Greek Yogurt",
"triggers_offers": ["O123", "O456"],
"total_points": 800,
"next_predicted_purchase": "2025-01-10"
}
]
},
"stacking_rules": {
"max_points_per_transaction": 5000,
"incompatible_offer_pairs": [["O789", "O801"]]
}
}
}

Schema Changes:

// Add to Offer node
Offer {
offer_id: "O123",
title: "Dairy Bonus",
points: 500,
start: date,
end: date,
priority: 10,
venue_type: "grocery",
// NEW: Stacking rules
stackable: true,
max_stacks_per_transaction: 3,
incompatible_offers: ["O456", "O789"]
}
// NEW: Offer compatibility relationship
CREATE (o1:Offer {offer_id: "O123"})-[:STACKS_WITH {
max_combined_points: 5000
}]->(o2:Offer {offer_id: "O456"})
// NEW: Offer incompatibility
CREATE (o1:Offer)-[:CONFLICTS_WITH]->(o2:Offer)

Implementation Effort: 2 weeks


New Tool

Why This Tool: Graph stores user’s habitual stores with visit frequency - don’t need to store all products at all stores.

Tool Spec:

{
"name": "get_user_location_context",
"description": "Get user's primary shopping locations and venue preferences",
"input": {
"user_id": "string (required)"
},
"output": {
"primary_stores": [
{
"retailer_id": "R123",
"name": "Target",
"address": "123 Main St, Chicago IL 60601",
"venue_type": "grocery",
"visit_frequency": "weekly",
"times_visited": 47,
"last_visit": "2025-01-05",
"avg_basket_size": 45.67,
"user_preference_rank": 1
}
],
"venue_preferences": {
"most_frequent_venue": "grocery",
"venue_distribution": {
"grocery": 65,
"convenience": 20,
"pharmacy": 15
}
},
"location_hints": {
"user_zip": "60601",
"typical_shopping_radius_miles": 5.2
}
}
}

Why Not Store All Product Locations:

  • Graph explosion: 100K products × 10K stores = 1B relationships
  • Real-time data: Inventory/pricing changes minute-by-minute
  • Solution: Give LLM user’s preferred stores, then LLM calls external location/inventory API

Schema Changes:

// Enhance existing Retailer (add city-level data only)
Retailer {
retailer_id: "R123",
name: "Target",
address: "123 Main St",
city: "Chicago",
state: "IL",
zip: "60601",
venue_type: "grocery",
// Don't add: lat/lon (use geocoding service)
// Don't add: inventory (use external API)
}
// Enhance VISITED relationship
VISITED {
times: 47,
first: datetime,
last: datetime,
total_spent: 2145.67,
// NEW: Visit patterns
typical_visit_frequency_days: 7,
preferred_day_of_week: "Saturday",
avg_basket_size: 45.67
}

Implementation Effort: 1 week


New Tool

Why This Tool: Graph stores emerging user preferences and collaborative trends - LLM doesn’t have this temporal/social context.

Tool Spec:

{
"name": "discover_new_products",
"description": "Discover products user hasn't tried based on evolving preferences and similar users",
"input": {
"user_id": "string (required)",
"category": "string (optional)",
"limit": "number (default: 20)"
},
"output": {
"new_products": [
{
"product_id": "PROD999",
"name": "Overnight Oats",
"brand": "Quaker",
"category": "Breakfast",
"why_recommended": "Similar users in your zip code recently started buying this",
"confidence": "medium",
"similar_to_products_you_buy": ["PROD1", "PROD5"],
"trending": true,
"adoption_rate_in_area": 23.5
}
]
}
}

Why Graph:

  • Track when products enter graph (new product launches)
  • Find similar users trying new products (collaborative discovery)
  • Identify category expansion patterns (user buying from new categories)

Schema Changes:

Product {
// ... existing fields ...
// NEW: Product lifecycle
first_seen_date: date("2024-12-01"),
is_new_product: true // Products added in last 90 days
}
// NEW: Track product adoption
CREATE (u:User)-[:DISCOVERED {
date: date("2025-01-07"),
source: "recommendation|trending|search"
}]->(p:Product)

Implementation Effort: 1.5 weeks


New Tool

Why This Tool: Graph stores household relationships between users - critical for shared shopping.

Tool Spec:

{
"name": "get_household_context",
"description": "Get household members and shared purchase patterns",
"input": {
"user_id": "string (required)"
},
"output": {
"household": {
"household_id": "HH123",
"member_count": 4,
"members": [
{
"user_id": "U123",
"role": "primary_shopper",
"contribution_to_shopping": 70
},
{
"user_id": "U456",
"role": "secondary_shopper",
"contribution_to_shopping": 30
}
]
},
"shared_preferences": {
"common_categories": ["Dairy", "Produce", "Snacks"],
"common_brands": ["Chobani", "Horizon"],
"divergent_preferences": {
"U123_only": ["Coffee"],
"U456_only": ["Tea"]
}
},
"combined_purchase_power": {
"monthly_spending": 950.0,
"points_earned": 15000
}
}
}

Schema Changes:

// NEW: Household node
Household {
household_id: "HH123",
member_count: 4,
combined_monthly_budget: 1000.0
}
// NEW: User belongs to household
CREATE (u:User)-[:MEMBER_OF {
role: "primary_shopper",
joined_date: date("2023-01-01")
}]->(h:Household)
// NEW: Household-level preferences
CREATE (h:Household)-[:PREFERS {
strength: 0.9
}]->(p:Product)

Implementation Effort: 2 weeks


New Tool

Why This Tool: Graph tracks category evolution over time - predict when users will try new categories.

Tool Spec:

{
"name": "predict_category_expansion",
"description": "Predict which new product categories user is likely to explore next",
"input": {
"user_id": "string (required)"
},
"output": {
"current_categories": ["Dairy", "Produce", "Snacks"],
"expansion_predictions": [
{
"category": "Organic Foods",
"confidence": "high",
"reasoning": "You've been buying more organic dairy products",
"similar_users_who_expanded": 45,
"suggested_entry_products": ["PROD888", "PROD999"]
}
],
"category_trends": {
"growing": ["Organic Foods (+30%)"],
"stable": ["Dairy", "Produce"],
"declining": ["Beverages (-5%)"]
}
}
}

Why Graph:

  • Track category adoption timeline per user
  • Find similar users’ category expansion patterns
  • Identify bridge products (connect categories)

Implementation Effort: 2 weeks


New Tool

Why This Tool: Graph stores collaborative filtering data - “users like you” patterns.

Tool Spec:

{
"name": "get_community_insights",
"description": "Get insights from similar users in your area",
"input": {
"user_id": "string (required)",
"category": "string (optional)"
},
"output": {
"similar_users_count": 234,
"community_trends": [
{
"product_id": "PROD555",
"name": "Oat Milk",
"adoption_rate": 45.2,
"growth_rate": "+67% this quarter",
"user_has_tried": false
}
],
"emerging_brands": [
{
"brand": "Oatly",
"product_count": 5,
"users_trying": 89,
"growth": "+120%"
}
]
}
}

Why Graph:

  • Find similar users via collaborative filtering (SIMILAR_TO relationship)
  • Track community adoption of new products/brands
  • Geo-filtered trends (same zip/city)

Schema Changes:

// NEW: User similarity (collaborative filtering)
CREATE (u1:User)-[:SIMILAR_TO {
similarity_score: 0.87,
common_products: 45,
computed_date: date("2025-01-07")
}]->(u2:User)
// Pre-compute periodically, don't compute on every query

Implementation Effort: 2 weeks


  1. get_product_details_enhanced - Rich product metadata with user context (1 week)
  2. track_spending_patterns - Spending analytics by category/brand/store (2 weeks)
  3. optimize_offer_activation - Offer stacking optimization (2 weeks)
  4. get_user_location_context - Primary stores without graph explosion (1 week)
  5. discover_new_products - Emerging product discovery (1.5 weeks)
  6. get_household_context - Household members and shared patterns (2 weeks)
  7. predict_category_expansion - Category growth predictions (2 weeks)
  8. get_community_insights - Similar users and community trends (2 weeks)
  • Enhance Product with attributes/tags and lifecycle tracking
  • Add spending data to PURCHASED relationship
  • Add offer stacking rules and relationships
  • Add visit patterns to VISITED relationship
  • Add Household nodes and MEMBER_OF relationships
  • Add User similarity relationships (SIMILAR_TO, pre-computed)
  • Add category adoption timestamps
  • Store location finding → LLM calls geocoding API with user_zip
  • Price comparison → LLM calls pricing API with product_id + store_ids
  • Inventory checking → LLM calls inventory API
  • Recipe generation → LLM generates from product list
  • Product substitution → LLM reasons about similarity + user context

Integration Strategy: Graph + External APIs

Section titled “Integration Strategy: Graph + External APIs”
User Query
LLM Agent
┌─────────────────────┬──────────────────────┐
│ Graph Tools │ External APIs │
│ (via MCP) │ (via LLM) │
├─────────────────────┼──────────────────────┤
│ User patterns │ Store locations │
│ Product context │ Real-time pricing │
│ Offer eligibility │ Inventory status │
│ Purchase history │ Recipe generation │
│ Spending trends │ Substitution logic │
│ Collaborative data │ Nutrition lookup │
└─────────────────────┴──────────────────────┘
Combined Response

Example: “Where can I buy Greek yogurt near me?”

Section titled “Example: “Where can I buy Greek yogurt near me?””

LLM Orchestration:

1. LLM calls: get_user_location_context(user_id)
→ Returns: Primary stores = ["Target Main St", "Jewel Oak St"]
2. LLM calls: get_product_details_enhanced(user_id, product_id)
→ Returns: User buys "Chobani Greek Yogurt" regularly
3. LLM uses function calling to: location_api.find_nearby_stores(
zip="60601",
store_names=["Target", "Jewel"],
max_distance=5
)
→ Returns: Store addresses with lat/lon
4. LLM uses function calling to: inventory_api.check_product_availability(
product="Chobani Greek Yogurt",
stores=[store_ids]
)
→ Returns: Stock status and prices
5. LLM synthesizes:
"Based on your shopping history, you usually buy Chobani Greek Yogurt.
Here are your nearby stores:
- Target (2.3 mi): In stock, $4.99
- Jewel (3.1 mi): In stock, $4.49 (cheapest!)
You also have a 500-point offer on dairy purchases this week."

Key Insight: Graph provides user context, external APIs provide real-time data.


Phase 2: Enhanced Context & Social (10 weeks)

Section titled “Phase 2: Enhanced Context & Social (10 weeks)”
  • 8 new graph tools
  • Enhanced product metadata
  • Spending analytics
  • Offer optimization
  • Household features
  • Community insights
  • Category predictions
  • Cost: $65-85K (2-3 engineers)

  • Graph size: <5M nodes, <50M relationships
  • Query latency: P95 <500ms
  • Data freshness: User purchase data updated daily
  • Pruning: Archive purchases >2 years old
  • Spending insights: 50% of users engage
  • Offer optimization: 35% adoption, +20% points earned
  • Household features: 25% of users are in households
  • Community insights: 40% view trending products

  • DON’T: Store Product×Retailer×Price (1B+ relationships)
  • DO: Store user’s primary retailers (50-100 per user)
  • DO: Aggregate spending, not individual transactions
  • DO: Prune old data (>2 years)
  • DON’T: Try to keep pricing/inventory in graph
  • DO: Use external APIs for volatile data
  • DO: Cache API responses (15-30 min TTL)
  • Graph: Relationships, patterns, user context
  • LLM: Reasoning, substitution, recipe generation
  • External APIs: Real-time pricing, inventory, locations

  1. Review and prioritize Phase 2 tools by business value
  2. Validate external API availability (pricing, inventory, geocoding)
  3. Prototype get_product_details_enhanced (1 week POC)
  4. Design data ingestion pipeline for spending data
  5. Begin Phase 2 Week 1 implementation

✅ Phase 2 Graph-Backed Tools (Build These - 8 tools)

Section titled “✅ Phase 2 Graph-Backed Tools (Build These - 8 tools)”
  • get_product_details_enhanced
  • track_spending_patterns
  • optimize_offer_activation
  • get_user_location_context
  • discover_new_products
  • get_household_context
  • predict_category_expansion
  • get_community_insights

🤖 LLM-Native Capabilities (Don’t Build Tools)

Section titled “🤖 LLM-Native Capabilities (Don’t Build Tools)”
  • Store location finding (geocoding API + LLM reasoning)
  • Price comparison (pricing API + LLM comparison)
  • Recipe generation (LLM generates from ingredients)
  • Product substitution (LLM reasons about similarity)
  • Inventory checking (inventory API)
  • Nutrition analysis (nutrition API + LLM interpretation)

🔮 Future Consideration (Not in Phase 2)

Section titled “🔮 Future Consideration (Not in Phase 2)”
  • track_offer_performance - Offer activation history tracking
  • optimize_multi_user_shopping - Multi-user household optimization