Phase 2 Tool Development Plan (Revised)
Phase 2 Tool Development Plan (Revised)
Section titled “Phase 2 Tool Development Plan (Revised)”Executive Summary
Section titled “Executive Summary”Key Principle: Focus on graph-backed features that require traversal, relationships, and temporal patterns. LLM can handle: Location finding, price comparison (via external APIs), recipe generation, and substitution reasoning.
Phase 1 Complete: 20 graph tools + 2 Points tools implemented.
Phase 2 Scope: 8 new tools focusing on graph-unique capabilities while maintaining a small, efficient graph.
Design Principles
Section titled “Design Principles”✅ What Belongs in the Graph
Section titled “✅ What Belongs in the Graph”- User behavior patterns - purchase history, preferences, temporal patterns
- Product relationships - similarity, category hierarchies, brand relationships
- User-product affinities - what users buy together, collaborative filtering
- Offer eligibility - personalized, time-bound relationships
- Retailer visit patterns - where users shop, frequency, spending
❌ What Should Stay External
Section titled “❌ What Should Stay External”- Real-time pricing - Use external pricing API, too volatile for graph
- Stock/inventory - Use external inventory API, changes too frequently
- Store locations - Use geocoding service, not graph traversal
- Recipe instructions - LLM can generate from product lists
- Product substitutions - LLM can reason about similarity
🎯 Graph Size Strategy
Section titled “🎯 Graph Size Strategy”- Keep graph to users, products, offers, retailers (core entities)
- Avoid cartesian explosion (Product × Retailer × Price × Time)
- Use aggregates and patterns, not granular transactions
- Prune old data (>2 years) periodically
Revised Gap Analysis
Section titled “Revised Gap Analysis”🔴 Gap 1: Product Details & Attributes
Section titled “🔴 Gap 1: Product Details & Attributes”What’s Missing: Comprehensive product metadata beyond name/brand/category
Why Graph, Not LLM:
- User’s historical relationship with products (how often, when, spending)
- Product attributes that enable filtering (organic, gluten-free, sustainable)
- Dynamic attributes based on user context (previously_purchased, in_usual_rotation)
Solution: Enhance Product node with rich attributes
🔴 Gap 2: Location-Aware User Context
Section titled “🔴 Gap 2: Location-Aware User Context”What’s Missing: “What stores are near me?” without querying every time
Why Graph, Not External API:
- User’s habitual stores are already in graph (VISITED relationship)
- Historical store preferences inform recommendations
- Store-offer associations require graph traversal
Solution: Pre-compute user’s “primary stores” cluster, avoid storing all products×stores
🔴 Gap 3: Spending & Budget Analytics
Section titled “🔴 Gap 3: Spending & Budget Analytics”What’s Missing: Track spending patterns over time
Why Graph, Not External:
- Temporal spending patterns by category/brand/store
- Behavioral changes (spending more/less on category)
- Budget adherence tracking over time
Solution: Add spending data to PURCHASED relationship (aggregate)
🔴 Gap 4: Offer Stacking & Optimization
Section titled “🔴 Gap 4: Offer Stacking & Optimization”What’s Missing: Which offers work together, optimal activation strategy
Why Graph, Not LLM:
- Offer compatibility rules (can’t stack certain offers)
- User eligibility across multiple offers
- Product-offer mappings for user’s predicted purchases
Solution: Add offer stacking metadata, graph traversal for optimization
🟡 Gap 5: Product Lifecycle & Trends
Section titled “🟡 Gap 5: Product Lifecycle & Trends”What’s Missing: New product discovery, discontinuation alerts
Why Graph:
- Track when products enter/leave user’s purchase rotation
- Identify emerging patterns in user behavior
- Collaborative trends (similar users trying new products)
🟡 Gap 6: Household & Family Context
Section titled “🟡 Gap 6: Household & Family Context”What’s Missing: Multi-user households, shared preferences
Why Graph:
- Household relationships between users
- Shared purchase patterns
- Preference conflicts within household
Phase 2: Enhanced Context & Intelligence (8-10 weeks)
Section titled “Phase 2: Enhanced Context & Intelligence (8-10 weeks)”Add rich product metadata, spending analytics, smarter offer optimization, household context, and social discovery without exploding graph size.
2.1: get_product_details_enhanced
Section titled “2.1: get_product_details_enhanced”Replaces: get_product_context (enhance existing tool)
Why This Tool: LLM needs structured, personalized product data that only graph can provide:
- User’s purchase history with this product
- How it relates to user’s preferences
- Dynamic attributes based on user context
Tool Spec:
{ "name": "get_product_details_enhanced", "description": "Get comprehensive product details with user-specific context and attributes", "input": { "user_id": "string (required)", "product_id": "string (required)" }, "output": { "product": { "product_id": "string", "name": "Greek Yogurt", "brand": "Chobani", "category": "Dairy", "attributes": { "organic": true, "gluten_free": true, "sustainable_packaging": false, "tags": ["high-protein", "probiotic", "low-sugar"] } }, "user_context": { "purchased_times": 15, "last_purchase": "2025-01-01", "avg_interval_days": 14, "next_predicted": "2025-01-15", "in_usual_rotation": true, "spending_on_product": 74.85, "price_trend": "stable" }, "alternatives": [ { "product_id": "PROD456", "name": "Fage Greek Yogurt", "similarity_score": 0.92, "user_has_tried": false } ], "applicable_offers": [ { "offer_id": "O123", "title": "Dairy Bonus", "points": 500 } ] }}Schema Changes:
Product { product_id: "P123", name: "Greek Yogurt", brand: "Chobani", category: "Dairy",
// NEW: Attributes for filtering (not food-specific) attributes: { organic: true, gluten_free: true, vegan: false, sustainable: false, local: false, handmade: false, eco_friendly: true }, tags: ["high-protein", "probiotic", "low-sugar"],
// Keep generic - works for any product type product_type: "food", // or "household", "personal_care", "pet"
// Optional food-specific (if product_type = "food") nutrition: { calories: 150, protein_g: 12, sugar_g: 4 }}Implementation Effort: 1 week (enhance existing tool)
2.2: track_spending_patterns
Section titled “2.2: track_spending_patterns”New Tool
Why This Tool: Graph stores temporal purchase patterns with spending - LLM can’t do time-series analysis without this data.
Tool Spec:
{ "name": "track_spending_patterns", "description": "Analyze user spending patterns by category, brand, store, and time period", "input": { "user_id": "string (required)", "lookback_days": "number (default: 90)", "group_by": "enum: category|brand|store|week|month" }, "output": { "spending_summary": { "total_spent": 487.56, "period_start": "2024-10-08", "period_end": "2025-01-07", "breakdown": [ { "group_name": "Dairy", "spent": 125.43, "percent_of_total": 25.7, "purchase_count": 23, "vs_prev_period": "+12%", "trend": "increasing" } ] }, "insights": { "top_spending_category": "Dairy", "fastest_growing_category": "Snacks (+35%)", "declining_categories": ["Beverages (-10%)"] } }}Schema Changes:
// Add to existing PURCHASED relationshipPURCHASED { times: 5, qty: 10, first: datetime, last: datetime, timestamps: [datetime, ...],
// NEW: Aggregated spending data total_spent: 49.95, avg_price_per_unit: 4.99, min_price: 4.49, max_price: 5.49}
// Optional: Add budget tracking to UserUser { user_id: "U123", zip: "60601",
// NEW: Budget preferences (optional) monthly_budget: 500.0, budget_alerts: true}Implementation Effort: 2 weeks
2.3: optimize_offer_activation
Section titled “2.3: optimize_offer_activation”New Tool
Why This Tool: Graph stores offer eligibility, stacking rules, and product applicability - LLM can’t reason about complex constraint satisfaction.
Tool Spec:
{ "name": "optimize_offer_activation", "description": "Find optimal offer activation strategy for maximum points based on predicted purchases", "input": { "user_id": "string (required)", "prediction_window_days": "number (default: 7)", "include_shopping_list": "array of product_ids (optional)" }, "output": { "strategy": { "total_points_potential": 2500, "offers_to_activate": [ { "offer_id": "O123", "title": "Dairy Bonus - 500 pts", "points": 500, "products_needed": ["PROD1", "PROD2"], "user_likely_to_buy": ["PROD1", "PROD2"], "confidence": "high", "stackable_with": ["O456"] } ], "recommended_purchases": [ { "product_id": "PROD1", "name": "Greek Yogurt", "triggers_offers": ["O123", "O456"], "total_points": 800, "next_predicted_purchase": "2025-01-10" } ] }, "stacking_rules": { "max_points_per_transaction": 5000, "incompatible_offer_pairs": [["O789", "O801"]] } }}Schema Changes:
// Add to Offer nodeOffer { offer_id: "O123", title: "Dairy Bonus", points: 500, start: date, end: date, priority: 10, venue_type: "grocery",
// NEW: Stacking rules stackable: true, max_stacks_per_transaction: 3, incompatible_offers: ["O456", "O789"]}
// NEW: Offer compatibility relationshipCREATE (o1:Offer {offer_id: "O123"})-[:STACKS_WITH { max_combined_points: 5000}]->(o2:Offer {offer_id: "O456"})
// NEW: Offer incompatibilityCREATE (o1:Offer)-[:CONFLICTS_WITH]->(o2:Offer)Implementation Effort: 2 weeks
2.4: get_user_location_context
Section titled “2.4: get_user_location_context”New Tool
Why This Tool: Graph stores user’s habitual stores with visit frequency - don’t need to store all products at all stores.
Tool Spec:
{ "name": "get_user_location_context", "description": "Get user's primary shopping locations and venue preferences", "input": { "user_id": "string (required)" }, "output": { "primary_stores": [ { "retailer_id": "R123", "name": "Target", "address": "123 Main St, Chicago IL 60601", "venue_type": "grocery", "visit_frequency": "weekly", "times_visited": 47, "last_visit": "2025-01-05", "avg_basket_size": 45.67, "user_preference_rank": 1 } ], "venue_preferences": { "most_frequent_venue": "grocery", "venue_distribution": { "grocery": 65, "convenience": 20, "pharmacy": 15 } }, "location_hints": { "user_zip": "60601", "typical_shopping_radius_miles": 5.2 } }}Why Not Store All Product Locations:
- Graph explosion: 100K products × 10K stores = 1B relationships
- Real-time data: Inventory/pricing changes minute-by-minute
- Solution: Give LLM user’s preferred stores, then LLM calls external location/inventory API
Schema Changes:
// Enhance existing Retailer (add city-level data only)Retailer { retailer_id: "R123", name: "Target", address: "123 Main St", city: "Chicago", state: "IL", zip: "60601", venue_type: "grocery",
// Don't add: lat/lon (use geocoding service) // Don't add: inventory (use external API)}
// Enhance VISITED relationshipVISITED { times: 47, first: datetime, last: datetime, total_spent: 2145.67,
// NEW: Visit patterns typical_visit_frequency_days: 7, preferred_day_of_week: "Saturday", avg_basket_size: 45.67}Implementation Effort: 1 week
2.5: discover_new_products
Section titled “2.5: discover_new_products”New Tool
Why This Tool: Graph stores emerging user preferences and collaborative trends - LLM doesn’t have this temporal/social context.
Tool Spec:
{ "name": "discover_new_products", "description": "Discover products user hasn't tried based on evolving preferences and similar users", "input": { "user_id": "string (required)", "category": "string (optional)", "limit": "number (default: 20)" }, "output": { "new_products": [ { "product_id": "PROD999", "name": "Overnight Oats", "brand": "Quaker", "category": "Breakfast", "why_recommended": "Similar users in your zip code recently started buying this", "confidence": "medium", "similar_to_products_you_buy": ["PROD1", "PROD5"], "trending": true, "adoption_rate_in_area": 23.5 } ] }}Why Graph:
- Track when products enter graph (new product launches)
- Find similar users trying new products (collaborative discovery)
- Identify category expansion patterns (user buying from new categories)
Schema Changes:
Product { // ... existing fields ...
// NEW: Product lifecycle first_seen_date: date("2024-12-01"), is_new_product: true // Products added in last 90 days}
// NEW: Track product adoptionCREATE (u:User)-[:DISCOVERED { date: date("2025-01-07"), source: "recommendation|trending|search"}]->(p:Product)Implementation Effort: 1.5 weeks
2.6: get_household_context
Section titled “2.6: get_household_context”New Tool
Why This Tool: Graph stores household relationships between users - critical for shared shopping.
Tool Spec:
{ "name": "get_household_context", "description": "Get household members and shared purchase patterns", "input": { "user_id": "string (required)" }, "output": { "household": { "household_id": "HH123", "member_count": 4, "members": [ { "user_id": "U123", "role": "primary_shopper", "contribution_to_shopping": 70 }, { "user_id": "U456", "role": "secondary_shopper", "contribution_to_shopping": 30 } ] }, "shared_preferences": { "common_categories": ["Dairy", "Produce", "Snacks"], "common_brands": ["Chobani", "Horizon"], "divergent_preferences": { "U123_only": ["Coffee"], "U456_only": ["Tea"] } }, "combined_purchase_power": { "monthly_spending": 950.0, "points_earned": 15000 } }}Schema Changes:
// NEW: Household nodeHousehold { household_id: "HH123", member_count: 4, combined_monthly_budget: 1000.0}
// NEW: User belongs to householdCREATE (u:User)-[:MEMBER_OF { role: "primary_shopper", joined_date: date("2023-01-01")}]->(h:Household)
// NEW: Household-level preferencesCREATE (h:Household)-[:PREFERS { strength: 0.9}]->(p:Product)Implementation Effort: 2 weeks
2.7: predict_category_expansion
Section titled “2.7: predict_category_expansion”New Tool
Why This Tool: Graph tracks category evolution over time - predict when users will try new categories.
Tool Spec:
{ "name": "predict_category_expansion", "description": "Predict which new product categories user is likely to explore next", "input": { "user_id": "string (required)" }, "output": { "current_categories": ["Dairy", "Produce", "Snacks"], "expansion_predictions": [ { "category": "Organic Foods", "confidence": "high", "reasoning": "You've been buying more organic dairy products", "similar_users_who_expanded": 45, "suggested_entry_products": ["PROD888", "PROD999"] } ], "category_trends": { "growing": ["Organic Foods (+30%)"], "stable": ["Dairy", "Produce"], "declining": ["Beverages (-5%)"] } }}Why Graph:
- Track category adoption timeline per user
- Find similar users’ category expansion patterns
- Identify bridge products (connect categories)
Implementation Effort: 2 weeks
2.8: get_community_insights
Section titled “2.8: get_community_insights”New Tool
Why This Tool: Graph stores collaborative filtering data - “users like you” patterns.
Tool Spec:
{ "name": "get_community_insights", "description": "Get insights from similar users in your area", "input": { "user_id": "string (required)", "category": "string (optional)" }, "output": { "similar_users_count": 234, "community_trends": [ { "product_id": "PROD555", "name": "Oat Milk", "adoption_rate": 45.2, "growth_rate": "+67% this quarter", "user_has_tried": false } ], "emerging_brands": [ { "brand": "Oatly", "product_count": 5, "users_trying": 89, "growth": "+120%" } ] }}Why Graph:
- Find similar users via collaborative filtering (SIMILAR_TO relationship)
- Track community adoption of new products/brands
- Geo-filtered trends (same zip/city)
Schema Changes:
// NEW: User similarity (collaborative filtering)CREATE (u1:User)-[:SIMILAR_TO { similarity_score: 0.87, common_products: 45, computed_date: date("2025-01-07")}]->(u2:User)
// Pre-compute periodically, don't compute on every queryImplementation Effort: 2 weeks
Phase 2 Summary
Section titled “Phase 2 Summary”Tools to Build (8)
Section titled “Tools to Build (8)”- get_product_details_enhanced - Rich product metadata with user context (1 week)
- track_spending_patterns - Spending analytics by category/brand/store (2 weeks)
- optimize_offer_activation - Offer stacking optimization (2 weeks)
- get_user_location_context - Primary stores without graph explosion (1 week)
- discover_new_products - Emerging product discovery (1.5 weeks)
- get_household_context - Household members and shared patterns (2 weeks)
- predict_category_expansion - Category growth predictions (2 weeks)
- get_community_insights - Similar users and community trends (2 weeks)
Schema Changes
Section titled “Schema Changes”- Enhance Product with attributes/tags and lifecycle tracking
- Add spending data to PURCHASED relationship
- Add offer stacking rules and relationships
- Add visit patterns to VISITED relationship
- Add Household nodes and MEMBER_OF relationships
- Add User similarity relationships (SIMILAR_TO, pre-computed)
- Add category adoption timestamps
What LLM Handles (No Tool Needed)
Section titled “What LLM Handles (No Tool Needed)”- Store location finding → LLM calls geocoding API with user_zip
- Price comparison → LLM calls pricing API with product_id + store_ids
- Inventory checking → LLM calls inventory API
- Recipe generation → LLM generates from product list
- Product substitution → LLM reasons about similarity + user context
Timeline: 10 weeks
Section titled “Timeline: 10 weeks”Cost: $65-85K (2-3 engineers)
Section titled “Cost: $65-85K (2-3 engineers)”Integration Strategy: Graph + External APIs
Section titled “Integration Strategy: Graph + External APIs”Architecture Pattern
Section titled “Architecture Pattern”User Query ↓LLM Agent ↓┌─────────────────────┬──────────────────────┐│ Graph Tools │ External APIs ││ (via MCP) │ (via LLM) │├─────────────────────┼──────────────────────┤│ User patterns │ Store locations ││ Product context │ Real-time pricing ││ Offer eligibility │ Inventory status ││ Purchase history │ Recipe generation ││ Spending trends │ Substitution logic ││ Collaborative data │ Nutrition lookup │└─────────────────────┴──────────────────────┘ ↓Combined ResponseExample: “Where can I buy Greek yogurt near me?”
Section titled “Example: “Where can I buy Greek yogurt near me?””LLM Orchestration:
1. LLM calls: get_user_location_context(user_id) → Returns: Primary stores = ["Target Main St", "Jewel Oak St"]
2. LLM calls: get_product_details_enhanced(user_id, product_id) → Returns: User buys "Chobani Greek Yogurt" regularly
3. LLM uses function calling to: location_api.find_nearby_stores( zip="60601", store_names=["Target", "Jewel"], max_distance=5 ) → Returns: Store addresses with lat/lon
4. LLM uses function calling to: inventory_api.check_product_availability( product="Chobani Greek Yogurt", stores=[store_ids] ) → Returns: Stock status and prices
5. LLM synthesizes: "Based on your shopping history, you usually buy Chobani Greek Yogurt. Here are your nearby stores: - Target (2.3 mi): In stock, $4.99 - Jewel (3.1 mi): In stock, $4.49 (cheapest!)
You also have a 500-point offer on dairy purchases this week."Key Insight: Graph provides user context, external APIs provide real-time data.
Total Investment
Section titled “Total Investment”Phase 2: Enhanced Context & Social (10 weeks)
Section titled “Phase 2: Enhanced Context & Social (10 weeks)”- 8 new graph tools
- Enhanced product metadata
- Spending analytics
- Offer optimization
- Household features
- Community insights
- Category predictions
- Cost: $65-85K (2-3 engineers)
Total: 8 new tools, 10 weeks, $65-85K
Section titled “Total: 8 new tools, 10 weeks, $65-85K”Success Metrics
Section titled “Success Metrics”Graph Efficiency
Section titled “Graph Efficiency”- Graph size:
<5Mnodes,<50Mrelationships - Query latency: P95
<500ms - Data freshness: User purchase data updated daily
- Pruning: Archive purchases >2 years old
Feature Adoption
Section titled “Feature Adoption”- Spending insights: 50% of users engage
- Offer optimization: 35% adoption, +20% points earned
- Household features: 25% of users are in households
- Community insights: 40% view trending products
Risk Mitigation
Section titled “Risk Mitigation”Graph Explosion Risk
Section titled “Graph Explosion Risk”- ❌ DON’T: Store Product×Retailer×Price (1B+ relationships)
- ✅ DO: Store user’s primary retailers (50-100 per user)
- ✅ DO: Aggregate spending, not individual transactions
- ✅ DO: Prune old data (>2 years)
Real-Time Data Risk
Section titled “Real-Time Data Risk”- ❌ DON’T: Try to keep pricing/inventory in graph
- ✅ DO: Use external APIs for volatile data
- ✅ DO: Cache API responses (15-30 min TTL)
LLM vs Graph Boundary Risk
Section titled “LLM vs Graph Boundary Risk”- ✅ Graph: Relationships, patterns, user context
- ✅ LLM: Reasoning, substitution, recipe generation
- ✅ External APIs: Real-time pricing, inventory, locations
Next Steps
Section titled “Next Steps”- Review and prioritize Phase 2 tools by business value
- Validate external API availability (pricing, inventory, geocoding)
- Prototype get_product_details_enhanced (1 week POC)
- Design data ingestion pipeline for spending data
- Begin Phase 2 Week 1 implementation
Appendix: Tool Classification
Section titled “Appendix: Tool Classification”✅ Phase 2 Graph-Backed Tools (Build These - 8 tools)
Section titled “✅ Phase 2 Graph-Backed Tools (Build These - 8 tools)”- get_product_details_enhanced
- track_spending_patterns
- optimize_offer_activation
- get_user_location_context
- discover_new_products
- get_household_context
- predict_category_expansion
- get_community_insights
🤖 LLM-Native Capabilities (Don’t Build Tools)
Section titled “🤖 LLM-Native Capabilities (Don’t Build Tools)”- Store location finding (geocoding API + LLM reasoning)
- Price comparison (pricing API + LLM comparison)
- Recipe generation (LLM generates from ingredients)
- Product substitution (LLM reasons about similarity)
- Inventory checking (inventory API)
- Nutrition analysis (nutrition API + LLM interpretation)
🔮 Future Consideration (Not in Phase 2)
Section titled “🔮 Future Consideration (Not in Phase 2)”- track_offer_performance - Offer activation history tracking
- optimize_multi_user_shopping - Multi-user household optimization