Skip to content

Graph Schema Reference

This document describes the Neo4j graph database schema powering the Consumer Graph MCP.

Represents a shopper with purchase history and preferences.

Properties:

  • user_id (string, unique) - User identifier
  • zip (string) - ZIP code for location context
  • created_at (datetime) - Account creation timestamp
  • profile_type (string) - Shopping profile (e.g., “Health-Conscious”, “Family-Bulk”)
  • latitude (float, optional) - Geographic coordinate
  • longitude (float, optional) - Geographic coordinate

Relationships:

  • PURCHASED → Product
  • ELIGIBLE_FOR → Offer
  • SHOPS_AT → Retailer
  • MEMBER_OF → Household
  • MEMBER_OF → Community

Represents a purchasable product.

Properties:

  • product_id (string, unique) - Product identifier
  • upc (string) - Universal Product Code
  • name (string) - Product name
  • brand (string) - Brand name
  • category (string) - Product category
  • organic (boolean) - Organic status
  • vegan (boolean, optional) - Vegan status
  • trending (boolean) - Trending flag
  • tags (array of strings) - Descriptive tags

Relationships:

  • IN_CATEGORY → Category
  • SIMILAR_TO → Product (with similarity_score property)
  • HAS_OFFER ← Offer

Represents a promotional offer for points/rewards.

Properties:

  • offer_id (string, unique) - Offer identifier
  • points (integer) - Point value
  • description (string) - Offer description
  • start (date) - Start date
  • end (date) - End date
  • priority (string) - Priority level (“High”, “Medium”, “Low”)
  • offer_type (string) - Type of offer
  • venue_type (string) - Where offer applies

Relationships:

  • HAS_OFFER → Product (specific product offer)
  • HAS_OFFER → Brand (brand-level offer via Category node)
  • HAS_OFFER → Category (category-level offer)
  • ELIGIBLE_FOR ← User
  • AVAILABLE_AT → Retailer

Represents a store or shopping venue.

Properties:

  • retailer_id (string, unique) - Retailer identifier
  • name (string) - Retailer name
  • venue_type (string) - Type of venue (“Grocery”, “Drugstore”, “Warehouse”, etc.)
  • tier (string) - Price tier (“Premium”, “Mid-Range”, “Discount”, etc.)

Relationships:

  • AVAILABLE_AT ← Offer
  • SHOPS_AT ← User

Demo Retailers:

  • RET_WHOLE_FOODS - Premium grocery
  • RET_ALDI - Discount grocery
  • RET_TARGET - Big box
  • RET_COSTCO - Warehouse club
  • RET_MARIANOS - Regional chain
  • RET_WALGREENS - Drugstore
  • RET_TRADER_JOES - Specialty
  • RET_AMAZON_FRESH - Online delivery

Intermediary node for product categorization and graph optimization.

Properties:

  • category_id (string, unique) - Category identifier
  • name (string) - Category name
  • product_count (integer) - Number of products in category

Relationships:

  • IN_CATEGORY ← Product
  • SHOPS_IN ← User
  • HAS_OFFER ← Offer (category-level offers)

Purpose: Reduces graph complexity by serving as intermediary between users and products. Instead of direct User-Product edges, uses User→Category→Product pattern.

Demo Categories:

  • Dairy (milk, cheese, eggs, yogurt)
  • Produce (fruits, vegetables)
  • Meat & Seafood (proteins)
  • Bakery (bread, baked goods)
  • Pantry (dry goods, canned items)
  • Beverages (drinks, juices)
  • Snacks (chips, bars, treats)
  • Frozen (frozen meals, ice cream)
  • Personal Care (hygiene products)
  • Household (cleaning, paper goods)

Intermediary node representing a group of similar shoppers.

Properties:

  • community_id (string, unique) - Community identifier
  • name (string) - Community name
  • member_count (integer) - Number of members
  • primary_category (string) - Main category affinity
  • zip_code (string, optional) - Geographic location

Relationships:

  • MEMBER_OF ← User
  • POPULAR_IN ← Product

Purpose: Graph optimization pattern reducing edges by 70%. Groups similar shoppers to enable efficient collaborative filtering and trend analysis.

Benefits:

  • Faster recommendation queries
  • Better scalability
  • Enables community-based insights
  • Reduces graph memory footprint

Represents a shared living/shopping unit.

Properties:

  • household_id (string, unique) - Household identifier
  • member_count (integer) - Number of household members
  • combined_monthly_budget (float) - Total household budget
  • household_type (string) - Type (“Couple”, “Family”, “Roommates”)

Relationships:

  • MEMBER_OF ← User

Purpose: Enables household-level shopping insights and recommendations considering shared context.


User purchased a product.

Properties:

  • quantity (integer) - Items purchased
  • timestamp (datetime) - Purchase time
  • price (float, optional) - Purchase price

Direction: User → Product

Usage: Purchase history, frequency analysis, preference modeling


User is eligible to redeem an offer.

Properties: None

Direction: User → Offer

Usage: Personalized offer filtering, targeted promotions


Offer applies to product, brand, or category.

Properties: None

Direction: Offer → Product/Category

Usage: Offer discovery, product-to-offer matching


Products are similar based on collaborative filtering.

Properties:

  • similarity_score (float) - Similarity strength (0.0 to 1.0)

Direction: Product ↔ Product (bidirectional)

Usage: Product recommendations, alternative suggestions, explainable AI


Product belongs to a category.

Properties: None

Direction: Product → Category

Usage: Category browsing, categorization


User shops at a retailer.

Properties:

  • frequency (integer) - Visit count
  • last_visit (datetime) - Most recent visit

Direction: User → Retailer

Usage: Retailer preferences, location-based offers


User belongs to household or community.

Properties: None

Direction: User → Household/Community

Usage: Household context, community insights, collaborative filtering


Product is popular within a community.

Properties:

  • purchase_count (integer) - Community purchases

Direction: Product → Community

Usage: Community trending, social proof


User shops in a category.

Properties:

  • purchase_count (integer) - Category purchases

Direction: User → Category

Usage: Category affinity, expansion prediction


(User)-[:MEMBER_OF]->(Household)
(User)-[:MEMBER_OF]->(Community)
(User)-[:SHOPS_IN]->(Category)
(User)-[:SHOPS_AT]->(Retailer)
(User)-[:PURCHASED]->(Product)
(Product)-[:IN_CATEGORY]->(Category)
(Product)-[:SIMILAR_TO]-(Product)
(Offer)-[:HAS_OFFER]->(Product)
(Product)<-[:POPULAR_IN]-(Community)
(User)-[:ELIGIBLE_FOR]->(Offer)
(Offer)-[:HAS_OFFER]->(Product/Category)
(Offer)-[:AVAILABLE_AT]->(Retailer)
(User)-[:MEMBER_OF]->(Community)<-[:MEMBER_OF]-(OtherUser)
(OtherUser)-[:PURCHASED]->(Product)

Old pattern (slow): (User)-[:SIMILAR_TO]-(OtherUser) New pattern (fast): Uses Community as intermediary


Direct user-to-user and user-to-product relationships created 2,182 edges, causing:

  • Slow queries
  • High memory usage
  • Poor scalability
  1. Community Nodes: Group similar users
  2. Category Nodes: Group related products
  • Edges reduced: 2,182 → 662 (70% reduction)
  • Query performance: ~3x faster
  • Scalability: Linear growth instead of quadratic
  • Memory: Reduced graph memory footprint
Before: (User)-[:PURCHASED]->(Product) [2,182 edges]
After: (User)-[:SHOPS_IN]->(Category)<-[:IN_CATEGORY]-(Product) [662 edges]

The following indexes are created for query performance:

User:

  • user_id (unique constraint)

Product:

  • product_id (unique constraint)
  • name (text index for search)
  • brand (index)
  • category (index)

Offer:

  • offer_id (unique constraint)
  • start, end (composite index for date range queries)

Category:

  • category_id (unique constraint)
  • name (index)

Community:

  • community_id (unique constraint)

MATCH (u:User {user_id: $userId})-[p:PURCHASED]->(prod:Product)
RETURN prod, p.quantity, p.timestamp
ORDER BY p.timestamp DESC
LIMIT 20
MATCH (u:User {user_id: $userId})-[:ELIGIBLE_FOR]->(o:Offer)
WHERE date() >= o.start AND date() <= o.end
RETURN o
ORDER BY o.points DESC
MATCH (p1:Product {product_id: $productId})-[s:SIMILAR_TO]-(p2:Product)
WHERE s.similarity_score >= $minSimilarity
RETURN p2, s.similarity_score
ORDER BY s.similarity_score DESC
LIMIT 10
MATCH (u:User {user_id: $userId})-[:MEMBER_OF]->(c:Community)
MATCH (c)<-[:MEMBER_OF]-(other:User)-[:PURCHASED]->(p:Product)
WHERE NOT (u)-[:PURCHASED]->(p)
RETURN p, count(*) as purchase_count
ORDER BY purchase_count DESC
LIMIT 10

  • Users: 6 demo + 120+ similar users
  • Products: 135+ items across 10 categories
  • Offers: 120+ active offers
  • Retailers: 8 venues
  • Categories: 47 categories
  • Communities: 45 communities
  • Households: 3 households
  • Total edges: ~662 (after optimization)
  • User relationships: ~180
  • Product relationships: ~300
  • Offer relationships: ~150
  • Household/Community: ~32

  • User → Product (PURCHASED)
  • User → User (SIMILAR_TO)
  • Simple but slow at scale
  • User → Community → User (implicit similarity)
  • User → Category → Product (categorized shopping)
  • 70% edge reduction, 3x faster queries
  • Temporal nodes for time-based patterns
  • Brand nodes for brand-level analysis
  • Location nodes for geographic clustering

  1. Always use indexes (user_id, product_id, etc.)
  2. Limit result sets with LIMIT clause
  3. Use date filters on active offers
  4. Leverage Community nodes for recommendations
  5. Profile queries with PROFILE or EXPLAIN
  1. Use intermediary nodes for many-to-many relationships
  2. Store frequently accessed properties on nodes
  3. Use relationship properties for metadata
  4. Normalize data appropriately
  5. Consider query patterns in schema design
  1. Create indexes on frequently queried properties
  2. Use parameterized queries
  3. Batch write operations
  4. Monitor query execution time
  5. Use APOC procedures for complex operations

Access Neo4j Browser at http://localhost:7474 to visualize the graph:

// Visualize demo user's shopping pattern
MATCH path = (u:User {user_id: 'DEMO_USER_HEALTH'})-[*1..2]-(n)
RETURN path
LIMIT 50
// Visualize product category structure
MATCH (p:Product)-[:IN_CATEGORY]->(c:Category)
RETURN p, c
LIMIT 100
// Visualize community structure
MATCH (u:User)-[:MEMBER_OF]->(c:Community)<-[:POPULAR_IN]-(p:Product)
WHERE c.community_id = 'COMM_001'
RETURN u, c, p