Scalability Analysis: Consumer Graph System

Executive Summary

This Neo4j-based consumer recommendation system has excellent scalability for repurchase predictions but potential bottlenecks in the recommendations query at scale. The system is optimized for read-heavy workloads with pre-computed metrics.

1. Data Model Scale Analysis

Node Counts (Projected)

Users: 1M - 100M nodes
Products: 10K - 100K nodes
Offers: 100 - 1K active nodes
Total Nodes: ~1M - 100M (dominated by Users)

Relationship Counts (Projected)

PURCHASED: 10M - 10B relationships (10-100 purchases per user avg)
SIMILAR_TO: 100K - 10M relationships (sparse product similarity graph)
ELIGIBLE: 100M - 10B relationships (many users × many offers)
APPLIES_TO: 1K - 100K relationships (offers × applicable products)
Total Relationships: ~10M - 20B+

Storage Estimates

Small scale (1M users): ~10GB
Medium scale (10M users): ~100GB
Large scale (100M users): ~1TB+

2. Query Performance Analysis

2.1 Repurchase Prediction Query ✅ EXCELLENT

Query: GetRepurchasePredictions (lines 183-200)

MATCH (u:User {user_id: $user_id})-[bp:PURCHASED]->(p:Product)
WHERE bp.last >= datetime() - duration({days: $lookback_days})
  AND bp.times >= $min_intervals
  AND bp.avg_interval_days > 0
WITH p, bp,
     bp.last + duration({days: toInteger(bp.avg_interval_days)}) AS next_expected_date
RETURN p.product_id, p.name, bp.last, bp.avg_interval_days,
       next_expected_date, bp.times, bp.times - 1
ORDER BY next_expected_date ASC
LIMIT $limitN

Complexity Analysis:

Time Complexity: O(P) where P = products purchased by user
Index Usage: user_id_unique constraint (indexed lookup)
Traversal Pattern: Single-hop from User → PURCHASED → Product
Filtering: On relationship properties (pre-computed values)

Scalability Assessment:

Metric	Small (10 purchases)	Medium (100 purchases)	Large (1000 purchases)
Query Time	`<5ms`	`<10ms`	`<50ms`
Scaling Factor	O(1) effectively	O(1) effectively	O(1) effectively

Why It Scales:

✅ Pre-computed intervals: avg_interval_days stored on relationships
✅ User-scoped: Query starts with unique user lookup (indexed)
✅ Bounded result: Limited by user’s purchase history
✅ Simple traversal: Single relationship type, no graph expansion
✅ Minimal computation: Just date arithmetic on relationship properties

Bottlenecks: ⚠️ NONE for typical use cases

Even power users with 1000s of purchases will see <100ms response times
Scales linearly with individual user purchase history (typically bounded)

2.2 Recommendations Query ⚠️ POTENTIAL BOTTLENECK

Query: GetRecommendations (lines 56-86)

MATCH (u:User {user_id:$user_id})
MATCH (u)-[bp:PURCHASED]->(p:Product)
WHERE bp.last >= datetime() - duration({days:$lookback_days})

MATCH (p)-[s:SIMILAR_TO]->(cand:Product)
MATCH (u)-[e:ELIGIBLE]->(o:Offer)-[:APPLIES_TO]->(cand)
WHERE e.start <= $as_of
  AND o.start <= $as_of AND $as_of <= o.end

OPTIONAL MATCH (u)-[rp:PURCHASED]->(cand)
WITH p, cand, s, o, rp
WHERE rp IS NULL OR rp.last < datetime() - duration({days:$exclude_days})

WITH p, cand, s, o
ORDER BY p.product_id, s.score DESC
WITH p, collect({cand:cand, score:s.score, o:o})[..$per_seed_cap] AS perSeed
UNWIND perSeed AS row
WITH row.cand AS cand, row.score AS sim_score, row.o AS o

WITH cand, o, sum(sim_score) AS agg_sim
WITH cand, o, toInteger(agg_sim * (log(1 + o.points)/log(2)) * 100) AS rank
ORDER BY rank DESC, o.priority DESC, o.end ASC
RETURN cand.product_id, cand.name, o.offer_id, o.title, o.points, o.end, rank
LIMIT $limitN

Complexity Analysis:

Time Complexity: O(P × S × E) where:
- P = products purchased by user (10-100 typical)
- S = similar products per seed (capped at per_seed_cap = 80)
- E = eligible offers per user (10-1000s potential)
Worst Case: O(100 × 80 × 1000) = 8M relationship traversals

Traversal Pattern:

User → PURCHASED → Product (P products)
Product → SIMILAR_TO → Candidate Products (P × S candidates)
User → ELIGIBLE → Offer (E offers)
Offer → APPLIES_TO → Candidate (filter join)
User → PURCHASED → Candidate (existence check)

Scalability Assessment:

Scale	Purchases	Similar/Product	Offers	Traversals	Est. Time
Small	10	20	10	2K	20-50ms
Medium	50	50	100	250K	100-200ms
Large	100	80	1000	8M	500-2000ms

Bottlenecks Identified:

❌ ELIGIBLE relationship fan-out
- Each user may have 100s-1000s of ELIGIBLE relationships
- This creates a large cartesian product with SIMILAR_TO candidates
- Impact: Quadratic growth with number of offers
❌ Missing index on relationship properties
- bp.last filtered but no index (full scan of PURCHASED relationships)
- e.start and o.start/end filtered but limited index coverage
- Impact: Linear scan of relationships
❌ Graph expansion
- Query expands from 1 user → P seeds → (P × S) candidates
- Then filters against E offers
- Impact: Potentially millions of intermediate paths
✅ Mitigations in place:
- per_seed_cap limits expansion per product (80 max)
- LIMIT caps final results (20 default)
- Offers are time-bounded (active offers only)

3. Index Coverage Analysis

Current Indexes (schema.go:51-71)

Index	Covers	Used By	Effectiveness
`user_id_unique`	User.user_id	All queries	✅ Excellent
`product_id_unique`	Product.product_id	Lookups	✅ Excellent
`offer_id_unique`	Offer.offer_id	Lookups	✅ Excellent
`offer_dates`	Offer.(start, end)	Recommendations	⚠️ Partial
`product_category`	Product.category	Not used	❌ Unused
`user_zip`	User.zip	Not used	❌ Unused
`product_brand`	Product.brand	Not used	❌ Unused

Missing Critical Indexes ❌

Relationship property index on PURCHASED.last
```
WHERE bp.last >= datetime() - duration({days:$lookback_days})
```
Impact: Full scan of user’s purchase relationships
Relationship property index on ELIGIBLE.start
```
WHERE e.start <= $as_of
```
Impact: Full scan of user’s eligibility relationships
Composite index on PURCHASED(user_id, last, product_id)
- Would accelerate both queries significantly
- Neo4j doesn’t support relationship property indexes in Community Edition

4. Scale Projections

4.1 Users: 1M → 10M → 100M

Repurchase Endpoint:

✅ Linear scaling: Each user query is independent
✅ Bounded complexity: O(user’s purchases) regardless of total users
✅ Expected: <50ms @ 100M users

Recommendations Endpoint:

⚠️ Degradation possible: Depends on ELIGIBLE relationship growth
⚠️ If offers scale with users: Query time could grow to 1-2s
⚠️ If offers stay bounded (likely): 200-500ms stable

4.2 Products: 10K → 100K → 1M

Repurchase Endpoint:

✅ No impact: Product count doesn’t affect user-scoped queries

Recommendations Endpoint:

⚠️ Moderate impact: More SIMILAR_TO relationships
⚠️ Mitigated by per_seed_cap limiting expansion
⚠️ Expected: +50-100ms per 10x product growth

4.3 Purchases: 10M → 1B → 10B

Repurchase Endpoint:

✅ No global impact: Only user’s own purchases matter
✅ User-level bounded: Even power users <1000 purchases

Recommendations Endpoint:

⚠️ Indirect impact: More historical purchase data
⚠️ Filtered by lookback window: Typically only last 120 days matter
⚠️ Archival strategy needed: Older purchases should be moved to cold storage

5. Performance Over Time

Data Growth Patterns

Linear Growth (Expected):

New users added steadily
New purchases added daily
Older data remains but is filtered by time windows

Impact on Query Performance:

Timeline	Users	Purchases	Repurchase Query	Recommendations Query
Month 1	100K	1M	5ms	50ms
Month 6	500K	10M	5ms	100ms
Year 1	1M	50M	5-10ms	200ms
Year 2	5M	500M	5-10ms	500ms
Year 5	20M	5B	10-20ms	1-2s ⚠️

Critical Threshold: ~10M users / 1B purchases

Recommendations query may exceed 500ms SLA
Requires optimization (see Section 6)

6. Optimization Recommendations

🔴 High Priority (Do Before Scale)

Add relationship property indexes (requires Neo4j Enterprise)

CREATE INDEX purchased_last FOR ()-[r:PURCHASED]-() ON (r.last)
CREATE INDEX eligible_start FOR ()-[r:ELIGIBLE]-() ON (r.start)

Expected improvement: 50-70% reduction in query time

Denormalize offer eligibility
- Pre-compute eligible offers per user
- Store as User property: eligible_offer_ids: [id1, id2, ...]
- Update on eligibility changes Expected improvement: 60-80% reduction for large offer sets
Add read replicas
- All queries are read-only
- Scale horizontally with replicas
- Route traffic via load balancer Expected improvement: 5-10x throughput capacity

🟡 Medium Priority (Before Year 2)

Implement caching layer
- Cache recommendations per user (5-15 min TTL)
- Redis/Memcached for hot data
- 80-90% cache hit rate expected Expected improvement: 10-20x reduction in DB load
Optimize SIMILAR_TO graph
- Prune low-score similarities (< 0.6 threshold)
- Limit max similarities per product to 50
- Reduce graph fan-out Expected improvement: 30-40% reduction in traversals
Partition historical data
- Move purchases >180 days to archive table/graph
- Keep hot data (last 6 months) in main graph
- Archive accessible for long-term queries only Expected improvement: 40-50% reduction in relationship scans

🟢 Low Priority (Nice to Have)

Pre-compute recommendations
- Batch job to generate recommendations nightly
- Store top 20 recommendations per user
- Serve from pre-computed table Expected improvement: Near-instant queries (<5ms)
Implement graph partitioning
- Shard by user_id ranges
- Dedicated Neo4j instances per shard
- Requires sophisticated routing layer Expected improvement: Unbounded horizontal scaling

7. Monitoring Recommendations

Key Metrics to Track

Query Performance:

P50, P95, P99 response times per endpoint
Alert threshold: P95 > 500ms
Track over time to detect degradation

Database Health:

Neo4j heap usage (alert >80%)
Page cache hit rate (alert <85%)
Transaction throughput
Connection pool saturation

Business Metrics:

Queries per user per day
Purchase history growth rate
Offer creation rate
SIMILAR_TO relationship density

Recommended Alerts

alerts:
  - name: "Slow Recommendations Query"
    condition: "p95_latency > 500ms for 5 minutes"
    action: "Scale read replicas or optimize query"

  - name: "Slow Repurchase Query"
    condition: "p95_latency > 100ms for 5 minutes"
    action: "Investigate index issues"

  - name: "Database Memory Pressure"
    condition: "heap_usage > 85%"
    action: "Increase heap or add replicas"

  - name: "Purchase Growth Anomaly"
    condition: "daily_purchases > 2x 7-day average"
    action: "Check for data quality issues"

8. Conclusion

✅ Strengths

Repurchase prediction endpoint is exceptionally scalable
Pre-computed metrics eliminate runtime complexity
User-scoped queries naturally partition workload
Graph database well-suited for relationship-heavy queries

⚠️ Concerns

Recommendations query has quadratic growth potential
Missing critical relationship property indexes
ELIGIBLE relationship fan-out is primary bottleneck
No caching layer for frequently accessed data

🎯 Recommended Scale Plan

Phase 1: Current State (0-1M users)

System works well out of the box
Monitor query performance
No immediate optimizations needed

Phase 2: Growth (1-10M users)

Implement caching layer (Redis)
Add read replicas (2-3 instances)
Prune similarity graph
Cost: ~$500-1000/month

Phase 3: Scale (10-50M users)

Upgrade to Neo4j Enterprise (relationship indexes)
Denormalize offer eligibility
Partition historical data
Add 5-10 read replicas
Cost: ~$5000-10000/month

Phase 4: Massive Scale (50M+ users)

Pre-compute recommendations (batch jobs)
Implement graph sharding
Dedicated recommendation clusters
Cost: ~$20000+/month

Scalability Analysis: Consumer Graph System

Scalability Analysis: Consumer Graph System

Executive Summary

1. Data Model Scale Analysis

Node Counts (Projected)

Relationship Counts (Projected)

Storage Estimates

2. Query Performance Analysis

2.1 Repurchase Prediction Query ✅ EXCELLENT

2.2 Recommendations Query ⚠️ POTENTIAL BOTTLENECK

3. Index Coverage Analysis

Current Indexes (schema.go:51-71)

Missing Critical Indexes ❌

4. Scale Projections

4.1 Users: 1M → 10M → 100M

4.2 Products: 10K → 100K → 1M

4.3 Purchases: 10M → 1B → 10B

5. Performance Over Time

Data Growth Patterns

6. Optimization Recommendations

🔴 High Priority (Do Before Scale)

🟡 Medium Priority (Before Year 2)

🟢 Low Priority (Nice to Have)

7. Monitoring Recommendations

Key Metrics to Track

Recommended Alerts

8. Conclusion

✅ Strengths

⚠️ Concerns

🎯 Recommended Scale Plan

Final Verdict: PRODUCTION-READY with optimization roadmap in place