Website Interaction Tracking
Overview
The Website Interaction Tracking system captures user behavior on client applications (web, mobile, desktop) and processes it into actionable insights. This is the Phase 1 component of CROW and the current focus for development.
JavaScript SDK
CROW provides a JavaScript SDK for easy integration into web, mobile (React Native), and desktop (Electron) applications.
Supported Platforms
- Web: Any website or web application
- Mobile: React Native applications (with DOM-like structure)
- Desktop: Electron applications (Chromium-based)
The SDK works with any platform that has a DOM setup, making it versatile across user interfaces.
Installation
# npm
npm install @crow/tracking-sdk
# pnpm
pnpm add @crow/tracking-sdk
# bun
bun add @crow/tracking-sdk
Basic Setup
// React Application
import { CROWTracker } from '@crow/tracking-sdk';
const tracker = new CROWTracker({
apiKey: 'your-api-key-here',
endpoint: 'https://api.crow.example.com'
});
// SDK automatically tracks:
// - Page views
// - Clicks
// - Form interactions
// - Product views
// - Custom events
Platform-Specific Examples
React Native
// React Native Application
import { CROWTracker } from '@crow/tracking-sdk';
const tracker = new CROWTracker({
apiKey: 'your-api-key-here',
endpoint: 'https://api.crow.example.com',
platform: 'react-native'
});
// Works with React Native's component structure
Electron
// Electron Application (Main or Renderer Process)
import { CROWTracker } from '@crow/tracking-sdk';
const tracker = new CROWTracker({
apiKey: 'your-api-key-here',
endpoint: 'https://api.crow.example.com',
platform: 'electron'
});
// Works with Electron's Chromium-based rendering
SDK Features
The CROW tracking SDK is designed to be lightweight and non-intrusive:
- Lightweight Wrapper: Minimal bundle size impact (< 10KB gzipped)
- Auto-Tracking: Automatic capture of common events
- Custom Events: Support for application-specific tracking
- Batching: Efficient event batching to reduce network calls
- Offline Support: Queue events when offline, sync when online
- Privacy Controls: GDPR/CCPA compliance features
- No Dependencies: Zero external dependencies
- TypeScript Support: Full TypeScript definitions included
Tracked Events
Automatic Tracking
The SDK automatically tracks:
-
Page Views
- URL, title, referrer
- Timestamp and session ID
- Viewport size and device type
-
Clicks
- Element clicked (button, link, etc.)
- Element attributes and text content
- Position on page
-
Form Interactions
- Form submissions
- Field focus/blur events
- Validation errors
-
Product Views
- Product ID and name
- View duration
- Scroll depth
-
Performance Metrics
- Page load time
- Time to interactive
- Core Web Vitals
Custom Events
Developers can track custom events:
tracker.track('purchase_completed', {
productId: '12345',
amount: 99.99,
currency: 'USD'
});
Ingestion Service Architecture
Session Management with Durable Objects
How It Works
- SDK Sends Events: User interactions sent to API Gateway
- Gateway Routes: Requests routed to Web Ingest Worker
- Durable Object Storage: Events stored in Durable Object per session
- Inactivity Alarm: Durable Object sets alarm for 1 hour
- Alarm Triggers: After 1 hour of inactivity, alarm fires
- Queue Dispatch: Session data sent to Interaction Service queue
- Processing: Interaction Service processes session
Why Durable Objects?
- Fast State Management: Instant read/write without database roundtrips
- Automatic Alarms: Built-in alarm system for time-based triggers
- Session Isolation: Each session has independent state
- Scalability: Automatically scales with traffic
- Consistency: Strong consistency guarantees
Durable Objects vs Traditional Session Storage
| Capability | Durable Objects | Traditional Redis |
|---|---|---|
| Latency | Low-latency access from Workers | Depends on Redis region vs client |
| Consistency | Strong consistency per object | Eventually consistent (typical) |
| Scaling | Automatic | Manual sharding |
| Billing | Per request + duration | Instance-based |
Session Definition
Web Session: A period of user activity ending after 1 hour of inactivity. When 1 hour passes without new events, the session is considered complete and sent for processing.
Web Ingest Worker
Service Details
Service Name: web-ingest-worker
Technology: Cloudflare Workers (TypeScript) with Durable Objects
Deployment: Cloudflare Workers edge network
Why Cloudflare Workers?
- Global Edge Network: Ingestion happens at 300+ edge locations
- Sub-50ms Latency: Response times under 50ms globally
- Instant Scaling: Handles traffic spikes automatically
- Zero Cold Starts: V8 isolates start in milliseconds
- Integrated Storage: Direct access to Durable Objects, D1, and Queues
- Durable Objects: Perfect for session state management with alarms
Ingestion Process
- Receive Events: SDK sends batched events to nearest edge location
- API Gateway: Validates auth, converts session/API key to JWT
- Web Ingest Worker: Receives events with JWT
- Durable Object: Stores events in session-specific Durable Object
- Alarm Management: Sets/resets alarm for 1-hour inactivity
- Queue Dispatch: After alarm fires, sends session to Interaction Queue
Session Processing Pipeline
Pipeline Stages
1. Event Collection
- SDK collects user interactions
- Batches events to reduce network calls
- Sends to nearest edge location via API Gateway
2. Authentication & Routing
- API Gateway validates session or API key
- Converts to JWT for internal use
- Routes to Web Ingest Worker
3. Durable Object Storage
- Events stored in session-specific Durable Object
- Alarm set for 1-hour inactivity
- Alarm reset on each new event
4. Session Completion
- After 1 hour without events, alarm fires
- Session marked as complete
- Session data sent to Cloudflare Queues
5. Async Processing Queue
- Session IDs added to Interaction Queue
- Queue acts as buffer between ingestion and processing
- Enables scaling of processing layer independently
6. Interaction Service Processing
- Interaction Service consumes queue messages
- Retrieves Organization and Product context
- Analyzes session using Gemini AI via Vercel AI SDK
- Generates human-readable interaction records
- Stores embeddings in Vectorize and metadata in D1
- Notifies Analytics Service for billing
Cloudflare Workflows Processing
Cloudflare Workflows orchestrates multi-step session processing with built-in durability.
How Workflows Work
- Durable Execution: Steps are checkpointed automatically
- Retry Logic: Built-in retries for failed steps
- Long-running: Can run for extended periods without timeouts
- Event-driven: Triggered by Cloudflare Queues
Workflow Steps
- Validate Session: Check session data completeness
- Enrich Context: Add product and organization context
- AI Processing: Send to ai-processing-service
- Generate Embeddings: Create vector embeddings
- Store Results: Save to D1 and Vectorize
Session Understanding with AI
The ai-processing-service analyzes user sessions to generate meaningful interactions.
Processing Capabilities
1. Behavior Analysis
- Interprets user actions
- Identifies patterns in clicks and navigation
- Detects user intent signals
2. Intent Detection
- Determines user intent (browsing, comparing, purchasing)
- Classifies session goals
- Predicts next actions
3. Interaction Generation
- Creates natural language interactions
- Translates technical events to business insights
- Generates human-readable summaries
4. Context Enrichment
- Adds product and business context
- Links to product catalog
- Enriches with organizational data
Example Processing
Raw Events:
[
{"type": "page_view", "url": "/products/laptop-x1"},
{"type": "click", "element": "add_to_cart"},
{"type": "page_view", "url": "/cart"},
{"type": "click", "element": "checkout"}
]
Generated Interaction:
User viewed Laptop X1 product page, added item to cart, and proceeded to checkout.
Indicates high purchase intent for Laptop X1.
Data Flow
Complete Flow
Integration with Data Storage
The interactions generated by the session processors feed into the CROW Data Lake where they are:
- Stored as raw data in R2: Full interaction data for archival
- Vectorized and indexed in Vectorize: Enables semantic search
- Cataloged in D1 with metadata: Source, timestamp, products, organization
For more information, see the Data Storage Architecture documentation.
Privacy & Compliance
Data Privacy
- Anonymization: Personal data can be anonymized
- User Consent: SDK respects user consent preferences
- Data Retention: Configurable retention policies
- Right to Deletion: Support for GDPR deletion requests
Compliance Features
- GDPR: Full GDPR compliance support
- CCPA: California Consumer Privacy Act compliance
- Cookie Consent: Integration with cookie consent tools
- Do Not Track: Respects DNT browser settings
Performance & Reliability
Performance Metrics
- Ingestion Latency: < 50ms (p95)
- Processing Latency: < 5 minutes (queue to insights)
- Throughput: 100K+ events/second
- Availability: 99.9% uptime SLA
Error Handling
- Retry Logic: Automatic retries with exponential backoff
- Dead Letter Queue: Failed events moved to DLQ for investigation
- Monitoring: Real-time alerts on error rates
- Graceful Degradation: SDK continues to work if backend is down
API Key Management
Creating API Keys
- Organization admin creates API key in dashboard
- Key scoped to organization
- Optional domain restrictions
- Rate limiting per key
Security
- Encryption: Keys encrypted at rest
- Rotation: Support for key rotation
- Revocation: Instant key revocation
- Audit Logging: All key usage logged
Monitoring & Analytics
Real-Time Metrics
- Events ingested per second
- Session processing queue depth
- Error rates and types
- Geographic distribution of events
Dashboards
- Organization-level analytics
- Product interaction heatmaps
- User journey visualization
- Conversion funnel analysis
Related Documentation
- Data Storage Architecture - How interaction data is stored
- System Architecture - Overall platform architecture
- Integration API - API for accessing interaction data