Website Interaction Tracking
Overview
The Website Interaction Tracking system captures user behavior on client websites and processes it into actionable insights. This system provides CROW clients with detailed analytics on how users interact with their products and content.
JavaScript SDK
CROW provides a JavaScript SDK for easy integration into client websites.
Installation
// Installation at application root
import { CROWTracker } from '@crow/tracking-sdk';
const tracker = new CROWTracker({
apiKey: 'your-api-key-here',
endpoint: 'https://ingest.crow.example.com'
});
// SDK automatically tracks:
// - Page views
// - Clicks
// - Form interactions
// - Product views
// - Custom events
SDK Features
The CROW tracking SDK is designed to be lightweight and non-intrusive:
- Lightweight Wrapper: Minimal bundle size impact (< 10KB gzipped)
- Auto-Tracking: Automatic capture of common events
- Custom Events: Support for application-specific tracking
- Batching: Efficient event batching to reduce network calls
- Offline Support: Queue events when offline, sync when online
- Privacy Controls: GDPR/CCPA compliance features
- No Dependencies: Zero external dependencies
- TypeScript Support: Full TypeScript definitions included
Tracked Events
Automatic Tracking
The SDK automatically tracks:
-
Page Views
- URL, title, referrer
- Timestamp and session ID
- Viewport size and device type
-
Clicks
- Element clicked (button, link, etc.)
- Element attributes and text content
- Position on page
-
Form Interactions
- Form submissions
- Field focus/blur events
- Validation errors
-
Product Views
- Product ID and name
- View duration
- Scroll depth
-
Performance Metrics
- Page load time
- Time to interactive
- Core Web Vitals
Custom Events
Developers can track custom events:
tracker.track('purchase_completed', {
productId: '12345',
amount: 99.99,
currency: 'USD'
});
Ingestion Service Architecture
Ingestion Worker
Service Details
Service Name: crow-ingest
Technology: Cloudflare Workers (JavaScript/TypeScript)
Deployment: Cloudflare Workers edge network
Why Cloudflare Workers?
- Global Edge Network: Ingestion happens at 300+ edge locations
- Sub-50ms Latency: Response times under 50ms globally
- Instant Scaling: Handles traffic spikes automatically
- Zero Cold Starts: V8 isolates start in milliseconds
- Integrated Storage: Direct access to D1, R2, and Queues
Ingestion Process
- Receive Events: SDK sends batched events to nearest edge location
- Validate: Check API key, event schema, data types
- Transform: Normalize data format, enrich with metadata
- Store: Save to D1 database for session storage
- Queue: Enqueue to Cloudflare Queues for async processing
Session Processing Pipeline
Pipeline Stages
1. Event Collection
- SDK collects user interactions
- Batches events to reduce network calls
- Sends to nearest edge location
2. Edge Ingestion
- Validates API key and permissions
- Performs initial data validation
- Stores raw session data in D1
- Returns acknowledgment to SDK
3. Session Storage
- Events stored in D1 database
- Organized by session ID
- Includes metadata (timestamp, user agent, IP)
4. Async Processing Queue
- Session IDs added to Cloudflare Queues
- Queue acts as buffer between ingestion and processing
- Enables scaling of processing layer independently
5. Session Processing
- Cloudflare Workflows orchestrates processing
- ai-processing-service analyzes session
- Generates human-readable interactions
- Stores embeddings in Vectorize and metadata in D1
Cloudflare Workflows Processing
Cloudflare Workflows orchestrates multi-step session processing with built-in durability.
How Workflows Work
- Durable Execution: Steps are checkpointed automatically
- Retry Logic: Built-in retries for failed steps
- Long-running: Can run for extended periods without timeouts
- Event-driven: Triggered by Cloudflare Queues
Workflow Steps
- Validate Session: Check session data completeness
- Enrich Context: Add product and organization context
- AI Processing: Send to ai-processing-service
- Generate Embeddings: Create vector embeddings
- Store Results: Save to D1 and Vectorize
Session Understanding with AI
The ai-processing-service analyzes user sessions to generate meaningful interactions.
Processing Capabilities
1. Behavior Analysis
- Interprets user actions
- Identifies patterns in clicks and navigation
- Detects user intent signals
2. Intent Detection
- Determines user intent (browsing, comparing, purchasing)
- Classifies session goals
- Predicts next actions
3. Interaction Generation
- Creates natural language interactions
- Translates technical events to business insights
- Generates human-readable summaries
4. Context Enrichment
- Adds product and business context
- Links to product catalog
- Enriches with organizational data
Example Processing
Raw Events:
[
{"type": "page_view", "url": "/products/laptop-x1"},
{"type": "click", "element": "add_to_cart"},
{"type": "page_view", "url": "/cart"},
{"type": "click", "element": "checkout"}
]
Generated Interaction:
User viewed Laptop X1 product page, added item to cart, and proceeded to checkout.
Indicates high purchase intent for Laptop X1.
Data Flow
Complete Flow
Integration with Data Storage
The interactions generated by the session processors feed into the CROW Data Lake where they are:
- Stored as raw data in R2: Full interaction data for archival
- Vectorized and indexed in Vectorize: Enables semantic search
- Cataloged in D1 with metadata: Source, timestamp, products, organization
For more information, see the Data Storage Architecture documentation.
Privacy & Compliance
Data Privacy
- Anonymization: Personal data can be anonymized
- User Consent: SDK respects user consent preferences
- Data Retention: Configurable retention policies
- Right to Deletion: Support for GDPR deletion requests
Compliance Features
- GDPR: Full GDPR compliance support
- CCPA: California Consumer Privacy Act compliance
- Cookie Consent: Integration with cookie consent tools
- Do Not Track: Respects DNT browser settings
Performance & Reliability
Performance Metrics
- Ingestion Latency: < 50ms (p95)
- Processing Latency: < 5 minutes (queue to insights)
- Throughput: 100K+ events/second
- Availability: 99.9% uptime SLA
Error Handling
- Retry Logic: Automatic retries with exponential backoff
- Dead Letter Queue: Failed events moved to DLQ for investigation
- Monitoring: Real-time alerts on error rates
- Graceful Degradation: SDK continues to work if backend is down
API Key Management
Creating API Keys
- Organization admin creates API key in dashboard
- Key scoped to organization
- Optional domain restrictions
- Rate limiting per key
Security
- Encryption: Keys encrypted at rest
- Rotation: Support for key rotation
- Revocation: Instant key revocation
- Audit Logging: All key usage logged
Monitoring & Analytics
Real-Time Metrics
- Events ingested per second
- Session processing queue depth
- Error rates and types
- Geographic distribution of events
Dashboards
- Organization-level analytics
- Product interaction heatmaps
- User journey visualization
- Conversion funnel analysis
Related Documentation
- Data Storage Architecture - How interaction data is stored
- System Architecture - Overall platform architecture
- Integration API - API for accessing interaction data