Services Architecture
Overview
CROW is built on Cloudflare Workers with a TypeScript-based microservices architecture. Each service is a dedicated Worker with specific responsibilities, communicating through mTLS, Cloudflare Queues, and service bindings. All AI work uses Google Gemini models via Vercel AI SDK.
Service Architecture
API Gateway
The API Gateway is the single entry point for all client requests, centralizing cross-cutting concerns.
Purpose: Request routing, authentication verification, and rate limiting.
Responsibilities:
- Route requests to appropriate backend services
- Validate sessions and API keys
- Convert sessions/API keys to JWTs for internal communication
- Enforce rate limiting per key and per IP
- Request validation and shaping
- Add correlation IDs for distributed tracing
Technology:
- TypeScript on Cloudflare Workers
- mTLS for service-to-service communication
Bindings:
- All backend services via mTLS
- Auth Service for session/key validation
Core Services
Auth Service
Purpose: Critical service for all authentication and authorization.
Responsibilities:
- Session-based authentication (Better Auth + Drizzle)
- API key management and validation
- JWT token issuance for internal services
- Public request JWT generation (per IP, rate-limited)
- System-level request authorization
- Session caching (1-5 min TTL)
Technology:
- TypeScript on Cloudflare Workers
- Better Auth library for session management
- Drizzle ORM for database access
Bindings:
- D1: User sessions, API keys
- KV: Session cache
User Service
Purpose: User management and profile operations.
Responsibilities:
- User signup, login
- User profile management
- Team member references
- Chat creator tracking
- Overview page data
Bindings:
- D1: User data
- Auth Service: Session validation
Organization Service
Purpose: Organization data and context management.
Responsibilities:
- Organization CRUD operations
- Maintains Vectorize store for organization context
- Provides context to Interaction and Pattern services
- Builds organization profiles from usage
- GDPR-compliant data clearing
Technology:
- TypeScript on Cloudflare Workers
- Gemini AI via Vercel AI SDK for context building
Bindings:
- D1: Organization data
- Vectorize: Organization context embeddings
Product Service
Purpose: Product catalog and web scraping.
Responsibilities:
- Product CRUD operations
- Web scraping with Cloudflare Browser Rendering
- Sitemap.xml analysis
- Product refinement interface
- Vectorize store for product data
Technology:
- TypeScript on Cloudflare Workers
- Gemini AI via Vercel AI SDK for product extraction
- Cloudflare Browser Rendering for scraping
Bindings:
- D1: Product data
- Vectorize: Product embeddings
- Browser Rendering: Web scraping
Notification Service
Purpose: User notification management.
Responsibilities:
- Store notification history
- Queue-based notification creation
- Trusted queue messages (no auth)
- Notification display
Bindings:
- D1: Notifications
- Queues: Notification events
Analytics Service
Purpose: Usage tracking and billing.
Responsibilities:
- Queue-based event processing
- Tracks interactions and patterns
- Usage analytics
- Billing calculations
Bindings:
- D1: Usage data, billing records
- Queues: Usage events
Data Processing Services
Interaction Service
Purpose: Core service for processing any input into insights.
Responsibilities:
- Processes video, images, text from any source
- Retrieves context from Organization and Product services
- Generates interaction records with textual descriptions
- Identifies products involved
- Exposed via MCP server
Technology:
- TypeScript on Cloudflare Workers
- Gemini AI via Vercel AI SDK
- Cloudflare AI Gateway for LLM routing
Trigger: Queue-based (Cloudflare Queues)
Output: Interaction records with text, metadata, product associations
Bindings:
- Queues: Input data
- D1: Metadata
- R2: Raw data
- Vectorize: Interaction embeddings
Pattern Service
Purpose: Discovers patterns across interactions.
Responsibilities:
- Cron-triggered daily (last 24 hours)
- Finds patterns across sessions
- Aggregates: day, week, month, year, all-time
- Self-triggers via queues for multi-period aggregation
- Uses Organization and Product context
Technology:
- TypeScript on Cloudflare Workers
- Gemini AI via Vercel AI SDK
- Cloudflare AI Gateway for LLM routing
Trigger: Cron job (daily) + self-triggered queues
Output: Pattern records with text, metadata, product associations
Bindings:
- Cron Triggers: Daily execution
- Queues: Self-triggering
- D1: Metadata, pattern records
- Vectorize: Pattern embeddings
Chat Service (Dashboard BFF)
Purpose: Backend for Frontend for dashboard chat.
Responsibilities:
- Chat history management
- Message storage
- Artifacts and assets management
- Multi-agentic orchestration
- Concurrent agent dispatching
Technology:
- TypeScript on Cloudflare Workers
- Gemini AI via Vercel AI SDK
- Cloudflare AI Gateway for LLM routing
- Multi-agent system via Vercel AI SDK
Integration: Dashboard-specific, not generic
Bindings:
- D1: Chat history
- R2: Artifacts, assets
- Vectorize: Semantic search
- Interaction/Pattern Services: Data access
Component Workers
Web Ingest Worker
Purpose: Receive events from JavaScript SDK and manage web sessions.
Responsibilities:
- Event validation and transformation
- API Gateway routing
- Session management via Durable Objects
- Hourly inactivity trigger (1 hour = session end)
- Queue dispatch to Interaction Service
Technology:
- TypeScript on Cloudflare Workers
- Durable Objects for session state
Bindings:
- Durable Objects: Session state with alarm
- Queues: Session processing
- D1: Temporary event storage
Session Definition: 1 hour of user inactivity triggers session processing.
Social Worker
Purpose: Scrape social media data via cron jobs.
Responsibilities:
- AI-generated web search: Generate queries, search, extract unique links
- Social media scraping: Use provided links to fetch content
- Web extraction for all discovered content
- Duplicate detection
- Queue dispatch to Interaction Service
Technology:
- TypeScript on Cloudflare Workers
- Gemini AI via Vercel AI SDK for query generation
- Cloudflare Browser Rendering for web extraction
Trigger: Cron-based scheduled execution
Bindings:
- Cron Triggers: Scheduled execution
- D1: Scrape state, seen links
- Queues: Extracted content
- Browser Rendering: Web scraping
Session Definition: One cron job trigger = one session.
CCTV Worker
Purpose: Handle CCTV video streaming and processing.
Responsibilities:
- Receive WebRTC streams via Cloudflare Realtime SFU
- Convert WebRTC to WebSocket
- Forward to Gemini Live API
- Hourly session creation
- Queue dispatch to Interaction Service
Technology:
- TypeScript/Bash on device and Cloudflare Workers
- Cloudflare Realtime SFU for WebRTC
- WebRTC-to-WebSocket adapter
- Gemini Live API for real-time analysis
Deployment:
- Background daemon on CCTV server
- Starts on system startup
- Bash script for initial setup
Trigger: Continuous streaming with hourly sessions
Bindings:
- Realtime SFU: WebRTC streams
- Queues: Hourly session data
- Gemini Live: Video analysis
Session Definition: 1 hour of continuous footage = one session.
External Integration Services
MCP Server
Purpose: Model Context Protocol server for LLM integrations.
Responsibilities:
- Exposes Interaction and Pattern services
- API key authentication
- Routes through API Gateway as external party
- Standard MCP protocol compliance
Technology:
- TypeScript on Cloudflare Workers
- Built on Cloudflare MCP Workers
- MCP protocol specification
Integrations:
- ChatGPT, Claude, other LLMs
- Developer tools
- External AI agents
Bindings:
- API Gateway: External auth
- Interaction Service: Data access
- Pattern Service: Data access
A2A Service
Purpose: Agent-to-agent communication protocol.
Responsibilities:
- Calls MCP Server for data
- Agent communication protocol
- Authentication required
- Enterprise AI integrations
Technology:
- TypeScript on Cloudflare Workers
Bindings:
- MCP Server: Data access
- Auth Service: Authentication
Service Communication
mTLS (Mutual TLS)
Services communicate via mTLS for secure bidirectional verification:
- API Gateway configured to trust internal services
- Each service validates calling service
- Additional verification layer beyond mTLS
- Prevents vendor lock-in (alternative to Service Bindings)
JWT Tokens
API Gateway issues JWTs for internal communication:
// API Gateway converts session/API key to JWT
const jwt = await issueJWT({
userId: session.userId,
orgId: session.orgId,
permissions: session.permissions
});
// Services validate JWT
const decoded = await validateJWT(request.headers.get('Authorization'));
JWT Characteristics:
- No TTL/expiration (performance optimization)
- Contains user, org, permissions
- Cached mapping (1-5 min TTL)
- Optimized for high-volume data workloads
Queue-based Communication
Asynchronous communication via Cloudflare Queues:
// Producer (Web Ingest Worker)
await env.INTERACTION_QUEUE.send({
type: 'web_session',
sessionId: '...',
events: [...],
orgId: '...'
});
// Consumer (Interaction Service)
export default {
async queue(batch, env) {
for (const message of batch.messages) {
await processInteraction(message.body, env);
}
}
};
Service Bindings (Limited Use)
While Cloudflare Service Bindings exist, CROW primarily uses mTLS to avoid vendor lock-in. Service Bindings may be used for specific internal-only services where portability is not a concern.
Technology Stack
| Service Type | Language | AI Framework | Key Libraries |
|---|---|---|---|
| All Services | TypeScript | Vercel AI SDK | Hono, Drizzle ORM |
| AI Processing | TypeScript | Gemini models | Cloudflare AI Gateway |
| Web Scraping | TypeScript | Gemini models | Browser Rendering |
| Auth | TypeScript | N/A | Better Auth |
Note on Hono: Hono is a lightweight, fast web framework used across all CROW microservices. It provides routing, middleware, and request handling optimized for Cloudflare Workers.
Programming Language Selection
| Language | Use Case | Justification |
|---|---|---|
| TypeScript | All core services on Cloudflare Workers | Native Worker support, strong ecosystem, type safety |
| Python | AI/ML services (if needed) | Strong ML tooling, useful for experiments |
| Go | Container-based services (future) | High performance, good concurrency |
Note: TypeScript is the primary language. Python services may be required if AI libraries lack TypeScript support. Go remains a future option if container-based services become necessary.
Repository Naming Conventions
- Core APIs:
organization-service,user-service,product-service,auth-service - Component Workers:
web-ingest-worker,social-worker,cctv-worker - Processing Services:
interaction-service,pattern-service,chat-service - Integration:
mcp-server,a2a-service - Clients:
landing-client,auth-client,dashboard-client
Deployment
Wrangler Configuration
Each service has its own wrangler.toml:
name = "crow-interaction-service"
main = "src/index.ts"
compatibility_date = "2024-01-01"
[[d1_databases]]
binding = "DB"
database_id = "..."
[[r2_buckets]]
binding = "STORAGE"
bucket_name = "crow-storage"
[[vectorize]]
binding = "VECTORIZE"
index_name = "interaction-embeddings"
[[queues.consumers]]
queue = "interaction-queue"
max_batch_size = 10
max_retries = 3
[ai]
binding = "AI"
Local Development
- Kubernetes with Kind: Docker-based local environment
- Consistency: Same config for local and production
- Future-ready: Enables cloud migration if needed
CI/CD Pipeline
Pipeline Stages
| Stage | Action |
|---|---|
| Lint | ESLint, Prettier, TypeScript checking |
| Test | Bun test runner with coverage reporting |
| Deploy Dev | Staging deployment on pull requests |
| Deploy Prod | Production deployment on main branch merge |
| Notify | Slack notifications for deployment status |
Deployments use Wrangler CLI for Workers and Containers with environment-specific configuration managed through Cloudflare's secrets store.
Observability and Monitoring
Metrics Collection
Cloudflare Analytics Engine provides edge-native metrics collection without impacting request latency:
- Request volumes and throughput
- Latency distributions (p50, p95, p99)
- Error rates by service
- AI processing times
Logging
Structured logs stream to Axiom for centralized analysis:
- Debugging: Trace requests across services
- Audit trails: Compliance and access logging
- Performance analysis: Identify bottlenecks
- Long-term retention: Historical analysis
Alerting
- Real-time alerts on error rate thresholds
- Latency degradation notifications
- Queue depth monitoring
- AI Gateway usage alerts
Related Documentation
- System Architecture - Overall platform architecture
- Integration API - API details
- Data Storage Architecture - Storage design
- Frontend Architecture - Client applications
- User Permission Levels - Access control