System Architecture
Overview
CROW (Cognitive Reasoning Observation Watcher) is a unified customer interaction intelligence platform built entirely on Cloudflare's edge infrastructure. It collects human-product interaction signals across web, in-store (CCTV), and social media channels, analyzes them with organizational context, and discovers patterns using AI.
The platform runs as a constellation of Cloudflare Workers communicating through a central API gateway, with D1 databases, R2 object storage, Vectorize indexes, and Queues providing the storage and messaging backbone. All AI inference runs through Cloudflare Workers AI models (Meta Llama, BAAI BGE, LLaVA) via the Cloudflare AI Gateway, with some services using the Vercel AI SDK workers-ai-provider adapter for structured generation.
Service Map
CROW consists of 25 deployable units (24 Cloudflare Workers + 1 CLI tool):
| Service | Type | Domain (prod) | Purpose |
|---|---|---|---|
| core-api-gateway | Gateway | api.crowai.dev | Central routing, auth, org resolution, rate limiting |
| core-auth-service | Core | internal.auth-api.crowai.dev | Better Auth sessions, API keys, JWT, JWKS, onboarding |
| core-user-service | Core | internal.users.crowai.dev | User CRUD, profile pictures, search, permissions |
| core-organization-service | Core | internal.orgs.crowai.dev | Org CRUD, AI context generation, member management |
| core-product-service | Core | internal.products.crowai.dev | Product catalog, crawler jobs, image analysis, vector search |
| core-interaction-service | Processing | interactions.crowai.dev | Interaction ingestion, CCTV batch processing (Workers Containers) |
| core-pattern-service | Processing | patterns.crowai.dev | Cron-triggered pattern analysis (Workers Containers) |
| core-billing-service | Core | internal.billing.crowai.dev | Stripe checkout, subscriptions, webhooks |
| core-notification-service | Core | internal.notifications.crowai.dev | Email via Resend, notification queues |
| core-analytics-service | Core | internal.analytics.crowai.dev | Usage event tracking and summaries |
| core-chat-service | Core | internal.chat.crowai.dev | Multi-agent chat with tool calling via Workers AI |
| core-social-collector | Processing | social-collector.crowai.dev | Cron-triggered social media data collection (Tavily search, AI query generation) |
| core-social-processor | Processing | N/A (queue consumer) | Social media content enrichment and forwarding to interaction queue |
| bff-chat-service | BFF | internal.chat.crowai.dev | Multi-agent chat with tool calling, agentic loop |
| bff-qna-service | BFF | internal.qna.crowai.dev | RAG Q&A with Vectorize embeddings |
| mcp-service | Integration | mcp.crowai.dev | MCP server for LLM tool integrations |
| a2a-service | Integration | a2a.crowai.dev | Agent-to-Agent protocol service (Google A2A SDK) |
| infra-crawl-service | Infrastructure | N/A (Workers Containers) | Product catalog web crawler (Playwright containers, R2 storage) |
| cctv-ingest-service | Ingestion | cctv.crowai.dev | CCTV frame analysis via vision AI, R2 frame storage |
| web-ingest-service | Ingestion | internal.ingest-worker.crowai.dev | Website interaction tracking SDK backend |
| cctv-cli | CLI | N/A | CLI tool for capturing and streaming CCTV video+audio |
| rogue-store | Frontend | rogue.crowai.dev | Next.js demo e-commerce store (test/showcase) |
| dashboard-client | Frontend | app.crowai.dev | Next.js dashboard SPA |
| auth-client | Frontend | auth.crowai.dev | Next.js auth pages (sign-up, sign-in, onboarding) |
| landing-client | Frontend | crowai.dev | Astro + Preact marketing/landing pages |
Dev Environment Domains
All dev domains follow the pattern dev.<domain>. Internal services use dev.internal.{service}.crowai.dev, and public-facing services use dev.{service}.crowai.dev. For example, the gateway is dev.api.crowai.dev and the auth service is dev.internal.auth-api.crowai.dev.
Request Routing Through API Gateway
The API gateway (core-api-gateway) is the single entry point for all client requests at api.crowai.dev. It routes requests based on the URL path segment after the version prefix:
/api/v1/{service-path}/...
The gateway maps path segments to internal service URLs:
| Path | Target Service |
|---|---|
auth, better-auth | core-auth-service |
users | core-user-service |
organizations | core-organization-service |
products, crawler-jobs | core-product-service |
interactions | core-interaction-service |
patterns | core-pattern-service |
billing | core-billing-service |
notifications | core-notification-service |
analytics | core-analytics-service |
chat | bff-chat-service |
qna | bff-qna-service |
mcp | mcp-service |
cctv | cctv-ingest-service |
Request Pipeline
Every request to an authenticated service flows through these gateway middleware layers in order:
- CORS -- origin validation against allowlist
- Security headers -- standard response hardening
- Rate limiting -- per-IP throttling (stricter on auth endpoints)
- Authentication -- session cookie or API key validation, JWT token retrieval
- Organization resolution -- resolves active org to internal UUID, sets
X-Organization-Id - Cache -- KV-backed response cache keyed on path + org
- Forward -- proxies to internal service with
X-Internal-Key,X-Organization-Id, andAuthorization: Bearer <JWT>injected
Special routes bypass some middleware:
- Auth routes (
/api/v1/auth/*,/api/v1/better-auth/*) skip the auth/org/cache middleware - Product image routes (
/api/v1/products/images/*) skip auth - JWT routes (
/api/v1/auth/jwt/*) skip auth
Auth Flow: Session Cookie to JWT to Gateway to Downstream
Better Auth with Cross-Subdomain Cookies
CROW uses the Better Auth library for session management. Sessions are stored as HTTP-only cookies scoped to .crowai.dev, enabling cross-subdomain authentication across auth.crowai.dev, app.crowai.dev, and api.crowai.dev.
The auth flow:
- User signs up or signs in at
auth.crowai.dev(auth-client) - Better Auth creates a session in the auth service D1 database
- A
better-auth.session_tokencookie is set on.crowai.dev - Subsequent requests to
api.crowai.devinclude the cookie automatically - The gateway reads the cookie, calls auth service
GET /api/v1/auth/get-sessionto validate - The gateway calls
GET /api/v1/auth/tokento obtain a JWT for downstream services
JWT Flow
Services verify JWTs using the auth service JWKS endpoint:
- Auth service exposes
GET /api/v1/auth/jwksreturning the public key set - Gateway obtains a JWT from
GET /api/v1/auth/tokenafter validating the session - Gateway forwards the JWT as
Authorization: Bearer <token>to downstream services - Downstream services fetch the JWKS and verify the JWT signature locally
- JWKS is cached by services (5-minute TTL after security hardening)
API Key Authentication
External integrations authenticate with API keys prefixed crow_:
- Client sends
X-API-Key: crow_...orAuthorization: Bearer crow_... - Gateway detects the
crow_prefix and does NOT forward it as a Bearer token to downstream (it is not a JWT) - Gateway resolves the organization ID from the key owner's user record
- Gateway injects the resolved
X-Organization-Idheader
Email Domain Blocklist
Sign-up blocks consumer email domains: gmail.com, yahoo.com, outlook.com, hotmail.com, x.com, live.com, msn.com, icloud.com, me.com, aol.com, yandex.com, mail.com. Only business email addresses are accepted.
BOLA Pattern
All org-scoped services enforce Broken Object Level Authorization (BOLA) checks. The gateway is the sole authority for setting the X-Organization-Id header:
- Gateway resolves the caller's organization from session or API key
- Gateway strips any client-supplied
X-Organization-Idheader (prevents header injection) - Gateway injects the verified
X-Organization-Idinto the forwarded request - Each service compares the header value against the resource's
organizationId - Mismatch results in a
403 Forbiddenresponse
const callerOrgId = c.req.header('X-Organization-Id')
if (!callerOrgId || callerOrgId !== resource.organizationId) {
return c.json({ error: 'Forbidden' }, 403)
}
This pattern is used consistently across: user, organization, product, interaction, pattern, billing, analytics, chat, notification, and QnA services.
Gateway Trust Model: X-Internal-Key
Services must reject requests that bypass the gateway and arrive directly at their internal URLs. The gateway injects a shared secret X-Internal-Key header into every forwarded request. Services validate this header in their /api/v1/* middleware:
app.use('/api/v1/*', async (c, next) => {
if (!c.env.INTERNAL_GATEWAY_KEY) {
return c.json({ error: 'Service unavailable' }, 503)
}
const key = c.req.header('X-Internal-Key')
if (!key || key !== c.env.INTERNAL_GATEWAY_KEY) {
return c.json({ error: 'Unauthorized' }, 401)
}
return next()
})
The INTERNAL_GATEWAY_KEY is a shared secret deployed to all services via wrangler secret put. This ensures that even though internal service URLs are reachable on the public internet (via Cloudflare custom domains), they reject direct access.
Services protected by INTERNAL_GATEWAY_KEY: gateway (injects), user, organization, analytics, notification, pattern, interaction, billing, QnA, bff-chat, core-chat, MCP, A2A.
Cross-Subdomain Cookie Setup on .crowai.dev
Better Auth sets cookies with domain=.crowai.dev, making them available to all subdomains:
auth.crowai.dev-- sets the cookie on sign-inapp.crowai.dev-- reads the cookie for dashboard accessapi.crowai.dev-- reads the cookie for API authentication
The session cookie is HTTP-only and SameSite=None; Secure for cross-origin credential inclusion.
The cookieCache feature is disabled on the auth service (cookieCache: { enabled: false }) to prevent stale session data after set-active organization changes.
Organization Resolution Flow
The gateway resolves a caller's organization through a multi-step process:
- Session path: Read session cookie, call
GET /api/v1/auth/get-session, extractactiveOrganizationId(a Better Auth org ID), then callGET /api/v1/organizations/by-auth-id/:idon the org service to resolve to the internal UUID - API key path: Call
POST /api/v1/auth/api-key/verify, extract the key owner'suserId, then look up the user'sorganizationIdvia user serviceGET /api/v1/users/by-auth-id/:userId(NOT from attacker-controlled key metadata)
The resolved internal organization UUID is set as X-Organization-Id on the forwarded request.
Technology Stack
| Category | Technology | Purpose |
|---|---|---|
| Compute | Cloudflare Workers | Serverless edge compute (TypeScript) |
| Containers | Workers Containers | Pattern analysis, interaction analysis, web crawling (Playwright) |
| Database | Cloudflare D1 | Per-service SQLite databases |
| Object Storage | Cloudflare R2 | Images, assets, raw data |
| Vector DB | Cloudflare Vectorize | Product embeddings, QnA index |
| Queues | Cloudflare Queues | Async message passing |
| KV | Cloudflare KV | Gateway response cache, Next.js ISR cache |
| Durable Objects | Cloudflare DO | Session state (CCTV ingest, web ingest), container orchestration |
| Browser Rendering | Cloudflare Browser | Product page scraping |
| Cron Triggers | Cloudflare Cron | Pattern analysis scheduling, social collection, QnA index refresh |
| AI | Cloudflare Workers AI | LLM inference (@cf/meta/llama-3.3-70b-instruct-fp8-fast, @cf/meta/llama-3.1-8b-instruct, @cf/llava-hf/llava-1.5-7b-hf, @cf/baai/bge-m3) |
| AI Gateway | Cloudflare AI Gateway | LLM request routing and caching (crow-ai-gateway) |
| AI SDK | Vercel AI SDK + workers-ai-provider | Structured generation (generateObject, streamText) for product extraction and org context |
| Web Framework | Hono / OpenAPIHono | HTTP routing and middleware |
| ORM | Drizzle | Type-safe D1 queries |
| Auth | Better Auth | Session management, org plugin, API keys |
| Payments | Stripe | Checkout, subscriptions, webhooks |
| Resend | Transactional email delivery | |
| Frontend | Next.js via OpenNext, Astro + Preact via @astrojs/cloudflare | Dashboard, auth, rogue-store, and landing clients |
Data Flow Summary
Web Interactions
SDK events flow through the web-ingest service Durable Objects (CrowWebSession for session buffering) into crow-interaction-queue, consumed by the interaction service for storage and AI analysis.
CCTV
The CCTV ingest service receives frames via API, runs vision analysis with @cf/llava-hf/llava-1.5-7b-hf, stores frames to R2 (crow-cctv-frames), and dispatches messages to crow-cctv-batch-queue for downstream processing by the interaction service.
Social Media Pipeline
Social data collection follows a two-stage pipeline:
- core-social-collector runs on a cron schedule (every 2 hours), generates AI-powered search queries, discovers social content via Tavily search, and dispatches raw items to
crow-social-processing-queue - core-social-processor consumes the queue, enriches content with AI analysis, and forwards processed items to
crow-interaction-queuefor the interaction service
Pattern Analysis
Cron triggers fire on the pattern service (hourly, daily at 02:00, weekly on Monday at 03:00, monthly on 1st at 04:00, yearly on Jan 1st at 05:00 UTC). The TypeScript worker fetches all organization IDs via the gateway, then dispatches each to a container (PatternAnalyzerContainer) for analysis. Embeddings use @cf/baai/bge-m3. Results are stored in the pattern_result D1 table and pattern embeddings in Vectorize.
Chat
Two chat services work together:
- core-chat-service implements a lightweight multi-agent system with tool calling using
@cf/meta/llama-3.3-70b-instruct-fp8-fastwith three tools (search_products,get_interactions,get_patterns) in an agentic loop (max 5 iterations), routed through Cloudflare AI Gateway. - bff-chat-service provides the dashboard-facing chat with richer tool calling (including
search_org_context,get_interaction_summary,search_interactions,search_patterns,search_products), also using@cf/meta/llama-3.3-70b-instruct-fp8-fastvia AI Gateway.
Product Crawling
- User creates a crawler job via the product service
- Job is sent to
crow-product-crawl-queue - Queue consumer crawls the URL using Browser Rendering or the
infra-crawl-service(Playwright containers with R2 result storage) - Products are extracted using
@cf/meta/llama-3.1-8b-instructvia the Vercel AI SDKworkers-ai-provideradapter (generateObjectwith Zod schema) - Products are embedded using
@cf/baai/bge-m3and stored in Vectorize