Why is AI-era development fast? The common answer — “the model got better” — isn't the real answer. Pinning a single AI to everything actually creates new bottlenecks. The real engine is something else: putting an AI in the project-manager seat, auto-delegating work to whichever AI is strongest in that domain, and placing an experienced human leader above all of it. This post traces how that structure works through three concrete events from Hawkeye, ClickEye's own in-house product being built with this method. At that scale, a system like Hawkeye would traditionally take a 15-20 person team 12-18 months. The ClickEye approach moves it to a small core team in 3-4 months — and the mechanism behind that shift is what we're unpacking.
0. What is Hawkeye?
Briefly setting the stage before the main argument. Hawkeye is ClickEye's in-house product, an AI operations platform for large-scale GPU data centers. Cloud providers and AI infrastructure operators share the same hard problem — thousands of GPU servers produce an endless stream of events (overheating, performance degradation, job failures, network anomalies) that no human team can keep up with manually. Hawkeye collects, analyzes, and judges those events automatically, and when action is needed an AI can execute it directly. High-impact decisions still require operator approval (human-in-the-loop). Where conventional monitoring tools stop at “here's what's wrong,” Hawkeye is a system that “tells you what's wrong, what to do, who has to approve it, and then executes it.”
It is designed to handle 5,000+ GPU devices, 50,000 events per second, and 3,300+ detection rules concurrently at production scale. A system of this size built the traditional SI way means a team of 15-20 people across 12-18 months. ClickEye converged this system to an operable state in roughly 60 days using its own approach, and the same approach is applied directly to client projects.
1. The four layers that produce the speed
- An experienced human leader sits at the strategy layer, holding only product direction, scope, and the final gate.
- A project-manager AI (powered by Google's Gemini) dispatches every task centrally and decides which AI will handle it based on difficulty. The manager AI does not write code itself — it only dispatches.
- Specialist AIs receive the manager's routing and work in parallel. Code review and testing go to OpenAI's code-specialized AI (Codex); architecture, database, site-reliability and domain review go to Anthropic's strongest model (Claude Opus). The best AI for the domain handles the work.
- The ClickEye Workflow (Harness) binds the three layers above into one executable system — team promises, role definitions, and automation flows live in one place that copies forward to the next project.
The mechanism is best explained by looking at how each layer actually moved during 60 days on Hawkeye.
2. What happens when a project-manager AI sits at the center
The most decisive design choice in Hawkeye is that the project manager role itself is held by an AI, not a person. The manager AI (Gemini) does the following:
- Frames the product direction and requirements, breaks them into tickets.
- Assigns each ticket a difficulty tier (1, 2, 3) automatically — Tier 1 for simple/routine, Tier 2 for design-sensitive, Tier 3 for high-complexity, cross-domain, high-risk. Database work is always Tier 3; security-sensitive work is always Tier 3. This is a rule baked into the manager AI.
- Routes each ticket to the right AI based on tier — Tier 1 to a light model, Tier 3 to the strongest model available (Claude Opus with extended reasoning).
- Dual-reviews every plan and requirement artifact with the code-specialized AI (Codex).
The result is that humans no longer pick the model for each task. The manager AI reads the domain and decides model strength automatically. The human role moves one layer up — market context, priority, scope freeze, and final sign-off. That is what ClickEye's tagline Execution by Experience actually points to. The leader's experience is encoded as policy; the manager AI enforces it.
3. Specialist AIs receive the routing and work in parallel
Once the manager AI dispatches, the work flows to that domain's champion AI immediately. Hawkeye has 14 specialist roles defined, and each role is intentionally bound to a different AI:
- Project manager / task decomposition → Google's Gemini (large context window, broad planning).
- Code review / testing / QA → OpenAI's Codex (code-correctness, diff-level reasoning).
- Architecture / database / SRE / domain experts → Anthropic's Claude (database work specifically requires the strongest model, Claude Opus). Long-horizon reasoning, domain trade-off design.
Four core roles (manager, decomposer, code review, QA) are mandatory gates every requirement and code change must pass. The remaining ten — architecture, data pipeline, intelligence, platform, infrastructure, security, UI/UX, technical writing, monitoring — are called automatically the moment that domain is touched. Unlike traditional SI where one person serializes every stage, here the moment the manager AI dispatches, 14 specialist AIs move concurrently.
4. The ClickEye Workflow (Harness) binds it together
Model routing alone is not enough. The order, the contracts, and the policies must be defined somewhere executable. That is what ClickEye's AI Workflow Execution System actually is. It comes as three bundles.
- Team promises — 18 principles encoded as enforceable policy. Five of the most decisive: usability first (no backend accuracy matters if the operator screen is confusing), no deferral (you cannot use “we'll do it in Phase 2” to ship a partial v1), design first (no code written before expert review), deploy from latest source (every release starts from a fresh pull of the main branch), evidence-traceable values (GPU specs must trace back to vendor datasheets and measurement, never hard-coded). Each is policy, not documentation — violations block the code change itself.
- Role definitions — the 14 roles above, with the AI bound to each role and the review responsibilities written down in advance. There is no “gut-call quality” stage.
- Automation flows — environment validation before every tool call, model routing, reproducible scripted operations.
Slogans live in meeting notes and get forgotten. The Harness keeps the promises in a form that copies forward to the next project. That is what the “reusable, verified Workflow” line in our marketing actually means.
5. Three events that show how it operates
Event ① — The operator dashboard showed 0/0/0
The AI operations dashboard's three headline metrics displayed 0/0/0 forever, even though the backend was computing the trust score correctly. The traditional SI answer is either “the design document separates these; it's intended” or “we'll harden it in Phase 2.” The usability-first promise forbids both responses. The human leader threw the ticket to the manager AI as a top priority; the manager classified it as a Tier 3 cross-domain task and simultaneously invoked five expert AIs — architecture, intelligence, UI/UX, monitoring, and platform experts (all Claude). The result: the backend design was realigned and the UI redesigned. The human held the gate, the manager AI routed, and five Claude instances audited their domains in parallel.
Event ② — The notification system did not ship in pieces
Operator notifications are the SI area most often deferred to “Phase 2.” The standard pattern is to ship email in v1, push Slack to the next cycle, leave escalation chains for v2, treat audit as “if we have time.” Hawkeye broke that pattern. The no-deferral promise blocked any partial release; the manager AI decomposed the work; specialist AIs executed in parallel. A single release went out containing five real channels (email, Slack, Teams, webhook, SMS), per-channel retry policy, per-tenant daily and monthly quotas, multi-tier escalation, audit trails, and SLA-breach reporting — all operable. Multiple AIs worked concurrently in their lanes; the release converged at one point.
Event ③ — Five gaps closed in the same week
A privileged agent that runs on each GPU server needed to be built. Because it was security-sensitive, the manager AI auto-classified it as the highest difficulty tier; architecture and platform-expert AIs (both Claude Opus) wrote the formal design document and committed a code skeleton in the same code change (design first and design-and-code-bundled-together firing simultaneously). A week later the platform-expert AI audited and found five gaps — a dead library with no callers, missing crash-recovery logic, missing termination state in the health-check schema, and two more. Traditional SI says “next cycle's backlog.” Here the manager AI converted each gap into a new ticket with re-assigned tiers, and three PATCH releases shipped within the same week. Expert audit gets re-injected as the manager's next-round tickets immediately. Rework doesn't stretch across cycles.
6. Why does it converge in 3-4 months?
Two things eat SI velocity — bottlenecks and rework. This structure blocks both.
- No bottleneck. The instant the manager AI dispatches, 14 specialist AIs move in parallel. Unlike the serial structure where one person processes every stage in sequence.
- No rework. Design-first stops missing design; design-and-code-bundled-together stops spec-to-code drift; usability-first stops backend-versus-UI divergence; no-deferral stops the future cost of partial releases. Every common point where teams have to go back and fix is blocked at the policy layer.
ClickEye's comparison copy — “people-driven, hard to scale” vs “automation-first, instantly scalable”, “frequent delays” vs “~3x faster delivery”, “built from scratch every time” vs “reused verified workflows” — describes the outputs of this mechanism. Hawkeye shows the mechanism firing on a real build.
7. What we carry to the next project
After Hawkeye, the asset stays. The 18 promises, 14 role definitions, the manager AI's tier-assignment policy, and the automation flows are stored in a form that copies into the first day of the next project. That's what “reusable verified Workflow” means in practice.
The four cooperating layers, in summary:
- Strategy layer — the experienced human leader holds product direction, role design, scope freeze, final gate.
- Orchestration layer — the project-manager AI (Gemini) dispatches every task. Writes no code; only routes.
- Execution layer — specialist AIs (Codex, Claude Opus, etc.) work concurrently in their lanes.
- Harness — promises, roles, and automation tied together so the three layers above connect cleanly.
8. Closing
The real innovation isn't in any single AI. It is in the multi-layer structure — a manager AI at the center, specialist AIs receiving auto-delegation, the Harness binding promises into one system, and an experienced human leader at the strategy layer above. Hawkeye's 60 days show the structure operating. The speed, cost, and reliability we promise are its outputs.
If you need a system at similar scale with similar reliability, we start from the same policy and the same Harness. Where traditional SI starts from a headcount estimate, we start by bringing the manager AI and its routing policy in on day one. That's how we deliver the same outcome with less cost in less time.
The events, promises, and role-to-AI mapping cited above are drawn directly from the internal documents and design records of an internal project (Hawkeye) being built with the ClickEye workflow. Raw counts such as lines of code, commits, migrations, and pipeline jobs are intentionally excluded — they are vanity metrics that mix auto-generated code and boilerplate. The 15-20-person, 12-18-month traditional SI estimate is industry custom for a NOC system at this scale (5,000+ GPUs, 50,000 events per second, 3,300+ detection rules); exact numbers vary by project.