Who owns the IP of AI-generated code?

100% of the deliverable IP belongs to the client. We include IP assignment clauses at contract signing, and ClickEye retains no rights to your code or data. (Our non-public internal Workflows remain ClickEye's asset.)

Can the build be done without sending data outside our environment?

Yes. We can deploy and run AI Workflows directly inside your environment (VPC, intranet, your own cloud). Sensitive data does not pass through ClickEye servers. Detailed security requirements are defined after NDA signing.

How do you prevent the schedule slippage that plagues traditional SI?

We decompose deliverables into Workflow units with completion criteria defined up front. Automated execution removes much of the people-dependency variance. Sprint (2–3 weeks) and Build (4–6 weeks) packages are duration-guaranteed, and we share weekly progress.

How are costs handled when scope expands?

Initial quotes are based on the agreed scope. Scope changes are priced per added Workflow unit. Because automation is the core, marginal cost growth is much lower than people-based SI. Every change is quoted before work begins.

How is the quality of AI-generated work guaranteed?

Through a Human-in-the-loop process. Every AI output passes multi-stage expert review and integration testing before delivery. Build and Scale packages include 2 weeks of free post-delivery stabilization.

Can our team take over and maintain the system?

Yes. We build on standard stacks (Next.js, TypeScript, Python, etc.) that any competent developer can take over. We provide operations manuals, architecture documentation, and handover sessions. Scale includes a dedicated project manager for continued operations.

What are the NDA and contract terms?

We sign an NDA from the first conversation. After scope agreement, contracts use a deposit / mid-payment / final payment structure. Detailed terms are shared during consultation.

Do you take on international projects? Is English or Bahasa Indonesia available?

Yes. We communicate fluently in English and Bahasa Indonesia. For wide time-zone gaps we work asynchronously first (shared docs, Slack), with regular sync meetings scheduled to fit both sides.

Without review, AI output becomes slop — why ClickEye encodes multi-stage verification as a directory

On May 8, 2024, developer and writer Simon Willison pinned a single word to the industry conversation: AI slop. The definition is tight — content that is (1) artificially generated without careful review and (2) pushed onto an audience that did not ask for it.^[1] A year and a half later Merriam-Webster named it the 2025 Word of the Year.^[2] When an industry gives a phenomenon a name, that itself is signal. This post lays out the data behind that signal and shows why ClickEye encodes a multi-stage verification structure as a directory from the first commit.

1. The definition — the line is not "did you use AI?" but "was there review?"

Willison's most important line is this:

“Sharing unreviewed content that has been artificially generated with other people is rude.”^[1]

He states the corollary in the same post: “Not all AI-generated content is slop.”^[1]

The dividing line is not whether AI was used. It is where review and accountability sit. AI output that passes through proper verification and human judgment is the output of a tool. The same output pushed forward unreviewed — into a code base, a search result, an operations dashboard — becomes slop. That line is not an aesthetic point; the 2024-2025 data shows it is a measurable industry cost.

2. The industry has begun to measure the cost of slop

Code — the ratio curl reported

Daniel Stenberg, maintainer of the open-source HTTP library curl, reported that as of 2025 around 20% of incoming security submissions are AI slop, while only ~5% are real vulnerabilities. Each false report consumes 30 minutes to several hours from three to four maintainers.^[3] Unreviewed AI output has begun to eat into the cost base of trust infrastructure — open-source security disclosure channels in this case.

Stack Overflow saw the pattern back in December 2022

A month after ChatGPT launched, Stack Overflow temporarily banned ChatGPT-generated answers. The announcement, verbatim:

“The average rate of getting correct answers from ChatGPT is too low... the primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good.”^[4]

“Looks plausible but is wrong at a high rate” — one sentence captures the common pattern under every slop case the industry has seen since. Without a verification layer, plausible-looking output passes straight through.

Package hallucination — academic measurement

A USENIX Security 2025 paper by Spracklen et al. quantifies the next failure mode. When LLMs generate code, they sometimes recommend package names that do not exist. The measured rates: commercial LLMs 5.2%, open-source LLMs 21.7%. Across 576,000 samples, 205,474 unique fake package names were extracted.^[5] This becomes more than statistics for one reason.

Lasso Security researchers actually registered huggingface-cli — one of the most-hallucinated package names — on PyPI as a proof of concept. Within a month it received over 30,000 downloads, and was referenced by multiple companies and projects including Alibaba.^[6] A new attack category called slopsquatting opens up: attackers pre-register the package names AIs hallucinate, and the supply chain gets poisoned. Unreviewed adoption equals security incident is now a documented equation.

Copilot security

NYU researchers Pearce et al. evaluated 1,689 programs across 89 CWE scenarios and found roughly 40% of GitHub Copilot-generated code contained security vulnerabilities.^[7] That is not a defect in Copilot itself — it is the baseline cost of accepting AI-generated code without security review.

The code base itself is changing — GitClear's 211M-line study

GitClear's 2025 analysis of 211 million lines of code shows a structural shift in the codebase that lines up temporally with AI assistant adoption. Refactoring rates collapsed from 25% in 2021 to under 10% in 2024, while copy-paste clone rates rose from 8.3% to 12.3% over the same window.^[8] AI assistants generate quickly, but when the review step that cleans up afterward is skipped, the codebase slowly rots. The cost of speed gets paid later, in maintenance.

3. The dividing line is the review seat

Five data points pointing the same way. The problem isn't that AI was used. It is that AI output reached production, the codebase, or the operations floor without passing through a review-and-accountability layer. Willison's definition pins exactly that line — the unreviewed seat is where slop is made.

This conclusion is the industry evidence behind one of ClickEye's headline messages — “AI drafts, experts verify” (human-in-the-loop). It is why we encode a multi-stage verification structure as a directory from day one.

4. Where ClickEye's multi-stage verification actually sits

Review is split into layers rather than concentrated in one seat, and each layer is automated. Before any AI output reaches production, it has to pass through:

The PM AI (Gemini) classifying every task by tier — every ticket gets an automatic difficulty tier (1, 2, 3). Security-sensitive and database work are policy-forced to Tier 3 and routed to the strongest model (Claude Opus extended). The areas where ‘looks-plausible-but-wrong’ output is most dangerous (security, DB, cross-domain) automatically receive the deepest review.
Mandatory code-review AI (Codex) on every code change — every proposed code change goes through the code-specialist AI before merge. This is the first filter for the “plausible but wrong” output Stack Overflow warned about in 2022.
Domain-expert AI (Claude Opus) audit — architecture, database, site reliability, and security are audited by domain-expert AIs. A single model's hallucination is cross-checked by a different model from a different vantage point. (In our Hawkeye case study, this is exactly the seat where a platform-expert audit found five gaps a week after design — see the companion post.)
Design-first + design-and-code-bundled-together — no code goes in without a spec, and the spec lives in the same code change as the code itself. The drift between intent and implementation — fertile ground for hallucination — is structurally blocked.
Human leader's final merge gate — even after Codex review and Opus audit, no merge happens without a human sanity check. The last place where “unreviewed content” could reach the outside is held by a person.

The single purpose of these five layers is to block every path by which AI output reaches production without passing through review and accountability. The finding that multi-stage review outperforms single-pass review is also one of the most stable conclusions in software defect-detection research, going back to the Fagan inspections of the 1980s.

5. What ClickEye is committing to

The comparison copy on the ClickEye site — “uncertain, inconsistent” vs “production-ready delivery guaranteed” — is backed by exactly this structure. AI speed is a starting point; the speed only becomes real value when the seats that hold accountability for the output are encoded as policy. ClickEye encodes those seats in the project's .claude/ directory — 18 team promises, 14 explicit role definitions, automated invocation flows — and copies them into the first commit of every next project.

Companion reading on the doctrine and the real-world case:

The environment makes the outcome — where real AI differentiation comes from (global industry doctrine: Anthropic's four engineering posts + UK government evaluation standard + coding evaluation moving from 1.96% to 82%)
Putting an AI in the project-manager seat — the dev-culture shift behind ClickEye (three concrete events from ClickEye's in-house product Hawkeye, where this structure actually operated)

6. Closing

AI is fast. It is going to get faster. But the industry is now paying the cost of speed that runs without accountability — in the time of curl maintainers, the trust of supply chains, the future maintainability of code bases, and above all the trust of clients who put a system in our hands. ClickEye blocks that cost from the start by encoding multi-stage review as a directory. If you need AI not merely adopted but delivered with the review seats designed in from day one, get in touch.

References

Willison, S. (May 8, 2024). Slop is the new name for unwanted AI-generated content. “sharing unreviewed content that has been artificially generated with other people is rude”. simonwillison.net/2024/May/8/slop
Merriam-Webster (2025). Word of the Year 2025: “slop”. merriam-webster.com/wordplay/word-of-the-year
Stenberg, D. (2025). ~20% of curl security submissions are AI slop. Public reports by the curl maintainer (originals at daniel.haxx.se; covered by LWN, The Register, Hackster).
Stack Overflow (Dec 2022). Temporary policy: Generative AI (e.g., ChatGPT) is banned. “The average rate of getting correct answers from ChatGPT is too low”. meta.stackoverflow.com/questions/421831
Spracklen, J. et al. (USENIX Security 2025). We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs. Commercial LLMs 5.2%; open-source 21.7%; 205,474 unique fake package names. arxiv.org/abs/2406.10279
Lasso Security (2024). Diving Deeper into AI Package Hallucinations: Slopsquatting in the wild. huggingface-cli PoC, 30,000+ downloads. lasso.security/blog/ai-package-hallucinations
Pearce, H. et al. (2022, IEEE S&P). Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions. ~40% of 1,689 programs across 89 CWE scenarios contained security vulnerabilities. arxiv.org/abs/2108.09293
GitClear (2025). AI Copilot Code Quality: 2025 Look at Refactoring, Reuse, and Read-Time. Analysis of 211M lines — refactoring 25% → under 10%, copy-paste clones 8.3% → 12.3%. gitclear.com/ai_assistant_code_quality_2025_research