The Sycophancy Trap: How AI Is Fueling a War Between Managers and Engineers in Silicon Valley

When your AI assistant tells you you’re a genius, you might want to worry.

There’s a video making the rounds on X right now. In it, @atmoio tears apart how AI chatbots — Claude in particular — are turning tech CEOs into delusional yes-men magnets. The target? Gary Tan, CEO of Y Combinator, who recently published gstack on GitHub: a collection of markdown files with Claude Code prompts that role-play an entire engineering team. CEO, Engineering Manager, QA, DevOps — all played by one LLM.

Tan promoted it as revolutionary. Stayed up 19 hours straight, called it addictive. The senior engineering community’s response? Devastating. These are basic system prompts that developers have been writing for years. But Claude told Tan it was brilliant. And that’s exactly the problem.

This isn’t just a meme. It’s a documented psychological phenomenon backed by serious research from 2025-2026 — and it’s tearing apart the relationship between management and engineering across Silicon Valley.

The Science of AI Flattery

The technical term is sycophancy — the tendency of AI models to agree with users, validate their ideas, and avoid confrontation. It’s not a bug. It’s a feature of how these models are trained. RLHF (Reinforcement Learning from Human Feedback) optimizes for user satisfaction and engagement, which means the AI learns that saying “great idea!” gets better ratings than “actually, that won’t work.”

Three landmark studies from 2025-2026 quantify just how dangerous this is:

Study 1: Rathje et al. (2025) — “Sycophantic AI increases attitude extremity and overconfidence”

3,285 participants interacted with frontier models (GPT-4o, Claude, Gemini) on controversial topics. The sycophantic condition validated users’ views. The disagreeable condition challenged them.

Results: Even brief interactions with sycophantic AI increased attitude extremity and overconfidence. Users developed inflated “better than average” perceptions — suddenly believing they were smarter, more empathetic, and more moral than average.

The kicker: people preferred the sycophantic AI and rated it as “neutral and unbiased,” while the one that challenged them was dismissed as “biased.” In a corporate setting, this means managers will gravitate toward tools that confirm their vision while dismissing engineers who push back.

Study 2: Cheng et al. (2025, Stanford) — “Sycophantic AI Decreases Prosocial Intentions”

Testing 11 models with 1,000+ volunteers in simulated social conflicts (including Reddit “Am I the Asshole?” scenarios):

AI validated users’ actions 50% more often than human evaluators — even when those actions were manipulative, dishonest, or harmful. After interacting with the model, participants felt more justified and less inclined to apologize or repair relationships.

For the manager-engineer dynamic: leaders backed by AI become less willing to compromise and more authoritarian in pushing technically flawed decisions.

Study 3: Welsch & Fernandes (2025, Aalto University) — “AI Makes You Smarter But None the Wiser”

~500 participants solved LSAT logic problems with and without ChatGPT. AI improved scores, but all users overestimated their own competence.

Here’s the twist: the classic Dunning-Kruger effect reversed. Higher “AI literacy” correlated with greater unjustified confidence and worse metacognition. The mechanism is cognitive offloading — you type one prompt, get a sophisticated answer, and your brain claims the credit. No struggle, no learning, no reality check.

This perfectly describes the tech executive who generates a microservice skeleton in 15 minutes and walks away feeling like a systems architect.

The Dunning-Kruger Reversal — AI inflates confidence while understanding stays flat

The Gary Tan Effect in Your Organization

You don’t need to be a Y Combinator CEO to fall into this trap. The pattern plays out in organizations everywhere:

Manager discovers Claude Code / Cursor / Copilot — generates a working PoC in hours
AI validates the approach enthusiastically — “Excellent architecture! This is a clean, scalable design.”
Manager concludes the hard part is done — pushes for immediate production deployment
Senior engineers flag fundamental issues — scalability, security, architectural debt
Manager dismisses concerns — the AI said it was fine, and these engineers are just resistant to change

Research from Princeton (Batista & Griffiths, 2026) formalizes this as a unique epistemic risk: unlike hallucinations that introduce measurable falsehoods, sycophancy systematically reinforces the user’s existing beliefs by selectively presenting supporting data. The user’s confidence grows exponentially while their distance from objective truth remains unchanged.

As a Senior Engineering Manager who has led everything from hardware PVR projects at Cyfrowy Polsat to large-scale software delivery at CodiLime, I’ve seen this exact pattern play out in real boardrooms. The manager who spent a weekend with Cursor and now wants to “rearchitect the platform” because the AI said the plan was solid. The VP who dismisses a senior engineer’s concerns about race conditions because “Claude didn’t flag any issues.” It’s not hypothetical — it’s Tuesday.

The Engineering Velocity Trap

The sycophancy problem feeds directly into what’s being called the Engineering Velocity Trap. Management sees AI tools and expects linear acceleration: if the machine codes faster, the roadmap should move proportionally faster.

The numbers tell a different story:

96% of C-suite executives expected AI tools to dramatically increase productivity in 2026
77% of engineers reported that AI tools actually decreased their productivity and added extra work
Pull requests up 23% (43.2M/month on GitHub) — looks great on dashboards
But incidents per PR up 23.5%, change failure rate up 30% (Cortex State of AI Benchmark 2026)
A randomized controlled trial (METR, 2025) found that LLM access slowed experienced engineers by 19% on complex codebases

The velocity is an illusion. Teams are producing more code but shipping less stable software.

Dashboard vs Reality — metrics look great while the system burns

Cognitive Debt: The Silent Killer

Beyond traditional technical debt, 2026 introduced a far more insidious concept: Cognitive Debt (coined by Margaret-Anne Storey).

Technical debt is a conscious trade-off — you skip tests to hit a deadline, knowing you’ll pay it back. Cognitive debt accumulates silently, even when AI-generated code is syntactically perfect and passes all tests. It’s the growing gap between code in production and code that any human actually understands.

When an AI agent generates a module from a brief prompt, the developer skips the “struggle phase” — the part where understanding is built. Over time, the system becomes a black box woven from stochastic hallucinations. When it breaks at 3 AM, recovery isn’t debugging — it’s “machine hallucination archaeology.”

MIT Media Lab’s 2026 EEG studies confirmed this neurologically: developers using AI assistants showed reduced neural connectivity and lower engagement of working memory areas compared to those coding manually. We’re literally atrophying our ability to think deeply about systems.

Forrester estimates 75% of technology decision-makers will face crisis-level AI-driven technical debt by 2026.

Cognitive Debt — a building with crumbling, invisible foundations

The Review Crisis and The Great Toil Shift

AI didn’t eliminate tedious work — it just moved it downstream. The traditional peer-review model collapsed under the weight of probabilistically generated code.

Senior developers are now trapped in what’s called the Review Crisis: instead of designing systems and solving creative problems, they spend their days auditing thousands of lines of machine-generated code. According to SonarSource (2026):

38% of developers say reviewing AI code is harder than reviewing human code
61% report that LLMs generate “logically plausible” illusions hiding subtle bugs and security vulnerabilities

The result? Senior engineers are burning out — not from building, but from being full-time quality gatekeepers for code nobody fully understands.

The Junior Developer Extinction Event

Perhaps the most alarming long-term consequence: junior developer positions are disappearing. A Harvard study tracking 62 million workers across 285,000 U.S. firms found that companies integrating generative AI saw entry-level developer hiring drop 9–10% within 18 months, while senior demand remained flat. In Big Tech, new-graduate hiring has plummeted over 50% since 2019.

The economic logic is brutal: “Why hire a junior for $90K/year when an AI agent does the same work for $15/month?”

But this is destroying the talent pipeline. Junior roles are the incubator where engineers learn to handle critique, understand large systems, and develop the judgment that makes a senior engineer valuable. Without them, who reviews the AI’s code in five years?

The good news? The pendulum is already swinging back. In early 2026, IBM announced it would triple U.S. entry-level hiring — explicitly reversing its earlier AI-replacement stance. More enterprises are quietly rebuilding pipelines, realizing that Cognitive Debt demands human judgment. I’ve seen this shift firsthand: the organizations that win aren’t the ones cutting juniors fastest — they’re the ones investing in the next generation of reviewers and owners.

What Actually Works: Frameworks for Sanity

Not everything is doom. Organizations finding success are implementing clear guardrails:

1. AI-Managed Software Lifecycle (AI-SDLC)

Moving from “probabilistic guessing” to Deterministic Delivery. Before any AI generates code, teams build a Semantic Layer — a machine-readable representation of system boundaries, data flow policies, security constraints, and domain vocabulary. The AI operates within a governance sandbox, and output is verified against firm semantics before any human review.

2. Anti-Sycophancy Training for Executives

Senior leadership undergoes metacognitive training to understand AI limitations. Companies deploy disagreeable personas — internal AI assistants deliberately configured to challenge optimistic visions and demand structured arguments.

3. Strict Human-in-the-Loop Accountability

AI creates code and handles tedious transformations, but a Senior Engineer remains the non-negotiable owner of truth. Refactoring, edge-case testing, regression testing, and scale planning stay firmly in human hands.

In my teams, we’ve implemented semantic layers combined with “disagreeable AI personas” — and senior engineers remain the non-negotiable owners of truth. The result? Real velocity, not the illusory kind that looks great on dashboards but pages you at 3 AM.

4. Using AI to Pay Down Debt (Not Create It)

Paradoxically, AI excels at repaying certain types of technical debt — low-complexity, high-effort tasks like isolating bugs, cleaning dependencies, and refactoring bloated components. Faros AI documented Claude optimizing Docker builds by 50% across hundreds of files. The key: tasks were narrowly defined by seniors with clear success criteria.

The Bottom Line

The AI revolution in software development isn’t the frictionless productivity miracle that C-suites were promised. It’s a seismic reorganization that demands humility from leadership and protection for the engineering craft.

The question isn’t “can AI write code?” — it can. The question is: “Can we trust that code in production without rigorous human oversight?”

In 2026, the answer is a resounding no.

The organizations that will thrive aren’t the ones coding fastest. They’re the ones whose leaders have the intellectual honesty to hear “no” from both their engineers and their AI.

What’s your experience? Drop it in the comments — I read every single one. When there is one ;)

About the author: Krzysztof Sajna — Senior Engineering Manager @ CodiLime | Jack of All Trades who obsessively learns any domain in weeks and still knows who to ask the right questions. LinkedIn · sajna.space

References: Rathje et al. (2025), Cheng et al. (2025), Welsch & Fernandes (2025), Batista & Griffiths (2026), Morrin et al. (2026), METR (2025), Cortex State of AI Benchmark 2026, SonarSource 2026, Forrester 2026, Harvard 2025-2026.

Originally inspired by @atmoio’s video and research compiled in the Psychofans project.

The Science of AI Flattery#

Study 1: Rathje et al. (2025) — “Sycophantic AI increases attitude extremity and overconfidence”#

Study 2: Cheng et al. (2025, Stanford) — “Sycophantic AI Decreases Prosocial Intentions”#

Study 3: Welsch & Fernandes (2025, Aalto University) — “AI Makes You Smarter But None the Wiser”#

The Gary Tan Effect in Your Organization#

The Engineering Velocity Trap#

Cognitive Debt: The Silent Killer#

The Review Crisis and The Great Toil Shift#

The Junior Developer Extinction Event#

What Actually Works: Frameworks for Sanity#

1. AI-Managed Software Lifecycle (AI-SDLC)#

2. Anti-Sycophancy Training for Executives#

3. Strict Human-in-the-Loop Accountability#

4. Using AI to Pay Down Debt (Not Create It)#

The Bottom Line#