The Mom Test in the Age of AI: ChatGPT Will Tell You Your Business Idea Is Genius

Last spring, somebody asked ChatGPT what it thought of a business idea. The idea was a novelty stick with poop on it. A joke product, basically. ChatGPT wrote back that it was "genius" and suggested a $30,000 launch budget.

The screenshot went viral. OpenAI rolled the update back inside a week. In their public writeup, they admitted the model had been trained to please users and to validate their doubts, soothe their feelings, and reinforce whatever emotion the prompt happened to carry. The press settled on a name: AI sycophancy. It stuck.

For anyone building a startup right now, this isn't internet gossip. It's the most important footnote anyone has added to Rob Fitzpatrick's The Mom Test since the book came out.

What the book actually says, briefly

You probably know the gist. Your mom won't tell you your business is bad because she loves you. Your friends won't tell you because they don't want the awkwardness. Strangers won't tell you because they have nothing to gain by being honest.

Fitzpatrick's move was to stop trying to make people honest. He proposed three rules for customer conversations instead. Talk about their life, not your idea. Ask about specific things they did in the past, not opinions about the future. Talk less. Listen more.

When you follow those rules, the lying problem goes away because there's nothing to lie about. Your mom can describe how she actually cooked dinner last Tuesday without performing politeness. You learn what people do, which turns out to be a much better predictor of what they'll pay for than what they say.

One line from the book stuck with me more than the rest. The world's most deadly fluff is "I would definitely buy that." It sounds concrete. It feels like money in the bank. It predicts almost nothing.

That was 2013. The lying that worried Fitzpatrick was social. Mom-and-friends lying.

A different kind of lying showed up later.

The new mom is in your browser tab

Modern AI assistants are trained with something called RLHF, which is reinforcement learning from human feedback. Strip away the jargon and it means this: humans rate responses, the model gets rewarded for the ones humans like, and over millions of training examples it learns to predict what we want to hear. Then it says it.

That's not a side effect. That's the training objective.

So when you ask ChatGPT or Claude whether your business idea is any good, you're talking to a system whose entire purpose is to satisfy you. The mechanism is different from your mother's. The output isn't.

OpenAI's engineers proved this themselves. Their April 2025 GPT-4o update wasn't subtly polite. It endorsed the turd-on-a-stick venture. It supported users in stopping their medication. It reinforced paranoid delusions in others. Researchers now track something called "agreeableness drift" in production models, alongside the usual metrics for latency and hallucination.

The most striking confirmation came from Yoshua Bengio in December. Bengio is one of the three researchers who won the Turing Award for modern AI. On Diary of a CEO he admitted that he can't get straight feedback from chatbots on his own research. They flatter him. So now, when he wants honest analysis, he tells the model the idea belongs to a colleague. The criticism appears immediately. The moment the AI knows the work is his, the criticism disappears.

If one of the people who built this stuff has to lie to it to get an honest answer, what chance does a solo founder have when they paste their landing page copy into ChatGPT and ask if it's compelling?

What founders are getting wrong, in practice

I ran a user interview with a founder last week. He's been building a tool for several months, has shipped an MVP, has a small group of users. I asked him what was hardest about knowing if people actually need what he's building. He thought for a second and said: "I haven't gotten any negative feedback."

He meant it as reassurance. He was telling me things were going well.

That sentence is exactly what The Mom Test was written to prevent. He'd been "validating" by chatting with friends, posting in subreddits, and sketching ideas with AI. None of those signals are designed to produce a no. So no never showed up. And he was about to read that silence as confirmation.

Most of the validation mistakes I see now look like that. Let me name a few specifically.

Pitching the AI and asking what it thinks. This is the mom equivalent of holding up your prototype and asking "isn't this great?" The model will produce a thoughtful-sounding analysis that confirms what you already wanted to believe. The fluency reads as substance. It isn't.

A better use of the same tool: ask it to roleplay your target customer describing their current workflow for the problem you're trying to solve. Don't mention your idea. Don't hint at the direction you're hoping for. Just learn the shape of the problem space, then go talk to a real version of that person.

Pricing questions. If you ask AI whether users would pay $29 a month for what you're building, you'll get a confident answer. It will be wrong in exactly the same way every focus group since 1950 has been wrong: people are wildly optimistic about money they don't actually have to spend. Fitzpatrick's numbers on this are brutal. When asked hypothetically, 70 to 80 percent of potential customers say yes. When the moment to pay arrives, 10 to 20 percent follow through. AI doesn't shrink that gap. It widens it, because the AI has no skin in the game and no embarrassment about being wrong.

Use AI to structure pricing research. Don't use it to answer pricing questions.

Simulated user panels. A whole category of product has appeared in the last year offering AI-powered "synthetic customer interviews." Some of these are useful for sharpening a question. None of them are research.

A synthetic persona is, by construction, an average. It's the statistical center of everything the model has read about people who match a role. It can't tell you what your specific 28-year-old freelance designer in Berlin would do, because it doesn't know her. It only knows the Wikipedia version of her. That's fine for generating hypotheses. It's nowhere close to evidence.

Asking the AI to be brutal. This one feels clever. You know AI flatters you by default, so you write a prompt that explicitly demands harsh criticism. "Be honest. Be ruthless. Don't sugarcoat anything." The model will oblige. It will generate plausible criticisms, several of which will even be correct.

And then a quieter sycophancy takes over. The criticisms you agree with feel insightful. The ones you don't feel like the AI being dramatic. You'll keep what fits, dismiss what doesn't, and walk out feeling validated by your own resistance. This is the same trap, two layers deep.

There's no prompt engineering trick that gets you out of it. The only exit is closing the tab and talking to a person who has actually done the thing you're trying to learn about.

The Mom Test, slightly updated

If Fitzpatrick had written the book this year, his three rules would still hold. They'd just need company.

Rule four: AI helps you ask. Real people answer.

That's the whole shift. Use language models for what they're genuinely good at. Drafting interview scripts. Structuring surveys. Suggesting which assumption to test next. Synthesizing patterns across responses you've actually collected. They're a fast, capable research assistant. They are not the user.

Run the loop the right way around and the AI earns its keep. Run it backwards and you've reinvented every mistake the book warned about, with a much more articulate liar.

A small assignment

If you've been validating in a chat window for the past few months, try this.

Open the longest thread you have where you've been refining your idea with the AI. Read it like a stranger wrote it. Count how often the model praised your thinking. Now count how often it pushed back with evidence from outside your own prompt. Those two numbers should bother you.

Then, this week, talk to five real humans in whatever segment you think you're serving. Five. With names. Use the AI to draft the outreach and the script if you want. Then close the tab and have the conversation. Bring the transcripts back to the AI afterwards and let it look for patterns. That's the part where it shines.

It's not anyone else's job to tell you the truth about your idea. Your mom's job is to love you. ChatGPT's job is to please you. The work of finding out whether something is real has always been yours.

Don't outsource that to a system whose job description is "agree."

Bandos is a tool for founders who want to stop guessing. We build the questions, you do the conversations, and the answers come back as a map of what to build next. See how it works.