How Anthropic’s Claude Thinks

Mar 25

In this article, we will look at what the Claude researchers found.

13 Comments

“Nobody at Anthropic programmed Claude to think a certain way. They trained it on data, and it developed its own strategies, buried inside billions of computations. For the people who built it, this could feel like an uncomfortable black box. Therefore, they decided to build something like a microscope for AI, a set of tools that would let them trace the actual computational steps Claude takes when it produces an answer”

this was a really well worded explanation right here in the opener. i think way to many people have misconstrued this fact. the emergent properties of ai is one of the things that make them so interesting.

Mattai80

Mar 26

"The philosopher Harry Frankfurt had a word for this kind of output. He called it bullshitting."

I'm not a scientist, not a coder, not a computer gut - nothing to desicribe why i am Even here. Maybe I just stranded here as I'm probalbly genuine curious and hence intersted in the human mind....and this line above was written for me. Thanks! I'll copy this for further discussions with stubborn but self-admitting 'intellectuals'.

Reply (1)

Mark Bartolo

Mar 29

That line itself made me think, ah it's learning to be like us humans.

Martin Donnelly

Apr 1

Any possibility that Claude stole the "rabbit" / "grab it" rhyme from Public Enemy? https://genius.com/Public-enemy-dont-believe-the-hype-lyrics

Alex

Mar 25

Great breakdown of Claude's reasoning patterns. We went a step further and analyzed the API traffic to extract the actual system prompts, all 24 tools, and turn-by-turn session traces. If you want to see what makes it work under the hood: https://agenticloopsai.substack.com/p/disassembling-ai-agents-part-2-claude

UNYIME MICHAEL SAMUEL

Apr 20

Hey everyone 👋

Quick favor from the community — we’ve just been featured on the AWS Builder Center for LuminaLog (privacy-first observability 🚀), and engagement (clicks, likes 👍, comments) directly boosts our chances of securing AWS startup grant funding.

If you’ve got 60 seconds:

👉 Open the post

👉 Drop a like 👍

👉 Leave a short comment (even “Great innovation!” goes a long way)

This support genuinely helps us push toward funding ahead of our public launch — and build something impactful for the ecosystem.

Here’s the link 👇

https://builder.aws.com/content/3Af988zLy1hRACEVFnzVuEeU5rz/aideas-finalist-luminalog-privacy-first-observability

Appreciate you all 🙏

UNYIME MICHAEL SAMUEL

Apr 20

Hey everyone 👋

If you’ve got 60 seconds:

👉 Open the post

👉 Drop a like 👍

👉 Leave a short comment (even “Great innovation!” goes a long way)

This support genuinely helps us push toward funding ahead of our public launch — and build something impactful for the ecosystem.

Here’s the link:

https://builder.aws.com/content/3Af988zLy1hRACEVFnzVuEeU5rz/aideas-finalist-luminalog-privacy-first-observability

Appreciate you all 🙏

Leftfield Strategist

Apr 12

This piece articulates something I’ve been observing in practice for a long time.

Every time I redirect an AI mid-conversation — “that’s not where I was going” or “I can’t use that” — I’m doing exactly what Anthropic’s researchers did in a lab: suppressing a feature and watching the output recalibrate. The difference is I’m doing it through human judgment, not a microscope.

What this research confirms for me is that the human in the loop isn’t a workaround for AI limitations. It’s an active layer of the computation. Someone has to recognize when the model misfired on meaning — not just syntax, not just accuracy, but organizational context, relational intent, the thing underneath the words. That recognition doesn’t live in the model. It lives in the person who knows what the work actually requires.

That’s not a gap that gets trained away. That’s the signal.

Richard S Kerr

Mar 30

Claude thinks exactly like I think! I narrow down the parameters in which an answer will be found and then I zero in on the exact answer! And Claude agrees with me that’s the way to do it!

Sean Hash

Mar 27

<p>200 agents on one codebase is a coordination problem before it's an engineering problem — at some point you're not shipping code, you're running a small economy with merge conflicts as the currency</p>

Pawel Jozefiak

Mar 27

The hallucination-as-misfired-recognition finding is the one that stuck with me. I run Claude sessions continuously (hundreds per week, automated) and the pattern is consistent: hallucinations cluster around entities the model almost recognizes.

Names that are close to real people, URLs that are plausible but wrong, version numbers from adjacent releases. Knowing it's a "known entity" feature misfiring rather than random fabrication actually changed how I structure prompts.

I front-load specific identifiers now (exact version numbers, full URLs) so the recognition circuit fires correctly instead of guessing. Reduced hallucinations noticeably. The poetry planning finding is wild though. Planning rhymes before generating intermediate lines suggests something closer to intent than pure next-token prediction.

Fernando Javier Martin

Mar 27

Excellent post! Loved the insights.

I won't lie and say I'm a bit surprised and concerned they created a Frankenstein that they didn't know her how it works.

Rakia Ben Sassi

Mar 26

Here’s the thing nobody tells you when you graduate from “I deploy to a VPS” to “I’m cloud-native now”:

Kubernetes is not a more reliable version of your old server. It’s a fundamentally different relationship with reliability. And if you approach it the same way, your pods will keep dying and you’ll keep losing sleep.

Let’s talk about it.

https://rakiabensassi.substack.com/p/the-kubernetes-mortality-rate-everything?utm_campaign=post-expanded-share&utm_medium=web

ByteByteGo Newsletter

How Anthropic’s Claude Thinks