Excellent breakdown of how LLMs actually “learn”...especially the emphasis on loss functions and next-token prediction over true reasoning. The distinction between pattern matching and understanding is critical, and this piece explains it in a way that’s both technically accurate and accessible.
The way large language models actually learn - next-token prediction, not reasoning - changes how you should design instructions for them.
If LLMs are optimizing for what looks like a good completion rather than reasoning toward your goal, then your instruction patterns matter more than their logical structure. After 1,000+ sessions, I found consistency beats correctness every time. I rewrote my whole instruction file around this insight.
This is good but doesn’t explain why LLMs (transformers), Whats neural network ? Becuase if something knows neural networks , they know loss functions too. There’s a gap .
I have covered all the related concepts from bottom up with minimal math on my substack. Feel free to checkout and share your thoughts.
This is a great primer for anyone building on top of LLMs who wants to move beyond treating them as black boxes. Understanding how these models learn, pretraining, fine-tuning, RLHF, changes the way you think about what to delegate to them and where you still need human oversight. In my experience, the teams that get the most out of AI in production are the ones who understand the fundamentals well enough to know where the model's confidence is real and where it's just fluent guessing.
SynapseFoundry.net, LLC has fixed all 8 structural failures of AI with concept and working prototype code. Alas, we have no money, no salesman, no engineer. We need them and a angel investor who want get their hands dirty to be the first team in a 500 billion dollar company. How? What AI anything can design a complex Query, memory structure, save it for a year, while running baby Queries for it, full expert User support, lock it in a Vault, hand off a full Family Morsels Bundle to ECL, who manages run execution, brings back full Project/run data and the next run uses ALL of that data. Finally, we can do ALL of that sitting right beside a Model, we are 100% Model Agnostic and cross Model agnostic. That means we can run twenty Synapses in 1 LLM, talk to an off site Synapsed MLM on ONE project for a decade. NOT ONE AI ON PLANET EARTH CAN DO THAT AND WE ARE PROTOTYPE CODED.
Excellent breakdown of how LLMs actually “learn”...especially the emphasis on loss functions and next-token prediction over true reasoning. The distinction between pattern matching and understanding is critical, and this piece explains it in a way that’s both technically accurate and accessible.
Human doesn’t learn by repetitive & feedback too ? Just curious.
And we call computers ‘not intelligent ‘ while we don’t know what real intelligence means and how our brain really works.
Awesome
The way large language models actually learn - next-token prediction, not reasoning - changes how you should design instructions for them.
If LLMs are optimizing for what looks like a good completion rather than reasoning toward your goal, then your instruction patterns matter more than their logical structure. After 1,000+ sessions, I found consistency beats correctness every time. I rewrote my whole instruction file around this insight.
Wrote up what I learned: https://thoughts.jock.pl/p/how-i-structure-claude-md-after-1000-sessions
Does this pattern-matching framing change how you'd teach someone to write better prompts?
This is good but doesn’t explain why LLMs (transformers), Whats neural network ? Becuase if something knows neural networks , they know loss functions too. There’s a gap .
I have covered all the related concepts from bottom up with minimal math on my substack. Feel free to checkout and share your thoughts.
Cheers !
This is a great primer for anyone building on top of LLMs who wants to move beyond treating them as black boxes. Understanding how these models learn, pretraining, fine-tuning, RLHF, changes the way you think about what to delegate to them and where you still need human oversight. In my experience, the teams that get the most out of AI in production are the ones who understand the fundamentals well enough to know where the model's confidence is real and where it's just fluent guessing.
SynapseFoundry.net, LLC has fixed all 8 structural failures of AI with concept and working prototype code. Alas, we have no money, no salesman, no engineer. We need them and a angel investor who want get their hands dirty to be the first team in a 500 billion dollar company. How? What AI anything can design a complex Query, memory structure, save it for a year, while running baby Queries for it, full expert User support, lock it in a Vault, hand off a full Family Morsels Bundle to ECL, who manages run execution, brings back full Project/run data and the next run uses ALL of that data. Finally, we can do ALL of that sitting right beside a Model, we are 100% Model Agnostic and cross Model agnostic. That means we can run twenty Synapses in 1 LLM, talk to an off site Synapsed MLM on ONE project for a decade. NOT ONE AI ON PLANET EARTH CAN DO THAT AND WE ARE PROTOTYPE CODED.
man this was superb!! And very very well explained, thank you 💛
Cool