10 Comments
User's avatar
Xian's avatar

“The quality of AI’s output tells you something about the quality of your input.

Think of it like a detective solving a case. One clue (the suspect was in town that day) leaves hundreds of possibilities open. Two clues (they were in town and had a motive) narrows it down. Five independent clues might point to exactly one person.” - Julie Zhuo

ToxSec's avatar

great read. awesome breakdown of the terminology as well. considering the audience here, i think it’s great we all align on the terms as we discuss.

Noah Hirshon's avatar

The “lost in the middle” problem is one of the most practical insights in this space. In my experience building with LLMs, context placement matters more than context volume — you can stuff the window with relevant information and still get mediocre output if the structure is wrong.

Pawel Jozefiak's avatar

The "lost in the middle" problem is something I've hit repeatedly. My agent setup has an instruction file that grew to about 4,000 tokens over time - added sections as needed - and at some point noticed it was ignoring rules buried in the middle.

Ended up restructuring: most critical rules at top, less frequent edge cases at bottom. Helped immediately. The point about writing information externally to overcome statelessness is also the part I'd flag for anyone starting out - platform memory doesn't travel across models or cron runs. A flat markdown file does. Took me longer than it should've to figure that out.

Mitchell Kosowski's avatar

The framing of context engineering as four strategies (write, select, compress, isolate) is really useful.

I think the "isolate" pattern is the one most teams underestimate. It's counterintuitive that splitting work across multiple smaller contexts outperforms one large context using a more powerful model, but Anthropic's 90% improvement with multi-agent delegation makes the case pretty clearly.

James Trageser's avatar

Will be interesting when everyone realizes that I have invented a way to fold proteins instantly (a different project specifically for protein folding under the same organization on GitHub), and have created Infinite Context for LLM, that is lossless with and entropy drops the larger it scales. Which can also retrieve data from the context history no matter how large the context (the context could be 1000 years of typing complex code and math in the same chat/session with an LLM) with "Needle in a Haystack" precision and no data/memory loss ever. It is on GitHub and HuggingFace for all to see at: https://GitHub.com/Nexus-Resonance-Codex/Phi-Infinity-Lattice-Compression/LLM-Infinite-Context-Prompt.md (and other more detailed and effective methods to use besides just instructions or a prompt, are also on that repository). You can also see fold Proteins Live for free, here on HuggingFace: https://huggingface.co/spaces/Nexus-Resonance-Codex/Resonance-Fold

You can see the infinite context demo stuff here: https//GitHub.com/Nexus-Resonance-Codex/Phi-Infinity-Lattice-Compression/models (use the prompts and play around with them)

and the prompt:

https://github.com/Nexus-Resonance-Codex/Phi-Infinity-Lattice-Compression/blob/main/LLM-Infinite-Context-Prompt.md

and the HuggingFace.co space to test it live:

https://huggingface.co/Nexus-Resonance-Codex/LLM-Infinite-Context-Engine

you can email us at nexus resonance co d e x at gma i l c o m 😁

🫶🫶🫶

Charles Fonseca's avatar

Chroma’ research is pretty great, thanks for sharing!

Scenarica's avatar

The accuracy cliff at long context is the most underreported finding of the year. Everyone is racing to expand context windows while the models quietly get worse the more you give them. The implication for retrieval architectures is the opposite of what most teams are building. Less context, better selected, beats more context every time.​​​​​​​​​​​​​​​​

TRADE CRAFTERS's avatar

There’s something almost poetic about a system that becomes less precise as it’s given more to think about. You can feel the attention thinning out, like a crowd that grows so large it forgets why it gathered in the first place.

Most people assume intelligence compounds with input, but here it behaves more like pressure. Add enough of it and the structure doesn’t strengthen, it distorts. At some point the signal isn’t lost, it’s buried under too many things pretending to be it.