Category Archives: Retrieval Augmented Generation (RAG)

Context Engineering – Stop Stuffing the Window

When teams ship their first AI agent, they usually find out — within a few weeks — that the model wasn’t the problem. The agent hallucinates a customer ID three turns into a conversation and then cheerfully references it for the rest of the session. A 50-step task dies at step 42 because the context window is filled with tool output nobody needed. A “simple” migration tool that worked beautifully on 10 files collapses on 100 because the noise drowns out the signal. The team retries with a bigger model. The bugs move, but they don’t leave.

This is the pattern that has pushed an entire sub-discipline — context engineering — from niche jargon to what Cognition has called “effectively the #1 job of engineers building AI agents.” In April 2026, Thoughtworks moved context engineering into the Adopt ring of its Technology Radar, framing it as having “evolved from an optimization tactic into a foundational architectural concern for modern AI systems.” In their words, the context window is “a design surface,” and your job is to “intentionally construct the AI’s information environment.”

In the last six months, every serious agent builder has published essentially the same lesson: what separates a demo from a production agent is not which model you pick, but how you shape the information that model sees on every turn.

For engineering leaders, this matters beyond the mechanics. Context engineering is reshaping how we structure codebases, document systems, think about memory and observability, and which skills we value on our teams. This post is a tour of that landscape.

Continue reading

Retrieval-Augmented Generation (RAG) with Spring AI

Retrieval-Augmented Generation (RAG) is a powerful pattern that enhances Large Language Models (LLMs) by grounding their responses in your specific documents and data. While GPT-4 is incredibly capable, it doesn’t know about your proprietary documents, internal knowledge bases, or recent updates that occurred after its training cutoff date. RAG solves this problem by retrieving relevant context from your documents before generating responses.

Continue reading

Unlocking the Power of LLMs with LangChain

As an AI and software professional, you’ve likely heard the buzz around large language models (LLMs) like GPT-3, ChatGPT, and their growing capabilities. These powerful models can handle a wide range of natural language tasks, from text generation to question answering. However, effectively leveraging LLMs in your own applications can be a complex challenge. That’s where LangChain comes in.

Continue reading