AI is changing how we build software. Large Language Models (LLMs) like GPT, Claude, and others have transformed from research curiosities into practical tools that can understand natural language, write code, and solve complex problems. However, while Python developers have enjoyed rich AI ecosystems, such as LangChain, Java developers, who power most enterprise applications, have been left behind.
Enter LangChain4j, a comprehensive Java library that brings the full power of modern AI to the enterprise Java ecosystem. It’s not just a wrapper around API calls; it’s a comprehensive framework that leverages Java’s strengths and addresses enterprise requirements.
A Few Key Concepts
Large Language Models (LLMs)
LLMs are AI models that are pre-trained on vast amounts of data that can then be used to generate human-like responses. Think of them as incredibly sophisticated pattern recognition systems that can answer questions based on their training, generate code, documentation, and image content, follow complex instructions, and maintain context across conversations.
Prompt Engineering
The quality of AI responses depends heavily on how you phrase your requests (aka prompts). LangChain4j provides tools to create reusable prompt templates, inject context dynamically, and define system-level behavior.
Chains
A chain is a predefined pipeline of calls (models, retrievers, other chains, etc.). It encapsulates a multi-step workflow into a single callable interface. For example, a chain might take an input question, pass it through an LLM, then use another model on the LLM’s output, and then decide to call an external Tool to get more information before finalizing the outputs. LangChain’s docs explain: “Chains should be used to encode a sequence of calls to components like models, document retrievers… and provide a simple interface to this sequence”. In LangChain4j, Chains work the same way: you can compose prompts, LLM calls, and retrievers into a structured pipeline.
Memory
Memory components let a chain or agent retain context across calls. In a chat, for example, memory loads past messages and saves new ones. LangChain docs summarize: “Memory maintains Chain state, incorporating context from past runs”.
Tools
LLMs can decide when to use external functions (or REST APIs or db queries) to complete tasks. Instead of just generating text responses to math prompts from its pre-trained knowledge, the LLMs can choose instead to call preferred Math functions you make available to them. This bridges the gap between AI reasoning and real-world actions. In the code example, we will walk through a simple example of providing the LLM access to some math functions, which it can then decide to call before responding to questions. For simplicity, I am using location functions in my code example.
Retrieval Augmented Generation (RAG)
RAG enables LLMs to answer questions using specific documents and data unique to your business domain. Instead of relying solely on training data, the AI first searches your knowledge base and then uses that information to provide accurate, contextual responses. LangChain’s conceptual guide explains: “Vector stores are specialized data stores that enable indexing and retrieving information based on vector representations… used to search over unstructured data“. LangChain4j mirrors this and supports various Java-accessible vector databases (e.g., Pinecone, Redis) and retrievers. You can load documents, embed them, and then retrieve them on demand, just as you would in Python LangChain.
Why LangChain4j Matters for Enterprise Java
LangChain4j brings the above concepts into the JVM world with a unified, type-safe Java API. Under the hood, it offers a familiar interface over dozens of LLM providers and vector stores. For example, whether you use OpenAI, Anthropic, or a local LLM (Ollama), you code against the same interface, thus making it easier to swap providers. Similarly, all vector stores implement a shared interface, so RAG pipelines are portable too.
LangChain4j benefits…
- Pure Java libraries that fit naturally into existing codebases
- Support for popular frameworks Spring Boot & Quarkus. In the remainder of this blog, we will utilize Spring Boot with Langchain4j.
- Proper logging, error handling, and configuration management
- Type safety with compile-time checking for AI interactions
- Uses annotations and interfaces that Java developers already know and use
LangChain4j Features
1. Declarative AI Services with @AiService
The traditional approach to AI integration often involves managing HTTP clients, handling responses, and parsing JSON. LangChain4j introduces a declarative approach:
1 2 3 4 5 |
@AiService public interface ShakespeareanAssistant { @SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end") String chat(String userMessage); }<code data-line="52" class="code-line language-java" dir="auto"> |
Benefits…
- Zero Boilerplate: Spring Boot automatically creates the implementation
- Type Safety: Compile-time checking ensures your interface is valid
- Testability: Easy to mock for unit tests
- Maintainability: AI behavior is clearly defined in one place
2. Function Calling with @Tool
Function calling allows LLMs to execute your Java methods when needed:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
@Service public class ChatService { @Tool("Calculate the sum of two numbers") public double sum(double a, double b) { return a + b; } @Tool("Calculate the square root of a number") public double squareRoot(double x) { return Math.sqrt(x); } } |
How it works…
- User asks: “What’s the square root of 144?”
- LLM recognizes it needs the
squareRoot
tool - LangChain4j calls your method:
squareRoot(144)
- The result is returned to the LLM and included in the response
Benefits…
- Accuracy: Real calculations instead of AI approximations
- Integration: Leverage existing business logic
- Auditability: Function calls are logged and traceable
3. Prompt Management
System Messages
1 2 3 4 5 |
@AiService public interface MyAssistant { @SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end") String shakespeareanChat(String message); } |
User Message Templates
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
@AiService public interface MyAssistant { @UserMessage(""" My name is {{name}}. Respond with answers politely by addressing my name. Question: ```{{question}} {{code}} ``` """) String chatWithPromptTemplate( @V("name") String name, @V("question") String question ); } |
Benefits…
- Reusability: Same template for different inputs
- Maintainability: Prompts are separate from logic
- Type Safety: Parameters are checked at compile time
Building Your First LangChain4j Application
Let’s build a practical AI-powered application step by step.
Project Setup
Add LangChain4j to your Spring Boot project:
1 2 3 4 5 |
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId> <version>1.0.1-beta6</version> </dependency> |
Configure your OpenAI API key:
1 2 3 |
langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY} langchain4j.open-ai.chat-model.log-requests=true langchain4j.open-ai.chat-model.log-responses=true |
Creating Your First AI Service
Start with a simple conversational service using the MyAssistant:
1 2 3 4 5 |
@AiService public interface MyAssistant { @SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end") String chat(String userMessage); } |
Adding Business Logic with Tools
Enhance your assistant with real functionality using the actual ChatService:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
@Service public class ChatService { private final ChatModel model; public ChatService(@Value("${openai.api.key}") String apiKey) { this.model = OpenAiChatModel.builder() .apiKey(apiKey) .modelName(GPT_4_O_MINI) .build(); } // Basic chat public String chat(String userMessage) { return model.chat(userMessage); } // Image analysis using multimodal capabilities public String whatIsThisImage(String imageURL) { UserMessage userMessage = UserMessage.from( TextContent.from("What do you see?"), ImageContent.from(imageURL) ); ChatResponse chatResponse = model.chat(userMessage); return chatResponse.toString(); } @Tool("Sums 2 given numbers") double sum(double a, double b) { System.out.println("sum called"); return a + b; } @Tool("Returns a square root of a given number") double squareRoot(double x) { System.out.println("squareRoot called"); return Math.sqrt(x); } } |
Creating REST Endpoints
Expose your AI services via REST APIs using the actual ChatController:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
@RestController @RequestMapping("/api") public class ChatController { private final ChatService chatService; private final MyAssistant myAssistant; private final WellArchitectedAssistant wellArchitectedAssistant; public ChatController(ChatService chatService, MyAssistant myAssistant, WellArchitectedAssistant wellArchitectedAssistant) { this.chatService = chatService; this.myAssistant = myAssistant; this.wellArchitectedAssistant = wellArchitectedAssistant; } // Basic chat @PostMapping("/chat") public String chat(@RequestBody String message) { return chatService.chat(message); } // shakespearean chat using @AiService @PostMapping("/shakchat") public String shakespeareanChat(@RequestBody String message) { return myAssistant.shakespeareanChat(message); } // Chat with prompt template @PostMapping("/template") public String chatWithPromptTemplate(@RequestParam String name, @RequestParam String question) { return myAssistant.chatWithPromptTemplate(name, question); } // RAG-powered chat about AWS Well-Architected Framework @PostMapping("/rag") public String ragChat(@RequestBody String question) { return wellArchitectedAssistant.chatWithArchitect(question); } // given a image URL, tell me what the image is of @PostMapping(value = "/whatisimage", consumes = "application/json", produces = "text/plain") public String whatIsThisImage(@RequestBody ImageRequest request) { return chatService.whatIsThisImage(request.getUrl()); } } |
Testing Your Application
Ensure that the OS environment variable OPENAI_API_KEY is set, as this is required to make API calls to OpenAI.
Start your application and test using curl commands to the rest endpoints defined in the ChatController. The startup will take a minute (a lifetime for any production service, but this is for illustration here in the blog code). The time taken is because, at startup, the code reads the PDF and loads it into an in-memory embedded store for retrieval by the LLM.
1 2 3 4 5 6 |
mvn clean spring-boot:run curl -X POST \ -H "Content-Type: text/plain" \ -d "What's the square root of 144?" \ http://localhost:8080/api/chat |
Expected response is “The square root of 144 is 12.0”
Test the MyAssistant Shakespearean style response assistant:
1 2 3 4 |
curl -X POST \ -H "Content-Type: text/plain" \ -d "What is 25 + 17?" \ http://localhost:8080/api/shakchat |
Test image analysis:
1 2 3 4 |
curl -X POST \ -H "Content-Type: application/json" \ -d '{"url": "https://www.cubesnjuliennes.com/wp-content/uploads/2020/08/Best-Indian-Punjabi-Samosa-1.jpg"}' \ http://localhost:8080/api/whatisimage |
Document-Based AI with RAG
One of LangChain4j’s most powerful features is RAG (Retrieval-Augmented Generation), which enables your AI to answer questions using your own knowledge base. In this code example, we will ingest the contents of the AWS Well-Architected Framework PDF. When asked Architecture questions, the LLM can refer to this indexed repository instead of defaulting to the LLM’s trained outputs. For now, assume that you will be using your own proprietary data sets in place of the document I chose to use in this code sample.
Understanding RAG Architecture
RAG works in two phases:
Ingestion Phase:
- Document Loading: Parse PDFs, Word docs, web pages
- Text Splitting: Break documents into manageable chunks
- Embedding Generation: Convert text chunks to numerical vectors
- Vector Storage: Store embeddings for fast similarity search
Query Phase:
- Question Embedding: Convert the user question to vector
- Similarity Search: Find the most relevant document chunks
- Context Injection: Add relevant chunks to the prompt
- AI Generation: LLM generates an answer using the retrieved context
Setting Up RAG
Add RAG dependencies:
1 2 3 4 5 6 7 8 9 10 |
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-document-parser-apache-pdfbox</artifactId> <version>1.0.1-beta6</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-embeddings-all-minilm-l6-v2</artifactId> <version>1.0.1-beta6</version> </dependency> |
Create a document processing service using the actual RagService:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
@Service public class RagService { private ContentRetriever contentRetriever; private EmbeddingModel embeddingModel; private boolean isInitialized = false; @PostConstruct public void initialize() { try { // Initialize embedding model embeddingModel = new AllMiniLmL6V2EmbeddingModel(); // Load PDF document from home directory String pdfPath = System.getProperty("user.home") + "/wellarchitected-framework.pdf"; // Check if file exists if (!Paths.get(pdfPath).toFile().exists()) { System.err.println("PDF file not found at: " + pdfPath); return; } Document document = FileSystemDocumentLoader.loadDocument( pdfPath, new ApachePdfBoxDocumentParser() ); // Split document into chunks DocumentSplitter splitter = DocumentSplitters.recursive(500, 100); List<TextSegment> segments = splitter.split(document); // Create embedding store and ingest segments EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder() .documentSplitter(splitter) .embeddingModel(embeddingModel) .embeddingStore(embeddingStore) .build(); ingestor.ingest(document); // Create content retriever contentRetriever = EmbeddingStoreContentRetriever.builder() .embeddingStore(embeddingStore) .embeddingModel(embeddingModel) .maxResults(5) .minScore(0.6) .build(); isInitialized = true; System.out.println("RAG service initialized successfully with " + segments.size() + " document segments"); } catch (Exception e) { System.err.println("Failed to initialize RAG service: " + e.getMessage()); e.printStackTrace(); } } public ContentRetriever getContentRetriever() { return contentRetriever; } public boolean isInitialized() { return isInitialized; } } |
Create a document-aware AI service using the actual RagConfiguration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
@Configuration public class RagConfiguration { @Bean public WellArchitectedAssistant wellArchitectedAssistant(ChatModel chatModel, RagService ragService) { if (!ragService.isInitialized()) { // Return a fallback implementation if RAG is not initialized return question -> "RAG service is not available. Please ensure the wellarchitected-framework.pdf file is in your home directory."; } return AiServices.builder(WellArchitectedAssistant.class) .chatModel(chatModel) .contentRetriever(ragService.getContentRetriever()) .build(); } } |
Testing RAG
With RAG, your AI can now answer questions like:
- “What are the key architecture principles to scale applications?”
- “What are the pillars of AWS Well-Architected Framework?”
1 2 3 4 |
curl -X POST \ -H "Content-Type: text/plain" \ -d "What are the pillars of AWS Well_Architected Framework?" \ http://localhost:8080/api/rag |
The LLM will now search your documents (that have already been indexed), find relevant sections, and provide accurate answers based on your actual content.
Wrap-up!
LangChain4j represents a strong addition for the Java developer community. The Git code repo is at https://github.com/thomasma/langchain4j. Additional recommended reading at https://docs.langchain4j.dev/intro/.