Building Java Applications with LangChain4j & Spring

AI is changing how we build software. Large Language Models (LLMs) like GPT, Claude, and others have transformed from research curiosities into practical tools that can understand natural language, write code, and solve complex problems. However, while Python developers have enjoyed rich AI ecosystems, such as LangChain, Java developers, who power most enterprise applications, have been left behind.

Enter LangChain4j, a comprehensive Java library that brings the full power of modern AI to the enterprise Java ecosystem. It’s not just a wrapper around API calls; it’s a comprehensive framework that leverages Java’s strengths and addresses enterprise requirements.

A Few Key Concepts

Large Language Models (LLMs)

LLMs are AI models that are pre-trained on vast amounts of data that can then be used to generate human-like responses. Think of them as incredibly sophisticated pattern recognition systems that can answer questions based on their training, generate code, documentation, and image content, follow complex instructions, and maintain context across conversations.

Prompt Engineering

The quality of AI responses depends heavily on how you phrase your requests (aka prompts). LangChain4j provides tools to create reusable prompt templates, inject context dynamically, and define system-level behavior.

Chains

A chain is a predefined pipeline of calls (models, retrievers, other chains, etc.). It encapsulates a multi-step workflow into a single callable interface. For example, a chain might take an input question, pass it through an LLM, then use another model on the LLM’s output, and then decide to call an external Tool to get more information before finalizing the outputs. LangChain’s docs explain: “Chains should be used to encode a sequence of calls to components like models, document retrievers… and provide a simple interface to this sequence”. In LangChain4j, Chains work the same way: you can compose prompts, LLM calls, and retrievers into a structured pipeline.

Memory

Memory components let a chain or agent retain context across calls. In a chat, for example, memory loads past messages and saves new ones. LangChain docs summarize: “Memory maintains Chain state, incorporating context from past runs”.

Tools

LLMs can decide when to use external functions (or REST APIs or db queries) to complete tasks. Instead of just generating text responses to math prompts from its pre-trained knowledge, the LLMs can choose instead to call preferred functions (tools) you make available to them. This bridges the gap between AI reasoning and real-world actions. In the code example, we will walk through a simple example of providing the LLM access to some math functions, which it can then decide to call before responding to questions. For simplicity, I am using local functions in my code example.

Retrieval Augmented Generation (RAG)

RAG enables LLMs to answer questions using specific documents and data unique to your business domain. Instead of relying solely on training data, the AI first searches your knowledge base and then uses that information to provide accurate, contextual responses. LangChain’s conceptual guide explains: “Vector stores are specialized data stores that enable indexing and retrieving information based on vector representations… used to search over unstructured data“. LangChain4j mirrors this and supports various Java-accessible vector databases (e.g., Pinecone, Redis) and retrievers. You can load documents, embed them, and then retrieve them on demand, just as you would in Python LangChain.

Why LangChain4j Matters for Enterprise Java

LangChain4j brings the above concepts into the JVM world with a unified, type-safe Java API. Under the hood, it offers a familiar interface over dozens of LLM providers and vector stores. For example, whether you use OpenAI, Anthropic, or a local LLM (Ollama), you code against the same interface, thus making it easier to swap providers. Similarly, all vector stores implement a shared interface, so RAG pipelines are portable too.

LangChain4j benefits…

Pure Java libraries that fit naturally into existing codebases
Support for popular frameworks Spring Boot & Quarkus. In the remainder of this blog, we will utilize Spring Boot with Langchain4j.
Proper logging, error handling, and configuration management
Type safety with compile-time checking for AI interactions
Uses annotations and interfaces that Java developers already know and use

LangChain4j Features

1. Declarative AI Services with @AiService

The traditional approach to AI integration often involves managing HTTP clients, handling responses, and parsing JSON. LangChain4j introduces a declarative approach:

@AiService
public interface ShakespeareanAssistant {
    @SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end")
    String chat(String userMessage);
}

@AiService

public interface ShakespeareanAssistant {

@SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end")

String chat(String userMessage);

}

Benefits…

Zero Boilerplate: Spring Boot automatically creates the implementation
Type Safety: Compile-time checking ensures your interface is valid
Testability: Easy to mock for unit tests
Maintainability: AI behavior is clearly defined in one place

2. Function Calling with @Tool

Function calling allows LLMs to execute your Java methods when needed:

@Service
public class ChatService {
    
    @Tool("Calculate the sum of two numbers")
    public double sum(double a, double b) {
        return a + b;
    }
    
    @Tool("Calculate the square root of a number")
    public double squareRoot(double x) {
        return Math.sqrt(x);
    }
}

@Service

public class ChatService {

@Tool("Calculate the sum of two numbers")

public double sum(double a, double b) {

return a + b;

}

@Tool("Calculate the square root of a number")

public double squareRoot(double x) {

return Math.sqrt(x);

}

How it works…

User asks: “What’s the square root of 144?”
LLM recognizes it needs the squareRoot tool
LangChain4j calls your method: squareRoot(144)
The result is returned to the LLM and included in the response

Benefits…

Accuracy: Real calculations instead of AI approximations
Integration: Leverage existing business logic
Auditability: Function calls are logged and traceable

3. Prompt Management

LangChain4j supports sophisticated prompt templates using both system and user messages:

System Messages

System messages define the AI’s behavior and personality:

@AiService
public interface MyAssistant {
    @SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end")
    String shakespeareanChat(String message);
}

@AiService

public interface MyAssistant {

@SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end")

String shakespeareanChat(String message);

}

User Message Templates

User messages help represent human inputs:

@AiService
public interface MyAssistant {
    @UserMessage("""
        My name is {{name}}. Respond with answers politely by addressing my name.
        Question:
        ```{{question}}
        {{code}}
        ```
        """)
    String chatWithPromptTemplate(
            @V("name") String name,
            @V("question") String question
    );
}

@AiService

public interface MyAssistant {

@UserMessage("""

My name is {{name}}. Respond with answers politely by addressing my name.

Question:

```{{question}}

```

""")

String chatWithPromptTemplate(

@V("name") String name,

@V("question") String question

);

}

Benefits…

Reusability: Same template for different inputs
Maintainability: Prompts are separate from logic
Type Safety: Parameters are checked at compile time

Building Your First LangChain4j Application

Let’s build a practical AI-powered application step by step.

Project Setup

Add LangChain4j to your Spring Boot project:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
    <version>1.0.1-beta6</version>
</dependency>

<groupId>dev.langchain4j</groupId>

<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>

</dependency>

Configure your OpenAI API key:

langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.chat-model.log-requests=true
langchain4j.open-ai.chat-model.log-responses=true

langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY}

langchain4j.open-ai.chat-model.log-requests=true

langchain4j.open-ai.chat-model.log-responses=true

Creating Your First AI Service

Start with a simple conversational service using the MyAssistant:

@AiService
public interface MyAssistant {
    @SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end")
    String chat(String userMessage);
}

@AiService

public interface MyAssistant {

@SystemMessage("You are a polite assistant but respond in shakespearean english. If asked math questions show the calculations and also end with a funny joke at the end")

String chat(String userMessage);

}

Adding Business Logic with Tools

Enhance your assistant with real functionality using the actual ChatService:

@Service
public class ChatService {
    
    private final ChatModel model;

    public ChatService(@Value("${openai.api.key}") String apiKey) {
        this.model = OpenAiChatModel.builder()
                .apiKey(apiKey)
                .modelName(GPT_4_O_MINI)
                .build();
    }

    // Basic chat
    public String chat(String userMessage) {
        return model.chat(userMessage);
    }

    // Image analysis using multimodal capabilities
    public String whatIsThisImage(String imageURL) {
        UserMessage userMessage = UserMessage.from(
                TextContent.from("What do you see?"),
                ImageContent.from(imageURL)
        );

        ChatResponse chatResponse = model.chat(userMessage);
        return chatResponse.toString();
    }
    
    @Tool("Sums 2 given numbers")
    double sum(double a, double b) {
        System.out.println("sum called");
        return a + b;
    }

    @Tool("Returns a square root of a given number")
    double squareRoot(double x) {
        System.out.println("squareRoot called");
        return Math.sqrt(x);
    }
}

@Service

public class ChatService {

private final ChatModel model;

public ChatService(@Value("${openai.api.key}") String apiKey) {

this.model = OpenAiChatModel.builder()

.apiKey(apiKey)

.modelName(GPT_4_O_MINI)

.build();

}

// Basic chat

public String chat(String userMessage) {

return model.chat(userMessage);

}

// Image analysis using multimodal capabilities

public String whatIsThisImage(String imageURL) {

UserMessage userMessage = UserMessage.from(

TextContent.from("What do you see?"),

ImageContent.from(imageURL)

);

ChatResponse chatResponse = model.chat(userMessage);

return chatResponse.toString();

}

@Tool("Sums 2 given numbers")

double sum(double a, double b) {

System.out.println("sum called");

return a + b;

}

@Tool("Returns a square root of a given number")

double squareRoot(double x) {

System.out.println("squareRoot called");

return Math.sqrt(x);

}

Creating REST Endpoints

Expose your AI services via REST APIs using the actual ChatController:

@RestController
@RequestMapping("/api")
public class ChatController {

    private final ChatService chatService;
    private final MyAssistant myAssistant;
    private final WellArchitectedAssistant wellArchitectedAssistant;

    public ChatController(ChatService chatService, MyAssistant myAssistant, WellArchitectedAssistant wellArchitectedAssistant) {
        this.chatService = chatService;
        this.myAssistant = myAssistant;
        this.wellArchitectedAssistant = wellArchitectedAssistant;
    }

    // Basic chat
    @PostMapping("/chat")
    public String chat(@RequestBody String message) {
        return chatService.chat(message);
    }

    // shakespearean chat using @AiService
    @PostMapping("/shakchat")
    public String shakespeareanChat(@RequestBody String message) {
        return myAssistant.shakespeareanChat(message);
    }

    // Chat with prompt template
    @PostMapping("/template")
    public String chatWithPromptTemplate(@RequestParam String name, @RequestParam String question) {
        return myAssistant.chatWithPromptTemplate(name, question);
    }
    
    // RAG-powered chat about AWS Well-Architected Framework
    @PostMapping("/rag")
    public String ragChat(@RequestBody String question) {
        return wellArchitectedAssistant.chatWithArchitect(question);
    }
    
    // given a image URL, tell me what the image is of
    @PostMapping(value = "/whatisimage", consumes = "application/json", produces = "text/plain")
    public String whatIsThisImage(@RequestBody ImageRequest request) {
        return chatService.whatIsThisImage(request.getUrl());
    }
}

@RestController

@RequestMapping("/api")

public class ChatController {

private final ChatService chatService;

private final MyAssistant myAssistant;

private final WellArchitectedAssistant wellArchitectedAssistant;

public ChatController(ChatService chatService, MyAssistant myAssistant, WellArchitectedAssistant wellArchitectedAssistant) {

this.chatService = chatService;

this.myAssistant = myAssistant;

this.wellArchitectedAssistant = wellArchitectedAssistant;

}

// Basic chat

@PostMapping("/chat")

public String chat(@RequestBody String message) {

return chatService.chat(message);

}

// shakespearean chat using @AiService

@PostMapping("/shakchat")

public String shakespeareanChat(@RequestBody String message) {

return myAssistant.shakespeareanChat(message);

}

// Chat with prompt template

@PostMapping("/template")

public String chatWithPromptTemplate(@RequestParam String name, @RequestParam String question) {

return myAssistant.chatWithPromptTemplate(name, question);

}

// RAG-powered chat about AWS Well-Architected Framework

@PostMapping("/rag")

public String ragChat(@RequestBody String question) {

return wellArchitectedAssistant.chatWithArchitect(question);

}

// given a image URL, tell me what the image is of

@PostMapping(value = "/whatisimage", consumes = "application/json", produces = "text/plain")

public String whatIsThisImage(@RequestBody ImageRequest request) {

return chatService.whatIsThisImage(request.getUrl());

}

Testing Your Application

Ensure that the OS environment variable OPENAI_API_KEY is set, as this is required to make API calls to OpenAI.

Start your application and test using curl commands to the rest endpoints defined in the ChatController. The startup will take a minute (a lifetime for any production service, but this is for illustration here in the blog code). The time taken is because, at startup, the code reads the PDF and loads it into an in-memory embedded store for retrieval by the LLM.

mvn clean spring-boot:run

curl -X POST \
  -H "Content-Type: text/plain" \
  -d "What's the square root of 144?" \
  http://localhost:8080/api/chat

mvn clean spring-boot:run

curl -X POST \

-H "Content-Type: text/plain" \

-d "What's the square root of 144?" \

http://localhost:8080/api/chat

Expected response is “The square root of 144 is 12.0”

Test the MyAssistant Shakespearean style response assistant:

curl -X POST \
  -H "Content-Type: text/plain" \
  -d "What is 25 + 17?" \
  http://localhost:8080/api/shakchat

curl -X POST \

-H "Content-Type: text/plain" \

-d "What is 25 + 17?" \

http://localhost:8080/api/shakchat

Test image analysis:

curl -X POST \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.cubesnjuliennes.com/wp-content/uploads/2020/08/Best-Indian-Punjabi-Samosa-1.jpg"}' \
  http://localhost:8080/api/whatisimage

curl -X POST \

-H "Content-Type: application/json" \

-d '{"url": "https://www.cubesnjuliennes.com/wp-content/uploads/2020/08/Best-Indian-Punjabi-Samosa-1.jpg"}' \

http://localhost:8080/api/whatisimage

Document-Based AI with RAG

One of LangChain4j’s most powerful features is RAG (Retrieval-Augmented Generation), which enables your AI to answer questions using your own knowledge base. In this code example, we will ingest the contents of the AWS Well-Architected Framework PDF. When asked Architecture questions, the LLM can refer to this indexed repository instead of defaulting to the LLM’s trained outputs. For now, assume that you will be using your own proprietary data sets in place of the document I chose to use in this code sample.

Understanding RAG Architecture

RAG works in two phases:

Ingestion Phase:

Document Loading: Parse PDFs, Word docs, web pages
Text Splitting: Break documents into manageable chunks
Embedding Generation: Convert text chunks to numerical vectors
Vector Storage: Store embeddings for fast similarity search

Query Phase:

Question Embedding: Convert the user question to vector
Similarity Search: Find the most relevant document chunks
Context Injection: Add relevant chunks to the prompt
AI Generation: LLM generates an answer using the retrieved context

Setting Up RAG

Add RAG dependencies:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-document-parser-apache-pdfbox</artifactId>
    <version>1.0.1-beta6</version>
</dependency>
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-embeddings-all-minilm-l6-v2</artifactId>
    <version>1.0.1-beta6</version>
</dependency>

<groupId>dev.langchain4j</groupId>

<artifactId>langchain4j-document-parser-apache-pdfbox</artifactId>

</dependency>

<groupId>dev.langchain4j</groupId>

<artifactId>langchain4j-embeddings-all-minilm-l6-v2</artifactId>

</dependency>

Create a document processing service using the actual RagService:

@Service
public class RagService {
    
    private ContentRetriever contentRetriever;
    private EmbeddingModel embeddingModel;
    private boolean isInitialized = false;
    
    @PostConstruct
    public void initialize() {
        try {
            // Initialize embedding model
            embeddingModel = new AllMiniLmL6V2EmbeddingModel();
            
            // Load PDF document from home directory
            String pdfPath = System.getProperty("user.home") + "/wellarchitected-framework.pdf";
            
            // Check if file exists
            if (!Paths.get(pdfPath).toFile().exists()) {
                System.err.println("PDF file not found at: " + pdfPath);
                return;
            }
            
            Document document = FileSystemDocumentLoader.loadDocument(
                pdfPath, 
                new ApachePdfBoxDocumentParser()
            );
            
            // Split document into chunks
            DocumentSplitter splitter = DocumentSplitters.recursive(500, 100);
            List<TextSegment> segments = splitter.split(document);
            
            // Create embedding store and ingest segments
            EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
            EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
                .documentSplitter(splitter)
                .embeddingModel(embeddingModel)
                .embeddingStore(embeddingStore)
                .build();
            
            ingestor.ingest(document);
            
            // Create content retriever
            contentRetriever = EmbeddingStoreContentRetriever.builder()
                .embeddingStore(embeddingStore)
                .embeddingModel(embeddingModel)
                .maxResults(5)
                .minScore(0.6)
                .build();
            
            isInitialized = true;
            System.out.println("RAG service initialized successfully with " + segments.size() + " document segments");
            
        } catch (Exception e) {
            System.err.println("Failed to initialize RAG service: " + e.getMessage());
            e.printStackTrace();
        }
    }
    
    public ContentRetriever getContentRetriever() {
        return contentRetriever;
    }
    
    public boolean isInitialized() {
        return isInitialized;
    }
}

@Service

public class RagService {

private ContentRetriever contentRetriever;

private EmbeddingModel embeddingModel;

private boolean isInitialized = false;

@PostConstruct

public void initialize() {

try {

// Initialize embedding model

embeddingModel = new AllMiniLmL6V2EmbeddingModel();

// Load PDF document from home directory

String pdfPath = System.getProperty("user.home") + "/wellarchitected-framework.pdf";

// Check if file exists

if (!Paths.get(pdfPath).toFile().exists()) {

System.err.println("PDF file not found at: " + pdfPath);

return;

}

Document document = FileSystemDocumentLoader.loadDocument(

pdfPath,

new ApachePdfBoxDocumentParser()

);

// Split document into chunks

DocumentSplitter splitter = DocumentSplitters.recursive(500, 100);

List<TextSegment> segments = splitter.split(document);

// Create embedding store and ingest segments

EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()

.documentSplitter(splitter)

.embeddingModel(embeddingModel)

.embeddingStore(embeddingStore)

.build();

ingestor.ingest(document);

// Create content retriever

contentRetriever = EmbeddingStoreContentRetriever.builder()

.embeddingStore(embeddingStore)

.embeddingModel(embeddingModel)

.maxResults(5)

.minScore(0.6)

.build();

isInitialized = true;

System.out.println("RAG service initialized successfully with " + segments.size() + " document segments");

} catch (Exception e) {

System.err.println("Failed to initialize RAG service: " + e.getMessage());

e.printStackTrace();

}

public ContentRetriever getContentRetriever() {

return contentRetriever;

}

public boolean isInitialized() {

return isInitialized;

}

Create a document-aware AI service using the actual RagConfiguration:

@Configuration
public class RagConfiguration {
    
    @Bean
    public WellArchitectedAssistant wellArchitectedAssistant(ChatModel chatModel, RagService ragService) {
        if (!ragService.isInitialized()) {
            // Return a fallback implementation if RAG is not initialized
            return question -> "RAG service is not available. Please ensure the wellarchitected-framework.pdf file is in your home directory.";
        }
        
        return AiServices.builder(WellArchitectedAssistant.class)
                .chatModel(chatModel)
                .contentRetriever(ragService.getContentRetriever())
                .build();
    }
}

@Configuration

public class RagConfiguration {

@Bean

public WellArchitectedAssistant wellArchitectedAssistant(ChatModel chatModel, RagService ragService) {

if (!ragService.isInitialized()) {

// Return a fallback implementation if RAG is not initialized

return question -> "RAG service is not available. Please ensure the wellarchitected-framework.pdf file is in your home directory.";

}

return AiServices.builder(WellArchitectedAssistant.class)

.chatModel(chatModel)

.contentRetriever(ragService.getContentRetriever())

.build();

}

Testing RAG

With RAG, your AI can now answer questions like:

“What are the key architecture principles to scale applications?”
“What are the pillars of AWS Well-Architected Framework?”

curl -X POST \
  -H "Content-Type: text/plain" \
  -d "What are the pillars of AWS Well_Architected Framework?" \
  http://localhost:8080/api/rag

curl -X POST \

-H "Content-Type: text/plain" \

-d "What are the pillars of AWS Well_Architected Framework?" \

http://localhost:8080/api/rag

The LLM will now search your documents (that have already been indexed), find relevant sections, and provide accurate answers based on your actual content.

Wrap-up!

LangChain4j represents a strong addition for the Java developer community. The Git code repo is at https://github.com/thomasma/langchain4j. Additional recommended reading at https://docs.langchain4j.dev/intro/.

{"Mat's Random Thoughts"}

Mathew's Tech Notes..

Building Java Applications with LangChain4j & Spring

A Few Key Concepts

Large Language Models (LLMs)

Prompt Engineering

Chains

Memory

Tools

Retrieval Augmented Generation (RAG)

Why LangChain4j Matters for Enterprise Java

LangChain4j Features

1. Declarative AI Services with @AiService

2. Function Calling with @Tool

3. Prompt Management

System Messages

User Message Templates

Building Your First LangChain4j Application

Project Setup

Creating Your First AI Service

Adding Business Logic with Tools

Creating REST Endpoints

Testing Your Application

Document-Based AI with RAG

Understanding RAG Architecture

Setting Up RAG

Testing RAG

Wrap-up!

Leave a Reply Cancel reply