Building Java Applications with LangChain4j & Spring

AI is changing how we build software. Large Language Models (LLMs) like GPT, Claude, and others have transformed from research curiosities into practical tools that can understand natural language, write code, and solve complex problems. However, while Python developers have enjoyed rich AI ecosystems, such as LangChain, Java developers, who power most enterprise applications, have been left behind.

Enter LangChain4j, a comprehensive Java library that brings the full power of modern AI to the enterprise Java ecosystem. It’s not just a wrapper around API calls; it’s a comprehensive framework that leverages Java’s strengths and addresses enterprise requirements.

A Few Key Concepts

Large Language Models (LLMs)

LLMs are AI models that are pre-trained on vast amounts of data that can then be used to generate human-like responses. Think of them as incredibly sophisticated pattern recognition systems that can answer questions based on their training, generate code, documentation, and image content, follow complex instructions, and maintain context across conversations.

Prompt Engineering

The quality of AI responses depends heavily on how you phrase your requests (aka prompts). LangChain4j provides tools to create reusable prompt templates, inject context dynamically, and define system-level behavior.

Chains

A chain is a predefined pipeline of calls (models, retrievers, other chains, etc.). It encapsulates a multi-step workflow into a single callable interface. For example, a chain might take an input question, pass it through an LLM, then use another model on the LLM’s output, and then decide to call an external Tool to get more information before finalizing the outputs. LangChain’s docs explain: “Chains should be used to encode a sequence of calls to components like models, document retrievers… and provide a simple interface to this sequence”. In LangChain4j, Chains work the same way: you can compose prompts, LLM calls, and retrievers into a structured pipeline.

Memory

Memory components let a chain or agent retain context across calls. In a chat, for example, memory loads past messages and saves new ones. LangChain docs summarize: “Memory maintains Chain state, incorporating context from past runs”.

Tools

LLMs can decide when to use external functions (or REST APIs or db queries) to complete tasks. Instead of just generating text responses to math prompts from its pre-trained knowledge, the LLMs can choose instead to call preferred Math functions you make available to them. This bridges the gap between AI reasoning and real-world actions. In the code example, we will walk through a simple example of providing the LLM access to some math functions, which it can then decide to call before responding to questions. For simplicity, I am using location functions in my code example.

Retrieval Augmented Generation (RAG)

RAG enables LLMs to answer questions using specific documents and data unique to your business domain. Instead of relying solely on training data, the AI first searches your knowledge base and then uses that information to provide accurate, contextual responses. LangChain’s conceptual guide explains: “Vector stores are specialized data stores that enable indexing and retrieving information based on vector representations… used to search over unstructured data“. LangChain4j mirrors this and supports various Java-accessible vector databases (e.g., Pinecone, Redis) and retrievers. You can load documents, embed them, and then retrieve them on demand, just as you would in Python LangChain.

Why LangChain4j Matters for Enterprise Java

LangChain4j brings the above concepts into the JVM world with a unified, type-safe Java API. Under the hood, it offers a familiar interface over dozens of LLM providers and vector stores. For example, whether you use OpenAI, Anthropic, or a local LLM (Ollama), you code against the same interface, thus making it easier to swap providers. Similarly, all vector stores implement a shared interface, so RAG pipelines are portable too.

LangChain4j benefits…

  1. Pure Java libraries that fit naturally into existing codebases
  2. Support for popular frameworks Spring Boot & Quarkus. In the remainder of this blog, we will utilize Spring Boot with Langchain4j.
  3. Proper logging, error handling, and configuration management
  4. Type safety with compile-time checking for AI interactions
  5. Uses annotations and interfaces that Java developers already know and use

LangChain4j Features

1. Declarative AI Services with @AiService

The traditional approach to AI integration often involves managing HTTP clients, handling responses, and parsing JSON. LangChain4j introduces a declarative approach:

Benefits…

  • Zero Boilerplate: Spring Boot automatically creates the implementation
  • Type Safety: Compile-time checking ensures your interface is valid
  • Testability: Easy to mock for unit tests
  • Maintainability: AI behavior is clearly defined in one place

2. Function Calling with @Tool

Function calling allows LLMs to execute your Java methods when needed:

How it works…

  1. User asks: “What’s the square root of 144?”
  2. LLM recognizes it needs the squareRoot tool
  3. LangChain4j calls your method: squareRoot(144)
  4. The result is returned to the LLM and included in the response

Benefits…

  • Accuracy: Real calculations instead of AI approximations
  • Integration: Leverage existing business logic
  • Auditability: Function calls are logged and traceable

3. Prompt Management

LangChain4j supports sophisticated prompt templates using both system and user messages:

System Messages

System messages define the AI’s behavior and personality:

User Message Templates

System messages define the AI’s behavior and personality:

Benefits…

  • Reusability: Same template for different inputs
  • Maintainability: Prompts are separate from logic
  • Type Safety: Parameters are checked at compile time

Building Your First LangChain4j Application

Let’s build a practical AI-powered application step by step.

Project Setup

Add LangChain4j to your Spring Boot project:

Configure your OpenAI API key:

Creating Your First AI Service

Start with a simple conversational service using the MyAssistant:

Adding Business Logic with Tools

Enhance your assistant with real functionality using the actual ChatService:

Creating REST Endpoints

Expose your AI services via REST APIs using the actual ChatController:

Testing Your Application

Ensure that the OS environment variable OPENAI_API_KEY is set, as this is required to make API calls to OpenAI.

Start your application and test using curl commands to the rest endpoints defined in the ChatController. The startup will take a minute (a lifetime for any production service, but this is for illustration here in the blog code). The time taken is because, at startup, the code reads the PDF and loads it into an in-memory embedded store for retrieval by the LLM.

Expected response is “The square root of 144 is 12.0”

Test the MyAssistant Shakespearean style response assistant:

Test image analysis:

Document-Based AI with RAG

One of LangChain4j’s most powerful features is RAG (Retrieval-Augmented Generation), which enables your AI to answer questions using your own knowledge base. In this code example, we will ingest the contents of the AWS Well-Architected Framework PDF. When asked Architecture questions, the LLM can refer to this indexed repository instead of defaulting to the LLM’s trained outputs. For now, assume that you will be using your own proprietary data sets in place of the document I chose to use in this code sample.

Understanding RAG Architecture

RAG works in two phases:

Ingestion Phase:

  1. Document Loading: Parse PDFs, Word docs, web pages
  2. Text Splitting: Break documents into manageable chunks
  3. Embedding Generation: Convert text chunks to numerical vectors
  4. Vector Storage: Store embeddings for fast similarity search

Query Phase:

  1. Question Embedding: Convert the user question to vector
  2. Similarity Search: Find the most relevant document chunks
  3. Context Injection: Add relevant chunks to the prompt
  4. AI Generation: LLM generates an answer using the retrieved context

Setting Up RAG

Add RAG dependencies:

Create a document processing service using the actual RagService:

Create a document-aware AI service using the actual RagConfiguration:

Testing RAG

With RAG, your AI can now answer questions like:

  • “What are the key architecture principles to scale applications?”
  • “What are the pillars of AWS Well-Architected Framework?”

The LLM will now search your documents (that have already been indexed), find relevant sections, and provide accurate answers based on your actual content.

Wrap-up!

LangChain4j represents a strong addition for the Java developer community. The Git code repo is at https://github.com/thomasma/langchain4j. Additional recommended reading at https://docs.langchain4j.dev/intro/.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.