Building Intelligent Chatbots with Spring AI: Implementing Conversational Memory
“
Context retention capability is the defining feature separating basic Q&A tools from true conversational AI systems. This comprehensive guide explores how to implement persistent memory in chatbots using Spring AI framework for natural human-machine dialogues.
1. Environment Setup and Technology Stack
Core Component Dependencies
The solution leverages:
-
Spring Boot 3.5.0: Microservice framework -
Spring AI 1.0.0-M6: Core AI integration library -
Java 17: Primary development language -
Ollama: Local LLM runtime environment
Maven Configuration
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.5.0</version>
</parent>
<groupId>com.example</groupId>
<artifactId>test</artifactId>
<version>0.0.1-SNAPSHOT</version>
<properties>
<java.version>17</java.version>
<spring-ai.version>1.0.0-M6</spring-ai.version>
</properties>
<dependencies>
<!-- Ollama integration -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
<version>1.0.0-M6</version>
</dependency>
<!-- Spring Boot core -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Spring AI core library -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-core</artifactId>
<version>1.0.0-M6</version>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>${spring-ai.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
</project>
Model Configuration (application.properties)
spring.ai.ollama.chat.model=mistral
2. Basic Chatbot Implementation (Without Memory)
Core Controller Code
@RestController
public class BasicChatController {
private final ChatModel chatModel;
// Inject Ollama chat model
public BasicChatController(@Qualifier("ollamaChatModel") ChatModel chatModel) {
this.chatModel = chatModel;
}
@PostMapping("/chat")
public String chat(@RequestBody String userMessage) {
// Create single-turn prompt
Prompt prompt = new Prompt(new UserMessage(userMessage));
// Get model response
return chatModel.call(prompt).getResult().getOutput().getText();
}
}
Functional Limitations
-
Conversational Amnesia: Each request treated as isolated interaction -
Context Fragmentation: User: My name is John AI: Nice to meet you, John! User: What's my name? AI: Sorry, I don't know your name
-
Unnatural Flow: Unable to handle contextual references
3. Implementing Conversation Memory
Enhanced Memory Controller
@RestController
public class MemoryChatController {
private final ChatModel chatModel;
// Store conversation history
private final List<UserMessage> messageHistory = new ArrayList<>();
public MemoryChatController(@Qualifier("ollamaChatModel") ChatModel chatModel) {
this.chatModel = chatModel;
}
@PostMapping("/chat")
public String chat(@RequestBody String userMessage) {
// Create user message object
UserMessage userMsg = new UserMessage(userMessage);
// Add to history
messageHistory.add(userMsg);
// Build prompt with full context
Prompt prompt = new Prompt(new ArrayList<>(messageHistory));
// Get contextual response
return chatModel.call(prompt).getResult().getOutput().getText();
}
}
Implementation Workflow
graph LR
A[User Input] --> B{Create UserMessage}
B --> C[Store in messageHistory]
C --> D[Build Contextual Prompt]
D --> E[Send to LLM]
E --> F[Return Contextual Response]
Core Benefits of Memory
-
Context Continuity
Example conversation:User: I'm a DevOps engineer specializing in cloud infrastructure AI: Great! What cloud platforms do you work with? User: What's my specialization? AI: You specialize in cloud infrastructure
-
Natural Dialogue Flow
Supports contextual follow-ups:User: Recommend Python books AI: "Fluent Python" and "Python Crash Course" User: Which is better for beginners? AI: "Python Crash Course" is more beginner-friendly
4. Production Environment Considerations
1. Token and Context Window Management
Optimization Strategies:
-
History Summarization: Compress old conversations -
Relevance Filtering: Prioritize key interactions -
Intelligent Truncation: Preserve recent dialogues
2. Persistent Storage Solutions
Limitations of in-memory storage (ArrayList
):
-
Memory loss on server restart -
No multi-session support -
Lack of data durability
Production-Grade Architecture:
graph TB
A[Client Request] --> B[Chat Controller]
B --> C{Query Redis}
C -->|Existing session| D[Retrieve History]
C -->|New session| E[Create New Record]
D --> F[Build Prompt]
E --> F
F --> G[Call LLM]
G --> H[Save to Redis]
Recommended Technologies:
-
Redis: In-memory database (millisecond response) -
MongoDB: Document database (flexible schema) -
PostgreSQL: Relational DB with JSONB support
5. Performance Optimization Guide
Memory Management Strategies
Key Monitoring Metrics
-
Tokens per Request -
Context Construction Time -
90% History Utilization Rate -
Conversation Breakage Rate
6. Implementation Best Practices
Business Value of Memory
-
User Experience: Human-like conversation flow -
Efficiency: Eliminate redundant information -
Intelligence: Handle complex dialogues -
Use Case Expansion: Support for customer service, education, etc.
Enterprise Implementation Guide
-
Layered Storage Architecture:
-
Hot Data: Redis for recent conversations -
Cold Data: PostgreSQL for historical storage
-
-
Memory Management Middleware:
public class MemoryManager {
// LRU-based memory storage
private static final int MAX_HISTORY_ITEMS = 20;
private final Map<String, Deque<Message>> sessionMemories =
Collections.synchronizedMap(new LinkedHashMap<>());
public void addMessage(String sessionId, Message message) {
// Session memory management logic
}
public List<Message> getHistory(String sessionId) {
// Retrieve conversation context
}
}
-
Memory Effectiveness Evaluation: -
Design continuity test scenarios -
Monitor context retention rates -
Implement user feedback mechanisms
-
“
Future Outlook: With emerging 128K+ token models, conversational AI will achieve true long-term memory. Spring AI continues to lower development barriers for intelligent chatbot systems within Java ecosystems.