In this example, inspired by Building Agents with AWS: Complete Tutorial, we will build a simple AI agent application using Spring AI, highlighting key features like:
- Chat Client API and its Advisors
- Model Context Protocol (MCP)
- Retrieval Augmented Generation (RAG)
- Testing with AI Model Evaluation 🤩
The original example uses AWS Bedrock, but one of the great things about Spring AI is that with just a few config tweaks and dependency changes, the same code works with any other supported model. In our case, we’ll use Ollama, which will hopefully let us run locally and in CI without heavy hardware requirements 🙏
The application features an AI agent that helps users book accommodations in tourist destinations.
Through MCP, the agent can use the following tools:
- Clock Tool: Provides the current date.
- Weather Tool: Retrieves weather information for a specific city and date.
- Booking Tool: Books accommodations in a city for a specific date.
The Clock and Weather tools will be implemented locally using MCP, while the Booking tool will be provided by a remote MCP server. Additional information about cities will be retrieved from a vector store using RAG.
As is often the case with Spring Boot, implementing the MCP Server is pretty straightforward. Following the MCP Server Boot Starter guide, you just need to:
-
Add the
spring-ai-starter-mcp-server-webflux
orspring-ai-starter-mcp-server-webmvc
dependency -
Create an instance and annotate it with
@Tool
and@ToolParam
:
@Service // or @Bean / @Component
class BookingTool(private val bookingService: BookingService) {
@Tool(
description = "make a reservation for accommodation for a given city and date",
)
fun book(
@ToolParam(description = "the city to make the reservation for")
city: String,
@ToolParam(description = "the check-in date, when the guest begins their stay")
checkinDate: LocalDate,
@ToolParam(description = "the check-out date, when the guest ends their stay")
checkoutDate: LocalDate
): String = bookingService.book(city, checkinDate, checkoutDate) // Delegate to a service
}
- Register it as a
MethodToolCallbackProvider
:
@Configuration
class BookingToolConfiguration {
@Bean
fun bookingToolCallbackProvider(bookingTool: BookingTool) =
MethodToolCallbackProvider.builder()
.toolObjects(bookingTool)
.build()
}
The Chat Server is a Spring Boot application built with the following dependencies:
spring-boot-starter-web
or-webflux
- to expose a REST API for the chat interfacespring-ai-starter-mcp-client
- to use MCPspring-ai-starter-vector-store-pgvector
andspring-ai-advisors-vector-store
- to enable RAG with PGVectorspring-ai-starter-model-ollama
- to use Ollama models
- MCP Tools
- Weather Tool - a local MCP tool that queries a WeatherService for the weather in a given city on a given date
- Clock Tool - a local MCP tool that returns the system date, letting us control the date the AI agent uses to avoid unpredictability
- Booking Tool - a remote MCP tool that connects to the Booking MCP Server to reserve accommodations
- Chat
- Chat Client - a Spring AI ChatClient configured with:
- A system prompt to define the AI agent’s role
- The Ollama model autoconfigured by Spring Boot
- The above MCP tools as part of the AI agent’s toolset
- Chat Service - wraps the Chat Client and adds three advisors:
- QuestionAnswerAdvisor - fetches context from a vector store and augments the user input (RAG)
- PromptChatMemoryAdvisor - adds conversation history to the user input (chat memory)
- SimpleLoggerAdvisor - logs the chat history to the console (for debugging)
- Chat Controller - exposes a simple REST POST endpoint that takes user input, calls the Chat Service, and returns the AI agent’s response
- Chat Client - a Spring AI ChatClient configured with:
- Vector Store Initializer - loads some sample data into the vector store at startup
Let's implement this step by step ...
Here's how the Weather Tool is implemented (the same applies to Clock Tool):
- Create an instance and annotate it with
@Tool
and@ToolParam
:
@Service // or @Bean / @Component
class WeatherTool(private val weatherService: WeatherService) {
@Tool(description = "get the weather for a given city and date")
fun getWeather(
@ToolParam(description = "the city to get the weather for")
city: String,
@ToolParam(description = "the date to get the weather for")
date: LocalDate
): String = weatherService.getWeather(city, date) // Delegate to a service
}
- Register it as a
MethodToolCallbackProvider
:
@Configuration
class WeatherToolConfiguration {
@Bean
fun weatherToolCallbackProvider(weatherTool: WeatherTool) =
MethodToolCallbackProvider.builder()
.toolObjects(weatherTool)
.build()
}
To set up the Booking Tool as a remote MCP tool, we just need to configure the MCP client SSE connection in application.yml
:
spring:
ai:
mcp:
client:
sse:
connections:
booking-tool:
url: http://localhost:8081
You can find all the alternative configurations in MCP Client Boot Starter documentation.
We create the Chat Client using Spring AI's ChatClient.Builder
, which is already autoconfigured via spring.ai
configuration properties (we'll talk at that later in Configuration), and initialize it with a custom system prompt and the available MCP tools:
@Configuration
class ChatClientConfiguration {
@Bean
fun chatClient(
builder: ChatClient.Builder,
toolCallbackProviders: List<ToolCallbackProvider>
): ChatClient {
return chatClientBuilder(builder, toolCallbackProviders).build()
}
private fun chatClientBuilder(
builder: ChatClient.Builder,
toolCallbackProviders: List<ToolCallbackProvider>
): ChatClient.Builder {
val system = """
You are an AI powered assistant to help people book accommodation in touristic cities around the world.
If there is no information, then return a polite response suggesting you don't know.
If the response involves a timestamp, be sure to convert it to something human-readable.
Do not include any indication of what you're thinking.
Use the tools available to you to answer the questions.
Just give the answer.
""".trimIndent()
return builder
.defaultSystem(system)
.defaultTools(*toolCallbackProviders.toTypedArray())
}
}
The Chat Service exposes a single chat
method that takes a chat ID and a user question. It calls the Chat Client with the user question along with a set of advisors to enrich the interaction:
- QuestionAnswerAdvisor - retrieves relevant context from a vector store and injects it to the context (RAG)
- PromptChatMemoryAdvisor - retrieves or creates an
InMemoryChatMemory
for the given chat ID and adds it to the context - SimpleLoggerAdvisor - logs internal advisor traces to the console (if
logging.level.org.springframework.ai.chat.client.advisor
is set toDEBUG
)
Additionally, the question and answer are logged to the console.
Here’s the implementation:
@Service // or @Bean / @Component
class ChatService(
vectorStore: VectorStore,
private val chatClient: ChatClient
) {
private val logger = LoggerFactory.getLogger(ChatService::class.java)
private val questionAnswerAdvisor = QuestionAnswerAdvisor(vectorStore)
private val simpleLoggerAdvisor = SimpleLoggerAdvisor()
private val chatMemory = ConcurrentHashMap<String, PromptChatMemoryAdvisor>()
fun chat(chatId: String, question: String): String? {
val chatMemoryAdvisor = chatMemory.computeIfAbsent(chatId) {
PromptChatMemoryAdvisor.builder(InMemoryChatMemory()).build()
}
return chatClient
.prompt()
.user(question)
.advisors(questionAnswerAdvisor, chatMemoryAdvisor, simpleLoggerAdvisor)
.call()
.content().apply {
logger.info("Chat #$chatId question: $question")
logger.info("Chat #$chatId answer: $this")
}
}
}
The Chat Controller exposes a simple REST POST endpoint that takes user input, calls the Chat Service, and returns the AI agent’s response:
@RestController
class ChatController(private val chatService: ChatService) {
@PostMapping("/{chatId}/chat")
fun chat(
@PathVariable chatId: String,
@RequestParam question: String
): String? {
return chatService.chat(chatId, question)
}
}
It's as simple as using Spring AI’s autoconfigured VectorStore
and adding documents to it. This automatically invokes the embedding model to generate embeddings and store them in the vector store:
@Bean
fun vectorStoreInitializer(vectorStore: VectorStore) =
ApplicationRunner {
// TODO check if the vector store is empty ...
// TODO load cities from a JSON file or any other source ...
cities.forEach { city ->
val document = Document(
"name: ${city.name} " +
"country: ${city.country} " +
"description: ${city.description}"
)
vectorStore.add(listOf(document))
}
}
You can find the full version of vectorStoreInitializer
in ChatServerApplication.kt.
In the main application.yml
file, we define global configuration values:
- Set the active Spring profile to
ollama
, allowing us to configure specific properties for the Ollama model in theapplication-ollama.yml
file. - Configure the datasource to connect to a PostgreSQL database with PGVector support.
- Set the server port to
8080
. - Configure the URL of the remote Booking Tool MCP server.
- Set the logging level for chat advisor debug traces.
spring:
profiles:
active: "ollama"
application:
name: chat-server
datasource:
url: "jdbc:postgresql://localhost:5432/postgres"
username: "postgres"
password: "password"
driver-class-name: org.postgresql.Driver
ai:
mcp:
client:
sse:
connections:
booking-tool:
url: http://localhost:8081
server:
port: 8080
logging:
level:
org.springframework.ai.chat.client.advisor: INFO
In the ollama
profile configuration file, application-ollama.yml
, we configure Spring AI to use Ollama models:
- Set the base URL for the Ollama server to
http://localhost:11434
. - Set the chat model to llama3.1:8b (must be a tools-enabled model).
- Set the embedding model to nomic-embed-text.
- Use
pull-model-strategy: when_missing
to only pull models if they are not available locally. - Configure PGVector as the vector store with 768 dimensions (matching the embedding model size).
spring:
ai:
ollama:
base-url: "http://localhost:11434"
init:
pull-model-strategy: "always"
chat:
enabled: true
options:
model: "llama3.1:8b"
embedding:
enabled: true
model: "nomic-embed-text"
vectorstore:
pgvector:
dimensions: 768
initialize-schema: true
To test the MCP Server, we will use a McpClient
to call the book
method of the Booking Tool, mocking the downstream BookingService:
See the simplified test implementation below. For the complete implementation, including a test that verifies the list of available tools, refer to McpServerApplicationTest.kt.
@SpringBootTest(webEnvironment = RANDOM_PORT)
class McpServerApplicationTest {
// 1. Inject the server port (it is random)
@LocalServerPort
val port: Int = 0
// 2. Mock the BookingService instance
@MockitoBean
lateinit var bookingService: BookingService
@Test
fun `should book`() {
// 3. Create a McpClient connected to the server
val client = McpClient.sync(
HttpClientSseClientTransport("http://localhost:$port")
).build()
client.initialize()
client.ping()
// 4. Mock the bookingService using argument captors
val bookResult = "Booking is done!"
val cityCaptor = argumentCaptor<String>()
val checkinDateCaptor = argumentCaptor<LocalDate>()
val checkoutDateCaptor = argumentCaptor<LocalDate>()
doReturn(bookResult)
.whenever(bookingService)
.book(
cityCaptor.capture(),
checkinDateCaptor.capture(),
checkoutDateCaptor.capture()
)
// 5. Call the tool
val city = "Barcelona"
val checkinDate = "2025-04-15"
val checkoutDate = "2025-04-18"
val result = client.callTool(CallToolRequest(
"book",
mapOf(
"city" to city,
"checkinDate" to checkinDate,
"checkoutDate" to checkoutDate
)
))
// 6. Verify the result
assertThat(result.isError).isFalse()
assertThat(result.content).singleElement()
.isInstanceOfSatisfying(TextContent::class.java) {
// TODO why is text double quoted?
assertThat(it.text).isEqualTo("\"$bookResult\"")
}
// 7. Verify that the bookingService was called with
// the correct parameters
assertThat(cityCaptor.allValues).singleElement()
.isEqualTo(city)
assertThat(checkinDateCaptor.allValues).singleElement()
.isEqualTo(LocalDate.parse(checkinDate))
assertThat(checkoutDateCaptor.allValues).singleElement()
.isEqualTo(LocalDate.parse(checkoutDate))
// 8. Close the client
client.close()
}
}
To test the Chat Server, we will:
- Replace the remote Booking Tool by a local Booking Test Tool with the same signature.
- Disable the MCP client with
spring.ai.mcp.client.enabled: false
inapplication-test.yml
. - Create the local Book Testing Tool in BookingTestToolConfiguration.kt.
- Disable the MCP client with
- Mock the downstream services Weather Service and Booking Service with
MockitoBean
. - Create a fixed Clock to control the date in ClockTestToolConfiguration.kt
- Declare it as
@Primary @Bean
to override the defaultClock
bean.
- Declare it as
- Start Docker Compose with both Ollama and PGVector using Testcontainers
You might’ve noticed that the test doesn’t actually check the MCP client’s SSE connection as it is disabled. I tried spinning up an McpServer
using McpServerAutoConfiguration
, and it almost worked. The problem? The client tries to connect before the server is up, which causes the whole application to fail on startup. Maybe is just an ordering issue, and hopefully something that can be fixed in the future 🤞
Now for the interesting part, how do we test the AI agent’s response? This is where Evaluation Testing comes in:
One method to evaluate the response is to use the AI model itself for evaluation. Select the best AI model for the evaluation, which may not be the same model used to generate the response.
This aligns with the evaluation techniques described in Martin Fowler’s Evals GenAI pattern:
- Self-evaluation: The LLM evaluates its own response, but this can reinforce its own mistakes or biases.
- LLM as a judge: Another model scores the output, reducing bias by introducing a second opinion.
- Human evaluation: People manually review responses to ensure the tone and intent feel right.
To keep things simple, we’ll go with self-evaluation 🤓
Each test will follow this structure:
@Test
fun `should do something`() {
// 1. Mock downstream service(s)
// Optionally use argument captors depending on how you plan
// to verify parameters in step 5
// Example for BookingService:
doReturn("Your booking is done!")
.whenever(bookingTestService).book(any(), any(), any())
// 2. Call the chat service
val chatId = UUID.randomUUID().toString()
val chatResponse = chatService.chat(
chatId,
"Can you book accommodation for Barcelona from 2025-04-15 to 2025-04-18?"
)
// 3. Evaluate the response using the AI model
val evaluationResult = TestEvaluator(chatClientBuilder) { evaluationRequest, userSpec ->
userSpec.text(
"""
Your task is to evaluate if the answer given by an AI agent to a human user matches the claim.
Return YES if the answer matches the claim and NO if it does not.
After returning YES or NO, explain why.
Assume that today is ${LocalDate.now(clock)}.
Answer: {answer}
Claim: {claim}
""".trimIndent()
)
.param("answer", evaluationRequest.responseContent)
.param("claim", evaluationRequest.userText)
}.evaluate(EvaluationRequest(
"Accommodation has been booked Barcelona from 2025-04-15 to 2025-04-18",
chatResponse
))
// 4. Assert the evaluation result is successful, show feedback if not
assertThat(evaluationResult.isPass).isTrue
.withFailMessage { evaluationResult.feedback }
// 5. If applicable, verify the parameters passed to the service
// You can verify with argument captors or use the `verify` method as in the example below:
verify(bookingTestService).book(
eq("Barcelona"),
eq(LocalDate.parse("2025-04-15")),
eq(LocalDate.parse("2025-04-18"))
)
}
See the full test implementation in ChatServerApplicationTest.kt.
Each evaluation uses a custom prompt tailored to the specific response being tested, and as you experiment, you'll notice some surprisingly quirky behavior. That’s why I ended up creating a custom TestEvaluator
, based on Spring AI’s RelevancyEvaluator
and FactCheckingEvaluator
, which may not yet offer the level of customization you might want.
I had to adjust each prompt after running into odd results. For example, the evaluation model assuming it was still 2023 and refusing to believe the AI agent could predict weather for 2025. Or it mistaking "you" in the answer as referring to itself instead of the user. The weirdest? One evaluation just answered “NO” but the explanation said, “well, maybe I should have said YES” 🤣
For a production system, you'd definitely need a lot of prompt tuning and testing to get things right, for both the system and the evaluator, I suppose that’s part of the "fun" when working with GenAI.
This whole setup can run locally without needing powerful hardware, the models are lightweight enough for a laptop. That said, it's slow. To speed things up for CI, I disabled all tests except for this basic one:
@Test
@EnabledIfCI
fun `should be up and running`() {
val chatId = UUID.randomUUID().toString()
val chatResponse = chatService.chat(chatId, "Hello!")
assertThat(chatResponse).isNotNull() // Any response is good 👌
}
We run Ollama locally using docker compose
but you can also install it natively on your machine.
- Start MCP server
cd mcp-server
./gradlew bootRun
- Start docker compose
cd chat-server
docker compose up -d
- Start Chat Server
cd chat-server
./gradlew bootRun
- Execute queries
You can access the API at http://localhost:8080/swagger-ui.html or use any HTTP client of your choice. Below are some examples using curl
:
curl -X POST "http://localhost:8080/2/chat" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "question=I want to go to a city with a beach. Where should I go?"
curl -X POST "http://localhost:8080/2/chat" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "question=How is the weather like in Madrid for the weekend?"
curl -X POST "http://localhost:8080/2/chat" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "question=Can I get a hotel for Berlin next monday for two nights?"
To use any of the other AI models supported by Spring AI, follow these steps:
- Add the required dependencies
- Configure the model in its own
application-<model>.yml
file - Activate the profile using
spring.profiles.active=<model>
in the mainapplication.yml
file
For example, check out the bedrock branch for the AWS Bedrock model configuration.
- Spring AI documentation
- Martin Fowler's GenAI patterns
- Inspired by sample project spring-ai-java-bedrock-mcp-rag
- Awesome Spring AI
Happy GenAI coding! đź’™