foojay – a place for friends of OpenJDK

MongoDB as a Vector Database for AI Agents-MongoDB

Aasawari Sahasrabuddhe — Thu, 04 Jun 2026 10:00:00 +0000

Table of Contents

Why should you use MongoDB for building AI agents?Understanding AI agentsBuilding a multi-agent application with MongoDB

Step 1: Creating a vector search index
Step 2: Creating the Trip
Step 3: Induce a disruption
Step 4: Replanning
Step 5: The Memory agents make use of vector search.
Step 6: Booking agent generates options
Step 7: The system finally makes the decision

Conclusion

Modern artificial intelligence systems are continually evolving. Large Language Models, or LLMs, have become the backbone of modern applications and help build conversational interfaces, like GPS, to more integrated content. However, LLMs lack memory and the capacity to retain content across interactions because they are stateless. And these limitations led to the building of AI agents. These AI agents build beyond simple prompt-response interactions into more autonomous, task-oriented workflows.

These agents are not just model invocations; rather, they are an orchestration layer that combines reasoning with capabilities like retrieval, memory, and tool execution. While developing these agents, a database with the ability to store and retrieve semantically meaningful data is needed, which is where vector databases come into the picture.

A vector database stores data as dense numerical representations of text, images, or unstructured data. These embeddings capture semantic meaning, enabling similarity search instead of exact matching. With MongoDB Atlas, developers can generate embeddings, store them alongside application data, and perform vector search within MongoDB Atlas, thus allowing AI agents to seamlessly combine operational data with semantic retrieval, simplifying architecture while improving performance.

In this blog post, we’ll build an AI agent in Java using MongoDB as our database, by storing user queries, documents, agent memory, and embeddings in a single place. We will understand how MongoDB simplifies the implementation of retrieval-augmented generation and persistent memory systems.

Why should you use MongoDB for building AI agents?

Vector store and voyage AI support – MongoDB Atlas infrastructure offers you a developer-friendly ecosystem. Giving you the ability to store vector embeddings, create vector embeddings, and finally perform the vector search directly from the platform. This reduces the need to have different systems to build an enterprise application.
Hybrid Search – With MongoDB Atlas infrastructure, you can add filters with a vector search query and add additional conditions to the query results. Unlike specialized vector stores, MongoDB can do both semantic (vector) and classically structured (keyword) queries together.
Developer Ecosystem – MongoDB has been a developer-first database ever since, and as it continues to do so, it lets your application integrate efficiently.
Operational Efficiency - If you already use MongoDB, adding vector search avoids the need to introduce new infrastructure. It simplifies schema, transactions, and ops.

Understanding AI agents

While we are building AI agents, it is important to understand the core principles of embeddings, retrieval-augmented generation (RAG), and agentic architectures.

Vector embeddings, or simply embeddings, are dense vector representations of numerical data derived from texts, audio, videos, or any form of unstructured data. These vectors reside in a high-dimensional space where semantic similarity is preserved, which means semantically similar inputs are located closer together based on distance metrics such as cosine similarity or dot product.

This vector representation helps retrieve the top-K most similar vectors, effectively performing semantic retrieval rather than exact matching using vector search. This is critical for handling paraphrasing, ambiguity, and contextual queries.

With retrieval-augmented generation, or RAG, it builds the retrieval step into a pipeline. The model uses the semantic search ability to generate responses. One of the most common challenges with standard LLMs is hallucination, or the generation of incorrect or fabricated information when relying solely on parametric knowledge stored in model weights. RAG addresses this by grounding responses in retrieved documents rather than depending only on internal weights. As a result, it improves factual consistency, traceability, and the freshness of responses.

With these changes, the concepts of agents came into the picture. In these agentic architectures, vector search becomes a core abstraction for implementing memory systems:

Short-term memory: recent interaction history embedded and retrieved for conversational continuity
Long-term memory: persisted embeddings of past interactions, documents, and tool outputs
Semantic recall: retrieving context dynamically based on similarity rather than rigid keys

In these architectures, vector databases serve as both the retrieval and the storage layer for these systems. Therefore, vector search no longer remains just for semantic searches but rather a foundational building block for agentic systems. It underpins how agents retrieve knowledge, maintain memory, and produce contextually relevant, low-hallucination outputs in real-world applications.

Building a multi-agent application with MongoDB

Before we get into the actual code for building the agents, let's first understand a few basic prerequisites for building the application.

A free-tier MongoDB Atlas cluster.
Create your free Voyage AI API key to generate embeddings in the database.
A Spring Boot setup to work with MongoDB using Spring Initializr.
Latest Java and Gradle/Maven versions installed.

To build the multi-agent system, we are using a travel replanning system as an example.

Here is a scenario to better understand this system: You are traveling from Toronto to San Francisco with a layover at New York. And then the reality happens. The flight between New York and SF is delayed by 9 hours, and now you need a better plan, since you have that one client meeting to showcase your product.

At this point, we do not need just a system that tells me another way, but rather helps me replan the entire trip. And this is where this multi-agent replanning system would come in. This system basically does the following:

A Monitoring Agent that detects disruptions
A Planner Agent orchestrates decisions
A Booking Agent finds alternative routes
A Budget Agent filters based on cost
A Preference Agent aligns with user choices
A Memory Agent recalls similar past situations

Each agent is simple on its own. But together, they behave like a coordinated system.

What makes this system powerful is the use of MongoDB as the database. MongoDB stores real-time data in a database; every event is recorded in the system, and Voyage AI and MongoDB’s vector search capabilities store embeddings of past travel incidents and retrieve similar cases during replanning.

To build this system, we will be using four different collections: trip_state, event, agent_decision, and incident_memory. The trip_state stores the current state of the trip; all disruptions are copied into events. Every agent logs its reasoning in agent_decision, and incident_memory stores the past incidents.

Let's do this step by step.

Step 1: Creating a vector search index

Before we build the system, we need a vector search index. The embeddings in this project are produced by Voyage AI's voyage-3-large model.

Go to MongoDB Atlas, create a collection named incident_memory, and create a vector search index with the JSON below.

{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}

Step 2: Creating the Trip

The trip is created with the following API call. This request lands in the controller. Because the request body is optional, we use a default CreateTripRequest when none is supplied and pass that normalized request into the service. So, normalized is just the incoming request or a default placeholder when the client omits the body.

@PostMapping("/create")
public TripState createTrip(@RequestBody(required = false) CreateTripRequest request) {
    CreateTripRequest normalized = request == null
            ? new CreateTripRequest("demo-user", null, null)
            : request;
    return tripService.createTrip(normalized);
}

And with the Service layer, it creates the trip. Example:

curl -X POST "http://localhost:8080/trip/create" \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "traveler-001",
    "preferences": {
      "airlinePreference": "SkyJet",
      "avoidRedEye": true,
      "maxAdditionalBudget": 250
    }
  }'

Would result in:

{
  "id": "69dd6111674d2228e4db4b25",
  "userId": "traveler-001",
  "itinerary": [
    {
      "segmentId": "SEG-1",
      "type": "FLIGHT",
      "provider": "SkyJet",
      "fromLocation": "JFK",
      "toLocation": "SFO",
      "cost": 420.0
    }
  ],
  "status": "ON_TRACK"
}

This trip gets stored in trip_state. At this point, everything looks fine.

Step 3: Induce a disruption

At this step, we would add a delay status in the database. This is done using another post method:

curl -X POST "http://localhost:8080/event/simulate-delay" \
  -H "Content-Type: application/json" \
  -d '{
    "tripId": "69dd6111674d2228e4db4b25",
    "delayMinutes": 180,
    "severity": "HIGH"
  }'

This is done using another code block in the controller.

@PostMapping("/simulate-delay")
public TravelEvent simulateDelay(@RequestBody SimulateDelayRequest request)

And at the same time, something critical happens:

tripState.setStatus(TripStatus.DISRUPTED);
tripService.saveTrip(tripState);

This is your first agent that detects a problem, updates the state, and logs the decision.

The following delay is stimulated:

{
  "id": "69dd6160674d2228e4db4b26",
  "tripId": "69dd6111674d2228e4db4b25",
  "type": "FLIGHT_DELAY",
  "severity": "HIGH",
  "metadata": {
    "from": "JFK",
    "to": "SFO",
    "delayMinutes": 180
  }
}

Step 4: Replanning

To trigger replanning, the PlannerAgent orchestrates the other agents. It asks MemoryAgent for similar incidents using MongoDB Vector Search and asks BookingAgent for alternative routes; then BudgetAgent and PreferenceAgent refine those options before PlannerAgent commits the final itinerary.

This enters the

@PostMapping("/plan/replan")
public TripState replan(@RequestBody ReplanRequest request)

And the planner agent takes over. Example:

curl -X POST http://localhost:8080/plan/replan \
  -H "Content-Type: application/json" \
  -d '{
    "tripId": "69dd6111674d2228e4db4b25"
  }'

Which responds as

{
  "id": "69dd6111674d2228e4db4b25",
  "status": "REPLANNED",
  "itinerary": [
    {
      "segmentId": "OPT-CHI-1",
      "fromLocation": "JFK",
      "toLocation": "ORD",
      "cost": 320.0
    },
    {
      "segmentId": "OPT-CHI-2",
      "fromLocation": "ORD",
      "toLocation": "SFO",
      "cost": 320.0
    }
  ]
}

This is where it starts to suggest taking another flight from Chicago.

Step 5: The Memory agents make use of vector search.

At first, the planner agents check, "Have we seen something like this?" If so, they retrieve it from the incident_memory and suggest what could be done.

List results = vectorSearchService.findSimilar(query);

Step 6: Booking agent generates options

At this point, when no response is found, it starts to generate its own options. To do so,

List options =
    bookingAgent.generateOptions(tripState, latestEvent, memories);

The budget agent also starts to filter options with

List budgeted =
    budgetAgent.filterOptions(tripState, options);

Step 7: The system finally makes the decision

Finally, the trip is updated, and the system records the reason for the same. At this point, when you call:

curl http://localhost:8080/trip/69dd6111674d2228e4db4b25

It would give you the response as:

{
  "status": "REPLANNED",
  "itinerary": [
    {
      "fromLocation": "JFK",
      "toLocation": "ORD"
    },
    {
      "fromLocation": "ORD",
      "toLocation": "SFO"
    }
  ]
}

Finally, the system didn’t just detect a delay, but it used memory, coordinated multiple agents, and produced a better plan with a fully traceable decision history stored in MongoDB.

The complete code for this multi-agent system is available on the GitHub repository.

Conclusion

In this blog, we tried to build a multi-agent system that is adaptive, stateful, and intelligent, all using MongoDB.

Starting from a simple travel itinerary, we saw how a disruption triggered a chain of coordinated actions across multiple agents. The Monitoring Agent detected the issue, the Memory Agent recalled similar past incidents using vector search, and the Planner Agent orchestrated Booking, Budget, and Preference Agents to arrive at a better alternative. Most importantly, every step of this process was persisted, making the system not just intelligent, but also explainable.

What makes this architecture powerful is the role of MongoDB as a unified data platform. Instead of separating operational data and AI memory into separate systems, MongoDB brings them together: This allows agents to move beyond stateless execution and operate with context and experience.

The vector search capability of MongoDB enables the system to retrieve similar past situations and apply that knowledge to new problems, reducing guesswork and improving decision quality.

The post MongoDB as a Vector Database for AI Agents-MongoDB appeared first on foojay.

What is Sharding in MongoDB and When Should You Use It?

Nancy Agarwal — Tue, 02 Jun 2026 22:15:00 +0000

Table of Contents

A Practical Introduction to Horizontal Scaling
1. Shards
2. Config Servers
3. Mongos Router
Large datasets
High write throughput
Rapid data growth

A Practical Introduction to Horizontal Scaling

When building applications, most developers start with a single database server.

At the beginning, everything works perfectly.

Your application might have:

A few thousand users
Manageable traffic
Datasets that easily fit on one machine

But as your application grows, something interesting starts to happen.

Queries take longer.
Write operations slow down.
The database server starts hitting CPU, RAM, or storage limits.

At this stage, many engineers ask an important question:

Should we upgrade the server or scale the database differently?

This is where horizontal scaling and sharding come into the picture.

If you're using MongoDB, sharding is the mechanism that allows your database to scale beyond the limits of a single machine.

In this article, we'll walk through:

What sharding actually is
Why horizontal scaling matters
How MongoDB implements sharding
When you should (and shouldn’t) use it

The Scaling Problem Most Databases Face

Imagine your application stores user data in a database.

Initially, the architecture looks like this:

Application

    │

Database Server

All reads and writes go to one machine.

This approach is called vertical scaling, when you keep upgrading the same server by adding:

More CPU
More RAM
Faster storage

While this works for a while, vertical scaling eventually hits limits:

Hardware upgrades become expensive
There is always a maximum server size
Downtime may be required during upgrades

Eventually, a single server becomes a bottleneck.

Instead of making one machine bigger, the better approach is to add more machines.

This approach is called horizontal scaling.

What is Horizontal Scaling?

Horizontal scaling means distributing data across multiple servers rather than relying on a single server.

Instead of storing all data on a single machine:

Server A

2 TB of data

You distribute the data:

Server A → 500 GB

Server B → 500 GB

Server C → 500 GB

Server D → 500 GB

Each server stores only part of the dataset.

This is exactly what sharding does.

What is Sharding in MongoDB?

Sharding is the process of splitting large datasets across multiple database servers.

Each server stores a portion of the data, called a shard.

For example, imagine an application storing millions of users.

Instead of keeping all users on one server:

Shard	Data
Shard 1	Users with IDs 1–1M
Shard 2	Users with IDs 1M–2M
Shard 3	Users with IDs 2M–3M

Each shard contains only a subset of the collection.

When queries come in, MongoDB determines which shard contains the relevant data.

This allows the database to handle massive datasets and high traffic efficiently.

MongoDB Sharded Cluster Architecture

A sharded cluster in MongoDB consists of three main components: shards, config servers, and MongoDB routers

1. Shards

Shards are where the actual data is stored.

Each shard is usually deployed as a replica set to ensure high availability and fault tolerance.

2. Config Servers

Config servers store metadata about the cluster.

They maintain information such as:

Which shard contains which data
How data is distributed
Shard key ranges

Without config servers, the cluster would not know where data lives.

3. Mongos Router

Applications do not connect directly to shards.

Instead, they connect to mongos, which acts as a query router.

Its responsibilities include:

Receiving application queries
Determining which shard contains the data
Forwarding the query to the correct shard

A simplified architecture looks like this:

     Application

          │

        Mongos

      /   |   \

Shard1  Shard2   Shard3

This abstraction means the application does not need to know where the data is stored.

Choosing a Shard Key

A shard key determines how data is distributed across shards.

For example:

{ userId: 1 }

MongoDB uses the shard key to decide which shard a document belongs to.

Choosing a shard key is one of the most critical decisions in a sharded architecture.

A good shard key should:

Distribute data evenly
Avoid hotspots
Support common query patterns

For example, if most queries are based on userId, using it as the shard key makes sense.

However, choosing something like country might create imbalanced shards if most users are from one region.

Creating a Sharded Collection

Let’s look at a simple example.

First, enable sharding for a database.

sh.enableSharding("companyDB")

Next, shard a collection.

sh.shardCollection(

 "companyDB.employees",

 { employeeId: 1 }

)

MongoDB will now automatically distribute documents across shards.

Querying Data in a Sharded Cluster

One of the nice things about sharding in MongoDB is that application queries remain the same.

For example:

db.employees.find(

 { department: "Engineering" },

 { name: 1, managerName: 1, departmentName: 1 }

)

The mongos router determines which shard contains the relevant documents and routes the query to that shard.From the application's perspective, it still feels like one database.

When Should You Use Sharding?

Sharding is powerful, but it should be introduced only when needed.

Here are common situations where sharding makes sense.

Large datasets

If your dataset grows into hundreds of gigabytes or terabytes, a single server may not be sufficient.

Examples include:

Analytics platforms
Log storage systems
IoT platforms

High write throughput

Applications that generate large numbers of writes can benefit from sharding because writes can be distributed across multiple nodes.

Examples include:

Event tracking systems
Gaming platforms
Social media feeds

Rapid data growth

If you expect your dataset to grow rapidly, designing the system with sharding in mind early can save major architectural changes later.

When Sharding Might Be Overkill

Despite its benefits, sharding adds operational complexity.

You probably don’t need sharding if:

Your dataset is relatively small
Your workload is moderate
Vertical scaling still works

Many applications run perfectly fine with replication and proper indexing.

Sharding should usually be considered after other scaling strategies have been exhausted.

Sharding vs Replication

Developers sometimes confuse these two concepts.

Feature	Replication	Sharding
Purpose	High availability	Horizontal scaling
Data	Same data on every node	Data split across nodes
Reads	Can scale reads	Scales read and write
Storage	Data duplicated	Data distributed

In practice, MongoDB often uses both together.

Each shard is typically configured as a replica set, ensuring both scalability and fault tolerance.

Final Thoughts

Sharding is one of the most powerful scaling mechanisms available in MongoDB.

It allows databases to handle:

Massive datasets
High query throughput
Continuously growing applications

However, like most architectural decisions, it should be introduced carefully and intentionally.

Understanding your data access patterns and choosing the right shard key are essential for a successful sharded deployment.

If you’re building applications expected to scale to millions of users or terabytes of data, sharding becomes a key tool in your database architecture.

The post What is Sharding in MongoDB and When Should You Use It? appeared first on foojay.

Jakarta EE is Ready for AI – But Don’t Just Take My Word for It!

Dominika Tasarz — Tue, 02 Jun 2026 11:41:01 +0000

Table of Contents

Where Jakarta EE Comes From and Where It's Headed

The Past, Present, and Future of Enterprise Java - Ivar Grimstad (Eclipse Foundation)

Jakarta EE Meets AI: Three Angles on the Same Problem

The Intelligent Monolith: Supercharging Jakarta EE with Local AI - Luqman Saeed (Azul)
Jakarta EE 11 Meets AI: Building Intelligent Microservices with Virtual Threads and Jakarta Data - Luqman Saeed (Azul)
Production-ready Agentic AI: Building Enterprise-grade Java Systems with Jakarta EE and MicroProfile - Kenji Kazumura (Fujitsu)

Getting the Fundamentals Right

API = Some REST and HTTP, right? RIGHT?! - Rustam Mehmandarov (Miles)

Back in April I had the pleasure of attending Open Community Experience 2026 in Brussels - the Eclipse Foundation's flagship open source conference. It's always good to be in a room (or a few rooms 😉 ) with people who really care about the technology they work with. Several of my colleagues and friends were speaking - watching them present work they've spent serious time on is one of the better parts of this community.

This post is a roundup of five talks I think belong well together. They don't cover the same topic but they tell a story about where enterprise Java is, where it's going and what it means to build serious software with it in 2026.

Where Jakarta EE Comes From and Where It's Headed

The Past, Present, and Future of Enterprise Java - Ivar Grimstad (Eclipse Foundation)

If you're going to watch one talk from OCX26 to orient yourself before watching the others, make it this one. Ivar traces the full arc from J2EE's famously painful complexity, through the birth of the Spring framework and eventual influence on the platform itself, all the way to where Jakarta EE is today.
He is really good at explaining why Jakarta EE looks the way it does: every simplification has a history, every specification carries a rationale. He walks through the TCK process, the platform profiles (full, web and core) and the key additions in Jakarta EE 10 and 11 - including Jakarta Data and virtual thread support - before turning to the EE 12 roadmap and the early moves towards AI standardisation.

Jakarta EE Meets AI: Three Angles on the Same Problem

The next three talks are best understood as a series. They each ask a version of the same question - how do you integrate AI into enterprise Java systems responsibly? - but approach it from different angles and with a slightly different focus.

The Intelligent Monolith: Supercharging Jakarta EE with Local AI - Luqman Saeed (Azul)

Luqman opens with a provocation that I think resonates with anyone who's been paying attention to how AI gets adopted in enterprise settings at the moment: what if the biggest risk in your AI strategy isn't the model - it's the dependency?

Most AI integration today is built on external API calls to hosted models. That means your application's intelligence is, well... rented. You're subject to someone else's pricing decisions, rate limits, latency, availability and - critically in regulated industries - data residency constraints. Luqman's talk is a detailed, practical demonstration of what it looks like to bring that intelligence back home.

The stack he demonstrates is very much Java-native: CDI for dependency injection, LangChain4j for AI orchestration, PostgreSQL with pgvector for embeddings and Ollama for running models locally. He builds a full retrieval-augmented generation (RAG) pipeline within the application itself - with your data, your model and your infrastructure.

Luqman walks through four progressive patterns: declarative RAG pipelines, agentic workflows with decision logic, multi-agent orchestration and finally fully in-process inference using Jlama - running the model directly inside the JVM, no external process required. Each step trades a little convenience for more control, and he's honest about the trade-offs at each stage.

Jakarta EE 11 Meets AI: Building Intelligent Microservices with Virtual Threads and Jakarta Data - Luqman Saeed (Azul)

With Luqman’s second talk, we are now moving from monolithic architecture to microservices - and in doing so, we highlight just how much Jakarta EE 11 has to offer for teams building AI-enabled systems.

The central architectural move here is using Jakarta Data repositories as the persistence layer for embeddings. Rather than reaching for a dedicated vector database, Luqman stores embeddings directly in JPA entities as byte arrays and implements cosine similarity search in plain Java. For many real-world use cases - where data volumes are moderate and operational simplicity matters - this is a very practical approach that avoids adding infrastructure complexity before you've validated whether you actually need it.

The talk also makes excellent use of Jakarta Concurrency 3.1's virtual thread support. Embedding generation and model inference are I/O-bound operations and the result is a highly concurrent system without any manual thread pool management. MicroProfile Config handles runtime model switching, so you can move between model providers without redeployment.

Luqman is being honest about when this approach reaches its limits. The in-memory vector search works… until it doesn't. The embedded model works… until your scale demands otherwise. Luqman is clear about what signals should prompt you to reach for a dedicated vector database or GPU-based inference. That kind of honest guidance - knowing not just how to do something but when to stop doing it that way - is what makes this talk relevant and practical!

Production-ready Agentic AI: Building Enterprise-grade Java Systems with Jakarta EE and MicroProfile - Kenji Kazumura (Fujitsu)

Where Luqman's talks focus on architecture and implementation, Kenji's talk asks the harder question: what does it actually take to put an AI-enabled system into production?

The answer, it turns out, is the same thing it's always taken for distributed systems: security, observability and transactional consistency. Kenji's argument is that Jakarta EE and MicroProfile already give you most of what you need.

The reference architecture he demonstrates is an agent-based system where a supervisor agent coordinates specialised sub-agents that interact with external tools via MCP servers. He uses OpenID Connect and JWT propagation - standard MicroProfile Security capabilities - ensuring that authentication context flows correctly across service boundaries even as agents delegate to agents.

Transaction handling in distributed AI workflows can be tricky - local ACID transactions work where they can, compensation patterns handle the cases where they can't. Observability is implemented via OpenTelemetry, giving you end-to-end tracing across what can otherwise be an extremely opaque chain of agent interactions.

Kenji introduces Jakarta Agentic AI - a project I wrote about here - aimed at standardising agent lifecycle and integration patterns across the enterprise Java ecosystem.

This talk will be useful for anyone who has built an AI proof-of-concept and is now wondering how to make it something you'd actually trust in production.

Getting the Fundamentals Right

API = Some REST and HTTP, right? RIGHT?! - Rustam Mehmandarov (Miles)

Every AI-enabled service in the previous three talks exposes APIs. Every agent that communicates with another service does so over an API. Every system Kenji secures with JWT and OpenID Connect is secured at its API boundary. The sophistication of your AI architecture means very little if the APIs it's built on are fragile, inconsistently versioned and poorly documented.

Rustam is one of my favourite Java talks presenters – his talks are energetic and funny but at the same time full of practical examples and experience-led lessons. This talk follows that approach, being a good reminder of how much often goes wrong with APIs in practice. He covers the gap between REST theory and REST reality, the widespread misuse of HTTP status codes, the underappreciated complexity of versioning strategies, and the operational challenges of deprecation and lifecycle management.

If the previously mentioned AI talks made you excited about what you're going to build, Rustam's talk is a good reminder to build it well.

The post Jakarta EE is Ready for AI – But Don’t Just Take My Word for It! appeared first on foojay.

Azul Payara May 2026 Release – What’s New

Luqman Saeed — Thu, 14 May 2026 11:40:02 +0000

Table of Contents

A critical security fix, patched across every supported branchAzul Payara Community 7.2026.5Azul Payara 6.38.0: Continued Jakarta EE 10 SupportAzul Payara 5.87.0: Jakarta EE 8 Support ContinuesAzul Payara 4.1.2.191.55: Legacy Branch Still MaintainedLooking AheadUpgrading and Feedback

The May 2026 release is the largest Payara milestone since the project's inception. Azul Payara Server 7 and Azul Payara Micro 7 ship as generally available, both certified against Jakarta EE 11. This is the first major Payara product release under the Azul brand, arriving six months after Azul completed its acquisition of Payara in December 2025.

Azul Payara Community 7 (download here), the open-source distribution, was the first implementation of any kind to certify across all three Jakarta EE 11 profiles (Full, Web Profile, Core Profile). Azul Payara Server 7 brings that certification to a commercially supported product with enterprise SLAs, making it the first commercially supported Jakarta EE 11 runtime from a major enterprise application server vendor. Both products ship with MicroProfile 6.1 (Config, Metrics, Health, Fault Tolerance, JWT, OpenAPI, REST Client, Telemetry Tracing). Azul Payara Server 7 holds Final TCK certification across all three profiles:

Profile	Azul Payara Server 7	Azul Payara Micro
Full	Certified	--
Web Profile	Certified	Certified
Core Profile	Certified	Certified

No other major enterprise application server vendor holds Final certification across all three profiles at Jakarta EE 11. Oracle WebLogic 15.1.1 sits at Jakarta EE 9.1. IBM WebSphere tWAS is frozen at Java EE 7. Red Hat JBoss EAP ships Jakarta EE 10.

Existing Jakarta EE 10 applications deploy without code changes; the jakarta.* namespace is stable between EE 10 and EE 11, so Azul Payara 6 applications move to Payara 7 by upgrading the runtime, not rewriting the codebase. JDK 21 is the minimum (Docker images ship for JDK 21 and JDK 25, the latest LTS). The same .war runs on both Server and Micro without modification. Jakarta Data, the headline API addition in Jakarta EE 11 introduces the @Repository annotation and a standardized data access layer.

This release also ships a critical security fix across every version: Azul Payara Community 7.2026.5, and Azul Payara 6.38.0, 5.87.0, and 4.1.2.191.55.

A critical security fix, patched across every supported branch

A critical security issue has been addressed across Azul Payara Community 7.2026.5 and Azul Payara 6.38.0, 5.87.0, and 4.1.2.191.55.

The fix lands in Azul Payara branches dating back to 4.1.2. Shipping security patches across the full supported lifecycle, not only the latest major release, is one of the practices that long-running Azul customers rely on; this release is a clear example. Azul is a registered CVE Numbering Authority (CNA) under CISA/DHS oversight, with patches backported to all supported versions on a published monthly schedule.

Azul Payara Community 7.2026.5

Community 7.2026.5 tracks the Payara 7 development line and ships additional fixes ahead of the Enterprise cadence.

Security Fixes

Remote attacker can read arbitrary files via unsafe parsing of OpenMQ configuration
Restrict access to vulnerable EL expressions

Bug Fixes

Fix Admin Console freezing after upgrading from Payara 6 to 7

ImprovementsImprovements

Update JaccProviderCompatibilityStartup Service
Remove Audit Modules
Add warlibs support to redeployment via Admin Console
Reduce INFO logging for the Jakarta Data implementation
Create new deployment descriptors with deprecated properties removed
Fix Jakarta Data @Repository methods not throwing UnsupportedOperationException when no implementation logic can be injected at deploy time

Component Upgrades

Docker JDK images refreshed to 21.0.11 and 25.0.3. Dependency updates for Jakarta Faces, MicroProfile Config, Project Reactor, and other libraries.****

Azul Payara 6.38.0: Continued Jakarta EE 10 Support

Azul Payara 6.38.0 continues the Jakarta EE 10 and MicroProfile 6.1 line for customers who are not yet on Payara 7.

Bug Fixes

Fix HTTP 403 Forbidden response on correctly authenticated and authorized calls to protected JAX-RS resources
Fix illegal reflective access by org.glassfish.pfl.basic.reflection.Bridge when starting Payara Server in Verbose mode

Improvements

Deprecate Audit Modules
Remove Yubikey Extension

Component Upgrades

Docker JDK images refreshed for JDK 21, 17, 11, and 8 (21.0.11, 17.0.19, 11.0.31, 8u492). Dependency updates for Mojarra and Project Reactor.

Azul Payara 5.87.0: Jakarta EE 8 Support Continues

Azul Payara 5.87.0 retains the javax. namespace, Jakarta EE 8, and MicroProfile 4.1 platform for customers running long-lived applications that have not yet migrated to the jakarta. namespace.

Bug Fixes

Fix illegal reflective access by org.glassfish.pfl.basic.reflection.Bridge when starting Payara Server in Verbose mode
Fix OIDC proxy support failing due to incorrect redirect URL comparison

Improvements

Deprecate Audit Modules
Remove Yubikey Extension

Component Upgrades

Docker JDK images refreshed for JDK 21, 17, 11, and 8 (21.0.11, 17.0.19, 11.0.31, 8u492).

Azul Payara 4.1.2.191.55: Legacy Branch Still Maintained

Azul Payara 4.1.2.191.55 receives security updates and targeted bug fixes for customers still running on the Payara 4 branch.

Bug Fixes

Fix Payara failing to start OpenMQ Broker in a separate JVM when using LOCAL mode on JDK 11 or later
Fix unclosed streams warnings from OpenMQ

Looking Ahead

With Payara 7 GA, the Azul Payara product line now covers the full enterprise Java surface: the JDK (Azul Zulu, Core and Azul Prime), the full application server (Azul Payara Server), and the cloud-native runtime (Azul Payara Micro). All three ship under one Azul contract with monthly security patches, a long term lifecycle per major release, transparent per-vCore pricing, 24-48 hour bug fix SLAs, and 2-hour critical incident response with dedicated support engineers.

Azul Payara 6, 5, and 4 continue to receive monthly security and bug-fix releases on the published schedule. Migration assessments to Azul Payara 7 are available through your Azul account team for customers planning the move.

Upgrading and Feedback

We recommend upgrading to your version’s latest release in this cycle. A critical security patch is available across every supported branch, so there is no reason to delay the upgrade based on the major-version line you run.

For detailed upgrade instructions, see the Payara documentation. To report issues, contribute fixes, or follow the Payara 7 roadmap, visit the Payara GitHub repository. For commercial support, your Azul account team.

Happy deploying!

The post Azul Payara May 2026 Release – What’s New appeared first on foojay.

BoxLang AI Deep Dive — Part 6 of 7: Memory Systems & RAG — Building AI That Remembers

Cristobal Escobar — Tue, 05 May 2026 15:10:15 +0000

Table of Contents

Two Categories of Memory Standard Memory Types

Summary Memory — How It Actually Works

Vector Memory Types

Hybrid Memory — The Best of Both

Per-Call Multi-Tenant Identity Routing Document Loaders Building a Complete RAG Pipeline

Step 1: Ingest
Step 2: Query
Step 3: Hybrid for Production

Token Management Multiple Memories Per Agent The aiPopulate() BIF — Structured Memory Without Live CallsWhat's Next

BoxLang AI 3.0 Series · Part 6 of 7

A chatbot with no memory isn't a conversation — it's a series of isolated queries. Every message starts from scratch. The user has to re-explain who they are, what they're working on, and what was just said. It's exhausting, and it signals that the AI isn't really listening.

Memory is what separates a useful AI application from a toy. BoxLang AI ships with one of the most comprehensive memory systems in any AI framework — 20+ memory types across two major categories, vector embedding support for semantic retrieval, 30+ document loaders for RAG pipelines, and a per-call identity routing system that makes multi-tenant applications safe by default.

This post is a complete tour.

🧠 Two Categories of Memory

           +-----------------------------------+
           |         BoxLang AI Memory         |
           +-----------------------------------+
                        /           \
                       /             \
                      v               v

+--------------------------------+   +--------------------------------+
|        Standard Memory         |   |         Vector Memory          |
+--------------------------------+   +--------------------------------+
| Stores conversation history    |   | Stores semantic knowledge      |
| Sequential message thread      |   | Embeddings + retrieval         |
| Retrieves by recency/order     |   | Retrieves by meaning           |
| Example: remember prior fact   |   | Example: RAG knowledge lookup  |
+--------------------------------+   +--------------------------------+

                      \               /
                       \             /
                        v           v

         +-------------------------------------------+
         | Shared abstraction and usage model        |
         +-------------------------------------------+
         | IAiMemory interface                       |
         | aiMemory() BIF                            |
         | Per-call identity routing                 |
         | Minimal app-code changes between both     |
         +-------------------------------------------+

BoxLang AI memory breaks into two fundamentally different categories, solving two different problems.

Standard Memory stores conversation history — the sequential messages between user and assistant. It's what lets the agent remember "my name is Luis" from three messages ago.

Vector Memory stores semantic knowledge — embeddings of documents, past conversations, or domain content that can be retrieved by meaning, not by recency. It's what enables RAG: "find the three most relevant passages from our knowledge base for this query."

Both categories share the same IAiMemory interface, the same aiMemory() BIF, and the same per-call identity routing — your application code barely changes between them.

📋 Standard Memory Types

Create any memory with our lovely global function: aiMemory( type, config: {} ). Our default memory type is a window memory of 20 messages:

// Window memory — keeps the last N messages
mem = aiMemory( "window", config: { maxMessages: 20 } )

// Summary memory — auto-summarizes old messages to preserve context
mem = aiMemory( "summary", config: {
    maxMessages      : 30,
    summaryThreshold : 15,
    summaryModel     : "gpt-4o-mini"
} )

// Cache memory — CacheBox-backed, distributed-friendly
mem = aiMemory( "cache", config: { cacheName: "aiMemory" } )

// Session memory — scoped to the current web session
mem = aiMemory( "session" )

// File memory — persisted to disk for audit trails
mem = aiMemory( "file", config: { filePath: "/logs/conversations/" } )

// JDBC memory — stored in a database for enterprise multi-user scenarios
mem = aiMemory( "jdbc", config: {
    datasource : "myDB",
    table      : "ai_conversations"
} )

Type	Best For
`window`	Quick chats, cost-conscious apps, stateless APIs
`summary`	Long conversations where context must survive message limits
`session`	Multi-page web applications with PHP/BoxLang sessions
`file`	Audit trails, offline inspection, long-term storage
`cache`	Distributed applications, multi-server deployments
`jdbc`	Enterprise multi-user systems, full persistence

Summary Memory — How It Actually Works

The summary type deserves special attention. When the message count exceeds summaryThreshold, it calls the configured LLM to produce a one-paragraph summary of the oldest messages, replaces them with that summary as a single system message, then continues accumulating. Conversation context survives without the token cost of carrying the full history.

agent = aiAgent(
    name   : "support-bot",
    memory : aiMemory( "summary", config: {
        maxMessages      : 40,    // keep up to 40 messages
        summaryThreshold : 20,    // summarize when we hit 20
        summaryModel     : "gpt-4o-mini"  // use a cheap model for summarization
    } )
)

🔍 Vector Memory Types

Vector memory stores embeddings and retrieves by semantic similarity — the right tool when "find relevant context" matters more than "recall what was said recently."

// In-memory vectors — development and small datasets
mem = aiMemory( "boxvector" )

// ChromaDB — Python-based vector store
mem = aiMemory( "chroma", config: {
    collection       : "support_docs",
    embeddingProvider: "openai",
    embeddingModel   : "text-embedding-3-small"
} )

// PostgreSQL pgvector — works with your existing Postgres
mem = aiMemory( "postgres", config: {
    datasource       : "myDB",
    table            : "ai_embeddings",
    embeddingProvider: "openai"
} )

// Pinecone — managed cloud vector DB
mem = aiMemory( "pinecone", config: {
    apiKey     : "${Setting: PINECONE_API_KEY not found}",
    index      : "knowledge-base",
    namespace  : "support"
} )

// OpenSearch — AWS OpenSearch or self-hosted
mem = aiMemory( "opensearch", config: {
    host             : "https://my-opensearch:9200",
    index            : "ai_embeddings",
    embeddingProvider: "openai"
} )

Full vector memory roster:

Type	Description
`boxvector`	In-memory, development/testing
`hybrid`	Recent window + semantic retrieval combined
`chroma`	ChromaDB integration
`postgres`	PostgreSQL pgvector
`mysql`	MySQL 9 native vectors
`opensearch`	MySQL 9 native vectors
`typesense`	Fast typo-tolerant search
`pinecone`	Managed cloud vector DB
`qdrant`	High-performance vector store
`weaviate`	GraphQL vector database
`milvus`	Enterprise-scale vector DB

Hybrid Memory — The Best of Both

hybrid combines a recent message window with semantic vector retrieval — you get recency and relevance:

mem = aiMemory( "hybrid", config: {
    recentLimit   : 5,        // keep last 5 messages always
    semanticLimit : 5,        // add 5 semantically relevant past messages
    vectorProvider: "chroma"  // backed by ChromaDB
} )

For most production support-bot or assistant scenarios, hybrid is the sweet spot — recent context for coherence, semantic retrieval for depth.

🏢 Per-Call Multi-Tenant Identity Routing

This is the architectural feature that makes BoxLang AI memory extensible. Memory instances are stateless and safe to use as singletons — userId and conversationId route each operation to the correct isolated conversation. Or you can create memories with seeded identities if you want a specific agent with specific memory; your choice.

Every memory operation accepts optional identity arguments:

sharedMemory = aiMemory( "cache" )

// Operations are fully tenant-isolated
sharedMemory.add( message, userId: "alice", conversationId: "sess-1" )
sharedMemory.add( message, userId: "bob",   conversationId: "sess-2" )

// Retrieval is scoped — alice never sees bob's messages
aliceHistory = sharedMemory.getAll( userId: "alice", conversationId: "sess-1" )
bobHistory   = sharedMemory.getAll( userId: "bob",   conversationId: "sess-2" )

// Clear only alice's conversation
sharedMemory.clear( userId: "alice", conversationId: "sess-1" )

In practice, you pass identity through AiAgent.run() options and it flows automatically to all memory operations:

sharedAgent = aiAgent( name: "support", memory: sharedMemory )

// One agent instance, many concurrent users — fully safe
sharedAgent.run( "Hello, I need help with my order",    {}, { userId: "alice", conversationId: "sess-1" } )
sharedAgent.run( "What did I just ask about?",          {}, { userId: "alice", conversationId: "sess-1" } ) // remembers
sharedAgent.run( "Can you help me reset my password?",  {}, { userId: "bob",   conversationId: "sess-2" } ) // isolated

No per-user agent factories. No thread-local hacks. No shared-state concurrency bugs. One instance, many tenants.

📚 Document Loaders

Document loaders are the ingestion layer for RAG pipelines. They normalize content from 30+ source types into the Document format that vector memory understands.

// Load a single PDF
docs = aiDocuments(
    source : "/path/to/product-manual.pdf",
    config : { type: "pdf" }
).load()

// Load all Markdown files in a directory (recursively)
docs = aiDocuments(
    source : "/knowledge-base",
    config : {
        type       : "directory",
        recursive  : true,
        extensions : [ "md", "txt", "pdf" ]
    }
).load()

// Load a live web page
docs = aiDocuments(
    source : "https://boxlang.ortusbooks.com/getting-started/overview",
    config : { type: "http" }
).load()

// Load from a database query
docs = aiDocuments(
    source : "SELECT title, content FROM articles WHERE published = 1",
    config : { type: "sql", datasource: "myDB" }
).load()

// Crawl an entire website
docs = aiDocuments(
    source : "https://docs.mycompany.com",
    config : {
        type     : "webcrawler",
        maxPages : 200,
        delay    : 500
    }
).load()

Built-in loaders:

Loader	Type	Handles
`TextLoader`	`text`	`.txt, .log`
`MarkdownLoader`	`markdown`	`.md` with header splitting
`HTMLLoader`	`html`	Web pages, strips scripts/styles
`CSVLoader`	`csv`	Rows as documents, column filtering
`JSONLoader`	`json`	Field extraction, array-as-documents
`PDFLoader`	`pdf`	Multi-page, page range selection
`XMLLoader`	`xml`	Structured XML content
`LogLoader`	`log`	Application log files
`HTTPLoader`	`http`	Single URL fetch
`FeedLoader`	`feed`	RSS / Atom feeds
`SQLLoader`	`sql`	Database query results
`DirectoryLoader`	`directory`	Batch file processing
`WebCrawlerLoader`	`webcrawler`	Multi-page crawl

🔗 Building a Complete RAG Pipeline

Here's the full picture — ingest documents into vector memory, then use an agent with that memory to answer questions grounded in your content.

Step 1: Ingest

// Create vector memory backed by ChromaDB
vectorMemory = aiMemory( "chroma", config: {
    collection       : "company_knowledge",
    embeddingProvider: "openai",
    embeddingModel   : "text-embedding-3-small"
} )

// Ingest everything in one call
result = aiDocuments(
    source : "/knowledge-base",
    config : {
        type       : "directory",
        recursive  : true,
        extensions : [ "md", "txt", "pdf" ]
    }
).toMemory(
    memory  : vectorMemory,
    options : { chunkSize: 1000, overlap: 200 }
)

// Rich ingestion report
println( "Documents loaded : #result.documentsIn#" )
println( "Chunks created   : #result.chunksOut#" )
println( "Vectors stored   : #result.stored#" )
println( "Duplicates skipped: #result.deduped#" )
println( "Estimated cost   : $#result.estimatedCost#" )

The toMemory() method handles chunking via aiChunk(), embedding via the configured provider, deduplication, and storage — everything in one fluent call with a detailed report back.

Step 2: Query

// Agent with the same vector memory — retrieves relevant chunks automatically
agent = aiAgent(
    name        : "knowledge-assistant",
    description : "Expert on all company documentation and policies",
    memory      : vectorMemory
)

// The agent retrieves semantically relevant chunks and grounds its answer
response = agent.run(
    "What is our refund policy for enterprise customers?",
    {},
    { userId: "support-team", conversationId: "ticket-12345" }
)

When the agent runs, vector memory retrieves the most semantically similar document chunks for the query and injects them as context before the LLM call. The LLM answers based on your actual content — not hallucinations.

Step 3: Hybrid for Production

For most production RAG scenarios, hybrid memory beats pure vector:

// Combines short-term conversation memory with long-term semantic retrieval
productionMemory = aiMemory( "hybrid", config: {
    recentLimit   : 8,
    semanticLimit : 6,
    vectorProvider: "chroma",
    collection    : "company_knowledge"
} )

agent = aiAgent(
    name   : "enterprise-assistant",
    memory : productionMemory
)

The first 8 messages keep conversations coherent. The semantic layer ensures relevant documentation is always surfaced. Together they handle both "what did I just ask?" and "what does our policy say about X?"

🔧 Token Management

Two BIFs help you reason about context window usage:

// Count tokens before sending (approximate)
tokenCount = aiTokens( "This is the text I want to count", { method: "words" } )

// Chunk a large document for ingestion
chunks = aiChunk( largeText, {
    chunkSize : 1000,  // tokens per chunk
    overlap   : 200    // overlap between chunks for context continuity
} )

aiChunk() is used internally by toMemory(), but you can call it directly when building custom ingestion pipelines.

🏗️ Multiple Memories Per Agent

Agents can have multiple memory instances simultaneously — useful when you want different retention policies for different types of information:

agent = aiAgent(
    name   : "research-assistant",
    memory : [
        // Short-term: current conversation
        aiMemory( "window", config: { maxMessages: 20 } ),
        // Long-term: semantic knowledge base
        aiMemory( "chroma", config: {
            collection       : "research_papers",
            embeddingProvider: "openai"
        } )
    ]
)

// Add another memory dynamically
agent.addMemory( aiMemory( "file", config: { filePath: "/audit/" } ) )

All memories are read from and written to in parallel. Messages retrieved from all memories are merged before each LLM call.

📦 The `aiPopulate()` BIF — Structured Memory Without Live Calls

One often-overlooked feature: aiPopulate() fills a typed BoxLang class from JSON without making any LLM call. This is essential for caching and testing:

class CustomerProfile {
    property name="name"         type="string";
    property name="tier"         type="string";
    property name="openTickets"  type="numeric";
}

// From a live AI call
profile = aiChat(
    "Extract the customer profile from: John Doe, Gold tier, 3 open tickets",
    { returnFormat: new CustomerProfile() }
)

// Cache it as JSON
cachedJson = jsonSerialize( profile )

// Later — restore the typed object without another LLM call
restoredProfile = aiPopulate( new CustomerProfile(), cachedJson )
println( restoredProfile.getName() ) // "John Doe"

Perfect for: pre-populated test fixtures, cached AI extractions, converting existing JSON data to typed objects.

What's Next

In Part 7 — the final post in the series — we go deep on MCP: how to consume tools from any MCP server, how MCPTool proxies work, and how to expose your own BoxLang functions as an enterprise MCP server with full security, CORS, API key validation, and rate limiting.

📖 Full Documentation 🌐 BoxLang AI Site 📦Install Today: install-bx-module bx-ai 🫶Professional Support

← Previous

Next ->

The post BoxLang AI Deep Dive — Part 6 of 7: Memory Systems & RAG — Building AI That Remembers appeared first on foojay.

The Code Was Always the Door

Markus Westergren — Tue, 05 May 2026 08:59:45 +0000

Table of Contents

The doorman in a hoodieThe shepherdRead the terrainChoose the pathWatch for predatorsTend the flockThe doorman's dignity

The doorman in a hoodie

There's a story Rory Sutherland tells in his book Alchemy. A consultant is hired to find savings at a luxury hotel. He watches a doorman for twenty minutes and writes in his report: this man opens doors. Automatic doors also open doors. Automatic doors are cheaper. So the hotel removes him.

The lobby falls apart. Guests can't find the restaurant. Nobody hails a taxi. The ineffable sense that someone is in charge of the front of house disappears with him. The consultant measured the visible action and missed the actual function.

Right now, somewhere, someone is watching a senior developer type code and writing a similar report. These people produce code. ChatGPT produces code. The doorman fallacy, in a hoodie. The code was always the door opening. It was never the job.

That is the argument I want to make, and I want to make it specifically to other senior developers, because we are the people best positioned to see why it is right, and most at risk of forgetting it. The visible work has shifted. The judgement underneath has not. Everything you have built up over a career is exactly what AI lacks and exactly what shepherding it well requires: the context, the taste, the system thinking, the willingness to be on the hook for the 2am call.

The headline isn't that AI is replacing developers. The headline is that AI made the rest of the job, the part that was always the actual job, finally visible.

The shepherd

I have started using a particular word for the role I think senior developers are evolving into: shepherd. Not prompt engineer. Not vibe coder. Not "developer who uses AI." A shepherd guides AI through terrain it cannot see: the codebase's history, the team's constraints, the deployment realities, the business context that lives nowhere in any training set. The shepherd's value is not speed. It is judgement about where to apply speed.

It is worth grounding this in numbers, because the surrounding hype is loud. A Stanford analysis of more than a hundred thousand developers across six hundred companies, looking at real code in real repositories rather than lab experiments, found that the much-quoted productivity boost of thirty to forty percent shrinks to roughly fifteen to twenty percent net once you factor in the time spent fixing what the AI got wrong. The gain is real. It is also smaller and lumpier than the demos suggest. Where AI helps most is on well-trodden ground. Where it helps least, or actively hurts, is on the complex existing systems that describe most of our day jobs.

So the question is not whether to use the tool. It is how to use it well. And the answer comes apart, I think, into four things a shepherd does. Read the terrain. Choose the path. Watch for predators. Tend the flock.

Read the terrain

A shepherd's first job is knowing the ground.

I work on a government system. Like a lot of public-sector systems, the requirements are not a tidy specification. They are a sediment. Laws, policy decisions, edge cases that surfaced years ago and never got formally documented, exceptions that exist for reasons nobody on the current team can fully articulate. When I hand a piece of that work to a coding agent, the result is almost always wrong, even when it looks plausible. The agent reads what is in front of it. The actual requirement lives in the negative space, in the conversations and the precedents that were never written down.

Trying to brute-force this by stuffing the agent's context with every related document does not work either. The window fills up, the relevant signal gets diluted, and what comes back has the same confident tone whether it is pattern-matching to your real situation or to something superficially similar from training data. The model cannot tell which it is doing. You can.

That is the shepherd's first contribution. Not the prompt. The framing. Knowing which two paragraphs of which document actually matter for this change. Knowing that this requirement looks routine but interacts with that legacy module in a non-obvious way. Knowing when the gap between what the model can see and what the answer actually depends on is too wide to bridge with any prompt at all, and the right move is to not delegate this one.

The skill is unglamorous. It is the same skill senior developers have always used to onboard new hires and unblock stuck juniors. It just turns out to be the load-bearing skill for AI work too.

Choose the path

A shepherd decides what to delegate and what to keep close.

Last year my team did a refactoring that touched around thirty similar objects in our system. The temptation, given the tools, was to point an agent at the whole thing and let it grind. I did not. I picked one object and refactored it myself, slowly and deliberately. Not because I could not have got the agent to do it, but because I wanted the pattern to come out of my hands first, with the small decisions and the second thoughts still attached to it.

Once I was happy with that one, the work changed shape. I asked the agent to look at the refactored object and the remaining ones, and to produce a task list for bringing the others into the same form. I read the list and adjusted it. Some entries were sharper than I would have written, and others missed subtleties that came from having lived inside that first object. Then I let the agent work through the list, one object at a time, with me reviewing each result before it went anywhere near main.

The shape of the work is what matters. I owned the design, delegated the propagation, and owned the review. The agent did the repetitive part faster than I could have, and the parts where my judgement actually mattered stayed in my hands.

That is the move I think senior developers are best positioned to make. The interesting question for a senior is no longer can AI do this. It is should I be the one doing this, and if not, what does it need from me to do it well. The first question is mostly about the tool. The second is mostly about you.

Watch for predators

A shepherd verifies. Always.

Some time ago we migrated a Quarkus application from JPA to Jakarta Data. Almost immediately, our tests started failing in a strange way. Data we had updated inside a transaction was invisible when we read it back inside the same transaction. We asked a coding agent for help. The reply came fast and confident: flush the session.

It was wrong. Jakarta Data uses stateless sessions. There is nothing to flush. The advice was a fluent answer to a different and more familiar question. It was the one the model had seen many times in its training data, where flushing a JPA EntityManager genuinely is the fix. Our problem looked similar from the outside and was structurally different underneath.

We tried feeding the agent the relevant Jakarta Data documentation. It did not help. In the end we did what we would always have done. We built a minimal reproducer, narrowed the behaviour down to a specific interaction, and reported the issue upstream. The Quarkus team confirmed it, and the root cause turned out to live in Hibernate itself.

The lesson is not don't trust AI. The lesson is sharper. A model produces fluent output with the same tone whether it is right or wrong, so confidence is not a signal. It is noise. And the moment when it is most dangerous is precisely the moment a senior developer is in the best position to handle: when the answer pattern-matches to something common but the actual problem sits just outside the model's training. You have to know enough to smell it. The smell is the moat.

Tend the flock

A shepherd does not work alone.

My team had been using AI assistants for a while, with results all over the map. Some people loved them. Some people had quietly stopped trying. The difference, when I looked at it, was not talent or seniority. It was process. The people getting the worst results were trying to solve whole problems in a single prompt: here is the issue, fix it. The people getting the best results were doing what we have always done with hard problems, just with a collaborator: planning, then implementing, then validating, in distinct steps with their own outputs.

So I started sharing that explicitly. Not as a productivity hack but as a re-statement of the obvious. Do not ask the agent to do everything at once. Ask it to lay out a plan you can read. Then ask it to implement one piece of the plan. Then check that piece against what you actually wanted. Smaller, more specific steps almost always beat one ambitious prompt. We were rediscovering the software development process. AI had not changed it. It had just rewarded teams that already had one and punished teams that did not.

That is the framing I think is most useful for the people you work with. AI did not break engineering. It made the gap between teams with a real process and teams without one suddenly very visible, because the tool magnifies whatever habits it lands on. A shepherd's job is not to police prompts. It is to make those habits explicit, share what is working, and help less experienced developers build the instincts that would otherwise take ten years and a few production outages to acquire.

The doorman's dignity

The doorman in Sutherland's story was not insecure about his role. He knew what he was actually doing. The consultant was the one who was confused.

Right now, a lot of senior developers are letting consultants confuse them. The viral demos, the executive quotes about replacing engineers, the LinkedIn posts from someone who built a to-do app with a single prompt are all reports written by people watching us type. They measure the visible action and miss the actual function.

The typing was always the door opening. The job was reading the terrain, choosing the path, watching for predators, tending the flock. AI did not take any of that away. It just made it the part that obviously matters now, because the part it could automate has been automated.

Your seniority is not a liability in this transition. It is the moat. It always was.

This article expands on the AI Shepherd concept from a conference talk Elma Westergren and I have given on developer identity in the AI era. The core framing comes from Elma's work as an occupational therapist. What is really at stake when our tools change is occupational identity, not just productivity, and I am grateful for that perspective.

The post The Code Was Always the Door appeared first on foojay.

Explore Spring AI SDK – Amazon Bedrock AgentCore – Part 2

Mahendra Rao B — Mon, 27 Apr 2026 09:09:00 +0000

Table of Contents

Step 1: Add the Ai model and AgentCore memory dependencies
Step 2: Create Short/Long Term in AWS Management Console
Step 3: Add the following memory-related properties.
Step 4: Add the below MemoryConfig class.
Step 5: Create the ChatRequest and ChatResponse classes as shown below.
Step 6: Add the below ShortTermController class.
Step 7: verify
End-to-End Flow
References

If you're joining us from Part 1 or need a quick refresher on the architecture, listen to this brief overview of how Spring AI and Amazon Bedrock work together.

Generated using Notebook LLM for my previous article

In this article, we explore one of the AgentCore capabilities i.e., memory

Source: Amazon

To begin, enable AgentCore memory for the agent you built earlier.

Step 1: Add the Ai model and AgentCore memory dependencies


    org.springframework.ai
    spring-ai-model


    org.springaicommunity
    spring-ai-agentcore-memory

Step 2: Create Short/Long Term in AWS Management Console

Navigate to Amazon Bedrock AgentCore > Memory to create short/long-term memories.

AgentCore Memory

application.yml

agentcore:
  memory:
    memory_id: memory_27vql-Vl7nIoHdf6
    total-events-limit: 100
    default-session: default
    page-size: 50
    ignore-unknown-roles: false

application.properties

agentcore.memory.memory_id=memory_27vql-Vl7nIoHdf6
agentcore.memory.total-events-limit=100
agentcore.memory.default-session=default
agentcore.memory.page-size=50
agentcore.memory.ignore-unknown-roles=false

Step 4: Add the below `MemoryConfig` class.

package com.bsmlabs.springai.config;

import org.springaicommunity.agentcore.memory.longterm.AgentCoreMemory;
import org.springframework.ai.chat.client.advisor.MessageChatMemoryAdvisor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.memory.MessageWindowChatMemory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.util.List;

@Configuration
public class MemoryConfig {

    @Bean
    public ChatMemory chatMemory() {
        return MessageWindowChatMemory.builder()
                .maxMessages(20) // keeps last 20 messages
                .build();
    }

    @Bean
    public MessageChatMemoryAdvisor messageChatMemoryAdvisor(ChatMemory chatMemory) {
        return MessageChatMemoryAdvisor.builder(chatMemory).build();
    }

    @Bean
    public AgentCoreMemory agentCoreMemory(MessageChatMemoryAdvisor advisor) {
        return new AgentCoreMemory(advisor, List.of());
    }

}

Let’s break down the structure of the beans defined in the above configuration class.

4.1. ChatMemory Bean – The Core

@Bean
public ChatMemory chatMemory() {
   return MessageWindowChatMemory.builder()
                .maxMessages(20) // keeps last 20 messages
                .build();
}

This creates a sliding window memory that retains only the last 20 messages. Benefits include:

Prevents unbounded memory growth
Keeps recent context while discarding older, irrelevant messages
Reduces token usage when calling LLMs, making it cost-effective
Maintains conversation relevance

4.2. MessageChatMemoryAdvisor – The Wrapper

@Bean
public MessageChatMemoryAdvisor messageChatMemoryAdvisor(ChatMemory chatMemory) {
   return MessageChatMemoryAdvisor.builder(chatMemory).build();
}

This advisor acts as an intermediary that:

Integrates the ChatMemory into Spring AI's advisor chain
Automatically injects conversation history into chat requests
Manages when and how memory is applied to prompts

4.3. AgentCoreMemory – The Orchestrator

@Bean
public AgentCoreMemory agentCoreMemory(MessageChatMemoryAdvisor advisor) {
   return new AgentCoreMemory(advisor, List.of());
}

This combines the advisor with an empty list of additional strategies. It:

Coordinates memory across agent operations
Provides a unified interface for long-term memory management
Allows for extensibility (the List.of() can include custom memory strategies)

Step 5: Create the ChatRequest and ChatResponse classes as shown below.

Add the following classes to the models folder. We will use them in the next REST controller.

package com.bsmlabs.springai.models;

public record ChatRequest(String message) {
}

package com.bsmlabs.springai.models;

public record ChatResponse(String response) {
}

Step 6: Add the below `ShortTermController` class.

Adding memory to an existing agent helps improve response latency and relevance. The agent can store previous conversations in short-term memory (STM). It can also retain learned information over time using long-term memory (LTM).

The SDK integrates with AgentCore Memory through Spring AI’s advisor pattern. These advisors act as interceptors that enrich prompts with relevant context before sending them to the model.

The below RestController demonstrates how to build a stateful chat API that maintains conversation history by leveraging the memory configuration from the previous example to provide a persistent conversational context.

package com.bsmlabs.springai.agents;

import com.bsmlabs.springai.models.ChatRequest;
import com.bsmlabs.springai.models.ChatResponse;
import org.springaicommunity.agentcore.memory.longterm.AgentCoreMemory;
import org.springaicommunity.agentcore.memory.shorttem.AgentCoreShortTermMemoryRepository;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.messages.Message;
import org.springframework.web.bind.annotation.*;

import java.util.List;

@RestController
public class ShortTermMemoryController {

    private final ChatClient chatClient;
    private final ChatMemory chatMemory;
    private final AgentCoreMemory agentCoreMemory;

    private static final String CONVERSATION_ID = UUID.randomUUID().toString();

    public ShortTermMemoryController(ChatClient.Builder chatClientBuilder,
                                     ChatMemory chatMemory,
                                     AgentCoreMemory agentCoreMemory,
                                     AgentCoreShortTermMemoryRepository shortTermMemoryRepository) {
        this.chatClient = chatClientBuilder.build();
        this.chatMemory = chatMemory;
        this.agentCoreMemory = agentCoreMemory;

        // shortTermMemoryRepository.deleteByConversationId(CONVERSATION_ID);
    }

    @PostMapping("/api/short")
    public ChatResponse shortTermChat(@RequestBody ChatRequest chatRequest) {
        String response = chatClient.prompt()
                .user(chatRequest.message())
                .advisors(agentCoreMemory.advisors)
                .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, CONVERSATION_ID))
                .call()
                .content();

        return new ChatResponse(response);
    }

    @GetMapping("/api/history")
    public List getHistory() {
        return chatMemory.get(CONVERSATION_ID);
    }

    @DeleteMapping("/api/history")
    public void clearHistory() {
        chatMemory.clear(CONVERSATION_ID);
    }

}

ChatClient: Send prompts to the LLM
ChatMemory: Manages the conversation window/sliding window (20 messages)
AgentCoreMemory: Orchestrates memory across operations

POST `/api/short` – Chat Endpoint

@PostMapping("/api/short")
public ChatResponse shortTermChat(@RequestBody ChatRequest chatRequest) {
   String response = chatClient.prompt()
                .user(chatRequest.message())
                .advisors(agentCoreMemory.advisors)
                .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, CONVERSATION_ID))
                .call()
                .content();

   return new ChatResponse(response);
}

What happens:

Receives user message in ChatRequest
Calls agentCoreMemory.advisors to inject the MessageChatMemoryAdvisor
Passes CONVERSATION_ID to the advisor so it knows which conversation's history to retrieve
ChatClient automatically
- Retrieves last 20 messages for this conversation
- Appends them to the current user message
- Sends the full context to the LLM
- Stores the user message + response in ChatMemory
Returns just the LLM response to the client

GET `/api/history` – Retrieve Conversation History

@GetMapping("/api/history")
public List getHistory() {
   return chatMemory.get(CONVERSATION_ID);
}

This method returns all messages (up to 20) for the given conversation ID. It is useful for:

Displaying chat history in the UI
Debugging the conversation context
Auditing interactions

DELETE `/api/history` – Clear History

@DeleteMapping("/api/history")
public void clearHistory() {
   chatMemory.clear(CONVERSATION_ID);
}

Step 7: verify

### Tell name - STM
POST http://localhost:8080/api/short
Content-Type: application/json

{
  "message": "Mahendra is writing an article to Foojay on Spring AI SDK with Amazon Bedrock Agentcore"
}

### Ask name - STM
POST http://localhost:8080/api/short
Content-Type: application/json

{
  "message": "What is my name?"
}

### Get history
GET http://localhost:8080/api/history

### Clear history
DELETE http://localhost:8080/api/history

Using curl commands

# --- Short-Term Memory (STM) ---
# Tell your name and what you're talking about
curl -X POST http://localhost:8080/api/short \
    -H "Content-Type: application/json" \
    -d '{"message": "Mahendra is writing an article to Foojay on Spring AI SDK with Amazon Bedrock Agentcore"}'

# Ask for your name (memory recall)
curl -X POST http://localhost:8080/api/short \
    -H "Content-Type: application/json" \
    -d '{"message": "What is my name?"}'

# Get conversation history
curl http://localhost:8080/api/history

# Clear conversation
curl -X DELETE http://localhost:8080/api/history

End-to-End Flow

User Request
    ↓
[/api/short endpoint]
    ↓
ChatMemory retrieves last 20 messages for CONVERSATION_ID
    ↓
Messages + current user input sent to LLM
    ↓
LLM generates response
    ↓
Exchange stored in ChatMemory (sliding window)
    ↓
Response returned to user

In the next part, I will discuss the inclusion of the remaining AgentCore services adding built-in tools like browser, code interpreter, and deployment to Amazon Bedrock AgentCore runtime.

Everything comes from the companion repo, which contains fully working implementations of each example.

Happy Learning Spring AI

References

https://spring.io/ai
Amazon Bedrock AgentCore: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-bedrock-agentcore.html
Spring AI: https://spring.io/projects/spring-ai
AWS Blog Spring AI SDK: https://aws.amazon.com/blogs/machine-learning/spring-ai-sdk-for-amazon-bedrock-agentcore-is-now-generally-available/

The post Explore Spring AI SDK – Amazon Bedrock AgentCore – Part 2 appeared first on foojay.

Crossing the River Styx: Spring Boot 3.5 and the Zombie Dependency Problem

Steve Poole — Sun, 19 Apr 2026 13:37:13 +0000

Table of Contents

The CVE Blind SpotThe River Styx

The Rules Changed. The Habits Didn't.
What This Looks Like in Practice

When Dependencies Become ZombiesSpring Boot 3.5: The Next Crossing

We've Seen This Film Before
The Window Is Open. For Now.

The Map, Not Just the Landscape

Tomorrow I start (o so early) for JCON Europe in Cologne and then, at the tail end of the week, go to Devoxx France to give more talks. If you're at either, come say hi. Herodevs has a booth at both.

After digging into the CVE stories behind Tomcat 8.5's end of life, I turned my attention to Spring Boot 3.5. Same question, different framework: what actually happens to your security posture when a project crosses the EOL line?

The CVE Blind Spot

Most of us understand the idea of a CVE. A vulnerability gets discovered, reported, assigned a severity score, and patched. We run our scanners, check our dashboards, update our dependencies. The system works.

Except it doesn't. Not after 'End Of Life'.

It seems we all have a collective blind spot about where CVEs come from. We think about the output: the advisory, the patch, the scanner alert. We rarely think about the process or the people who do this work. Who finds vulnerabilities? Who reports them? Who assigns the CVE identifier?

And critically: what happens to that pipeline when a project reaches end of life?

The answer is that it dries up. Not all at once. Not even dramatically. It just... stops.

The River Styx

Think of moving from active development and maintenance into EOL mode as crossing the River Styx. On the living side, you have maintainers actively looking at the code. Security researchers submitting reports. A CNA (CVE Numbering Authority) assigning identifiers. A disclosure process that, for all its flaws, at least functions.

On the other side? Silence.

The vulnerabilities don't stop existing. The code doesn't magically become secure because nobody's maintaining it. What stops is the reporting. Researchers move their attention to supported versions. Maintainers stop triaging issues against the older branch. Fewer reports reach the CNA. Fewer identifiers get assigned for a codebase nobody's going to patch.

Those on the living, active side know about problems downstream. They can see the vulnerable patterns in the dead code. But they tell no one in any readily discoverable way. There's no obligation to, and no mechanism for it. They don't report the problem because they have no intention of fixing it.

That's been the model forever.

It's actually amazing that any of the problems are fixed at all. I'm certainly not pointing fingers at anyone to say that the way this has worked before was wrong. I'm always grateful to the people who develop and share their creations. Open Source is, well, amazing, and our developer lives would be immeasurably worse off without it.

The Rules Changed. The Habits Didn't.

However, the world has changed and open source is being weaponised against us. Our old certainties are being destroyed, diluted, compromised in the face of the relentless army of bad actors. When once it was ok to accept that EOL meant 'stable' and meant nothing-to-see-here-move-on, well now that's not true.

The maintainers' muscle memory says that not reporting a CVE against an EOL stream is the right thing to do (because they have no intention of fixing it). That muscle memory now works against us.

The bad actors? They see everything...

They watch the CVEs reported on maintained streams, take the juicy ones, and try them against the older EOL streams.

And voilà: a compromise that the maintainers are conceptually aware of but that's not in any CVE database. A free ride for the bad actors.

What This Looks Like in Practice

A vulnerability exists in both the supported and the EOL branch. On the supported side, a researcher finds it, reports it, gets a CVE assigned, ships a patch.

On the EOL side? The same vulnerability sits in the same code. But fewer researchers are looking. Fewer reports get filed. The vulnerability doesn't appear in your scanner results. Not because it doesn't exist, but because nobody filed the paperwork.

When Dependencies Become Zombies

Pretty quickly the public CVE count against an EOL project drops. If you're lucky, it's because there are none to be found. The codebase is what we'd traditionally call stable. But it's more likely the software didn't get safer. All that happened was the system that records the problems wound down.

Nobody, to my knowledge, has done a rigorous study of this effect. But ask anyone who works in open-source security support. It's the pattern they see every time. It's the core reason companies like the one I work for exist.

Your dependencies end up in one of two states: actually stable, or more likely, zombies. Out of support and with hidden CVEs accumulating. Technically present in your stack. Functionally dead from a security standpoint. Slowly deteriorating whilst your scanners give you a green light.

We need to stop thinking silence means stability. It's frequently the opposite.

Spring Boot 3.5: The Next Crossing

Spring Boot 3.5 reaches end of open-source support on June 30, 2026. That's roughly 80 days from now.

When it crosses that line, it doesn't go alone. Spring Framework 6.2, Spring Security, and the entire Spring portfolio lose community patches simultaneously. The CVE reporting pipeline protecting a vast number of Java applications starts winding down for these versions.

We've Seen This Film Before

Spring Boot 2.7 went EOL in November 2023. Since then, multiple CVEs have surfaced for that branch. CVE-2024-38807, for example: a signature spoofing vulnerability in the boot loader. No open-source patches available. Teams still running 2.7 have to find the fix themselves, pay for commercial support, or accept the risk.

And the longer 2.7 sits in EOL, the quieter the CVE stream gets. Not safer. Quieter. Maybe you can hear the sound of dragging feet...

Based on that pattern, it's incredibly unlikely Spring Boot 3.5 won't follow the same trajectory. The transition from stable to zombie isn't a question of "if." It's a question of how fast the reporting pipeline dries up once the maintainers shift focus to 4.0.

The Window Is Open. For Now.

But here's the thing: it doesn't happen overnight. There is time. The zombie transition is gradual, and that window matters.

The quicker you assess the scale of the change from 3.5 to 4.0, the better positioned you'll be. Maybe that means migrating on your own terms. Maybe it means arranging commercial support to bridge the gap, or finding another path entirely. The worst move is to wait until the silence sets in and assume everything is fine.

The Map, Not Just the Landscape

That's the landscape. Now let's talk about the map.

In my recent JDK 8 to 25 review, I started to walk through every major change across seventeen years of Java releases. I map out what teams actually face when they finally modernise. I'm going to do the same for Spring Boot 3.5 to 4.0.

In the coming articles, I'll cover the technical challenges organised by severity. The obvious compilation errors, the runtime failures and hidden behavioural changes that may slip past your test suite. I'll look at the costs, explore the alternatives, and break down what a realistic migration timeline looks like.

The zombie transition is coming for Spring Boot 3.5. The only question is whether you'll be ready for it or surprised by it. If you're at JCON or Devoxx France this week, come find me at the HeroDevs booth. I'd love to swap migration war stories.

Steve Poole is a Java Champion, Oracle ACE and IBM Champion. Also a developer advocate at HeroDevs, and author of the No Regressions newsletter. Find him at the HeroDevs booth at JCON or Devoxx France

The post Crossing the River Styx: Spring Boot 3.5 and the Zombie Dependency Problem appeared first on foojay.

Grails Isn’t Done Yet (Part 2): EOL, Spring Boot, and What Comes Next

Steve Poole — Wed, 01 Apr 2026 08:48:56 +0000

Table of Contents

The inflexion point
Where Grails versions stand today
The Spring Boot gravitational pull
What the risk actually looks like
The practical middle ground
Upgrade is an action, not a strategy
Summary
Resources

In the companion article to this one, I looked at the revitalisation of Grails under the Apache Software Foundation: the 18-month migration, the technical modernisation, and the release of Grails 7 as a Top-Level ASF Project. That is the good-news story, and it is a genuinely impressive piece of community engineering.

This article is about the other side of the same coin.

While Grails moves forward, many of the applications built on it cannot move at the same pace. The result is a growing gap between where the framework is heading and where a significant number of production systems actually sit. Understanding that gap, and what options exist for managing it, is what this piece is about.

The inflexion point

The alignment of Grails with modern Spring Boot and Java baselines brings us to a critical inflexion point in 2026. While the framework is being revitalised, the “gravity” of the underlying ecosystem is shifting. Many legacy Grails applications remain tied to versions of Spring Boot and Java that are rapidly approaching or have already reached End of Life.

Where Grails versions stand today

The Apache Grails support schedule tells the story clearly:

Grails 3 and 4 have reached End of Support.
Grails 5 ended support in June 2024.
Grails 6 (the last pre-ASF release, with 6.2.3 shipping in January 2025) reached End of Support in June 2025.
Grails 7 is in Active Maintenance with support through June 2026
Grails 7.1 and 8 are in Active Development with Grails 8 maintenance support targeted for December 2026.

That is the Grails layer. Beneath it, the picture gets more complicated.

The Spring Boot gravitational pull

Beneath all of this, the Spring Boot timelines create their own gravitational pull.

Spring Boot follows a six-month release cycle with roughly 13 months of open-source support per release. That sounds generous until you lay the dates out: Spring Boot 3.3 OSS support ended in June 2025. 3.4 ended in December. 3.5 runs until June 2026. Spring Boot 4.0 (released November 2025) has OSS support through December 2026.

For teams still running applications on Spring Boot 2.x, open-source support ended years ago. Only commercial extended support remains available. The window is not slamming shut. It is closing steadily, and each version that falls off the end makes the next upgrade harder.

Taken together, this is less a single deadline and more a slow-moving dependency cliff.

What the risk actually looks like

The primary corporate risk is rarely that these systems suddenly stop working. Mature Grails applications are typically very stable. The real exposure appears more slowly and more quietly, when organisations lose visibility of their dependency health and drift out of a supported posture without fully realising it.

In the Java ecosystem, supply chain health matters far more than announcement-day excitement.

A team running Grails 4 on Spring Boot 2.x and Java 11 does not wake up one morning to a broken application. What they wake up to, eventually, is a CVE that applies to their stack and no upstream patch to apply. Or a compliance audit that flags unsupported components. Or a new integration requirement that demands a Java version their framework cannot support.

The danger is not sudden failure. It is the slow accumulation of exposure that nobody is tracking.

The practical middle ground

In practice, most organisations are not choosing between “upgrade tomorrow” and “do nothing.” Reality is rarely that clean. Portfolio constraints, regulatory timelines, and simple engineering capacity mean many teams need a supported holding pattern while they plan their next move.

Increasingly, this is where commercial continuity support for end-of-life open source is emerging as a pragmatic middle ground. A small but growing number of providers now specialise in keeping critical open source components supported beyond their community end-of-life, giving teams breathing room without forcing rushed or poorly sequenced migrations.

Even the ASF itself acknowledges this reality: the Foundation does not offer commercial support, but it recognises that not everyone can keep pace with upstream release cadences.

Upgrade is an action, not a strategy

The reflex response to EOL exposure is “just upgrade.” And upgrading is, eventually, the right thing to do. But treating it as a strategy rather than an action ignores the complexity of real-world systems.

A Grails 4 application is not merely a Grails application. It is a Spring Boot 2.x application, running on a specific Java version, with a specific set of transitive dependencies, deployed into a specific infrastructure.

Upgrading Grails means upgrading Spring Boot, which means upgrading Java, which means re-validating every integration point, every test suite, every deployment pipeline.

For teams with a single application, that is manageable. For teams with a portfolio of services, some of which were built by people who have since left the organisation, it is a multi-quarter programme of work.

Pretending otherwise does not make the problem smaller. It just makes the plan worse.

What to actually do about it

Know what you’re running

This sounds obvious. It often is not. Fat JARs, shaded dependencies, containers, vendor forks, embedded runtimes: over time, even the teams shipping a system may no longer be certain what is actually inside it. SBOMs are not insight. They are institutional memory. Start there.

Understand your exposure window

Map your Grails and Spring Boot versions against the support schedules. Know which components are still covered, which are approaching EOL, and which have already passed it. This does not require a commercial tool. It requires someone spending a day with a spreadsheet.

Buy time if you need it

If a full upgrade is not feasible in the near term, commercial continuity support for EOL components can keep your systems in a supported posture while you plan. This is not a permanent solution (unless your aiming to retire the app real-soon-now), but it is a pragmatic one for teams that need breathing room.

Commercial EOL support.

If you are currently assessing the support posture of older Grails estates, it is worth understanding the continuity support options available in the Java ecosystem. The landscape has evolved significantly in the past few years.

See here for some ‘official’ offerings

I work for one on that list: HeroDevs (see later for the full disclaimer) ,who provide support for Grails and many other Java and non Java products.

Providing EOL support is not a simple undertaking and requires particular skills and knowledge which I like to believe is something HeroDevs excels at. Visit their website https://herodevs.com

Plan the upgrade as a programme, not a task

If you have multiple Grails applications at different versions, sequence the work. Prioritise by exposure, not by convenience. Treat the dependency cliff as the engineering constraint it is, and fund it accordingly.

Summary

The Grails revitalisation under the ASF is real, and it matters. But it does not retroactively protect the applications that were built on earlier versions of the framework. Those systems need their own plan.

In an industry that celebrates only the new, the work of keeping older systems safe and supported is easy to overlook. It probably should not be.

Resources

Apache Grails Support Schedule
Spring Boot End of Life Dates
Grails 7.0.0 Release Announcement

Author’s note: Full Disclosure

In the interest of transparency: I work for HeroDevs, a company provides extended security support for end-of-life open source components ( including Java ecosystem frameworks ) and funds open source maintainers through its sustainability programme.

Where HeroDevs tools or services are referenced in this article it’s because I truly believe that what they offer is significant and relevant. My views on open source sustainability and EOL risk are formed independently and predate that relationship.

The post Grails Isn’t Done Yet (Part 2): EOL, Spring Boot, and What Comes Next appeared first on foojay.

Grails Isn’t Done Yet (Part 1): Inside the ASF Reboot

Steve Poole — Wed, 25 Mar 2026 08:30:21 +0000

Table of Contents

The technology we stop seeing
Why the Apache move matters
Twenty years of changing hands
Eighteen months of migration
One hundred repositories become nine
Beyond the code: licensing and compliance
The modernisation you might have missed
What Grails 7 actually ships
Grails 8 and the release cadence
The humans behind the reboot
Where Grails realistically sits in 2026
The hard work that keeps software alive

Steve Poole | With contributions from James Fredley, Apache Grails PMC Chair

For a technology that many people filed under “legacy,” Grails has been unusually active. While much of the industry’s attention has drifted toward newer frameworks and shinier stacks, something more deliberate has been happening in the background.

Grails has been moving into the Apache Software Foundation (ASF), modernising and positioning itself for the next chapter.
If you have not looked at Grails recently, your mental model is likely several years out of date. And that, in many ways, is exactly the problem.

The technology we stop seeing

Software ecosystems rarely end with a bang; most of the time, they simply slip out of focus. Conference agendas move on, blog coverage thins out, and new frameworks capture the narrative. Eventually, we collectively “agree” that a technology is “basically done”.

Except in enterprise environments, that is often not true at all. There are still Grails applications in production, processing transactions and serving customers. But while the systems remain, the organisational spotlight has shifted.

There is a significant gap between what gets hype and what actually runs the web.

According to W3Techs, PHP powers roughly 71.8% of all websites whose server-side language is known. Between 40% and 60% of the web runs on WordPress alone.

JavaScript, for all its conference-circuit dominance, accounts for under 6% on the server side. The technologies that quietly keep the internet running and the ones that dominate the narrative are often not the same technologies at all.

Why the Apache move matters

The transition of Grails into the ASF is not merely administrative tidying. Moving under the ASF umbrella is one of the clearest signals an open-source project can send about its long-term intent.

ASF provides a neutral home, predictable release discipline, and a contributor model that reduces perceived vendor risk.

For Grails, this matters because mature platforms live or die on trust signals. The ASF move changes the risk conversation for organisations evaluating whether Grails still has a place on their servers.

Twenty years of changing hands

The context makes the move even more significant. Grails was primarily led by single organisations for most of its 20-year history: G2One from 2005 to 2008, then SpringSource through 2015, Object Computing through 2021, and the Grails Foundation/Unity Foundation through 2025.

Each transition introduced uncertainty about the project’s direction and sustainability.

The ASF model is designed to break that pattern, replacing single-organisation dependency with volunteer-driven governance, vendor neutrality, and the structured transparency of the Apache Way.

Eighteen months of migration

In October 2025, Grails officially graduated from incubation to become a Top-Level Project at the ASF, following a board vote in September.

That sounds like a single event. It wasn’t. The migration was an 18-month process that began in late spring 2024 with a volunteer team assessing project readiness and submitting an incubation proposal.

What followed was a substantial modernisation effort: merging repositories into a mono-repo, overhauling the build system and dependency management, upgrading Maven coordinates, and issuing releases under ASF governance. The first ASF release (Milestone 4) shipped in June 2025, with the 7.0.0 General Availability release arriving in October.

One hundred repositories become nine

The scale of the repository consolidation tells its own story. Grails originally had around 100 Git repositories, of which 43 were slated for ASF migration. By the time the move was complete, those had been consolidated to 18, with only 9 still in active use.

That is a lot of plumbing.

The mono-repo approach accelerated compliance with ASF policy but required integrating separate build systems and release processes across hundreds of commits.

Over 2,000 commits went into the grails-core mono-repo alone, and build times for a release dropped from over three weeks to approximately 30 minutes.

Read that again. Three weeks to thirty minutes.

Beyond the code: licensing and compliance

The code was only part of it. The team also had to meet ASF security and licensing requirements. Reproducible, verifiable builds were implemented (requiring upstream contributions to dependencies including Apache Groovy).

Every source file was reviewed for licence headers, and 327 separate artefacts were audited for licensing compliance. The team automated licence review by adopting Software Bill of Materials for every published jar, ensuring ongoing compliance with reduced future effort.

Migrating the fully automated Gradle and GitHub Actions workflows proved to be a novel challenge in its own right; other Gradle-based projects at the ASF are now looking at the result as a reference implementation.

The modernisation you might have missed

A significant amount of careful modernisation has been focused on keeping Grails aligned with the moving baseline of the JVM and the Spring ecosystem.

This is not cosmetic: dependencies have been pulled forward, and compatibility with newer Java runtimes has been tightened.

What Grails 7 actually ships

Grails 7.0.0 shipped in October 2025 as the first stable release under ASF stewardship. It brings major dependency upgrades including Java 17+ support (through to Java 25), Groovy 4, Spring Boot 3.5, Spring Framework 6.2, and Jakarta EE 10.

Alongside the platform alignment, the release introduced containerised browser testing via Testcontainers and Geb, optional Micronaut integration, SBOM generation for all published binaries, and reproducible builds and artefacts.

The grails-core mono-repo now produces over 325 published jar files across 109 Gradle projects, with local build times between two and ten minutes depending on caching and hardware.

Grails 8 and the release cadence

Grails 8 development started in late November 2025, tracking Spring Boot 4.0 which reached general availability at the end of that month.

The project now follows Spring Boot’s six-month release cadence, with 13 months of support per release. Giving teams predictable timelines to plan around.

The humans behind the reboot

Open-source projects do not evolve by inertia. They move forward because a relatively small number of people decide the work is worth doing.

One of the challenges Grails faces today is not a lack of activity but a lack of visible narrative.

Much of the effort is concentrated in a tight group of committed maintainers. From the outside, that can appear to be silence even when meaningful progress is underway.

To make that work more visible, I spoke with James Fredley, the Apache Grails PMC Chair, about where the project stands and where it is heading.

What motivated the move to the Apache Software Foundation?

There were real questions about Grails’ future, and they were understandable.

The concerns crept in during the 4.x through 6.x era, when the project moved through several organisations and its direction felt uncertain. For most of its 20-year history, Grails was primarily led by a single organisation at any given time, with limited community contributions or input.

The move to the ASF was about addressing that directly: shifting from single-organisation dependency to a volunteer-driven, vendor-neutral model. The ASF’s structure: the Project Management Committee, mailing lists, consensus-based voting, the incubation process, gives people confidence that the project is sustainable, not dependent on any one company’s priorities.

From inside the project, what kind of technical work has been happening?

The scale of it probably surprises people. Thousands of hours of volunteer time have gone into modernising the 7.x line and building toward 8.x.

We consolidated from around 100 repositories down to 18 (with 9 active), rewrote the build and release pipeline, achieved reproducible and verifiable builds, implemented SBOM generation, and ensured licensing compliance across hundreds of artefacts.

Grails 7 now produces over 325 published jar files across 109 Gradle projects, with local build times between two and ten minutes. The release process itself went from a three-week ordeal to about 30 minutes.

Migrating our fully automated Gradle and GitHub Actions workflows to the ASF was a novel challenge, but grails-core can now serve as a model for other Gradle-based projects joining the Foundation.

What people also need to understand is that a Grails application is a Spring Boot application.

With roughly 85–90% of Java applications running on Spring Boot,Grails is not some exotic outlier: it is extra developer-productivity layers on top of what everyone in the Java ecosystem already uses.

What do you hope the ASF transition unlocks?

Broader adoption and broader contribution. The ASF gives us credibility with enterprise decision-makers who need to know a framework will still be around in five or ten years.

But it also lowers the barrier for new contributors. The governance is transparent, the processes are well-documented, and the project is genuinely welcoming.

Grails now follows Spring Boot’s release cadence: a six-month cycle with 13 months of support, which gives teams predictable timelines to plan around.

What misconception about Grails would you most like to correct?

That it’s a legacy technology for legacy teams.

Grails is still the most productive way to build a web application in the Java ecosystem, and that should be a draw for newer engineers and greenfield projects, not just established estates.

The convention-over-configuration approach means less boilerplate, sensible defaults, and a gentle learning curve.

It is a “framework of frameworks,” built on Spring Boot, Spring Framework, Jakarta EE, and Hibernate.

If you know those, you already know a significant part of Grails.

Where Grails realistically sits in 2026

Grails is not trying to out-Spring Boot Spring Boot. Where it continues to make sense is in environments that value convention-heavy productivity and rapid delivery, particularly where there is already meaningful investment in the Groovy ecosystem.

For teams with established Grails estates, the question isn’t “does it work?” but “is it still safe to stay?”

The ASF graduation, the release of Grails 7 (supporting Java 17 through 25), and the active development of Grails 8 tracking Spring Boot 4 are designed to lower the perceived risk of that decision. But that safety is contingent on moving forward.

For teams evaluating new projects, the productivity argument deserves a fresh hearing. As Fredley puts it, Grails is extra developer-productivity layers on top of what 85–90% of the Java ecosystem already uses. That framing: not “legacy framework” but “productivity accelerator built on Spring Boot”, is a different proposition than the one most people have filed away in their mental models.

The hard work that keeps software alive

Software rarely dies because of a single technical flaw; it fades because attention moves somewhere else.
What the current maintainers are doing is the careful, methodical work required to keep a mature framework viable in a fast-moving ecosystem.

In an industry that celebrates only the new, that kind of work, and the difficult EOL conversations it requires, is easy to overlook.

It probably should not be.

Of course, none of this helps the teams still running Grails 3 or 4 on their servers. For them, the dependency cliff is already here. In part two, I want to look at what that cliff actually looks like and what the options are.

Resources

Author’s note:

In the interest of transparency, I work for HeroDevs, a company that provides extended support for end-of-life open-source components. If you are currently assessing the support posture of older Grails estates, it is worth understanding the continuity support options available in the Java ecosystem. The landscape has evolved significantly in the past few years.

The post Grails Isn’t Done Yet (Part 1): Inside the ASF Reboot appeared first on foojay.

foojay – a place for friends of OpenJDK

MongoDB as a Vector Database for AI Agents-MongoDB

Why should you use MongoDB for building AI agents?

Understanding AI agents

Building a multi-agent application with MongoDB

Step 1: Creating a vector search index

Step 2: Creating the Trip

Step 3: Induce a disruption

Step 4: Replanning

Step 5: The Memory agents make use of vector search.

Step 6: Booking agent generates options

Step 7: The system finally makes the decision

Conclusion

What is Sharding in MongoDB and When Should You Use It?

A Practical Introduction to Horizontal Scaling

The Scaling Problem Most Databases Face

What is Horizontal Scaling?

What is Sharding in MongoDB?

MongoDB Sharded Cluster Architecture

1. Shards

2. Config Servers

3. Mongos Router

Choosing a Shard Key

Creating a Sharded Collection

Querying Data in a Sharded Cluster

When Should You Use Sharding?

Large datasets

High write throughput

Rapid data growth

When Sharding Might Be Overkill

Sharding vs Replication

Final Thoughts

Jakarta EE is Ready for AI – But Don’t Just Take My Word for It!

Where Jakarta EE Comes From and Where It's Headed

The Past, Present, and Future of Enterprise Java - Ivar Grimstad (Eclipse Foundation)

Jakarta EE Meets AI: Three Angles on the Same Problem

The Intelligent Monolith: Supercharging Jakarta EE with Local AI - Luqman Saeed (Azul)

Jakarta EE 11 Meets AI: Building Intelligent Microservices with Virtual Threads and Jakarta Data - Luqman Saeed (Azul)

Production-ready Agentic AI: Building Enterprise-grade Java Systems with Jakarta EE and MicroProfile - Kenji Kazumura (Fujitsu)

Getting the Fundamentals Right

API = Some REST and HTTP, right? RIGHT?! - Rustam Mehmandarov (Miles)

Azul Payara May 2026 Release – What’s New

A critical security fix, patched across every supported branch

Azul Payara Community 7.2026.5

Security Fixes

Bug Fixes

ImprovementsImprovements

Component Upgrades

Azul Payara 6.38.0: Continued Jakarta EE 10 Support

Bug Fixes

Improvements

Component Upgrades

Azul Payara 5.87.0: Jakarta EE 8 Support Continues

Bug Fixes

Improvements

Component Upgrades

Azul Payara 4.1.2.191.55: Legacy Branch Still Maintained

Bug Fixes

Looking Ahead

Upgrading and Feedback

BoxLang AI Deep Dive — Part 6 of 7: Memory Systems & RAG — Building AI That Remembers

🧠 Two Categories of Memory

📋 Standard Memory Types

Summary Memory — How It Actually Works

🔍 Vector Memory Types

Hybrid Memory — The Best of Both

🏢 Per-Call Multi-Tenant Identity Routing

📚 Document Loaders

🔗 Building a Complete RAG Pipeline

Step 1: Ingest

Step 2: Query

Step 3: Hybrid for Production

🔧 Token Management

🏗️ Multiple Memories Per Agent

📦 The aiPopulate() BIF — Structured Memory Without Live Calls

What's Next

The Code Was Always the Door

The doorman in a hoodie

The shepherd

Read the terrain

📦 The `aiPopulate()` BIF — Structured Memory Without Live Calls

Step 4: Add the below `MemoryConfig` class.

Step 6: Add the below `ShortTermController` class.

POST `/api/short` – Chat Endpoint

GET `/api/history` – Retrieve Conversation History

DELETE `/api/history` – Clear History