foojay – a place for friends of OpenJDK

BoxLang AI 3.2.0 — Image Generation, Web Search, Fluent Audio, Agent Registry & MCP Observability

Cristobal Escobar — Tue, 02 Jun 2026 12:27:07 +0000

BoxLang AI 3.2.0 is here, and it's a landmark release. We're shipping five major features: image generation, web search, a fluent audio builder API, a centralized agent registry, and deep MCP observability along with a suite of analytics improvements and a critical bug fix. Let's dig in.

Image Generation — aiImage()
You can now generate images directly from BoxLang using any provider that supports text-to-image generation. The aiImage() BIF follows the same fluent, chainable philosophy as the rest of bx-ai then act on the result with expressive method calls.

// Generate and save in one fluent chain
aiImage( "A futuristic cityscape at sunset" )
    .saveToFile( "/images/cityscape.png" )

// Full control with params and provider
response = aiImage(
    "A watercolor painting of a mountain lake",
    { n: 2, size: "1024x1024", quality: "hd" },
    { provider: "openai" }
)

// Embed directly in HTML output
dataURI = response.toDataURI()

The returned AiImageResponse object gives you everything you need: hasImages(), getCount(), getFirstURL(), getFirstBase64(), saveToFile(), saveAllToDirectory(), toDataURI(), getMimeType(), and toStruct().

Supported providers out of the box:

Provider	Model	Env Var
OpenAI	gpt-image-1 (default), DALL-E models	OPENAI_API_KEY
Gemini	imagen-3.0-generate-008	GEMINI_API_KEY
Grok / xAI	grok-2-image	GROK_API_KEY
OpenRouter	FLUX Schnell (default), many others	OPENROUTER_API_KEY

A generateImage@bxai agent tool is auto-registered in the global tool registry at module startup, so your agents can generate images without any manual wiring:

agent = aiAgent( tools: [ "generateImage@bxai" ] )

Image Generation Docs

Web Search — aiWebSearch() & aiWebSearchAsync()
BoxLang AI now ships a unified web search system with provider abstraction and normalized results. Every provider returns the same fields — title, url, snippet, publishedDate, domain, score, thumbnail, language — so you can swap providers without touching your code.

// Synchronous search
results = aiWebSearch( "latest BoxLang AI updates", { provider: "brave", maxResults: 8 } )

// Async — returns a BoxFuture
future = aiWebSearchAsync( "BoxLang release highlights", { provider: "tavily" } )
results = future.get()

Supported providers:

Provider	Notes
http	URL fetching & parsing — no API key required
brave	Privacy-focused; country/language filters
google	Google Custom Search
tavily	Retrieval-focused, great for AI agents
exa	Semantic and neural search modes

The webSearch@bxai tool is auto-registered globally, so any agent can search the web immediately:

agent = aiAgent(
    name: "ResearchAgent",
    tools: [ "webSearch@bxai" ]
)

response = agent.run( "Find and summarize recent BoxLang AI release highlights" )

Web Search Docs

Fluent Builder API for Audio BIFs
aiSpeak(), aiTranscribe(), and aiTranslate() now support a full fluent builder API. Call any of them with no arguments to get the request object back, then chain your configuration before executing. The traditional positional-argument syntax continues to work exactly as before — the fluent builder is purely additive.

aiSpeak()

// Traditional syntax — still works
audio = aiSpeak( "Hello!", { voice: "nova" }, { provider: "openai" } )

// Fluent builder — expressive and self-documenting
audio = aiSpeak()
    .of( "Hello, world!" )
    .voice( "nova" )
    .provider( "openai" )
    .asMP3()
    .speak()

// Gender shortcuts
audio = aiSpeak()
    .of( "Welcome aboard!" )
    .male()
    .speed( 1.2 )
    .speak()

// Format shortcuts
audio = aiSpeak()
    .of( "System alert." )
    .asWav()
    .outputFile( "/audio/alert.wav" )
    .speak()

Key builder methods: .of(), .voice(), .male() / .female(), .speed(), .instructions(), .outputFile(), .asMP3() / .asWav() / .asFlac() / .asOpus() / .asPCM(), .provider(), .speak().

aiTranscribe()

// From file
text = aiTranscribe()
    .file( "/audio/meeting.mp3" )
    .withWordTimestamps()
    .asVerboseJSON()
    .transcribe()

// From URL
text = aiTranscribe()
    .url( "https://example.com/audio.mp3" )
    .language( "es" )
    .transcribe()

// Translate audio directly to English
english = aiTranscribe()
    .file( "/audio/french.mp3" )
    .translate()

Key builder methods: .file(), .url(), .data(), .language(), .withWordTimestamps(), .withSegmentTimestamps(), .diarize(), .asJSON() / .asText() / .asVerboseJSON() / .asSRT() / .asVTT(), .transcribe(), .translate().

aiTranslate()

english = aiTranslate()
    .file( "/audio/german.mp3" )
    .asText()
    .translate()

Audio Docs

Agent Registry — aiAgentRegistry()
3.2.0 introduces the AIAgentRegistry — a global singleton that gives you centralized discoverability, observability, and lifecycle management for all agents running in your BoxLang application.

// Auto-register at creation time
agent = aiAgent(
    name: "support-agent",
    description: "Customer support agent",
    register: true,
    module: "my-app"
)

// Or register manually
aiAgentRegistry().register( agent, "my-app" )

// Discover what's running
agents = aiAgentRegistry().listAgents()
info   = aiAgentRegistry().getAgentInfo( "support-agent@my-app" )

// Resolve a mixed array of string keys and live instances
resolved = aiAgentRegistry().resolveAgents( [
    "support-agent@my-app",
    anotherAgentInstance
] )

// Clean up
aiAgentRegistry().unregister( "support-agent@my-app" )
aiAgentRegistry().unregisterByModule( "my-app" )

Module Authors: First-Class Agent & Tool Registration
This is a big deal for the BoxLang ecosystem. Developers building BoxLang modules can now ship agents and tools that auto-register themselves globally when the module loads — no manual wiring by the application developer required.

Define your aiAgent() instances with register: true and a module namespace
Define your tools, scan them via aiToolRegistry().scan( new MyTools(), "my-module" ), and they appear globally as toolName@my-module
Application developers can consume your agents and tools by name, from any part of their app, the moment your module is installed
This makes bx-ai a genuine platform for building composable, discoverable AI ecosystems — publish a module to ForgeBox, and your agents and tools show up ready to use.

Two new interception points fire on registry changes: onAIAgentRegistryRegister and onAIAgentRegistryUnregister.

MCP Server Pause/Resume
MCPServer now supports pausing and resuming without tearing down configuration or losing registered tools. Ideal for maintenance windows, graceful degradation, or controlled rollouts.

server = MCPServer( "my-tools", "Provides custom tools" )
    .registerTool( myTool )

server.pause()

if ( server.isPaused() ) {
    println( "Server is paused — rejecting all non-ping requests" )
}

server.resume()

pause() — fires onMCPServerPause; all non-ping requests receive error code -32005
resume() — fires onMCPServerResume; normal handling restored
getSummary() now includes a paused boolean
MCP Server & Client Observability
Server Analytics
MCP server monitoring gets a major overhaul in 3.2.0:

Thread-safe counters using named locks across all stat operations
Security failure tracking — auth failures, API key rejections, body-size violations all get dedicated counters
Per-tool error tracking — byTool[name].errors with errors.byTool roll-up
Active concurrent request counter — activeRequests increments and decrements in real time
Requests-per-minute rate — exposed in getSummary()
X-Request-ID correlation — request IDs echoed in response headers and event payloads
Paused-request stats — rejected requests tracked when server is paused
onMCPError now fires for METHOD_NOT_FOUND
Client Stats — MCPClient
MCPClient gains full internal usage and performance tracking:

client = MCP( "http://localhost:3000" )

tools  = client.listTools()
result = client.callTool( "search", { query: "BoxLang" } )

// Inspect what's happening
stats   = client.getStats()   // per-operation, per-tool, per-URI breakdowns
summary = client.getSummary() // totalCalls, successRate, avgResponseTime

// Reset when needed
client.resetStats()

Three new interception points cover the full client lifecycle: onMCPClientRequest, onMCPClientResponse, onMCPClientError.

Type-Aware Tool Argument Support
Tool schemas in bx-ai are now generated directly from callable parameter metadata, so LLMs finally receive accurate JSON Schema types for every argument instead of a flat bag of strings. ClosureTool.getArgumentsSchema() maps BoxLang types naturally — numeric, integer, float, and double become "number", boolean becomes "boolean", array becomes "array" with "items": {}, and struct becomes "object" — meaning LLMs can send native JSON values for non-string arguments and tools behave exactly as their signatures declare. On the output side, BaseTool.invoke() continues to serialize results consistently for provider compatibility, converting simple values via toString() and complex values via JSON serialization, keeping the tool contract clean in both directions.

// Tool with numeric and boolean arguments
// LLM sends { "quantity": 3, "applyDiscount": true } — no casting needed
calculateTotal = aiTool(
    name: "calculateTotal",
    description: "Calculate order total with optional discount",
    tool: ( numeric price, numeric quantity, boolean applyDiscount = false ) -> {
        total = price * quantity
        if ( applyDiscount ) total *= 0.9
        return { summary: "Order total calculated", total: total }
    }
)

// Tool with an array argument
// LLM sends { "tags": ["boxlang", "ai", "tools"] } — native array
tagContent = aiTool(
    name: "tagContent",
    description: "Apply a list of tags to a content item",
    tool: ( string contentId, array tags ) -> {
        // tags arrives as a real BoxLang array
        return {
            summary : "Tags applied to #contentId#",
            applied : tags.len(),
            tags    : tags
        }
    }
)

// Tool with a struct argument
// LLM sends { "filter": { "status": "active", "minAge": 18 } } — native struct
queryUsers = aiTool(
    name: "queryUsers",
    description: "Query users by filter criteria",
    tool: ( struct filter, numeric limit = 10 ) -> {
        results = userService.query( filter, limit )
        return {
            summary : "Found #results.len()# users",
            count   : results.len(),
            data    : results
        }
    }
)

agent = aiAgent(
    tools: [ calculateTotal, tagContent, queryUsers ]
)

Bug Fix — ClosureTool.doInvoke() JSON Struct Handling
MCP clients that send JSON fields as real objects or arrays (rather than pre-stringified JSON) no longer cause "Can't cast Struct to a string" errors. doInvoke() now inspects declared parameters and calls jsonSerialize() on any non-simple value whose declared type is string. Silent, automatic, no code changes required.

Module Configuration
New image Settings Block

{
  "modules": {
    "bxai": {
      "settings": {
        "image": {
          "defaultProvider": "openai",
          "defaultApiKey": "",
          "defaultModel": "gpt-image-1",
          "defaultSize": "1024x1024",
          "defaultQuality": "standard",
          "defaultStyle": "vivid",
          "defaultInstructions": ""
        }
      }
    }
  }
}

New Interception Points
3.2.0 brings bx-ai to 50 total interception points, adding 10 new events:

Event	When Fired
beforeAIImageGeneration	Before image generation request
afterAIImageGeneration	After image generation response
onAIImageRequest	Image request object created
onAIImageResponse	Image response received
onAIAgentRegistryRegister	Agent registered
onAIAgentRegistryUnregister	Agent unregistered
onMCPServerPause	MCP server paused
onMCPServerResume	MCP server resumed
onMCPClientRequest	MCP client HTTP request
onMCPClientResponse	MCP client HTTP response
onMCPClientError	MCP client HTTP error

Upgrade Now

# CommandBox
box install bx-ai

# OS
install-bx-module bx-ai

Full Docs: ai.ortusbooks.com Community: community.ortussolutions.com GitHub: github.com/ortus-boxlang/bx-ai

BoxLang AI 3.2.0 is a platform release: image generation, web search, fluent audio, a global agent & tool registry, and deep observability all land together. We can't wait to see what you build.

The post BoxLang AI 3.2.0 — Image Generation, Web Search, Fluent Audio, Agent Registry & MCP Observability appeared first on foojay.

Why Enterprise Java Teams Need Quality Gates Even More in the Age of AI

Anton Lem — Fri, 29 May 2026 07:00:00 +0000

Table of Contents

Enterprise quality is a scaling problemLocal differences become delivery problemsNoisy diffs hurt review qualityIDE-based quality control is not enoughAI needs deterministic boundariesWhat enterprise quality gates should checkFormatting is only one source-code gateJava member ordering is harder than it looksThe missing layer: JHarmonizerWhere it fits in the Java quality stackConclusion

People and AI can write code together, but enterprise repositories still need deterministic quality gates to protect code quality.

Enterprise quality is a scaling problem

Enterprise Java development is not only about writing correct code. It is about keeping a large, long-lived codebase understandable, reviewable and safe to change while many people and many tools touch it over time.

In a small project, informal discipline can be enough. A few developers agree on conventions, use similar IDE settings and fix inconsistencies during review.

That model breaks down in larger organizations. Teams change, ownership moves, modules outlive their original authors, and code is edited through different IDEs, web interfaces, scripts, generators and AI-assisted workflows.

This is where quality gates become important. They are not bureaucracy around the build. They are executable engineering agreements. If a rule matters for long-term maintainability, it should be runnable, repeatable and enforceable from the build pipeline.

Local differences become delivery problems

Most quality problems look small in isolation. One developer uses a different IDE profile. Another ignores a local inspection warning. Someone forgets to run tests. A dependency is updated without checking the broader impact. A generated change touches many files in a slightly different style.

The damage comes from accumulation. Similar modules stop following the same structure. Reviewers work harder to find the real behavior change. Static analysis findings arrive too late. Test coverage becomes uneven. Dependency rules drift. Build behavior becomes less predictable.

A large team therefore needs common rules that run the same way for everyone. Formatting, source structure, static analysis, dependency checks, test coverage and license checks should not depend on who made the change or which local setup happened to be configured correctly.

Readable and maintainable code is a delivery concern, not an aesthetic preference. The main consumer of source code is another developer: the person reviewing it today, debugging it in six months or extending it next year.

Noisy diffs hurt review quality

Weak automation often shows up as noisy pull requests. A developer changes a few lines of behavior, but the diff also contains reordered methods, import cleanup, blank-line changes and unrelated formatting noise.

The reviewer has to dig for the real change inside layout churn. That is tiring, slow and bad for review quality.

Good tooling separates these concerns. If a project has a canonical representation of source code, developers can bring files back to that representation before review. The diff becomes smaller, and the reviewer focuses on behavior instead of formatting archaeology.

IDE-based quality control is not enough

The natural answer is: let the IDE handle it. Modern IDEs are powerful productivity tools. IntelliJ IDEA, Eclipse and other environments can format Java code, optimize imports, rearrange class members, run inspections, show test coverage and integrate static-analysis plugins. For local work, that feedback is valuable. It helps developers produce cleaner code before they even run the build.

The problem starts when this local workflow becomes the quality strategy for a large distributed team. An IDE can help one developer on one machine. It cannot guarantee that every change in every branch was produced with the same editor, plugins, settings, imported profile and manual actions.

At enterprise scale, that assumption fails quickly. Developers may use IntelliJ IDEA, Eclipse, VS Code, terminal tools, repository web editors, generated code or automated migrations. Some remember the right action. Others do not. One workstation has the correct profile. Another has a slightly different setup.

IDE support remains useful, but repository protection must live somewhere independent of the developer's workstation. If a rule matters for the project, it should be part of the build, reproducible in Maven or Gradle, and enforceable in CI.

AI agents make this even more obvious. They do not reliably use your IDE, inspection profile, formatter settings, rearrangement rules or local quality plugins. Depending on IDE-based quality control becomes even weaker when not all code is produced through an IDE.

AI needs deterministic boundaries

AI does not remove the need for quality gates. It increases it.

AI agents can generate, refactor and explain code quickly. That is useful, but it also means more code can be produced with less friction by humans, scripts and AI assistants together. The repository needs stronger automatic boundaries around what is accepted.

The tempting mistake is to turn AI itself into the quality gate. For deterministic rules, that is the wrong default.

A prompt is not a quality gate. Asking an AI agent to check whether code follows a style guide, uses the right dependency policy, has correct formatting, or follows a source-structure convention is not the same as enforcing a rule. A model may follow the instruction, partially follow it, misunderstand it, or produce a different judgment when the context changes. That is not how enterprise gates should work.

If a quality rule can be expressed as a deterministic algorithm, it should be enforced by deterministic code. Formatting, import cleanup, dependency checks, static analysis, license checks and reproducible source ordering should be fast, cheap and repeatable. The same input should produce the same result. The same check should fail locally and in CI for the same reason.

AI can still be useful around this process. It can suggest fixes, explain a failed check, generate tests, or help a developer understand a static-analysis warning. But the final repository boundary should not depend on a model interpreting a prompt. It should depend on executable rules.

What enterprise quality gates should check

A quality gate is not one vague checkbox called "quality". In a serious Java project, it is a set of concrete checks that protect different parts of the delivery process.

In the best case, the build and CI pipeline should verify:

Build reproducibility: expected JDK version, Maven or Gradle version, plugin versions, compiler target, generated sources and repeatable build behavior.
Dependency governance: banned dependencies, dependency convergence, snapshot dependencies, duplicated versions, vulnerable libraries and license compatibility.
Compilation: main sources, test sources, annotation processors, generated code and selected Java language level.
Automated tests: unit tests, integration tests, contract tests, smoke tests and other project-specific test suites.
Coverage: minimum line or branch coverage, module-level thresholds and protection against silent coverage drops.
Static analysis: bug patterns, duplicated code, excessive complexity, risky APIs, nullability problems and maintainability rules.
Security and compliance: dependency vulnerability scanning, secret scanning, required license headers, SPDX metadata and internal repository rules.
Source-code policy: formatting, imports, line wrapping, naming conventions, generated-code exclusions, package rules and class structure.
Build output discipline: controlled warnings, stable reports, useful failure messages and artifacts that can be inspected after CI failure.

For enterprise development, these rules should be part of the build, not only part of local IDE setup. A common Maven or Gradle configuration gives the project one executable contract.

Locally, developers should have commands that can fix what is safe to fix automatically. For example, a Maven profile or plugin goal may format code, clean imports, reorder source structure, regenerate reports, or apply other safe mechanical changes.

In CI, the same project should have check-only execution. The pipeline should not silently rewrite code. It should verify that the code already follows the required rules. If formatting is wrong, imports are dirty, tests fail, coverage drops, dependencies violate policy, or source structure is inconsistent, the build should fail with a clear message.

This is how a written convention becomes an executable rule. Instead of repeating the same review comments again and again, such as "run the formatter", "fix imports", "update tests", "do not use this dependency", or "this class structure is inconsistent", the team moves these checks into the build pipeline. Reviewers can then focus on design, behavior, risk and business logic instead of acting as manual linters.

Formatting is only one source-code gate

Many quality gates are already common in mature Java projects. Build checks, tests, coverage thresholds, static analysis and dependency rules are familiar parts of CI pipelines.

Source-code policy is one part of that broader picture.

Formatting is the most familiar source-code gate. It controls the text-level shape of the file: spaces, indentation, wrapping, imports, blank lines and syntax layout. Java already has strong tools for this layer. google-java-format and palantir-java-format can make formatting reproducible, and they can be integrated into the build.

But source-code policy does not end at formatting. It does not decide where constants, fields, constructors, public methods, private helpers, accessors and nested types belong inside a class. That is a separate layer: source structure.

A file can be perfectly formatted and still be hard to scan because every class follows a different internal order. This is the gap between formatting and source restructuring.

Java member ordering is harder than it looks

Java declarations are not always independent. One constant may depend on another constant. A field initializer may depend on a field declared above it. Static or instance initialization blocks may rely on members that already exist earlier in the class.

For example, this order is safe:

private static final int DEFAULT_TIMEOUT_SECONDS = 30;
private static final int API_REQUEST_TIMEOUT_SECONDS = DEFAULT_TIMEOUT_SECONDS * 2;

Blind alphabetical sorting may accidentally produce this:

private static final int API_REQUEST_TIMEOUT_SECONDS = DEFAULT_TIMEOUT_SECONDS * 2;
private static final int DEFAULT_TIMEOUT_SECONDS = 30;

Now the change is not only cosmetic. API_REQUEST_TIMEOUT_SECONDS depends on DEFAULT_TIMEOUT_SECONDS, so the base constant must stay above the derived one.

A source restructuring tool therefore has to respect declaration-order dependencies. Similar issues can appear around field initializers, static initializers, instance initializers, enum constants, annotation values and other class-level declarations where order may matter.

The missing layer: JHarmonizer

This was the missing layer I could not find in the Java tooling ecosystem: a way to make Java class structure reproducible outside the IDE, enforceable from the build, and safe enough to respect declaration-order dependencies.

So I built JHarmonizer.

JHarmonizer reorganizes Java class members into a canonical structure, making code easier to scan, safer to review, and more consistent across teams.

JHarmonizer is an open-source Java source harmonization tool. It focuses on one layer of the quality workflow: making Java source structure and formatting reproducible from Maven, CLI and CI.

It can:

reorder Java class members while respecting declaration-order dependencies;
keep accessors together;
use different ordering strategies for interfaces, DTOs, tests, utility classes and regular production classes;
format the reordered result with Palantir Java Format;
run from Maven or the command line in auto-fix mode;
run in check mode as a CI quality gate.

Where it fits in the Java quality stack

JHarmonizer is not meant to replace the Java quality ecosystem. Static analyzers, bug detectors, coverage tools, architectural tests and code review all solve different problems.

A typical Java quality setup may include Maven Enforcer, PMD, CPD, SpotBugs, JaCoCo, license checks, sortpom and JHarmonizer. Each tool protects a different layer: dependencies, static checks, duplication, bug patterns, coverage, legal metadata, build files and source structure.

There is no single magic tool. Enterprise quality comes from boring, deterministic checks that protect different parts of the codebase.

Conclusion

AI will continue to change how code is produced. That is not a reason to weaken engineering discipline. It is a reason to automate more of it.

Large teams need rules that do not depend on local IDE settings, personal habits, memory or how an AI model interprets a prompt.

Formatting, static analysis, tests, coverage, dependency rules and license checks are already part of that picture. Java source structure deserves to be part of it too.

Readable and understandable code does not happen automatically in large teams. It has to be protected by process, tooling and automation.

That is what quality gates are really for: not slowing teams down, but helping them deliver safely, predictably and with less avoidable noise.

The post Why Enterprise Java Teams Need Quality Gates Even More in the Age of AI appeared first on foojay.

Intro to the BoxLang Formatter

Cristobal Escobar — Thu, 28 May 2026 17:45:09 +0000

Table of Contents

Recommended Team Workflow

You know the drill. Someone opens a PR and half the review comments are about tabs vs spaces, where braces go, or why that one function has its arguments formatted differently from everything else. It's noise. And it's over.

The BoxLang Formatter is here, and it handles all of that for you.

You can find the docs here: https://boxlang.ortusbooks.com/getting-started/ide-tooling/boxlang-formatter

What Is It?

The BoxLang Formatter is a built-in code formatting tool that ships with BoxLang. It enforces consistent style across .bx, .bxs, .bxm, .cfm, .cfc, and .cfs files — automatically.

It's not a linter. It doesn't just complain. It fixes your code, or tells CI to fail when style drift sneaks in.

Getting Started in 60 Seconds

If you have BoxLang installed, you already have the formatter. No extra install needed.

Format everything in your current directory:

boxlang format

That's it. It recurses through your project and rewrites supported files in place.

Want to target a specific path or file?

# A directory
boxlang format --source ./src

# A single file
boxlang format --source ./models/User.bx

Multiple paths at once (v1.14+):

boxlang format --source commands,models,services

Configure Your Style

The formatter works great out of the box with sensible defaults, but you can customize it with a .bxformat.json file in your project root.

Bootstrap one instantly:

boxlang format --initConfig

This drops a starter config in your current directory. From there, tweak what you care about. Here's a minimal example:

{
  "maxLineLength": 120,
  "tabIndent": true,
  "singleQuote": false,
  "braces": {
    "style": "same-line",
    "require_for_single_statement": true
  },
  "operators": {
    "comparison_style": "symbols"
  }
}

You've got control over indentation, line length, brace style, struct/array formatting, operator style, SQL keyword casing, import sorting, and a lot more. Only override what you need — everything else uses sensible defaults.

Lock It Down in CI

This is where it gets really useful. Run the formatter in check mode as a quality gate:

boxlang format --check --source ./

Exits 0 if everything is already formatted correctly
Exits non-zero if any file has drift

Drop that into your CI pipeline and pull requests with messy formatting simply won't merge. One command, no separate linter needed.

Recommended Team Workflow

Developers run boxlang format before pushing
CI runs boxlang format --check on every PR
PRs that fail must reformat before merge

No more style debates in code review. The formatter wins.

Format on Save in VS Code

If you want formatting to happen automatically as you work, the BoxLang LSP supports experimental format-on-save.

Step 1 - Enable it in .bxlint.json:

{
  "formatting": {
    "experimental": {
      "enabled": true
    }
  }
}

Step 2 - Add this to your VS Code settings.json:

{
  "[boxlang]": {
    "editor.formatOnSave": true
  },
  "[boxlang-template]": {
    "editor.formatOnSave": true
  }
}

Step 3 - Open the Command Palette and run:

BoxLang: Select BoxLang Version (pick latest)
BoxLang: Select LSP Version (pick latest)
Developer: Reload Window

Save a .bx file and it just formats. Local fast feedback, CI enforcement as the source of truth.

Coming from cfformat?

Already using cfformat in your project? Migration is a two-step process, and your existing style intent is preserved.

Step 1 - Convert your config:

boxlang format --convertConfig --source ./

This transforms your .cfformat.json into a .bxformat.json, keeping your rules intact.

Step 2 - Validate with check mode:

boxlang format --check --source ./

See what (if anything) drifted. Run the formatter once in a cleanup commit, then turn on --check in CI and you're done.

A Few Other Handy Options

Preview without rewriting files — pipe output to stdout instead:

boxlang format --overwrite false --source ./handlers/MainHandler.cfc

Exclude directories (v1.14+):

boxlang format --source . --excludes generated,vendor

Use a custom config path:

boxlang format --config ./config/.bxformat.json --source ./

The Bottom Line

Stop spending review cycles on style. The formatter handles it — in your editor, in your pre-commit hook, in CI. One command, consistent output, zero arguments about semicolons ever again.

Go format something:

boxlang format

Questions? Hit us up on Community & Support or open a discussion on the BoxLang repo. We'd love to hear how you're using it.

The post Intro to the BoxLang Formatter appeared first on foojay.

Introducing bx-jwt: Enterprise-Grade JSON Web Tokens for BoxLang

Cristobal Escobar — Tue, 26 May 2026 10:14:00 +0000

Table of Contents

The Fluent Builder — jwtNew()
The BIF Functions
HMAC Sign and Verify
RSA Sign and Verify
JWE Encryption
alg:none Rejection
HMAC Minimum Key Lengths (RFC 7518 §3.2)
Algorithm Allowlist
Clock Skew Tolerance
Authentication Middleware
Token Refresh with Grace Period
Kid-Based Key Rotation
Signing (JWS)
Encryption (JWE)

JWT authentication is everywhere. But rolling it correctly — with proper algorithm enforcement, key management, clock skew handling, JWE encryption, and zero security footguns — is anything but trivial. Today, we're shipping bx-jwt, a production-ready JWT/JWE module for BoxLang that handles all of it out of the box, so you can focus on building, not fighting cryptography.

bx-jwt is part of the BoxLang+ and BoxLang++ subscription tiers — our enterprise-grade module collection built for teams that take security seriously.

What is bx-jwt?

bx-jwt is a full implementation of the JWT/JWE specification stack for BoxLang:

JWS (JSON Web Signature) — HMAC, RSA, and Elliptic Curve signing
JWE (JSON Web Encryption) — RSA and symmetric encryption
RFC 7518 — JSON Web Algorithms
RFC 7519 — JSON Web Token

It ships with two APIs that serve different tastes: a fluent builder for expressive, chainable token construction, and a suite of BIF functions for direct, functional-style usage. Both share the same engine, key registry, and security model.

Two APIs, One Module

The Fluent Builder — `jwtNew()`

When readability matters, the fluent builder gives you a clean, chainable surface for token construction. Call jwtNew() and chain your claims. Terminate with .sign() or .encrypt().

token = jwtNew()
    .subject( "user-123" )
    .issuer( "auth-service" )
    .audience( "mobile-client" )
    .claim( "roles", [ "admin", "user" ] )
    .expireIn( 3600 )
    .header( "kid", "v1" )
    .sign( secret, "HS256" );

Every standard claim has a named method. Custom claims go through .claim( key, val ). Headers via .header( key, val ). Swap .sign() for .encrypt() and you have a JWE. It reads like what it does. 🎯

The BIF Functions

For teams that prefer a direct, functional style, all operations are available as first-class BoxLang BIFs:

BIF	Purpose
`jwtCreate()`	Sign a payload struct into a compact JWS token
`jwtVerify()`	Verify signature and validate claims — throws on failure
`jwtValidate()`	Like `jwtVerify()` but returns `true`/`false`
`jwtDecode()`	Inspect header/payload without signature verification
`jwtRefresh()`	Re-issue a token with fresh `iat`, `jti`, and optional new `exp`
`jwtEncrypt()`	Encrypt a payload as a compact JWE token
`jwtDecrypt()`	Decrypt a JWE token and return claims
`jwtGenerateSecret()`	Cryptographically random HMAC secret (Base64-encoded)
`jwtGenerateKeyPair()`	RSA or EC key pair as PEM strings

Get Started in Seconds

HMAC Sign and Verify

secret  = jwtGenerateSecret( 256 );
token   = jwtCreate( { sub: "user-123", iss: "my-api", roles: [ "admin" ] }, secret, "HS256" );
payload = jwtVerify( token, secret, "HS256" );
writeOutput( payload.sub ); // user-123

RSA Sign and Verify

keys    = jwtGenerateKeyPair( "RS256" );
token   = jwtCreate( { sub: "user-123" }, keys.privateKey, "RS256" );
payload = jwtVerify( token, keys.publicKey, "RS256" );

JWE Encryption

Sensitive payloads — PII, PHI, internal claims that must stay opaque — belong in a JWE, not a JWS. bx-jwt handles both:

token   = jwtEncrypt(
    { sub: "patient-456", phi: { dob: "1990-01-15" } },
    secret32bytes,
    { keyAlgorithm: "dir", encAlgorithm: "A256GCM" }
);
payload = jwtDecrypt( token, secret32bytes, { keyAlgorithm: "dir", encAlgorithm: "A256GCM" } );

Or nest them — sign first, encrypt the signed token — for the full sign-then-encrypt pattern:

// Inner signed JWT
signedToken = jwtCreate( { sub: "u1", role: "admin" }, innerPrivKey, "RS256", {
    headers: { cty: "JWT" }
} );

// Outer encrypted JWE
encryptedToken = jwtEncrypt( signedToken, outerPubKey, {
    keyAlgorithm : "RSA-OAEP-256",
    encAlgorithm : "A256GCM"
} );

Enterprise Key Management with the Key Registry

This is where bx-jwt separates from basic JWT libraries. The Key Registry lets you define named keys once in configuration and reference them by name throughout your entire application. Keys never appear in application logic. Rotation is a config change, not a code change.

// ModuleConfig.bx
settings = {
    keys: {
        "api-signing": {
            algorithm : "HS256",
            secret    : "${Setting: env.JWT_HMAC_SECRET not found}"   // env var substitution built-in
        },
        "api-rsa": {
            algorithm  : "RS256",
            privateKey : "/etc/keys/api-private.pem",
            publicKey  : "/etc/keys/api-public.pem"
        },
        "partner-public": {
            algorithm : "RS256",
            publicKey : "/etc/keys/partner-public.pem"  // verify-only key
        }
    },
    defaultSigningKey : "api-signing",
    defaultVerifyKey  : "api-signing",
    defaultAlgorithm  : "HS256",
    defaultIssuer     : "my-api",
    defaultAudience   : "web",
    defaultExpiration : 3600,
    generateIat       : true,
    generateJti       : true
}

With defaults fully configured, the key and algorithm arguments become optional everywhere:

// No key argument, no algorithm argument — resolved from registry
token   = jwtCreate( { sub: "user-123" } );
payload = jwtVerify( token );

Keys can also be registered at runtime via the JWTService:

jwtService = getBoxContext().getRuntime().getGlobalService( "JWTService" );
jwtService.registerKey( "session-key", { algorithm: "HS256", secret: generateSecureKey() } );

Security by Default — Not by Configuration 🛡️

bx-jwt is built with the attack surface in mind. Security properties are unconditional — they cannot be turned off:

`alg:none` Rejection

The classic JWT attack. bx-jwt unconditionally rejects tokens with alg:none. Passing an unsigned token to jwtVerify() or jwtRefresh() always throws JWTVerificationException. No configuration switch, no override. It simply doesn't work.

HMAC Minimum Key Lengths (RFC 7518 §3.2)

Short HMAC secrets are a real-world vulnerability. bx-jwt enforces RFC 7518 minimums:

Algorithm	Minimum Key Length
HS256	32 bytes (256 bits)
HS384	48 bytes (384 bits)
HS512	64 bytes (512 bits)

Use jwtGenerateSecret( bits ) and you're always compliant.

Algorithm Allowlist

Algorithm-confusion attacks exploit servers that accept any algorithm the token header declares. Lock your application to a known set:

// Only HS256 and RS256 are accepted — anything else throws
allowedAlgorithms: [ "HS256", "RS256" ]

Clock Skew Tolerance

Distributed systems have clock drift. bx-jwt ships with a configurable clockSkew (default: 60 seconds) that prevents legitimate tokens from failing exp/nbf validation due to minor time differences between services. Tune it per environment:

// Strict environment
payload = jwtVerify( token, secret, "HS256", { clockSkew: 0 } );

// Distributed system with known drift
payload = jwtVerify( token, secret, "HS256", { clockSkew: 120 } );

Real-World Patterns

Authentication Middleware

function requireAuth() {
    var authHeader = getHttpRequestData().headers[ "Authorization" ] ?: ""
    if ( !authHeader.startsWith( "Bearer " ) ) {
        bx:header statusCode=401;
        abort;
    }

    var token = authHeader.removeFirst( "Bearer " )

    if ( !jwtValidate( token, application.jwtSecret, "HS256" ) ) {
        bx:header statusCode=401;
        abort;
    }

    request.currentUser = jwtVerify( token, application.jwtSecret, "HS256", {
        claims: { iss: "auth-service", aud: "api" }
    } );
}

Token Refresh with Grace Period

function refreshToken( token ) {
    try {
        return jwtRefresh( token, application.jwtSecret, "HS256", {
            allowExpired : true,   // honor recently expired tokens
            expireIn     : 3600,
            claims       : { iss: "auth-service" }
        } );
    } catch ( "bxjwt.JWTVerificationException" e ) {
        // Bad signature — not refreshable
        return "";
    }
}

Kid-Based Key Rotation

function verifyWithKeyRotation( token ) {
    var decoded = jwtDecode( token );
    var kid     = decoded.header.kid ?: "default";
    var key     = getKeyForKid( kid );
    return jwtVerify( token, key, decoded.header.alg );
}

Full Algorithm Support

Signing (JWS)

Algorithm	Type	Notes
HS256, HS384, HS512	HMAC	Symmetric
RS256, RS384, RS512	RSA	Asymmetric — private signs, public verifies
ES256, ES384, ES512	Elliptic Curve	Smaller keys than RSA, equivalent security

Encryption (JWE)

Key Algorithm	Content Encryption	Key Type
RSA-OAEP-256	A256GCM	RSA key pair
dir	A256GCM	256-bit symmetric secret

Installation

# CommandBox
box install bx-jwt

# BoxLang CLI
install-bx-module bx-jwt

bx-jwt requires a BoxLang+ or BoxLang++ subscription. 🔑

This module ships as part of our enterprise module collection — a growing library of production-ready, security-focused, professionally maintained modules available exclusively to BoxLang+ subscribers.

BoxLang+/++/Starter

bx-jwt is one of many enterprise modules available under BoxLang+/++/Starter. When you subscribe, you get:

🔐 bx-jwt and the full enterprise module library
⚡ Priority support from the Ortus team
🏗️ Access to upcoming enterprise modules as they ship
❤️ You fund the continued development of BoxLang as a community-supported open source project
View Plans & Subscribe → boxlang.io/plans

Resources

JSON Web Tokens are a solved problem. Now BoxLang has the enterprise solution to prove it. Install bx-jwt, protect your applications, and ship with confidence. 🚀

The post Introducing bx-jwt: Enterprise-Grade JSON Web Tokens for BoxLang appeared first on foojay.

Context Is a Budget — Eight levers and three workflow patterns

Soham Dasgupta — Fri, 22 May 2026 12:52:06 +0000

Table of Contents

Where the tokens actually goThe Eight Levers

A. Context engineering — scope your asks
B. Prompt caching — order matters
C. Tool & MCP hygiene — every schema is a tax
D. Custom instructions & skills — codify it once
E. Model routing — start cheap, escalate when stuck
F. Output discipline — diffs, not novels
G. Repo hygiene — what the indexer sees
H. Observability — latency is your token meter

Three workflow patterns that compound

1. The Ralph Wiggum loop
2. Auto-compact
3. Planner → Implementer → Reviewer (agent handover)

The Monday checklistClosing

Eight levers and three workflow patterns that pay for themselves in a week.

A team of fifty developers can quietly burn $30,000 a month on AI coding assistants without anyone noticing. Premium-request quotas vanish by the third week. The bill arrives. Nobody has a story for where it went.

The cost is the obvious pain. The other two are sneakier:

Latency. Bigger contexts take longer. The model thinks more, but you also wait more.
Context rot. This is the surprising one. Anthropic and Chroma have both shown that as the context window fills up, model recall and reasoning degrade — even well inside the advertised window. The 200K-token model is genuinely worse at the 150K mark than at the 20K mark. More context is not free; past a point, it's actively harmful.

The mental model that fixes all three: stop treating context as a free buffet. Treat it as a budget you spend on every turn.

This post is a practical guide to spending it well: where the tokens actually go, eight levers that move the needle, and three workflow patterns that compound on top of them.

Where the tokens actually go

Every request to a coding assistant is a stack of buckets. The shape varies by tool and session, but it tends to look like this:

Bucket	Typical share	Notes
System prompt / instructions	5–15%	Boilerplate that's been copy-pasted for months
Tool / function schemas	10–40%	Re-sent on every turn
Retrieved files & code chunks	20–60%	The biggest lever, almost always
Conversation history	10–30%	Grows linearly until you compact it
Model output	5–20%	Verbose prose is expensive to produce and to read

A few things to notice:

Tool schemas dominate more than people expect. Five connected MCP servers can easily contribute 5,000–10,000 tokens to every request before you've typed a word. The model doesn't have to use the tool — the schema ships either way.
Conversation history grows without bound. A 30-turn chat is paying for the first 29 turns on every new question, plus your fresh one.
Output is small in volume but expensive per token. On most direct APIs, output tokens cost three to five times input tokens. A reply that says "Sure! Let me explain what I'm about to do…" before doing it is pure tax.

Rule of thumb: profile your own traffic before optimizing. The bucket dominating your sessions is rarely the one your gut says.

In a Copilot context, you can't see token counts directly — but you can see the symptom. Open Output → "GitHub Copilot Chat" and watch the ccreq lines: each one shows the model, latency, and request type per turn. When the same question takes three times longer in chat #2 than chat #1, you've just watched your token meter the entire time.

The Eight Levers

These aren't in priority order — they're in the order you'd naturally encounter them in a session. The first three (context, caching, tools) are about the request shape. The next three (instructions, model, output) are about how you talk to the assistant. The last two (repo, observability) are the foundations that make all of the others stick.

A. Context engineering — scope your asks

The single biggest waste in most AI workflows is asking vague questions of agent-mode chat with full codebase access. The agent dutifully explores, reads ten files to find the two it needed, summarizes them all, and then answers. You pay for every step.

Compare:

Bad: "Refactor the order confirmation email to use the new template engine."
The agent opens four files under src/main/java/com/example/demo/email/, reads WelcomeEmailService.java for context it doesn't need, considers whether a templates/ resource directory should exist, and proposes a sprawling diff that renames a method on the way through.

Good: "Refactor #file:src/main/java/com/example/demo/email/OrderConfirmationService.java to call render on #file:src/main/java/com/example/demo/email/TemplateEngine.java instead of renderLegacy. Keep behaviour identical."
The agent opens two files. The diff is three semantically meaningful lines. The whole turn is roughly a tenth of the cost.

Specificity is free. Every #file: (Copilot) or explicit path (Claude Code) you provide is a chunk the agent doesn't have to find. Every "keep behaviour identical" is a sentence of guard-rails that prevents a 200-line side quest.

Do this Monday: make #file: your default. Use agent-mode-with-broad-retrieval only when you genuinely don't know what you don't know.

B. Prompt caching — order matters

Every major provider supports prompt caching now. Anthropic and OpenAI both charge roughly 10% of base input cost for cache hits. Google's Gemini does it explicitly. The mechanism is the same: a stable prefix at the front of your prompt is cached after the first request and read back cheaply on subsequent ones.

The cost discipline is therefore about order:

[ tool definitions ]    ← rarely change         ┐
[ system prompt ]       ← rarely changes      │ cache these
[ skills / rules ]      ← stable per repo           ┘
[ retrieved files ]     ← changes per task
[ conversation ]        ← changes every turn

Static at the top, dynamic at the bottom. The longest stable prefix you can construct is the most cacheable one.

The classic anti-pattern is innocent-looking and brutal is to have dynamic values/variables part of your instructions, custom agent files. It will most likely busts the cache on every request. You will pay full price for the same 10 KB of preamble all day. The fix is to push dynamic content down into the user message or tail of the prompt.

Do this Monday: audit the first 200 tokens of your system prompts. Anything that changes per-request belongs further down.

C. Tool & MCP hygiene — every schema is a tax

Each connected tool ships its full JSON schema with every request. A typical MCP server with 8–15 tools costs 400–2,500 tokens per turn. Five servers connected? You may be paying 5,000–10,000 tokens per turn for tool definitions the model never invokes.

Treat MCP servers like browser extensions: useful, but only the ones you actually need today.

// .vscode/mcp.json — keep this short
{
  "servers": {
    "filesystem": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem"] }
    // disable github, playwright, brave-search, etc. when you don't need them
  }
}

The same discipline applies to the tools you build yourself. A tool that returns { id, summary } is cheap. A tool that returns a 50-field JSON object is expensive — the model re-processes all 50 fields on every turn it's referenced. Default to compact responses with optional ?expand=... for the rare caller that needs the rest.

Do this Monday: open MCP server list, disable everything you didn't actively use this week. Re-enable on demand.

D. Custom instructions & skills — codify it once

Anything you find yourself re-typing in chats belongs in an instructions file. The exact filename varies — .github/copilot-instructions.md, CLAUDE.md, AGENTS.md, Cursor Rules — but the principle is identical: write your team conventions once, commit them, and let every chat in the repo inherit them.

A small example is worth more than a long one:

Six lines. Now no chat in this repo will propose Jest, no chat will dump a whole-file rewrite when a diff would do, and no chat will preface its answer with "Sure! Let me explain what I'm about to do…"

For stack-specific rules, use path-scoped instructions. In Copilot:

---
applyTo: "src/main/java/**/*.java"
---
# Test conventions for src/
- Use JUnit 5 via `mvn test`.
- Tests mirror the source tree under `src/test/java/...` as `Test.java`.

This file is loaded only when a matching file is in scope. Repo-wide rules go in the global instructions; stack-specific rules go in scoped ones. Both are committed, both are versioned, both are team artifacts — not personal preferences buried in someone's IDE settings.

Do this Monday: check what you've typed into chat windows in the last week. Anything that reappeared more than twice is a candidate for an instructions file.

E. Model routing — start cheap, escalate when stuck

Routine tasks pick the most expensive model by default if you let them. You probably just paid 10× for the same answer.

A defensible default routing table:

Task	Model	Multiplier (Copilot)
Inline completions, simple chat	Cheapest available (e.g. GPT-4.1)	0×
Real coding work	Mid-tier (GPT-5 / Claude Sonnet)	1×
Long-context refactor / agent mode	Mid-tier with long context	1×
Genuinely hard reasoning	Top-tier (Claude Opus)	10×

The rule is: start cheap, escalate only when stuck. "Stuck" means you've tried the mid-tier model with good context and it's plainly missing the point — not "I want to feel sure," not "I have time to spare."

The math compounds. A team of fifty doing twenty agent runs each per day at 10× costs five times more than at 2× — for the same diffs, on most days.

Do this Monday: pin your default to the mid-tier. Make Opus a deliberate choice with a reason.

F. Output discipline — diffs, not novels

Every model has a "let me explain what I'm about to do" reflex. It's polite. It's also pure cost.

Same fix, two ways to ask:

"In templateEngine.js, the welcome template is missing an exclamation mark. Show me the updated file."
→ 30 lines back. (With a 600-line file, 600 lines.)

"In templateEngine.js, the welcome template is missing an exclamation mark. Reply with a unified diff only, no commentary."
→ 5 lines back.

Output tokens are typically three to five times the price of input tokens on direct APIs. In a per-request model like Copilot's, verbose output still hurts: it increases latency, fills the context for subsequent turns, and evicts earlier useful content sooner.

The leverage is in the system prompt. Two lines in copilot-instructions.md make every chat in the repo behave better forever:

- Be concise. No preamble.
- Prefer diffs over full files.

Do this Monday: add those two lines.

G. Repo hygiene — what the indexer sees

The indexer that powers retrieval respects .gitignore. Tighten it.

+target/
+*.class
+*.jar
+.idea/
+*.iml
 *.log

Important gotcha: if a file is already tracked in git, adding the path to .gitignore does not untrack it — the indexer still sees it. You also need:

git rm --cached target/demo-0.0.1-SNAPSHOT.jar

For secrets, fixtures, and vendored deps, use content exclusions at the org/repo level (most coding-assistant providers expose this).

The other half of repo hygiene is summary comments at the top of each module:

// TemplateEngine — central renderer. Use render(id, data) for new emails.
// renderLegacy(id, data) is deprecated and only used by OrderConfirmationService.
// Templates registered: welcome, order_confirmation_v2.

Three lines, ~50 tokens. Now "what does the template engine do?" can be answered without reading the rest of the file. A 200-token summary at the top of each module beats re-reading 5,000 tokens of code, every single time.

Do this Monday: git rm --cached whatever shouldn't be indexed; add three-line summaries to your top-of-mind modules.

H. Observability — latency is your token meter

You can't see Copilot's token counts. You don't need to. Use the proxy you already have:

Reply latency	≈ Input tokens
< 5 s	20 s	Near limit — start a new chat

When the same question takes three times longer in your fourth chat than in a fresh one, you've just watched your context bloat in real time. The fix is "new chat with a summary," not "wait it out."

You can also lint for context bloat the same way you lint for bundle size. A 30-line script in CI is enough to catch the most common regressions:

// fail if any .github/instructions/*.md exceeds 150 lines
import { readdir, readFile } from "node:fs/promises";
const files = (await readdir(".github/instructions")).filter(f => f.endsWith(".md"));
let failed = false;
for (const f of files) {
  const lines = (await readFile(`.github/instructions/${f}`, "utf8")).split("\n").length;
  console.log(`${lines > 150 ? "❌" : "✅"} ${f}: ${lines} lines`);
  if (lines > 150) failed = true;
}
if (failed) process.exit(1);

Wire it into CI and context bloat stops accumulating silently across PRs.

Do this Monday: put a stopwatch next to your editor for one day. Count "Amsterdam" (not Mississippi's) . You'll know which chats to rotate.

Three workflow patterns that compound

The eight levers above shrink the cost of an individual turn. These three patterns shrink the number of expensive turns. Apply them on top.

1. The Ralph Wiggum loop

Named after the Simpsons character whose superpower is relentless dumbness. The recipe is unglamorous on purpose:

Write a TODO.md with checkbox tasks.
Open agent-mode chat with a cheap model.
Tell it: "Read TODO.md. Pick the first unchecked item. Implement only that. Run npm test. If green, check the box and commit. Pick the next. Repeat."

That's it. The agent burns through the list one item at a time.

Why it works:

Each iteration starts with a small, fresh context. The chat history isn't growing the way it would in a free-form conversation.
State lives on disk (TODO.md and git commits), not in conversation tokens.
A cheap model is good enough, because each task is small and self-contained.
It's restartable. Kill the chat halfway, start a new one, run the prompt again — it picks up where it left off.

After it runs, git log --oneline reads like a changelog: one commit per task, message starts with the task title, easy to revert any one step. Compare with the typical "fix things" mega-commit and you'll never go back.

2. Auto-compact

Most assistants don't compact aggressively on their own. You have to drive it.

When a chat hits 60–80% of the context window (you'll know — replies start to crawl), stop and ask:

Summarize what we've discussed: the goal, files we've touched, decisions made, open questions, and the next step. Keep it under 300 words and use bullet points.

Save the output to plan.md. Open a brand new chat. Attach it:

Continue from #file:plan.md. The next step is…

The new chat's first request is dramatically smaller than the old chat's last one. The model picks up the thread without missing a beat. Roughly: a 4 KB summary keeps 95% of the signal at 3% of the cost.

The bonus pattern: that summary file becomes a stable, cacheable prefix. Every future chat that references it benefits from prompt caching on top of the compaction. Two compounding wins for one summarization.

If you are interested in a sofisticated implementation of compaction, check this skill which is used by some of the custom agents.

Rule of thumb: one task per chat. New task → new chat with summary attached.

3. Planner → Implementer → Reviewer (agent handover)

This is the one that changes how features get built. Three short, focused chats with three different model choices and one shared artifact:

Planner — expensive model, one call. Reads the feature request, produces plan.md with goal, acceptance criteria, tasks, files expected to change, out-of-scope items, and risks. No code yet.
Implementer — cheap model, agent mode, fresh chat. Sees only plan.md. Runs a Ralph loop on it: pick first unchecked task, implement, test, check the box, commit, repeat.
Reviewer — expensive model, fresh chat. Sees only plan.md and the diff. Marks each acceptance criterion PASS or FAIL, lists bugs, smells, out-of-scope edits. Ends with VERDICT: APPROVE or VERDICT: REQUEST CHANGES.

Three chats, ~5–8 premium requests total for an end-to-end feature. Compare with one mega-chat using the most expensive model the whole way: easily 30+ requests at 10× the multiplier.

The crucial discipline: the handover artifact (plan.md, the diff, the review notes) is the only thing that crosses the boundary. Never chat history. That's how you keep each agent's context small, focused, and cheap.

The Monday checklist

Pin this to your team's wiki. Take what's useful, ignore the rest.

Repo setup

[ ] Add a top-level instructions file (copilot-instructions.md, AGENTS.md, CLAUDE.md, or your tool's equivalent) with build, test, lint, conventions, and output-style rules.
[ ] Add path-scoped instruction files for stack-specific rules (e.g. test conventions under src/).
[ ] .gitignore build outputs, snapshots, and large fixtures. git rm --cached anything already tracked.
[ ] Add three-line "what does this module do" summary comments to your top 10 modules.
[ ] Add a CI lint that fails if instruction files exceed ~150 lines or prompt files exceed ~250 lines.

Per-session habits

[ ] Disable MCP servers you don't need this session. Re-enable on demand.
[ ] Default to a mid-tier model. Escalate to a top-tier model only when stuck — and only with a reason.
[ ] Use #file: (or your tool's equivalent) instead of broad-retrieval / agent mode for scoped tasks.
[ ] Ask for diffs, not full files.
[ ] Start each new task in a fresh chat.
[ ] When responses start to crawl (~60% context), summarize to a plan.md and continue in a new chat.

Workflow patterns to try this week

[ ] Run a Ralph loop on a TODO.md of small chores.
[ ] Use the planner / implementer / reviewer split for one real feature. Notice the request count.
[ ] Treat latency as your token meter. Count Amsterdam for one day.

Closing

The mindset shift is small and the wins are not.

Prompt engineering used to be about clever phrasing. Context engineering — what this post was really about — is about what's in the window and what isn't. Smaller prompts, fewer tools, scoped retrieval, summaries instead of histories, cheap models for cheap work, expensive models for the rare hard parts.

None of it is novel. None of it is hard. Most teams don't actually have a token problem; they have a discipline problem. The levers are boring. The compounding is real: a team that adopts even half of the above will see latencies fall, premium-request burn drop noticeably, and, counterintuitively, answer quality go up, because the model isn't drowning in irrelevant context.

One sticky line to take with you:

The worst tokens are the ones you're paying for and not noticing.

Watch your ccreq lines. Count Amsterdams. Spend the budget like it's yours.

The post Context Is a Budget — Eight levers and three workflow patterns appeared first on foojay.

Introducing skills.boxlang.io — The Open Agent Skills Ecosystem for BoxLang & the Ortus World

Cristobal Escobar — Thu, 21 May 2026 11:42:26 +0000

Table of Contents

The Problem: AI Knowledge Doesn't Scale by Copy-Paste What Is a Skill? Install in Seconds: Two Paths, One Standard

Option 1 — npx skills (works everywhere)
Option 2 — ColdBox CLI (deep BoxLang/ColdBox integration)

Core Repositories — Curated by Ortus A Taste of What's Available Submit Your Own — Community Skills, Security First How Your Agent Actually Uses It Why This Matters Beyond BoxLang Get Started Now Resources

Today we're launching something we've been quietly building for months: skills.boxlang.io — a public, agent-agnostic directory for AI skills covering BoxLang, ColdBox, TestBox, CommandBox, and the entire Ortus ecosystem.

If you've ever pasted a 400-line system prompt into yet another AI agent, watched two of your bots drift onto subtly different versions of the same coding standard, or spent half a Friday afternoon trying to convince an LLM that BoxLang is not Java and is not CFML, or how to code for Modern CFML; this launch is for you. 🎯

The numbers at launch:

203+ curated skills available on day one
8,000+ installs already, before public announcement
3 core repositories maintained directly by Ortus Solutions
Multiple agents supported — Claude Code, Cursor, GitHub Copilot, Codex, OpenCode, and more
Let's dig into what it is, why we built it, and how to start using it in the next 30 seconds. 🚀

🤔 The Problem: AI Knowledge Doesn't Scale by Copy-Paste

Every team building with AI agents eventually hits the same wall.

You write a great system prompt that teaches an agent your SQL conventions. Then a teammate spins up a new bot and pastes a slightly older version. A month later there's a third variant in a Slack snippet that nobody can find. Your "single source of truth" is now three sources of conflict, and the agent's outputs reflect every one of them.

This isn't a discipline problem — it's an architecture problem. System prompts are plain strings, and plain strings don't have a source of truth. They aren't versioned, aren't audited, aren't shared, and aren't discoverable.

Anthropic's Agent Skills open standard — Markdown files with frontmatter metadata, distributed as SKILL.md — gave the industry a real answer. BoxLang AI 3.0 implemented it natively. And now skills.boxlang.io brings the missing piece: a public, curated, security-audited registry where these skills live, are versioned, and can be installed into any AI agent in seconds. 💚

🎓 What Is a Skill?

A skill is a portable, reusable unit of expertise — a SQL coding style guide, a tone-of-voice policy, a ColdBox conventions cheat sheet, an API design standard, a security ruleset. Anything your AI assistant should know before it starts answering.

Each skill is a Markdown file (SKILL.md) with optional YAML frontmatter:

---
description: Use this skill when writing, reviewing, or formatting any
  Ortus Solutions code (BoxLang, CFML, or Java) to ensure it follows
  the official Ortus coding standards.
tags: [boxlang, cfml, java, coding-standards, ortus]
---

# Ortus Coding Standards

Always use spacing inside parentheses and brackets for readability.
Prefer closures with `=>` over anonymous functions.
Use lambdas with `->` when no external scope is needed.
...

Define it once. Inject it everywhere. Let your codebase — not your clipboard — be the source of truth. 📚

📥 Install in Seconds: Two Paths, One Standard

We built skills.boxlang.io to be agent-agnostic. Whatever AI tool your team prefers, the skills work the same way. You have two install paths.

⚡ Option 1 — `npx skills` (works everywhere)

Powered by skills.sh, an open-source, agent-agnostic CLI for discovering, installing, and managing SKILL.md files across Claude Code, GitHub Copilot, Cursor, Codex, and more. It reads the BoxLang Skills Hub catalog, security-audits community content, and drops files into the correct agent directory in one command.

# Install an entire repository of skills
npx skills add ortus-boxlang/skills

# Or grab a single, focused skill
npx skills add ortus-boxlang/skills/coldbox-basics

No global install needed. Works with any Node.js. 🌐

🥊 Option 2 — ColdBox CLI (deep BoxLang/ColdBox integration)

If you're already living in the ColdBox world, the ColdBox CLI 8.11 release wires the directory directly into your project workflow:

# Browse the directory interactively
coldbox ai skills install --list

# Filter by source or category
coldbox ai skills install --list coldbox/skills
coldbox ai skills install --list coldbox/skills/coldbox-testing

# Install a specific skill
coldbox ai skills install ortus-boxlang/skills/async-programming

# Search the registry
coldbox ai skills find "rest api"

Bonus: when you box install a module that has skills published to the directory, coldbox ai refresh auto-installs them. Skills become infrastructure, not setup. 💚

🔷 Core Repositories — Curated by Ortus

Three core repositories are officially maintained by Ortus Solutions. Skills here are trusted by default and skip the community audit step.

Repository	Focus
`ortus-boxlang/skills`	BoxLang language, runtime, BIFs, and core modules
`coldbox/skills`	ColdBox MVC framework patterns and conventions
`ortus-solutions/skills`	WireBox, TestBox, LogBox, and the broader Ortus module library

Want a skill added to a core repo? Open a pull request. Add your SKILL.md inside a new folder, include valid YAML frontmatter, and the Ortus team will review and merge it. Once merged, it's automatically imported the next time the hub syncs. ⚡

⭐ A Taste of What's Available

A small sample of skills you'll find in the directory at launch:

code-documenter — Producing or improving developer-facing documentation for codebases, APIs, modules, and architecture decisions
ortus-java-coding-standards — Official Ortus formatting and structural conventions for BoxLang, CFML, and Java
javascript-expert — Modern JavaScript correctness, async flows, module design, and architectural refactors
alpinejs-expert — Alpine.js component state, directives, transitions, and reusable stores
vite-expert — Vite-based frontend builds, HMR diagnostics, plugin customization, and Vitest integration
vuejs-expert — Composition API patterns, routing, forms, testing, and SSR-aware component design
async-programming — BoxLang futures, parallel execution, and concurrency primitives
coldbox-basics — ColdBox MVC conventions, handlers, models, interceptors, and module architecture
…and 195+ more. Browse the full directory at skills.boxlang.io/skills. 🎯

🌐 Submit Your Own — Community Skills, Security First

Don't want to contribute to a core repo? Publish your own GitHub repository as a Community source or send us a Pull Request to any of our repos. Community skills are listed alongside core skills in the directory and go through automated security auditing before being made available, so consumers can install them with confidence.

The submission flow is straightforward:

Create a GitHub repository with one or more SKILL.md files, each in its own subfolder (e.g. my-skill/SKILL.md)
Add YAML frontmatter with at minimum name, description, and tags
Write clear, accurate documentation in the Markdown body
Submit your repo and we'll review it
You keep full ownership and control of your skills. The hub just makes them discoverable and installable. 💚

🛠 How Your Agent Actually Uses It

After installing, skills land in ~/.ai/skills/, ~/.claude/skills/, or the equivalent directory for your agent. Your AI assistant automatically discovers and loads them in each conversation.

The change in agent behavior is immediate. Ask things like:

"Write a ColdBox REST handler with full error handling"
"Create a WireBox-managed singleton service that queries SQLite"
"Show me how to use TestBox to write integration tests"
"Help me configure bx-migrations for my BoxLang app"

…and the agent answers using patterns and idioms from the installed skills, not scattered (and often outdated) snippets pulled from random internet training data. The hallucinations go down. The accuracy goes up. The output starts to feel like it was written by someone who actually knows the framework — because, in a sense, it now was. 🎓

🔮 Why This Matters Beyond BoxLang

We didn't build skills.boxlang.io as a marketing site. We built it because the Ortus ecosystem — BoxLang, ColdBox, TestBox, CommandBox, WireBox, LogBox, CacheBox, hundreds of modules across 18+ years of work — is too rich to fit into anyone's training data, and too valuable to be re-discovered through trial and error every time a developer opens a new chat with their AI assistant.

A public, curated, audited skills directory means:

Module authors can ship AI knowledge alongside their code
Teams can standardize agent behavior across every developer's workstation
Newcomers get accurate, idiomatic guidance from day one
The community owns and contributes to a shared knowledge layer that compounds over time

This is the same shift package managers brought to language ecosystems — except for AI knowledge. It's the era of skills, and now every BoxLang and ColdBox developer can participate. 🚀

🎯 Get Started Now

# Install your first skill in 10 seconds
npx skills add ortus-boxlang/skills

# Or via the ColdBox CLI
coldbox ai skills install --list

Then point your AI agent at your codebase and watch the difference. ⚡

📚 Resources

Skills Hub: skills.boxlang.io
Browse the Directory: skills.boxlang.io/skills
Documentation: skills.boxlang.io/docs
Submit a Repository: skills.boxlang.io/submit
skills.sh CLI: skills.sh
Core Repo — BoxLang: github.com/ortus-boxlang/skills
Core Repo — ColdBox: github.com/coldbox/skills
Core Repo — Ortus: github.com/ortus-solutions/skills
BoxLang AI: ai.boxlang.io
BoxLang Plans: boxlang.io/plans

Got a skill you'd love to publish, or one you wish existed? We'd love to hear from you — open a PR, submit your repo, or drop us a note. The directory grows because the community grows. 💚

The post Introducing skills.boxlang.io — The Open Agent Skills Ecosystem for BoxLang & the Ortus World appeared first on foojay.

From Zero (Really Zero) to OpenTelemetry

Geertjan Wielenga — Tue, 19 May 2026 13:36:11 +0000

Table of Contents

The Super Awesome PromptWhy This Prompt Works

After The Agent Finishes
Follow Up Prompts You'll Likely Want

Here's a super awesome prompt (e.g., for Claude Code) that you can use with https://github.com/dash0hq/agent-skills, the free collection of skills for AI coding agents to make applications observable with OpenTelemetry, such as with Dash0.

And the end result is this, a view into the traces of your application (without anything at all at the start of the process).

The Super Awesome Prompt

Take a careful look below: before doing this prompt, not only do we not have an application that is instrumented with OpenTelemetry yet. Not only do we not have the agent we need to do the instrumentation yet.

The application itself doesn't even exist yet.

So, here's the prompt that gets you from really zero to OpenTelemetry:

Create a minimal Spring Boot app from scratch in this directory with two endpoints: GET /hello returning a greeting, and GET /work that sleeps 50–200ms and returns a JSON payload. Use Maven and Java 21.

Then instrument it with OpenTelemetry to send traces, metrics, and logs to Dash0 using the otel-instrumentation skill.

Configuration:
Endpoint: (gRPC, so set OTEL_EXPORTER_OTLP_PROTOCOL=grpc)
Auth header: Authorization=Bearer
Dataset: default (via Dash0-Dataset header)
Service name: dash0-java-demo
Service version and namespace as appropriate resource attributes

Use the OpenTelemetry Java agent (-javaagent:opentelemetry-javaagent.jar) — download it into the project. Don't hardcode the token in source; put env vars in a run.sh script that's gitignored, and document everything in a README. Follow OpenTelemetry semantic conventions for any custom spans or attributes you add.

When done, show me the exact commands to run the app and generate some traffic.

(All you need to run this prompt is to get your endpoint and token from your Dash0 Settings dialog, and put them in the placeholders above.)

Why This Prompt Works

A few things in there are deliberate:

"using the otel-instrumentation skill" — naming it explicitly nudges the agent to load it. Skills are supposed to auto-trigger on description match, but being explicit removes ambiguity, especially in tools where skill triggering is less aggressive than in Claude Code.
Concrete config values — agents do much better with copy-pasteable specifics than "set up Dash0." You'd otherwise spend two turns answering "what's your endpoint?"
gRPC protocol callout — port 4317 needs OTEL_EXPORTER_OTLP_PROTOCOL=grpc; without it, the agent might wire up the default HTTP protocol and silently fail. Worth pinning.
run.sh + gitignore — keeps your token out of source control. The agent will do this if asked; less reliably if not.
"show me the commands to run" — forces it to surface the verification path, not just dump files.

After The Agent Finishes

You should end up with, roughly pom.xml, src/main/java/.../Application.java plus a controller, opentelemetry-javaagent.jar, run.sh, .gitignore, README.md.

Then run it (also fine to do in your AI prompt):

./run.sh
# in another terminal, generate traffic:
for i in {1..50}; do curl -s localhost:8080/hello; curl -s localhost:8080/work; done

Then in Dash0 go to the Trace Explorer — filter by service.name = dash0-java-demo, you should see GET /hello and GET /work spans within 10–30 seconds.

Next, go to Integrations → Java → Install all dashboards if you haven't yet, then open JVM Metrics for the heap/GC/thread charts.

Follow Up Prompts You'll Likely Want

Once data is flowing, these are the natural next asks (each triggers a different skill):

"Add a custom span around the work logic in /work with a work.difficulty attribute that follows semantic conventions." → triggers otel-semantic-conventions for naming guidance.
"My span names look wrong — they're showing the full URL instead of the route. Fix that." → common Spring Boot gotcha, the instrumentation skill covers it.
"Set up an OpenTelemetry Collector in front of the app instead of exporting directly to Dash0." → triggers otel-collector.

That's pretty cool, get it here: https://github.com/dash0hq/agent-skills

The post From Zero (Really Zero) to OpenTelemetry appeared first on foojay.

BoxLang v1.13.0: Compatibility, Concurrency, and Formatter Maturity

Cristobal Escobar — Tue, 19 May 2026 12:11:19 +0000

Table of Contents

New Features

Character-Aware Trimming — trim(), ltrim(), rtrim()
getClassMetadata() by Absolute Path
SystemExecute() Environment Controls

The BoxLang Formatter Goes Production-ReadyAsync & Concurrency HardeningMiniServer: Security & ReliabilityCompatibility WinsChangelog Highlights

BoxLang 1.13.0 is a stability-first release with deep compatibility work and runtime hardening. This build closes 48 issues, with the majority focused on CFML compatibility edge cases, concurrency correctness, formatting parity, and miniserver/runtime reliability under real production loads.

While this release is bug-fix heavy, it still introduces several meaningful features and quality-of-life improvements: character-aware trimming, class metadata lookup by absolute path, process environment control in SystemExecute(), SOAP headers, new query column rename capabilities, and safer miniserver routing/security defaults.

New Features

Three additions that materially expand what the runtime can do.

Character-Aware Trimming — `trim()`, `ltrim()`, `rtrim()`

The string trimming BIFs now accept an optional chars argument. Strip arbitrary character sets without reaching for rereplace().

"**Urgent**".trim( "*" )       // "Urgent"
"000123".ltrim( "0" )          // "123"
"report....".rtrim( "." )      // "report"
"//path/to/dir//".trim( "/" )  // "path/to/dir"

Each character in chars is treated as an independent trim target — the same behavior you'd expect from Python or JavaScript. One less regex workaround.

`getClassMetadata()` by Absolute Path

Class metadata can now be loaded directly from a filesystem path, bypassing the class loader and import resolution entirely.

meta = getClassMetadata( "/opt/apps/models/User.bx" )
writeDump( meta.name )        // "User"
writeDump( meta.properties )  // array of property definitions
writeDump( meta.functions )   // array of function signatures

This is a cornerstone API for tooling. Linters, IDE integrations, documentation generators, and migration scanners can now inspect .bx and .cfc files without booting them into the runtime, firing onApplicationStart, or wrestling with import edge cases. The kind of unglamorous primitive that makes an ecosystem possible.

`SystemExecute()` Environment Controls

Two new arguments give you deterministic control over the environment of spawned child processes:

inheritEnvironment (boolean, default true) — when false, the child starts with a clean slate

environment (struct) — an explicit map of variables to inject

result = systemExecute(
    name               = "env",
    arguments          = "",
    inheritEnvironment = false,
    environment        = {
        APP_ENV   : "production",
        DB_HOST   : "internal.db.example.com",
        FEATURE_X : "true"
    }
)

writeOutput( result.output )

Before 1.13.0, every systemExecute() call inherited the full parent environment — including secrets, tokens, and internal config. Security-conscious deployments now have an explicit, auditable way to lock that down.

The BoxLang Formatter Goes Production-Ready

This is a flagship moment. The formatter graduates from experimental to production-grade and lands with a complete CI/CD integration surface.

The outcome you actually care about: when formatting is enforced in CI, pull requests stop being about whitespace and start being about logic again. For mixed BoxLang/CFML codebases, the legacy .cfformat.json compatibility path means you can adopt the formatter on legacy code today and migrate to BoxLang-native defaults on your own timeline.

Capabilities:

In-place formatting — boxlang format --input ./ formats an entire project tree
CI check mode — boxlang format --check --input ./ exits non-zero on any unformatted file (drop straight into GitHub Actions, GitLab CI, or Jenkins)
Stdout mode — boxlang format --overwrite false --input ./models/User.cfc for diff-friendly previews
Multi-extension — .bx, .bxs, .bxm, .cfm, .cfc, .cfs in a single pass

Config discovery fallback chain:

.bxformat.json — BoxLang-native config (Ortus gold-standard defaults)
.cfformat.json — legacy CFFormat config, auto-converted with migration-safe defaults
Built-in defaults — sensible behavior with zero config

Migration tooling built in:

# Generate a fresh .bxformat.json with defaults
boxlang format --initConfig

# Convert an existing .cfformat.json to .bxformat.json
boxlang format --convertConfig --input ./

Async & Concurrency Hardening

Concurrency bugs are the worst kind of bug — intermittent, non-deterministic, catastrophic when they hit production. 1.13.0 closes several long-standing race conditions and lifecycle issues across the async subsystem and threading layer.

API surface normalization. Missing async methods are restored: all(), allApply(), thenAsync(), delay(), and shutdownAndAwaitTermination() now exist with correct signatures. Positional spread arguments (...args) are supported in calls — unblocking a common functional-programming pattern.

args     = [ "Ada", "Lovelace" ]
fullName = formatName( ...args )

BoxFuture() lifecycle. A BoxFuture created during an HTTP request used to throw scope-access errors if the parent request completed before the future resolved. The context lifecycle is now properly decoupled — background work survives request teardown without touching stale scopes.

Concurrent array iteration. for/in loops over arrays no longer throw ConcurrentModificationException when the array is mutated from another thread.

Atomic class file writes. Class generation now uses a temp-file-then-atomic-rename pattern. No more transient zero-byte .class artifacts surfacing under parallel compilation — a race condition that produced some genuinely painful ClassNotFoundException reports in production.

MiniServer: Security & Reliability

The headline: a misconfigured miniserver no longer accidentally serves your source code or configuration over HTTP. The static-serving security filter now blocks hidden files and dotfiles, framework config artifacts (.boxlang.json, boxlang.json), and source files (.bx, .cfc) when not routed through the engine.

Pass predicate is now configurable through three channels — pick whichever fits your deployment model:

# CLI
boxlang server start --pass-predicate "/api/*"

// boxlang.json
{
  "web": {
    "passPredicate": "/api/*"
  }
}

# Environment variable
export BOXLANG_PASS_PREDICATE="/api/*"

Transfer reliability fixes:

Chunked encoding truncation fixed for large file responses (above the default buffer size)
Empty text-file uploads no longer throw illegal-state errors
content-length headers correctly computed across all response paths

Compatibility Wins

CFML compatibility is a continuous workstream, not a one-time port. This release closes a handful of high-impact gaps that real applications were tripping over.

SOAP header support. Consumers can now include optional

blocks for WS-Security, transactional metadata, and routing.

soapService.call(
    method  = "processOrder",
    headers = { Security : { UsernameToken : { Username : "admin" } } }
)

query.setColumnNames(). Query objects now support column renaming through a dedicated method, matching the Adobe CF and Lucee API.

q = queryNew( "fname,lname", "varchar,varchar", [ [ "Ada", "Lovelace" ] ] )
q.setColumnNames( [ "firstName", "lastName" ] )
writeDump( q.columnList )  // "firstName,lastName"

CLI .box.env support. The CLI now reads ~/.box.env on startup, loading user-level environment variables that persist across sessions.

# ~/.box.env
DB_HOST=localhost
DB_PORT=5432

Runtime Hardening
The unsexy stuff that matters. A condensed view of the deeper fixes shipped in this release:

Area	What Changed
Abort semantics	Corrected in web runtime Java `try/catch` boundaries
AppCDS paths	Deterministic, per-binary paths on Windows
Superclass init	Failed init no longer blocks class recreation retries
Module `onLoad()`	Request-context setup fixed for `dump()` template behavior
REST CFC mapping	Service-name routing corrected
Class creation	Broad performance optimizations in class loading and locator
JSA packages	Path handling fixed for `BOXLANG_HOME` with spaces
Zero timespan	`createTimeSpan( 0, 0, 0, 0 )` now correctly interpreted as no-cache
Remote methods	Force-write correctly under `enableOutputOnly`
Binary writes	Valid downloaded ZIP output restored
Numeric parsing	Leading-zero strings parsed safely
QoQ nesting	Nested-parentheses predicate parsing corrected
Custom tags	this scope no longer leaks from custom-tag context
`numberFormat()`	Major mask compatibility sweep across multiple tickets

Changelog Highlights

New Features

BL-2348: trim(), ltrim(), rtrim() gain chars argument
BL-2349: getClassMetadata() accepts absolute filesystem path
BL-2390: SystemExecute() gains inheritEnvironment and environment arguments

Improvements

BL-2078: SOAP header support for auth and security blocks
BL-2333: query.setColumnNames() compatibility API
BL-2354: Miniserver pass predicate configurability (CLI, JSON, env var)
BL-2355: Miniserver security handler upgrades
BL-2378: CLI reads ~/.box.env on startup
BL-2393: Chunked encoding truncation fix for large file responses
BL-2398: BoxLang-native formatting defaults aligned with Ortus conventions

Notable Bug Fixes

BL-2269: Missing async methods and signatures restored
BL-2336: Abort semantics corrected in web runtime try/catch boundaries
BL-2360: Positional spread arguments supported in calls
BL-2372: Concurrent modification exception fixed for array for/in
BL-2373: Class-file write race fixed with atomic write pattern
BL-2376: BoxFuture() context lifecycle fix after HTTP request completion
BL-2382: Binary write path fixed for valid downloaded ZIP output
BL-2386: QoQ nested-parentheses predicate parsing corrected
BL-2394: Custom-tag context no longer leaks incorrect this scope

View the full release report

BoxLang 1.13.0 is available now. Head to boxlang.io to get started, dig into the docs, and join us on the Ortus Community Slack to share what you're building.

The post BoxLang v1.13.0: Compatibility, Concurrency, and Formatter Maturity appeared first on foojay.

BoxLang AI Series: Complete Guide to Building AI Agents

Cristobal Escobar — Thu, 14 May 2026 09:26:48 +0000

Table of Contents

Start Here: A Practical OverviewThe Full SeriesWhat You’ll LearnKey ResourcesWhy BoxLang AIReady to Start Building?

The world of AI development is moving fast, but building real, production-ready AI agents doesn’t have to be complex.

This series walks you step by step through how to design, build, and deploy AI agents using BoxLang AI. Whether you’re exploring AI for the first time or looking to modernize your current applications, these guides will help you move from concept to implementation with clarity.

Start Here: A Practical Overview

If you’re new to BoxLang AI or want to understand what’s possible before diving into the technical details, start here:

https://foojay.io/today/how-to-develop-ai-agents-using-boxlang-ai-a-practical-guide/

This guide provides a high-level view of how to build AI agents, integrate multiple models, and design real-world workflows using BoxLang.

The Full Series

Follow the series in order to go from fundamentals to advanced implementations:

What You’ll Learn

Across this series, you’ll learn how to:

Build AI agents with memory, tools, and reasoning capabilities
Connect to multiple AI providers with a single unified API
Implement Retrieval-Augmented Generation (RAG) pipelines
Work with vector databases and document ingestion
Design scalable, production-ready AI workflows
Deploy AI agents in modern cloud environments

Key Resources

To help you go deeper and start building right away:

BoxLang AI - Playgroundhttps://ai.boxlang.io/
Official BoxLang AI - Documentationhttps://ai.ortusbooks.com/
BoxLang Website - https://boxlang.io/
GitHub Examples and Integrations - https://github.com/ortus-boxlang

Why BoxLang AI

BoxLang AI is designed to remove the complexity of working with multiple AI providers and tools. With a single API, you can build powerful AI-driven applications without vendor lock-in, while maintaining full control over your architecture.

If you’re working with legacy systems, BoxLang also allows you to introduce AI capabilities incrementally without needing a full rewrite.

Ready to Start Building?

Explore the series, try the examples, and start building your own AI agents today.

If you have questions or want to see how this can apply to your existing systems, feel free to reach out to the Ortus team.

The post BoxLang AI Series: Complete Guide to Building AI Agents appeared first on foojay.

How to Develop AI Agents Using BoxLang AI: A Practical Guide

Cristobal Escobar — Tue, 12 May 2026 12:52:39 +0000

Table of Contents

What we'll CoverPrerequisites

Step 1 — Install BoxLang
Step 2 — Install the bx-ai Module
Step 3 — Set Up Your .env File
Step 4 — Configure config/boxlang.json
Step 5 — Run Your First Script

What Are AI Agents?What Is BoxLang AI?Core Concept 1: ToolsCore Concept 2: MemoryCore Concept 3: The AgentHow to Put It All Together

What the Middleware Does

Streaming Responses

How Streaming Works
Simple Streaming with aiChatStream()
Agent Streaming with agent.stream()
Streaming to a Web Browser (BoxLang Web)
Consuming the Stream on the Frontend
Streaming with Accumulated Memory
When to Use Streaming

How the Agent ThinksGoing Further

Adding a Knowledge Base (RAG)
Human-in-the-Loop Approvals
Multi-Agent Escalation

ConclusionResources

AI agents are transforming how we build software. Unlike traditional chatbots that just answer questions, agents can reason about what tools they need, decide when to use them, chain multiple actions together, and remember what happened earlier in a conversation.

In this tutorial, I'll show you how to build a real-world AI agent using BoxLang AI — the official AI framework for the BoxLang JVM language. We'll build SupportBot, an e-commerce customer support agent that can look up orders, check inventory, issue refunds, and answer questions grounded in your knowledge base.

By the end you'll understand how AI agents work under the hood, and you'll have a fully working agent you can adapt for your own domain.

What we'll Cover

Prerequisites
What Are AI Agents?
What Is BoxLang AI?
Core Concept 1: Tools
Core Concept 2: Memory
Core Concept 3: The Agent
How to Put It All Together
Streaming Responses
How the Agent Thinks
Going Further
Conclusion

Prerequisites

Before diving in, you should be comfortable with:

BoxLang basics — You should know how to write BoxLang scripts, work with structs and arrays, and understand closures. If you're new, start with the Quick Start Guide.

Basic LLM familiarity — Knowing what a large language model is and having used one (via aiChat() or similar) will help you follow along.

Step 1 — Install BoxLang

Download and install BoxLang from boxlang.io, or use BVM (BoxLang Version Manager) to manage multiple versions:

# Install BVM
/bin/bash -c "$(curl -fsSL https://downloads.ortussolutions.com/ortussolutions/bvm/install.sh)"

# Install the latest BoxLang
bvm install latest
bvm use latest

# Verify
boxlang --version

Step 2 — Install the `bx-ai` Module

Install bx-ai locally into your project using the built-in module installer:

# Creates a boxlang_modules/ folder in your project
install-bx-module bx-ai --local

Your project structure will look like this:

my-project/
├── boxlang_modules/
│   └── bxai/               ← installed here
├── config/
│   └── boxlang.json        ← BoxLang configuration
├── .env                    ← your API keys (never commit this)
├── .env.example            ← template to share with your team
├── .gitignore
└── agent.bxs               ← your BoxLang scripts

Step 3 — Set Up Your `.env` File

Copy .env.example to .env and fill in at least one provider API key. Never commit .env to source control.

.env.example — commit this template so your team knows what keys are needed:

# BoxLang Custom Configuration — points BoxLang at your config file
BOXLANG_CONFIG=./config/boxlang.json

# AI Provider API Keys — fill in at least one
OPENAI_API_KEY=your-api-key
CLAUDE_API_KEY=your-api-key
GEMINI_API_KEY=your-api-key
GROK_API_KEY=your-api-key
GROQ_API_KEY=your-api-key
PERPLEXITY_API_KEY=your-api-key
OPENROUTER_API_KEY=your-api-key
MISTRAL_API_KEY=your-api-key
HUGGINGFACE_API_KEY=your-api-key
VOYAGE_API_KEY=your-api-key
COHERE_API_KEY=your-api-key
# AWS Bedrock
AWS_ACCESS_KEY_ID=your-key
AWS_SECRET_ACCESS_KEY=your-secret
AWS_REGION=us-east-1

.env — your actual keys, never committed:

BOXLANG_CONFIG=./config/boxlang.json
OPENAI_API_KEY=sk-proj-...

Add .env to your .gitignore:

.env
boxlang_modules/

Step 4 — `Configure config/boxlang.json`

BoxLang reads its configuration from the file pointed to by BOXLANG_CONFIG. The ${Setting: VAR_NAME not found} syntax reads directly from your .env file — your keys never live in the config file itself.

config/boxlang.json:

{
    "modules": {
        "bxai": {
            "settings": {
                "provider": "openai",
                "apiKey": "${Setting: OPENAI_API_KEY not found}",
                "defaultParams": {
                    "model": "gpt-4o",
                    "temperature": 0.2
                }
            }
        }
    }
}

Step 5 — Run Your First Script

Create agent.bxs and run it:

// agent.bxs
answer = aiChat( "What is BoxLang AI in one sentence?" )
println( answer )

boxlang agent.bxs

That's it — no build step, no compile, no server. BoxLang reads .env automatically, loads the bxai module from boxlang_modules/, and runs.

Switching Providers

To switch from OpenAI to Claude, change two lines in config/boxlang.json and add the key to .env:

{
    "modules": {
        "bxai": {
            "settings": {
                "provider": "claude",
                "apiKey": "${Setting: CLAUDE_API_KEY not found}",
                "defaultParams": {
                    "model": "claude-sonnet-4-5-20251001"
                }
            }
        }
    }
}

Your agent.bxs code doesn't change at all. This is the zero-vendor-lock-in promise in practice.

"💡 bx-ai supports 17 providers — OpenAI, Claude, Gemini, Ollama, Groq, and more. You can also run fully local AI with Ollama — no API key required, zero cost, complete privacy. See the provider docs for per-provider configuration."

What Are AI Agents?

Think of an AI agent as a chatbot that can act, not just respond. A traditional chatbot answers questions from what it knows. An agent can reach out and do things — query databases, call APIs, read files, send emails — and chain those actions together to solve multi-step problems.

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│   TRADITIONAL CHATBOT           AI AGENT                    │
│   ──────────────────            ────────                    │
│                                                             │
│   User ──► LLM ──► Answer       User ──► Agent              │
│                                           │                 │
│   One shot. No tools.                     ├──► Tool A       │
│   No memory.                              ├──► Tool B       │
│                                           ├──► Memory       │
│                                           └──► Answer       │
│                                                             │
│                                 Reasons. Acts. Remembers.   │
└─────────────────────────────────────────────────────────────┘

Here's a conversation with the SupportBot we'll build:

User:  "Where is order #ORD-78291? It was supposed to arrive yesterday."

Agent: [Thinks: I need to look up that order]
Agent: [Calls get_order( orderId: "ORD-78291" )]
Agent: [Gets back: { status: "In Transit", carrier: "FedEx",
                     tracking: "794644792798",
                     estimatedDelivery: "2026-04-04" }]

Agent: "Your order #ORD-78291 is in transit with FedEx
        (tracking: 794644792798). It was delayed by one day
        and is now estimated to arrive tomorrow, April 4th."

The agent broke the problem down, picked the right tool, and synthesized the answer. This matters when:

Queries don't fit into predefined categories
Answering requires combining data from multiple sources
Users need to follow up on previous answers

What Is BoxLang AI?

BoxLang AI (bx-ai) is the official AI framework for BoxLang — a modern, dynamic JVM language. It provides a unified, fluent API for building AI agents, multi-model workflows, RAG pipelines, and AI-powered applications.

┌────────────────────────────────────────────────────────────────┐
│                     BoxLang AI Stack                           │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│   Your Application Code                                        │
│   ─────────────────────────────────────────────────────────    │
│   aiAgent()  aiChat()  aiEmbed()  aiMemory()  aiTool()         │
│                                                                │
│   ─────────────────────────────────────────────────────────    │
│   Skills │ Middleware │ Tool Registry │ Memory │ Pipelines     │
│                                                                │
│   ─────────────────────────────────────────────────────────    │
│   OpenAI │ Claude │ Gemini │ Ollama │ Groq │ + 12 more         │
│                                                                │
└────────────────────────────────────────────────────────────────┘

Key properties that make it great for building agents:

- One API, 17 providers — switch from OpenAI to Claude by changing a config value, not code
- aiAgent() BIF — a fully featured agent with tools, memory, skills, and middleware
- Fluent tool definition — turn any closure into an AI-callable tool with aiTool()
- Multi-tenant memory — one agent instance safely handles thousands of concurrent users
- JVM-native — runs everywhere Java runs, with full Java interop

Core Concept 1: Tools

Tools are functions your AI agent can call. The framework passes the tool's name, description, and parameter schema to the LLM, which decides when and how to call them. When the LLM decides to use a tool, BoxLang AI executes it and feeds the result back.

┌──────────────────────────────────────────────────────────────┐
│                    How Tools Work                            │
│                                                              │
│  ┌─────────┐    "I need order data"    ┌──────────────────┐  │
│  │   LLM   │ ─────────────────────── ► │  get_order()     │  │
│  │         │                           │  • name          │  │
│  │         │ ◄───────────────────────  │  • description   │  │
│  └─────────┘    { status, tracking }   │  • parameters    │  │
│                                        └──────────────────┘  │
│                                                              │
│  The LLM reads the description to decide WHEN to call.       │
│  BoxLang AI handles the execution and result passing.        │
└──────────────────────────────────────────────────────────────┘

Defining a Tool with `aiTool()`

The simplest way to create a tool is with the aiTool() BIF and a closure:

getWeatherTool = aiTool(
    "get_weather",
    "Get the current weather for a city. Use when the user asks about weather conditions.",
    ( required city ) => {
        // In a real app you'd call a weather API here
        return { temp: 72, condition: "sunny", city: arguments.city }
    }
)

The three arguments are: name, description, and callable. The description is what the LLM reads to decide whether this is the right tool — write it like you're telling a colleague when to use it.

A Real Tool: `get_order`

Here's the first tool for our SupportBot. It looks up an order by ID:

// OrderTools.bx
class {

    property name="orderService";

    function init( required any orderService ) {
        variables.orderService = arguments.orderService
        return this
    }

    @AITool( "Retrieve a single order by order ID. Use first when a customer mentions a specific order number. Always call this before attempting a refund or cancellation." )
    public struct function get_order( required string orderId ) {
        var order = variables.orderService.findById( arguments.orderId )

        if ( isNull( order ) ) {
            return {
                found   : false,
                orderId : arguments.orderId,
                message : "Order #arguments.orderId# was not found. Please verify the order ID."
            }
        }

        return {
            found            : true,
            orderId          : order.getId(),
            status           : order.getStatus(),
            carrier          : order.getCarrier(),
            trackingNumber   : order.getTrackingNumber(),
            estimatedDelivery: order.getEstimatedDelivery().dateFormat( "long" ),
            items            : order.getItems().map( item => {
                return { name: item.getName(), qty: item.getQty(), price: item.getPrice() }
            } ),
            total            : order.getTotal(),
            summary          : "Order ##arguments.orderId# — #order.getStatus()# — Est. delivery: #order.getEstimatedDelivery().dateFormat( 'long' )#"
        }
    }

}

A few things to notice:

The @AITool annotation tells the AIToolRegistry scanner that this method is an AI-callable tool. The annotation value becomes the tool's description. When you call aiToolRegistry().scan( new OrderTools( orderService ), "support" ), it registers get_order@support automatically.

The return value includes a summary field. Rather than making the LLM parse a raw struct, you pre-compute a one-sentence summary it can read directly. Return both the data (for detailed reasoning) and the summary (for quick reading).

The not-found case returns a helpful struct instead of throwing. The LLM sees found: false and the message and can relay that to the user clearly — far better than an unhandled exception.

The Full `OrderTools` Class

class {

    property name="orderService";

    function init( required any orderService ) {
        variables.orderService = arguments.orderService
        return this
    }

    @AITool( "Retrieve a single order by order ID. Use first when a customer mentions a specific order number." )
    public struct function get_order( required string orderId ) {
        var order = variables.orderService.findById( arguments.orderId )
        if ( isNull( order ) ) {
            return { found: false, message: "Order #arguments.orderId# not found." }
        }
        return {
            found            : true,
            orderId          : order.getId(),
            status           : order.getStatus(),
            carrier          : order.getCarrier(),
            trackingNumber   : order.getTrackingNumber(),
            estimatedDelivery: order.getEstimatedDelivery().dateFormat( "long" ),
            total            : order.getTotal(),
            summary          : "Order ##arguments.orderId# — #order.getStatus()#"
        }
    }

    @AITool( "Search a customer's order history. Use when the customer asks about past orders, spending history, or recent purchases." )
    public struct function search_orders(
        required string customerEmail,
        string  status = "",
        numeric limit  = 10
    ) {
        var orders = variables.orderService.findByEmail(
            email  : arguments.customerEmail,
            status : arguments.status,
            limit  : arguments.limit
        )
        return {
            count  : orders.len(),
            orders : orders.map( o => { id: o.getId(), status: o.getStatus(), total: o.getTotal(), date: o.getCreatedAt().dateFormat( "short" ) } ),
            summary: "Found #orders.len()# orders for #arguments.customerEmail#"
        }
    }

    @AITool( "Issue a refund for a specific order. IMPORTANT: Only call this after confirming the order exists and the customer has explicitly requested a refund." )
    public struct function issue_refund(
        required string orderId,
        required string reason
    ) {
        var result = variables.orderService.refund(
            orderId: arguments.orderId,
            reason : arguments.reason
        )
        return {
            success       : result.isSuccess(),
            refundId      : result.getRefundId(),
            amount        : result.getAmount(),
            processingDays: 5,
            summary       : result.isSuccess()
                ? "Refund of $#result.getAmount()# issued for order ##arguments.orderId#. Allow 5 business days."
                : "Refund failed: #result.getError()#"
        }
    }

}

Tool Design Principles

┌─────────────────────────────────────────────────────────────────┐
│                  The 4 Tool Design Rules                        │
│                                                                 │
│  1. DESCRIPTION ── Tell the LLM exactly when (and when NOT)     │
│                    to call this tool. Be specific.              │
│                                                                 │
│  2. SUMMARY     ── Always return a pre-computed one-liner       │
│                    alongside raw data. Saves tokens.            │
│                                                                 │
│  3. NO THROWS   ── Return { success: false, message: "..." }    │
│                    instead of throwing. LLM can relay errors.   │
│                                                                 │
│  4. CAP RESULTS ── Always use a limit param. Never return       │
│                    unbounded arrays to the LLM.                 │
└─────────────────────────────────────────────────────────────────┘

Write the description like you're training a new colleague:

// ❌ Vague — LLM won't know when to call this
@AITool( "Gets order information" )

// ✅ Clear — tells the LLM exactly when and what
@AITool( "Retrieve a single order by order ID. Use first when a customer mentions
          a specific order number. Do not call without an explicit order ID." )

Core Concept 2: Memory

Memory is what separates a stateful agent from a stateless API call. Without memory, every message is processed in isolation. With memory, the agent carries the full conversation thread.

┌────────────────────────────────────────────────────────────────┐
│               Without Memory  vs  With Memory                  │
│                                                                │
│  WITHOUT                       WITH                            │
│  ──────────────────            ────────────────────            │
│                                                                │
│  Turn 1:                       Turn 1:                         │
│  User: "My order is late"      User: "My order is late"        │
│  Agent: "Which order?"         Agent: "Which order?"           │
│                                                                │
│  Turn 2:                       Turn 2:                         │
│  User: "ORD-78291"             User: "ORD-78291"               │
│  Agent: "Which order?" ❌       Agent: [looks up ORD-78291] ✅ │
│                                                                │
│  Each call is isolated.        Full context is preserved.      │
└────────────────────────────────────────────────────────────────┘

BoxLang AI ships 20+ memory types. Here are the three you'll use most.

Window Memory — Short-Term Conversation History

Window memory keeps the last N messages. It's the minimum you need for a coherent conversation:

memory = aiMemory( "window", config: { maxMessages: 20 } )

What the memory stores as a conversation builds:

After Turn 1:
┌─────────────────────────────────────────────────────┐
│  user      │ "Where is order #ORD-78291?"           │
│  assistant │ "Your order is in transit..."          │
└─────────────────────────────────────────────────────┘

After Turn 2:
┌─────────────────────────────────────────────────────┐
│  user      │ "Where is order #ORD-78291?"           │
│  assistant │ "Your order is in transit..."          │
│  user      │ "When exactly will it arrive?"         │
│  assistant │ "It's estimated to arrive April 4th."  │
└─────────────────────────────────────────────────────┘

Without memory, "When exactly will it arrive?" has no context — "it" refers to nothing. With memory, the agent knows what "it" means.

Cache Memory — Multi-Tenant Production

For web applications serving multiple users, you need one agent instance that's safe across concurrent requests:

memory = aiMemory( "cache" )

Every memory operation accepts userId and conversationId to route each read/write to the right isolated conversation:

┌──────────────────────────────────────────────────────────────┐
│              One Memory Instance, Many Users                 │
│                                                              │
│  ┌──────────┐                                                │
│  │  Alice   │──► add( msg, userId:"alice", convId:"t-101" )  │
│  └──────────┘                    │                           │
│                                  ▼                           │
│                         ┌────────────────┐                   │
│                         │  Cache Memory  │                   │
│                         │  ──────────── │                    │
│  ┌──────────┐           │  alice/t-101  │                    │
│  │   Bob    │──────────►│  bob/t-102    │                    │
│  └──────────┘           │  carol/t-103  │                    │
│                         └────────────────┘                   │
│                                  │                           │
│  getAll( userId:"alice" ) ───────┘  Returns ONLY Alice's     │
│                                     messages. Bob isolated.  │
└──────────────────────────────────────────────────────────────┘

When you pass userId and conversationId through agent.run() options, they flow automatically to all memory operations — no explicit wiring needed:

// Same agent instance, fully isolated per user
agent.run( "My order is late.", {}, { userId: "alice@example.com", conversationId: "ticket-101" } )
agent.run( "I need a refund.",  {}, { userId: "bob@example.com",   conversationId: "ticket-102" } )

No per-user agent factories. No thread-local hacks. One instance handles thousands of concurrent users safely.

Summary Memory — Long Conversations

For long support sessions, summary memory auto-compresses old messages to preserve context without token bloat:

memory = aiMemory( "summary", config: {
    maxMessages      : 40,
    summaryThreshold : 20,
    summaryModel     : "gpt-4o-mini"   // use a cheap model for summarization
} )

              How Summary Memory Works

Messages 1-20 accumulate normally...

At message 21:
┌──────────────────────────────────────────────────┐
│  Messages 1–20  ──► LLM summarizes ──►           │
│  "Customer reported damaged item on order        │
│   ORD-78291. Refund of $89.99 discussed."        │
└──────────────────────────────────────────────────┘
       │
       ▼
┌──────────────────────────────────────────────────┐
│  [SUMMARY]  +  Messages 21–40                    │
│  Full context preserved, fraction of the tokens  │
└──────────────────────────────────────────────────┘

Core Concept 3: The Agent

With tools and memory defined, the agent is the piece that ties them together. In BoxLang AI, aiAgent() is a single BIF call that gives you a fully autonomous agent.

┌──────────────────────────────────────────────────────────────┐
│                    The Agent is the Glue                     │
│                                                              │
│   ┌──────────┐   ┌──────────┐   ┌──────────┐                 │
│   │  Tools   │   │  Memory  │   │  Skills  │                 │
│   └────┬─────┘   └────┬─────┘   └────┬─────┘                 │
│        │              │              │                       │
│        └──────────────┼──────────────┘                       │
│                       │                                      │
│                  ┌────▼─────┐                                │
│                  │  Agent   │◄── Instructions                │
│                  │          │◄── Middleware                  │
│                  └────┬─────┘                                │
│                       │                                      │
│                  ┌────▼─────┐                                │
│                  │   LLM    │  (any of 17 providers)         │
│                  └──────────┘                                │
└──────────────────────────────────────────────────────────────┘

The Simplest Possible Agent

// Window memory by default with 20 messages
agent = aiAgent(
    name   : "SupportBot",
    tools  : [ getOrderTool, searchOrdersTool, issueRefundTool ]
)

response = agent.run( "Where is order #ORD-78291?" )
println( response )

That's it. The agent handles the full reasoning loop: deciding when to call tools, passing results back to the LLM, and producing a final response.

Giving the Agent an Identity

A well-defined description and instructions dramatically improve agent behavior:

agent = aiAgent(
    name         : "SupportBot",
    description  : "Customer support specialist for Acme Store. Expert in orders, shipping, returns, and product questions.",
    instructions : "
        You are a friendly and efficient customer support agent.
        Always look up order details before discussing specific orders.
        Confirm refund requests explicitly before calling issue_refund.
        Lead with the direct answer, then add supporting detail.
        If you cannot resolve an issue, offer to escalate to a human agent.
    ",
    tools        : [ getOrderTool, searchOrdersTool, issueRefundTool ],
    memory       : aiMemory( "cache" )
)

The Agent Run Lifecycle

┌──────────────────────────────────────────────────────────────┐
│                  Agent Run Lifecycle                         │
│                                                              │
│  agent.run( "My order is late" )                             │
│        │                                                     │
│        ▼                                                     │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  1. Resolve userId / conversationId for this call   │     │
│  │  2. Build system message (description + instructions│     │
│  │     + skills + tool list)                           │     │
│  │  3. Load conversation history from memory           │     │
│  │  4. Assemble: [system, ...history, user message]    │     │
│  └────────────────────┬────────────────────────────────┘     │
│                       │                                      │
│                       ▼                                      │
│              ┌────────────────┐                              │
│              │   LLM Call     │                              │
│              └───────┬────────┘                              │
│                      │                                       │
│              Tool calls?                                     │
│              ┌───────┴────────┐                              │
│             YES               NO                             │
│              │                │                              │
│              ▼                ▼                              │
│       ┌────────────┐   ┌────────────────┐                    │
│       │ Execute    │   │ Store in memory│                    │
│       │ each tool  │   │ Return answer  │                    │
│       └─────┬──────┘   └────────────────┘                    │
│             │                                                │
│             └──► back to LLM Call (loop)                     │
│                                                              │
└──────────────────────────────────────────────────────────────┘

This loop is what makes the agent autonomous — it keeps calling tools until it has everything it needs to produce a final answer.

How to Put It All Together

Here's the complete SupportBot:

// SupportBot.bx
import bxModules.bxai.models.middleware.core.LoggingMiddleware;
import bxModules.bxai.models.middleware.core.GuardrailMiddleware;
import bxModules.bxai.models.middleware.core.MaxToolCallsMiddleware;

class {

    property name="agent";

    /**
     * Wire up the agent with tools, memory, and middleware.
     *
     * @orderService   Your order data service
     * @kbVectorMemory Vector memory backed by your knowledge base (optional)
     */
    function init( required any orderService, any kbVectorMemory ) {
        // 1. Register tools by scanning the OrderTools class
        aiToolRegistry().scan( new OrderTools( arguments.orderService ), "support" )

        // 2. Build the agent
        variables.agent = aiAgent(
            name        : "SupportBot",
            description : "Customer support specialist for Acme Store.",
            instructions: "
                You are a friendly and efficient customer support agent.
                Always call get_order before discussing a specific order.
                Confirm refunds explicitly before calling issue_refund.
                Lead with the direct answer, then add supporting detail.
                If you cannot resolve an issue, offer to escalate.
            ",
            tools       : [ "get_order@support", "search_orders@support", "issue_refund@support", "now@bxai" ],
            memory      : aiMemory( "cache" ),
            middleware  : [
                new LoggingMiddleware( logToConsole: true, prefix: "[SupportBot]" ),
                new GuardrailMiddleware( blockedTools: [ "delete_order" ] ),
                new MaxToolCallsMiddleware( maxCalls: 8 )
            ]
        )

        // 3. Optionally seed with a knowledge base for RAG
        if ( !isNull( arguments.kbVectorMemory ) ) {
            variables.agent.addMemory( arguments.kbVectorMemory )
        }

        return this
    }

    /**
     * Handle a customer message — returns the full response string.
     */
    string function handle(
        required string message,
        required string userId,
        required string conversationId
    ) {
        return variables.agent.run(
            arguments.message,
            {},
            {
                userId        : arguments.userId,
                conversationId: arguments.conversationId
            }
        )
    }

}

What the Middleware Does

┌────────────────────────────────────────────────────────────────┐
│                  Middleware Stack                              │
│                                                                │
│  Every agent.run() call passes through:                        │
│                                                                │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  LoggingMiddleware   — logs every LLM call + tool call   │  │
│  │  GuardrailMiddleware — blocks forbidden tools (delete_*) │  │
│  │  MaxToolCallsMiddleware — stops runaway loops at 8 calls │  │
│  └──────────────────────────────────────────────────────────┘  │
│           │             │                │                     │
│           ▼             ▼                ▼                     │
│       ai.log      reject call       cancel run                 │
│                   with error        gracefully                 │
└────────────────────────────────────────────────────────────────┘

LoggingMiddleware logs every agent run, LLM call, and tool invocation to BoxLang's ai log file. In development you'll see exactly what the agent is doing. In production, disable logToConsole and write to the log for observability.

GuardrailMiddleware blocks delete_order permanently — even if the LLM somehow decides to call it. Defense-in-depth for high-stakes operations.

MaxToolCallsMiddleware prevents runaway agents. If the agent gets stuck in a tool-calling loop, it hits the cap and stops with a clear error rather than burning tokens indefinitely.

Streaming Responses

For web UIs and real-time applications, you want the agent's response to appear token-by-token as it's generated — like typing. This is what makes AI feel alive rather than frozen.

BoxLang AI supports streaming at every level: direct model calls, agent runs, and web responses.

How Streaming Works

┌──────────────────────────────────────────────────────────────┐
│                   Streaming vs Blocking                      │
│                                                              │
│  BLOCKING (default)                                          │
│  ──────────────────                                          │
│  User sends message                                          │
│  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  (waiting 2–8 seconds)     │
│  Full response arrives at once                               │
│                                                              │
│  STREAMING                                                   │
│  ─────────                                                   │
│  User sends message                                          │
│  "Your" ► " order" ► " #ORD" ► "-78291" ► " is" ► ...        │
│  Response appears immediately, token by token                │
└──────────────────────────────────────────────────────────────┘

Simple Streaming with `aiChatStream()`

For basic streaming without an agent:

// Stream a response token by token
aiChatStream(
    messages : "Explain how BoxLang AI handles tool calling",
    callback : chunk => {
        // Each chunk contains a delta with partial content
        var token = chunk.choices?.first()?.delta?.content ?: ""
        if ( token.len() ) {
            writeOutput( token )
            bx:flush;  // push each token to the browser immediately
        }
    },
    params   : { model: "gpt-4o" }
)

Agent Streaming with `agent.stream()`

The stream() method on AiAgent works exactly like run() but delivers the response token by token. Tool calls still execute synchronously under the hood — the streaming applies to the final text response:

// SupportBot.bx — add this alongside the handle() method
void function handleStream(
    required string   message,
    required string   userId,
    required string   conversationId,
    required function onChunk
) {
    variables.agent.stream(
        onChunk : arguments.onChunk,
        input   : arguments.message,
        options : {
            userId        : arguments.userId,
            conversationId: arguments.conversationId
        }
    )
}

Streaming to a Web Browser (BoxLang Web)

Here's how to wire streaming to a real HTTP response — tokens pushed to the browser as they arrive:

// handlers/SupportStreamHandler.bx
class {

    property name="supportBot" inject="SupportBot";

    function stream( event, rc, prc ) {
        var userId         = auth.getCurrentUser().getEmail()
        var conversationId = rc.ticketId
        
        // Use BoxLang's Native SSE Streamer
        SSE(
            callback          : ( emitter ) => {
                supportBot.handleStream(
                    message        : rc.message,
                    userId         : userId,
                    conversationId : conversationId,
                    onChunk        : chunk => {
                        if ( emitter.isClosed() ) {
                            return
                        }
                        var token = chunk.choices?.first()?.delta?.content ?: ""
                        if ( token.len() ) {
                            emitter.send( token, "token" )
                        }
                    }
                )
                emitter.send( { complete: true }, "done" )
                emitter.close()
            },
            keepAliveInterval : 30000,
            cors              : ""
        )
    }

}

Consuming the Stream on the Frontend

On the client side, use the standard EventSource API or fetch with a readable stream:

// JavaScript — connect to the SSE stream
const eventSource = new EventSource(
    `/support/stream?ticketId=${Setting: ticketId not found}&message=${Setting: encodeURIComponent(message) not found}`
);

const responseEl = document.getElementById( "agent-response" );

eventSource.onmessage = ( event ) => {
    if ( event.data === "[DONE]" ) {
        eventSource.close();
        return;
    }
    // Append each token as it arrives
    responseEl.textContent += event.data;
};

eventSource.onerror = () => eventSource.close();

Streaming with Accumulated Memory

One important detail: even in streaming mode, the full response is stored in memory after the stream completes. The AiAgent.stream() method accumulates tokens internally and saves them when done:

// From AiAgent.bx — the wrapped callback pattern
var accumulated = ""
var wrappedCallback = ( chunk ) => {
    var content = chunk.choices?.first()?.delta?.content ?: ""
    accumulated &= content        // accumulate for memory
    userOnChunk( chunk )          // forward to your callback
}

// After streaming completes, store the full response
storeInMemory( userMessage, { role: "assistant", content: accumulated }, userId, conversationId )

This means streaming and memory work seamlessly together — the user sees tokens as they arrive, and the next turn has the full conversation history.

When to Use Streaming

┌──────────────────────────────────────────────────────────────┐
│               Streaming Decision Guide                       │
│                                                              │
│  USE streaming when:                                         │
│  • Building a chat UI where responsiveness matters           │
│  • Responses are long (> 2-3 sentences)                      │
│  • You want a "typing" feel for the user                     │
│  • Delivering to a browser over HTTP                         │
│                                                              │
│  USE blocking (agent.run()) when:                            │
│  • Processing in a background job or batch pipeline          │
│  • The caller needs the complete response before proceeding  │
│  • Building an API that returns JSON                         │
│  • Writing tests (deterministic, easier to assert)           │
└──────────────────────────────────────────────────────────────┘

How the Agent Thinks

Let's trace exactly what happens for a real multi-step request: "My order #ORD-78291 arrived damaged. I want a refund."

┌──────────────────────────────────────────────────────────────┐
│              Full Agent Execution Trace                      │
│                                                              │
│  USER: "My order #ORD-78291 arrived damaged. I want          │
│         a refund."                                           │
│         │                                                    │
│         ▼                                                    │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  LLM CALL 1                                         │     │
│  │  "Customer wants refund. Look up order first."      │     │
│  │  → tool_call: get_order( "ORD-78291" )              │     │
│  └───────────────────┬─────────────────────────────────┘     │
│                      │                                       │
│         ▼            ▼                                       │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  TOOL: get_order                                    │     │
│  │  { found: true, status: "Delivered",                │     │
│  │    total: 89.99, summary: "Order #ORD-78291..." }   │     │
│  └───────────────────┬─────────────────────────────────┘     │
│                      │                                       │
│                      ▼                                       │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  LLM CALL 2                                         │     │
│  │  "Order confirmed. Instructions say confirm         │     │
│  │  before issuing refund."                            │     │
│  │  → text: "Can you confirm the $89.99 refund?"       │     │
│  └───────────────────┬─────────────────────────────────┘     │
│                      │                                       │
│  USER: "Yes, please go ahead."                               │
│                      │                                       │
│                      ▼                                       │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  LLM CALL 3                                         │     │
│  │  "Customer confirmed. Issue the refund."            │     │
│  │  → tool_call: issue_refund( "ORD-78291",            │     │
│  │                             "Item arrived damaged" )│     │
│  └───────────────────┬─────────────────────────────────┘     │
│                      │                                       │
│                      ▼                                       │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  TOOL: issue_refund                                 │     │
│  │  { success: true, refundId: "REF-44821",            │     │
│  │    amount: 89.99, processingDays: 5 }               │     │
│  └───────────────────┬─────────────────────────────────┘     │
│                      │                                       │
│                      ▼                                       │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  LLM CALL 4                                         │     │
│  │  "Refund confirmed. Compose final response."        │     │
│  │  → text: "Your refund of $89.99 has been            │     │
│  │           processed (REF-44821)..."                 │     │
│  └──────────────────────────────────────────────────── ┘     │
│                      │                                       │
│                      ▼                                       │
│  ┌─────────────────────────────────────────────────────┐     │
│  │  STORE in memory (scoped to this user + ticket)     │     │
│  │  RETURN to caller                                   │     │
│  └─────────────────────────────────────────────────────┘     │
└──────────────────────────────────────────────────────────────┘

The agent confirms before acting (because the instructions say to), executes the tool only after explicit confirmation, and builds the full response from the tool result. This is the multi-step reasoning that makes agents genuinely useful.

What the conversation history looks like at the end:

┌────────────────────────────────────────────────────────────┐
│  Role        │  Content                                    │
├──────────────┼─────────────────────────────────────────────┤
│  system      │  "You are SupportBot..."                    │
│  user        │  "My order arrived damaged..."              │
│  assistant   │  [tool_call: get_order]                     │
│  tool        │  { found:true, status:"Delivered"... }      │
│  assistant   │  "Can you confirm the $89.99 refund?"       │
│  user        │  "Yes, please go ahead."                    │
│  assistant   │  [tool_call: issue_refund]                  │
│  tool        │  { success:true, refundId:"REF-44821"... }  │
│  assistant   │  "Your refund of $89.99 has been issued..." │
└────────────────────────────────────────────────────────────┘

Going Further

The SupportBot above covers the essentials. Here's what to add for production.

Adding a Knowledge Base (RAG)

Ingest your documentation into vector memory and the agent retrieves relevant content automatically before answering:

// One-time ingestion (run when docs change)
vectorMemory = aiMemory( "chroma", config: {
    collection       : "support_kb",
    embeddingProvider: "openai",
    embeddingModel   : "text-embedding-3-small"
} )

result = aiDocuments(
    source : "/knowledge-base",
    config : { type: "directory", recursive: true, extensions: [ "md", "txt" ] }
).toMemory(
    memory  : vectorMemory,
    options : { chunkSize: 800, overlap: 150 }
)
println( "Loaded #result.documentsIn# docs → #result.chunksOut# chunks" )

┌──────────────────────────────────────────────────────────────┐
│                    RAG Pipeline                              │
│                                                              │
│  INGESTION (run once)                                        │
│  ─────────────────────────────────────────────────────────   │
│  /knowledge-base/*.md                                        │
│        │                                                     │
│        ▼                                                     │
│  aiDocuments() ──► chunk ──► embed ──► store in ChromaDB     │
│                                                              │
│  QUERY (every agent.run())                                   │
│  ─────────────────────────────────────────────────────────   │
│  User: "What is your return policy?"                         │
│        │                                                     │
│        ▼                                                     │
│  Vector search: find top-5 semantically similar chunks       │
│        │                                                     │
│        ▼                                                     │
│  Inject chunks into LLM context                              │
│        │                                                     │
│        ▼                                                     │
│  LLM answers from YOUR actual docs, not hallucinations       │
└──────────────────────────────────────────────────────────────┘

Human-in-the-Loop Approvals

For refunds above a threshold, require a supervisor to approve before the refund executes:

import bxModules.bxai.models.middleware.core.HumanInTheLoopMiddleware;

agent = aiAgent(
    name       : "SupportBot",
    middleware : [
        new LoggingMiddleware(),
        new GuardrailMiddleware( blockedTools: [ "delete_order" ] ),
        new MaxToolCallsMiddleware( maxCalls: 8 ),
        new HumanInTheLoopMiddleware(
            mode                  : "web",
            toolsRequiringApproval: [ "issue_refund" ]
        )
    ],
    checkpointer: aiMemory( "cache" )
)

┌──────────────────────────────────────────────────────────────┐
│            Human-in-the-Loop Flow                            │
│                                                              │
│  Agent reaches issue_refund tool call                        │
│        │                                                     │
│        ▼                                                     │
│  HumanInTheLoopMiddleware intercepts                         │
│        │                                                     │
│        ▼                                                     │
│  result.isSuspended() == true                                │
│  Agent saves checkpoint to cache memory                      │
│        │                                                     │
│        ▼                                                     │
│  Your code notifies supervisor (Slack, email, dashboard)     │
│        │                                                     │
│        ▼                                                     │
│  Supervisor approves / rejects / edits args                  │
│        │                                                     │
│        ├── approve ──► agent.resume( "approve", threadId )   │
│        ├── reject  ──► agent.resume( "reject",  threadId )   │
│        └── edit    ──► agent.resume( "edit", threadId,       │
│                             { correctedArgs: { amount:100 }} │
└──────────────────────────────────────────────────────────────┘

Multi-Agent Escalation

For complex issues, automatically delegate to a specialist:

billingAgent = aiAgent(
    name        : "BillingSpecialist",
    description : "Expert in billing disputes, chargebacks, and payment issues",
    tools       : [ "get_payment_history@billing", "dispute_charge@billing" ]
)

// SupportBot gets a delegate_to_billing-specialist tool automatically
supportBot = aiAgent(
    name      : "SupportBot",
    subAgents : [ billingAgent ]
)

┌──────────────────────────────────────────────────────────────┐
│               Multi-Agent Hierarchy                          │
│                                                              │
│            ┌─────────────────┐                               │
│            │   SupportBot    │  (coordinator)                │
│            │  (root agent)   │                               │
│            └────────┬────────┘                               │
│                     │                                        │
│          ┌──────────┴───────────┐                            │
│          │                      │                            │
│  ┌───────┴───────┐    ┌─────────┴──────────┐                 │
│  │   Billing     │    │     Returns &      │                 │
│  │  Specialist   │    │     Shipping       │                 │
│  └───────────────┘    └────────────────────┘                 │
│                                                              │
│  Each sub-agent appears as a "delegate_to_*" tool.           │
│  The LLM decides when to delegate — no routing code needed.  │
└──────────────────────────────────────────────────────────────┘

Conclusion

Building an AI agent with BoxLang AI comes down to three concepts:

┌──────────────────────────────────────────────────────────────┐
│                  The Three Core Concepts                     │
│                                                              │
│  1. TOOLS    ──  Functions your agent can call               │
│                  @AITool annotation or aiTool() BIF          │
│                  Registered once, referenced by name         │
│                                                              │
│  2. MEMORY   ──  Conversation history that makes it          │
│                  stateful and multi-tenant safe              │
│                  window / cache / summary / vector           │
│                                                              │
│  3. AGENT    ──  The reasoning loop that ties it together    │
│                  aiAgent() with instructions + middleware    │
│                  Handles the tool-call loop automatically    │
└──────────────────────────────────────────────────────────────┘

The framework handles the hard parts: the tool-calling loop, memory isolation, provider differences, lifecycle events, and cross-cutting concerns like logging and rate limiting. You focus on your domain logic — the tools that do the actual work.

The full SupportBot example shows how these pieces combine in a real application. The same patterns apply to any domain: financial assistants, developer tools, data analysis agents, document processors — whatever problem you're solving, the architecture is the same.

Resources

📖 BoxLang AI Documentation
🐙 BoxLang AI GitHub
🎓 AI BootCamp — hands-on course covering all concepts in this guide
💬 BoxLang Community Slack
📦 ForgeBox Package

# Start building
install-bx-module bx-ai
boxlang my-agent.bxs

The post How to Develop AI Agents Using BoxLang AI: A Practical Guide appeared first on foojay.

foojay – a place for friends of OpenJDK

BoxLang AI 3.2.0 — Image Generation, Web Search, Fluent Audio, Agent Registry & MCP Observability

Why Enterprise Java Teams Need Quality Gates Even More in the Age of AI

Enterprise quality is a scaling problem

Local differences become delivery problems

Noisy diffs hurt review quality

IDE-based quality control is not enough

AI needs deterministic boundaries

What enterprise quality gates should check

Formatting is only one source-code gate

Java member ordering is harder than it looks

The missing layer: JHarmonizer

Where it fits in the Java quality stack

Conclusion

Intro to the BoxLang Formatter

What Is It?

Getting Started in 60 Seconds

Configure Your Style

Lock It Down in CI

Recommended Team Workflow

Format on Save in VS Code

Coming from cfformat?

A Few Other Handy Options

The Bottom Line

Introducing bx-jwt: Enterprise-Grade JSON Web Tokens for BoxLang

What is bx-jwt?

Two APIs, One Module

The Fluent Builder — jwtNew()

The BIF Functions

Get Started in Seconds

HMAC Sign and Verify

RSA Sign and Verify

JWE Encryption

Enterprise Key Management with the Key Registry

Security by Default — Not by Configuration 🛡️

alg:none Rejection

HMAC Minimum Key Lengths (RFC 7518 §3.2)

Algorithm Allowlist

Clock Skew Tolerance

Real-World Patterns

Authentication Middleware

Token Refresh with Grace Period

Kid-Based Key Rotation

Full Algorithm Support

Signing (JWS)

Encryption (JWE)

Installation

BoxLang+/++/Starter

Resources

Context Is a Budget — Eight levers and three workflow patterns

Where the tokens actually go

The Eight Levers

A. Context engineering — scope your asks

B. Prompt caching — order matters

C. Tool & MCP hygiene — every schema is a tax

D. Custom instructions & skills — codify it once

E. Model routing — start cheap, escalate when stuck

F. Output discipline — diffs, not novels

G. Repo hygiene — what the indexer sees

H. Observability — latency is your token meter

Three workflow patterns that compound

1. The Ralph Wiggum loop

2. Auto-compact

3. Planner → Implementer → Reviewer (agent handover)

The Monday checklist

Closing

Introducing skills.boxlang.io — The Open Agent Skills Ecosystem for BoxLang & the Ortus World

🤔 The Problem: AI Knowledge Doesn't Scale by Copy-Paste

🎓 What Is a Skill?

📥 Install in Seconds: Two Paths, One Standard

⚡ Option 1 — npx skills (works everywhere)

🥊 Option 2 — ColdBox CLI (deep BoxLang/ColdBox integration)

🔷 Core Repositories — Curated by Ortus

⭐ A Taste of What's Available

🌐 Submit Your Own — Community Skills, Security First

🛠 How Your Agent Actually Uses It

🔮 Why This Matters Beyond BoxLang

🎯 Get Started Now

📚 Resources

From Zero (Really Zero) to OpenTelemetry

The Fluent Builder — `jwtNew()`

`alg:none` Rejection

⚡ Option 1 — `npx skills` (works everywhere)

Character-Aware Trimming — `trim()`, `ltrim()`, `rtrim()`

`getClassMetadata()` by Absolute Path

`SystemExecute()` Environment Controls

Step 2 — Install the `bx-ai` Module

Step 3 — Set Up Your `.env` File

Step 4 — `Configure config/boxlang.json`

Defining a Tool with `aiTool()`

A Real Tool: `get_order`

The Full `OrderTools` Class

Simple Streaming with `aiChatStream()`

Agent Streaming with `agent.stream()`