Stories by Thiago Gonzaga on Medium

RAG in Real World: How to Use Java and LangChain4j with Corporate Data

Thiago Gonzaga — Fri, 21 Nov 2025 12:46:01 GMT

Overview

It’s undeniable that generative AI has exploded in scale, kicking off a race among big tech companies for leadership in this space. Naturally, companies of all sizes want to ride this new technological wave.

However, when we’re talking about corporate data, several critical challenges arise: compliance, information security, intellectual property, and other factors that make it unfeasible to send sensitive data to public LLM APIs.

On top of that, many organizations face another issue: valuable information is scattered across multiple platforms, often poorly structured, making it hard for teams to access it efficiently.

Ironically, LLMs themselves have essential limitations: studies show they can hallucinate in up to 15–20% of cases when they lack proper context. Token costs can make some solutions impossible to run at scale.

So the big question is: how can we use AI safely with corporate data, deliver reliable results, and at the same time reduce token usage and costs?

The answer is RAG (Retrieval-Augmented Generation), and in this article, I’ll show how to implement it in Java using LangChain4j.

What is RAG (Retrieval-Augmented Generation)?

RAG is a technique that extends an LLM’s ability to answer questions without retraining the model. Instead of relying only on the model’s built-in knowledge, RAG fetches external data — documents, databases, APIs — to provide more specific, accurate, and up-to-date answers, without the cost of training or fine-tuning.

A simple analogy: imagine a meeting where an expert needs to answer questions.
Without RAG, they depend only on their memory.
With RAG, they can review documents, spreadsheets, and systems during the meeting to provide more precise, evidence-based answers.

Why is RAG needed? As mentioned earlier, “pure” LLMs suffer from critical limitations:

Hallucinations: They make things up when they don’t know the answer
Outdated data: Training is frozen in time (knowledge cutoff)
Lack of specificity: They don’t know your internal corporate data
No sources: It’s hard to trace where the information came from

RAG makes LLMs much more trustworthy by connecting them to real, verifiable data.

How does it work in practice? The RAG flow happens in well-defined steps:

User question
The user asks something like: “What is the company’s expense reimbursement policy?”
Retrieval
The system:
- Detects that it needs internal information
- Search corporate documents, wikis, and PDFs
- Uses semantic search (not just keyword search)
Vectorization (Embeddings)
The documents and the question are converted into mathematical representations (vectors).
This allows semantic similarity comparison between the question and the documents.
Example: “reimbursement” and “refund” are treated as similar.
Context enrichment
The most relevant snippets are selected and injected into the prompt sent to the LLM.
Answer generation
The LLM analyzes the question plus the retrieved context and generates an answer grounded in real data.

Benefits for companies

With RAG, your organization can:

✅ Reduce operational costs: Fewer tokens consumed, no model retraining needed
✅ Keep information up to date: Update documents, not the model
✅ Have reliable, traceable answers: Each answer can reference the source
✅ Maintain complete control: Corporate data stays in your own infrastructure
✅ Integrate with legacy systems: Connect to existing databases and APIs

When not to use RAG

RAG is powerful, but it’s not a silver bullet. There are scenarios where other approaches are safer, cheaper, or simply more effective.

For exact, auditable answers
RAG is not a good fit when a wrong answer has serious consequences — for example, pricing rules, tax calculations, medical dosing, or regulatory reports. In these cases, the LLM should not be the source of truth.
Instead, call a deterministic API, rules engine, or database, and use the model only to explain the result in natural language if needed.

When latency is critical
RAG adds extra steps (retrieval + generation), which naturally increases latency. If your application needs near-instant responses (e.g., some trading, fraud scoring, or real-time user flows), a simpler model, aggressive caching, or a cache-augmented generation (CAG) approach can be more appropriate.

When the data is highly structured
If your data is already well-structured in a database and the question is essentially a query or aggregation (“total sales by region in the last 30 days”), a Text-to-SQL agent or direct SQL access is usually more precise and reliable than RAG over exported text.

When you already have a strong system of record/system of truth
If the answer already lives in a reliable system of record — a database, rules engine, ERP, or any other transactional system — you should read from that source directly instead of asking the model to guess. RAG makes much more sense for discovery, summarization, and knowledge navigation over unstructured content. If the underlying data is highly volatile or noisy (logs, transient events, temporary documents), keeping RAG indexes fresh can become expensive and operationally complex; in such cases, it’s usually better to first stabilize/structure the source data and apply RAG only to what actually represents long-lived knowledge.

Why Java for Enterprise AI?

Even though Python dominates generative AI tutorials and examples, the reality in enterprises is very different. According to TechRepublic, 90% of Fortune 500 companies use Java in their mission-critical systems.

And here’s the key point: AI needs to live where the data is. In corporate environments, that usually means:

Integration with legacy systems (ERP, CRM, corporate databases)
Compatibility with existing Java infrastructure (Spring, Jakarta EE, microservices)
Strict requirements around security, performance, and maintainability
Development teams are already specialized in Java

Rewriting the entire infrastructure in Python to add AI is neither viable nor desirable.

LangChain4j solves this gap by bringing RAG and LLM capabilities into the Java ecosystem, allowing you to integrate AI directly into your existing applications without disruptive architectural changes.

Getting to Know LangChain4j

LangChain4j is a Java library that dramatically simplifies integrating LLMs and RAG-like techniques into enterprise applications. It’s essentially the Java equivalent of the popular Python LangChain, but designed specifically for the Java ecosystem and conventions.

Main features

✅ Native Spring Boot integration: Configuration via properties and auto-configuration
✅ Multiple LLM providers: OpenAI, Gemini, Claude, local models via Ollama
✅ Enterprise-grade vector stores: Chroma, Pinecone, Elasticsearch, or custom implementations
✅ Full observability: Integration with OpenTelemetry and Micrometer
✅ Type safety: Strongly typed APIs, no “magic strings”

The library abstracts away the complexity of embeddings, vector stores, and LLMs so that you can focus on business logic rather than low-level details.

Implementing RAG in Practice

To demonstrate how to implement RAG with LangChain4j, I created a project that indexes the Java 25 documentation and supports natural language queries. The complete source code is available on GitHub and can be used as a starting point for your own implementation. Let’s walk through the main components.

1. Project Structure

src/main/java/com/javarag/
├── Application.java              # Spring Boot application entry point
├── config/                       # System configuration
│   ├── ModelConfig.java          # LLM and embeddings configuration
│   ├── VectorDbConfig.java       # Vector store configuration
│   ├── IngestionConfig.java      # Document ingestion configuration
│   └── QueryConfig.java          # Query configuration
├── service/                      # Core RAG services
│   ├── IngestionService.java     # Orchestrates document ingestion
│   ├── QueryService.java         # Orchestrates query processing
│   ├── RetrievalEngine.java      # Semantic search engine
│   ├── DocumentProcessor.java    # Interface for document chunking
│   └── RecursiveCharacterSplitter.java # Smart chunking implementation
├── repository/                   # Vector store access layer
│   └── VectorRepository.java     # Interface for vector operations
├── model/                        # Data models
│   ├── QueryResponse.java        # Query response
│   ├── DocumentChunk.java        # Document chunk
│   └── TokenUsageMetrics.java    # Token usage metrics
└── controller/                   # REST controllers
    └── ChatController.java       # Chat interaction API

2. System Configuration

The first step is configuring the core components via application.properties:

# Language model configuration
rag.model.provider=openai
rag.model.name=gpt-4o-mini
rag.model.temperature=0.2
rag.model.max-tokens=2048

# Embeddings configuration
rag.model.embedding.provider=openai
rag.model.embedding.model=text-embedding-3-small

# Vector store configuration (Chroma)
rag.vector.provider=chroma
rag.vector.host=localhost
rag.vector.port=8000
rag.vector.collection=java-docs

# Ingestion configuration
rag.ingestion.chunk-size=1000
rag.ingestion.chunk-overlap=100
rag.ingestion.batch-size=10

# Query configuration
rag.query.max-retrieved-chunks=5
rag.query.similarity-threshold=0.75
rag.query.timeout-seconds=30

The application follows standard Spring Boot configuration using @ConfigurationProperties classes:

@ConfigurationProperties(prefix = "rag.model")
public class ModelConfig {
    private String provider = "openai";
    private String name = "gpt-4o-mini";
    private Double temperature = 0.2;
    private Integer maxTokens = 2048;

    // getters and setters...
}

3. Document Ingestion

Ingestion is the process of preparing documents for semantic search. Here’s the main service:

@Service
public class IngestionService {

    private final DocumentLoader documentLoader;
    private final DocumentProcessor documentProcessor;
    private final EmbeddingModelProvider embeddingModel;
    private final VectorRepository vectorRepository;

    public IngestionResult ingestDocuments(Path documentPath) {
        logger.info("Starting document ingestion from path: {}", documentPath);

        // 1. Load documents from directory
        List documents = documentLoader.loadDocuments(documentPath);

        IngestionResult result = new IngestionResult();

        for (Document document : documents) {
            // 2. Process document (chunking)
            List chunks = documentProcessor.processDocument(document);

            // 3. Process chunks in batches
            int batchSize = config.getBatchSize();
            for (int i = 0; i < chunks.size(); i += batchSize) {
                List batch = chunks.subList(
                    i,
                    Math.min(i + batchSize, chunks.size())
                );

                // 4. Generate embeddings for the batch
                List texts = batch.stream()
                    .map(DocumentChunk::getContent)
                    .collect(Collectors.toList());

                List embeddings = embeddingModel.embedBatch(texts);

                // 5. Store in vector store
                vectorRepository.storeBatch(batch, embeddings);
            }

            result.incrementDocumentsProcessed();
        }

        return result;
    }
}

Smart Chunking

Chunking is crucial for RAG success. The project implements a recursive splitter that tries to break the text at natural boundaries:

@Service
public class RecursiveCharacterSplitter implements DocumentProcessor {

    // Separators in order of preference
    private static final String[] SEPARATORS = {
        "\n\n\n",  // Multiple paragraph breaks
        "\n\n",    // Paragraph break
        "\n",      // Line break
        ". ",      // End of sentence
        "! ",      // Exclamation
        "? ",      // Question
        "; ",      // Semicolon
        ", ",      // Comma
        " ",       // Space
        ""         // Character-level fallback
    };

    public List chunkText(String text, DocumentMetadata metadata) {
        List textChunks = splitText(
            text,
            config.getChunkSize(),
            config.getChunkOverlap()
        );

        List chunks = new ArrayList<>();
        for (int i = 0; i < textChunks.size(); i++) {
            // Create chunk-specific metadata
            DocumentMetadata chunkMetadata = new DocumentMetadata(
                metadata.getSourceFile(),
                metadata.getSection(),
                i  // chunk index
            );

            // Estimate token count (4 chars ≈ 1 token)
            int tokenCount = estimateTokenCount(textChunks.get(i));

            chunks.add(new DocumentChunk(textChunks.get(i), chunkMetadata, tokenCount));
        }

        return chunks;
    }
}

4. Retrieval Engine

The RetrievalEngine is responsible for finding relevant document chunks for a given query:

@Service
public class RetrievalEngine {

    public List retrieveRelevantChunks(String query) {
        logger.debug("Retrieving relevant chunks for query: {}", query);

        // 1. Generate query embedding
        float[] queryEmbedding = embeddingModel.embed(query);
        logger.debug("Generated query embedding with {} dimensions",
                     queryEmbedding.length);

        // 2. Similarity search
        int topK = queryConfig.getMaxRetrievedChunks();
        double threshold = queryConfig.getSimilarityThreshold();

        List scoredDocuments = vectorRepository
            .similaritySearch(queryEmbedding, topK, threshold);

        // 3. Filter and convert results
        List relevantChunks = scoredDocuments.stream()
            .filter(doc -> doc.getSimilarityScore() >= threshold)
            .limit(queryConfig.getMaxRetrievedChunks())
            .map(ScoredDocument::getChunk)
            .collect(Collectors.toList());

        logger.info("Retrieved {} relevant chunks for query", relevantChunks.size());
        return relevantChunks;
    }
}

5. Query Processing

QueryService orchestrates the full RAG flow:

@Service
public class QueryService {

    public QueryResponse processQuery(String question) {
        logger.info("Processing query: {}", truncateForLog(question));
        long startTime = System.currentTimeMillis();

        try {
            // Step 1: Retrieve relevant chunks
            List relevantChunks =
                retrievalEngine.retrieveRelevantChunks(question);

            if (relevantChunks.isEmpty()) {
                return createNoResultsResponse(question, startTime);
            }

            // Step 2: Build prompt with context
            String prompt = promptBuilder.buildPrompt(question, relevantChunks);

            // Step 3: Generate answer with LLM
            GenerationResponse generationResponse = generateWithTimeout(prompt);
            String answer = generationResponse.getText();

            // Step 4: Extract source references
            List sources = extractSourceReferences(relevantChunks);

            // Step 5: Record token usage
            TokenUsageMetrics tokenMetrics = new TokenUsageMetrics(
                generationResponse.getPromptTokens(),
                generationResponse.getCompletionTokens(),
                generationResponse.getTotalTokens()
            );
            tokenTracker.recordTokenUsage(question, tokenMetrics);

            long responseTime = System.currentTimeMillis() - startTime;
            return new QueryResponse(answer, sources, tokenMetrics, responseTime);

        } catch (ModelTimeoutException | ModelInvocationException e) {
            // Specific handling for model errors
            logger.error("Model error processing query: {}", e.getMessage());
            throw e;
        } catch (Exception e) {
            logger.error("Unexpected error processing query", e);
            throw new RagSystemException("Failed to process query", e);
        }
    }
}

6. REST API for Integration

The system exposes a clean REST API for integration:

@RestController
@RequestMapping("/api/chat")
public class ChatController {

    @PostMapping("/query")
    public ResponseEntity query(@Valid @RequestBody ChatRequest request) {
        logger.info("Received chat query for session: {}", request.getSessionId());

        QueryResponse queryResponse = chatService.processMessage(
            request.getSessionId(),
            request.getMessage(),
            request.getWebSocketSessionId()
        );

        ChatResponse response = ChatResponse.fromQueryResponse(queryResponse);
        return ResponseEntity.ok(response);
    }

    @GetMapping("/history")
    public ResponseEntity> getHistory(@RequestParam String sessionId) {
        List history = chatService.getHistory(sessionId);
        return ResponseEntity.ok(history);
    }
}

Important Implementation Details

Observability and Monitoring

The system includes full observability with OpenTelemetry to trace each step of the RAG process:

// Automatic span instrumentation
Span span = tracer.spanBuilder("process-query").startSpan();
try (Scope scope = span.makeCurrent()) {
    span.setAttribute("query.text", truncateForLog(question));
    span.setAttribute("query.chunks_retrieved", relevantChunks.size());
    span.setAttribute("llm.tokens.total", generationResponse.getTotalTokens());

    // Processing logic...

    span.setStatus(StatusCode.OK);
}

Cost Control

The project tracks token usage in real time, which is essential to control LLM costs:

public class TokenUsageTracker {

    public void recordTokenUsage(String query, TokenUsageMetrics metrics) {
        logger.info("Query tokens - Prompt: {}, Completion: {}, Total: {}",
                   metrics.getPromptTokens(),
                   metrics.getCompletionTokens(),
                   metrics.getTotalTokens());

        // Record metrics for monitoring
        metricsService.recordTokenUsage(metrics);
    }
}

Robust Error Handling

The system differentiates between error types and applies specific strategies:

// Configurable timeout to avoid endless calls
private GenerationResponse generateWithTimeout(String prompt) {
    int timeoutSeconds = queryConfig.getTimeoutSeconds();

    CompletableFuture future = CompletableFuture
        .supplyAsync(() -> languageModel.generate(request));

    try {
        return future.get(timeoutSeconds, TimeUnit.SECONDS);
    } catch (TimeoutException e) {
        future.cancel(true);
        throw new ModelTimeoutException("Generation timed out", e);
    }
}

Testing the System

To demonstrate RAG in action, let’s run a few queries against the Java 25 docs.

Example 1: Query about Virtual Threads

curl -X POST http://localhost:8080/api/chat/query \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "test-session",
    "message": "How do I use Virtual Threads in Java 25?"
  }'

Response:

{
  "answer": "In Java 25, you can use Virtual Threads via Thread.ofVirtual() or Executors.newVirtualThreadPerTaskExecutor(). Virtual Threads are lightweight threads that let you write thread-per-task style concurrent code without the usual resource costs. Basic example:\n\nThread.ofVirtual().start(() -> {\n    // your logic here\n});\n\nFor many tasks, use:\nvar executor = Executors.newVirtualThreadPerTaskExecutor();",
  "sources": [
    {
      "filename": "virtual-threads.md",
      "section": "Basic Usage",
      "chunkIndex": 2
    }
  ],
  "tokenUsage": {
    "promptTokens": 1250,
    "completionTokens": 85,
    "totalTokens": 1335
  },
  "responseTimeMs": 1847
}

Example 2: Query about Pattern Matching

curl -X POST http://localhost:8080/api/chat/query \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "test-session",
    "message": "What’s new in pattern matching in Java?"
  }'

The system returns specific information extracted from the documentation, with exact references to the sources.

Lessons Learned and Best Practices

1. Choosing Chunk Size

After some experimentation, I found that chunks of 1,000 characters with 100 characters of overlap gave the best balance between:

Enough context for the LLM to understand the content
Good granularity for semantic search
Token cost control

2. Similarity Threshold

A similarity threshold of 0.75 filtered out irrelevant results without being too strict.
Very high values (0.9+) dropped relevant answers; very low values (0.6-) introduced too much noise.

3. Prompt Engineering

The prompt needs careful structure. The final version included:

An explicit instruction about the system’s role
Context extracted from documents
The user’s question
Instructions about answer format and source citation

4. Memory Management

To avoid OutOfMemoryError with large document volumes:

Use configurable batch processing
Explicitly release large objects
Monitor heap usage

5. Fallbacks and Graceful Degradation

The system always tries to provide something useful:

If no relevant documents are found, it explains this and suggests rephrasing
If the LLM fails, it returns an informative error
Configurable timeouts prevent endless requests

Production Considerations

Security

For production, you should implement:

@Component
public class SecurityConfig {

    // Authentication and authorization
    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) {
        return http
            .oauth2ResourceServer(oauth2 -> oauth2.jwt(withDefaults()))
            .authorizeHttpRequests(authz -> authz
                .requestMatchers("/api/chat/**").hasRole("USER")
                .anyRequest().authenticated())
            .build();
    }

    // Rate limiting
    @Bean
    public RateLimiter rateLimiter() {
        return RateLimiter.create(10.0); // 10 requests/second per user
    }
}

Performance and Scalability

@Configuration
@EnableAsync
public class AsyncConfig {

    // Thread pool for async processing
    @Bean
    public TaskExecutor ragTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(5);
        executor.setMaxPoolSize(20);
        executor.setQueueCapacity(100);
        executor.setThreadNamePrefix("rag-");
        return executor;
    }

    // Cache for recent embeddings
    @Bean
    public CacheManager cacheManager() {
        return new ConcurrentMapCacheManager("embeddings", "queries");
    }
}ja

Monitoring

Key metrics to track:

Response time per query
Token usage over time
Success/error rate
Vector store latency
Memory usage during ingestion

Alternatives and Considerations

Vector Stores

Chroma (used in the project): Great for prototyping and development
Pinecone: Managed service, excellent performance, but the cost can be high
Elasticsearch: If your company already uses ES, you can reuse the infrastructure
PostgreSQL with pgvector: Natural fit if you already have relational data

Embedding Models

OpenAI text-embedding-3-small: Good cost/benefit, 1536 dimensions
OpenAI text-embedding-3-large: Higher quality, 3072 dimensions, more expensive
Local models: sentence-transformers via Ollama for complete control

Chunking Strategies

Recursive (implemented): Good for general text
Semantic: Uses embeddings to choose smarter boundaries.
Type-specific: PDFs, code, and Markdown may need different strategies

Next Steps

To evolve the system:

Reranking: Add a reranker model (like BGE) to improve retrieval quality
Hybrid embeddings: Combine semantic search with lexical search (BM25)
Agentic RAG: Allow the system to decide when to fetch more information
Multimodal: Add support for images, diagrams, and tables
Feedback loop: Collect user feedback to improve continuously

Conclusion

RAG with Java and LangChain4j offers a robust path for companies that want to leverage generative AI while keeping complete control over corporate data. The implementation I showed here delivers:

✅ Security: Data stays in your infrastructure
✅ Reliability: Answers grounded in verifiable documents
✅ Observability: Full tracing of the entire process
✅ Scalability: Spring Boot-based architecture
✅ Cost control: Detailed token usage monitoring

The combination of Java, LangChain4j, and RAG unlocks huge possibilities: from internal assistants that know your company policies to systems that can answer technical questions about your codebase.

The key is to start simple, measure everything, and evolve iteratively based on honest user feedback. With RAG, you’re not just “adding AI” — you’re building an intelligent system that grows together with your organization.

Source code: https://github.com/devops-thiago/my-java-genie

RAG in Real World: How to Use Java and LangChain4j with Corporate Data was originally published in Arquivo Livre on Medium, where people are continuing the conversation by highlighting and responding to this story.

Java 25 in Action: Real Features Solving Real Developer Pain

Thiago Gonzaga — Tue, 09 Sep 2025 01:17:48 GMT

Java 25 LTS is here. In this post, let's deep dive into what matters.

The most important thing to note: this is a Long-Term Support (LTS) version. That means stability and updates until at least 2033. A safe bet if you’re thinking about migration.

Of course, migration is never just a click away — especially if you’re still on Java 8 or 11. But here’s the deal: when new versions bring real performance gains and simpler code, the move starts to pay for itself.

Think about cloud costs. Memory consumption is money. Every GC improvement, every runtime optimization, every feature that reduces boilerplate… it all adds up. If your app runs fine and you don’t feel the pain, staying put is fine. But if you’re serious about cutting expenses or want your team to be more productive, Java 25 is worth a look.

👉 In this post, I’ll go through the features that matter most for developers. The goal isn’t just to show what changed, but to explain how these changes solve your daily pains.

JEP-519 Compact Object Headers: Save Memory, Save Money

Every object in Java has a header stored in the heap. This metadata is important for things like synchronization and garbage collection. The problem? It adds overhead.

Depending on the object, the header can take up to 16 bytes. For small objects (say around 64 bytes), that’s about 25% of the memory wasted just on the header.

Java 25 changes that. With Compact Object Headers, the JVM can reduce headers to less than 8 bytes, which means:

Up to 12% memory saved per object
Up to 22% less heap usage overall
Fewer GC cycles → less CPU usage

That’s not just theory — on large apps this can translate into real cloud cost savings.

🔧 To enable it, you need to start your JVM with:

$ java -XX:+UseCompactObjectHeaders -jar yourapp.jar

⚡ Bonus: frameworks like Spring and Quarkus, which rely heavily on object creation, can become much more resource-efficient with this improvement. Perfect for microservices running at scale.

💡 Note for beginners: Think of it like packing your suitcase tighter. Same stuff, less space. You don’t need to understand the JVM internals to benefit — just know that enabling this flag means your apps will use less memory and CPU.

JEP-514/JEP-515 Ahead-of-Time Command-Line Ergonomics & Method Profiling

Startup time matters. Whether you’re deploying microservices, scaling pods in Kubernetes, or just restarting apps during development, those seconds (or minutes) add up.

Back in Java 24, we got Ahead-of-Time (AOT) Class Loading and Linking. It worked by analyzing your most-used classes, preloading them, and reducing startup time by up to 42%. Pretty amazing… but there was a catch:

You had to run the your app twice → first in record mode, then again in create cache mode
Great for advanced use cases (like generating caches across platforms)
But for most developers, it felt like overkill

Java 25 fixes that. Thanks to command-line ergonomics, you can now simplify the process into one step:

# train and generate cache in a single command
$ java -XX:AOTCacheOutput=appcache.aot -jar your_app.jar

# run your app using the created cache
$ java -XX:AOTCache=appcache.aot -jar your_app.jar

No more two-step dance. One run, one cache.

But there’s more. Java 25 also introduces Ahead-of-Time Method Profiling. This means the JVM doesn’t just look at classes — it also tracks your most-used methods and puts them into the cache.

🚀 The result:

Faster warmup times
Apps hit peak performance quicker
Ideal for services that need to scale fast, recover fast, or deploy fast

Reducing Boilerplate: Write Less, Do More

One of the common complaints about Java has always been boilerplate. Too much ceremony just to run something simple. Java 25 takes a big step forward to make the language leaner and more fluent.

1. JEP-512 Compact Source Files

For quick scripts, demos, or small apps, you no longer need to wrap everything inside a class. You can just write a main method, define global variables, or add helper methods without static.

final String HELLO_WORLD_FORMAT = "Hello, %s";
String name;

void main(){
  name = "Thiago";
  IO.println(output(name));
}

String output(String name) {
  return String.format(HELLO_WORLD_FORMAT, name);
}

2. JEP-513 Flexible Constructors

Before Java 25, super() or this() had to be the first line of any constructor. That made input validation awkward. Now, you can place them anywhere in the constructor body.

class User {
  private final String name;

  User(String name) {
    if (name == null || name.isBlank()) {
      throw new IllegalArgumentException("Name cannot be empty");
    }
    super(); // now allowed after validation
    this.name = name;
  }
}

3. JEP-511 Simpler Module Imports

Modules were introduced back in Java 9 to better organize dependencies. But the import syntax was verbose, often requiring multiple imports or wildcards.

Java 25 streamlines this by allowing you to import the module itself, while still respecting requires, exports, and transitive rules.

import module java.sql;

void main() {
  Connection conn = DriverManager.getConnection("url", "user", "password");
}

Java 25 streamlines this by allowing you to import the module while respecting requires, exports, and transitive rules.

For newbies and instructors, Java is now easier to teach and learn.
For senior devs, writing scripts, microservices, or demos becomes faster and less noisy.
Overall, Java is catching up with the simplicity you’d expect in modern languages — without losing its power.

JEP-506 Scoped Values: A Safer Alternative to ThreadLocal

Since the early 2000s, ThreadLocal has been the go-to tool for passing context like user IDs, tokens, or request metadata. But let’s be honest—it has also caused more than a few headaches:

Hard-to-reason-about data flow
Risk of memory leaks if you forget remove()
Mutable values that can be replaced accidentally

Java 25 changes the game. With Scoped Values, you now have a safer and clearer way to share context.

final ScopedValue USER = ScopedValue.newInstance();

void main() {
  IO.println("Before: " + user());
  ScopedValue.where(USER, "Admin").run(() -> {
    IO.println(" User: " + user());
    ScopedValue.where(USER, "Guest").run(() -> {
      IO.println(" Inner scope user: " + user());
    });
    IO.println(" User: " + user());
  });
  IO.println("After: " + user());
}

String user() { return USER.isBound() ? USER.get() : "unbound"; }

🔑 Scoped Values fix the ThreadLocal pain points:

Immutable once bound — no accidental mutation
Bounded lifetime — values disappear automatically when the scope ends
Clear scope structure — values are passed explicitly

Even better: child virtual threads can inherit Scoped Values, but only when used with Structured Concurrency (JEP-505, still in preview). Regular threads do not inherit scoped values by default.

💡 ThreadLocal still has valid use cases (like caching expensive mutable objects), but for one-way, immutable data flow, ScopedValue is the future.

Other Stable Features

Generational Shenandoah (JEP-521): A new generational mode for the Shenandoah GC that improves efficiency by separating young and old objects.
Key Derivation Function API (JEP-510): Standard cryptography API for secure key derivation (like turning passwords into encryption keys).
JFR Cooperative Sampling (JEP-518): Reduces overhead when profiling applications with Java Flight Recorder by coordinating thread sampling.
JFR Method Timing & Tracing (JEP-520): Adds detailed method-level timing and tracing to JFR for deeper performance insights.

Preview, Incubator & Experimental Features

PEM Encodings of Cryptographic Objects (JEP-470, Preview): Makes it easier to read/write PEM-encoded keys and certificates.
Stable Values (JEP-502, Preview): Immutable values designed for predictable behavior in concurrent programming.
Structured Concurrency (JEP-505, Fifth Preview): Simplifies managing multiple tasks running in parallel.
Primitive Types in Patterns, instanceof, and switch (JEP-507, Third Preview): Lets you use pattern matching directly with primitives.
Vector API (JEP-508, Tenth Incubator): Enables high-performance vector computations for data-parallel workloads.
JFR CPU-Time Profiling (JEP-509, Experimental): Profiles threads based on CPU time usage, not just elapsed wall time.

Wrapping It Up

Java 25 is not just another release — it’s a Long-Term Support version that balances runtime optimizations with language simplicity.

Memory & Performance: Compact Object Headers, AOT profiling, and new GC modes reduce heap usage, CPU cycles, and startup times.
Developer Productivity: Compact source files, flexible constructors, and simpler module imports cut boilerplate and make Java easier to learn and teach.
Safer Concurrency: Scoped Values offer a modern, immutable alternative to ThreadLocal, solving issues devs have faced for decades.
Observability & Security: JFR enhancements and new crypto APIs make apps easier to monitor and more secure.
Cleaner Platform: By removing support for 32-bit x86, Java 25 focuses on modern 64-bit architectures (Intel/AMD), simplifying future development.

Migration may feel like a big leap if you’re still on Java 8 or 11. However, the performance gains, cost savings, and language improvements in Java 25 make it worth serious consideration.

🚀 Whether you care about cloud bills, developer velocity, or simply cleaner code, Java 25 has something to ease your daily pains.

Reference: https://openjdk.org/projects/jdk/25/

Java 25 in Action: Real Features Solving Real Developer Pain was originally published in Arquivo Livre on Medium, where people are continuing the conversation by highlighting and responding to this story.

TreeMap vs HashMap in Java — and When to Use Each (With a Real Interview Story)

Thiago Gonzaga — Fri, 18 Apr 2025 21:39:33 GMT

TreeMap vs HashMap in Java — and When to Use Each (With a Real Interview Story)

I’ve been helping a few developers lately — some of them recently went through interview processes. They nailed some parts, but something still didn’t move things forward.

Unfortunately, the hiring process isn’t under our control. Each company has its own approach, and sometimes even inside the same company, different candidates get different experiences. You can never fully predict what’s coming.

That’s why I always tell people: ask the right questions early on:

What kind of technical interview should I expect?
Will it involve live coding?
What technologies are they focusing on?
Are there questions I should prepare for?

Asking these early not only helps you prepare — it also helps you decide if the company aligns with your expectations. I’ve wasted time on long, exhausting interview processes that ended with offers I had to refuse. So my advice: protect your time, your energy, and ask early.

But this post isn’t just about interviews — it’s about storytelling and how you can answer interview questions even when they catch you off guard.

My Story: “I Forgot… And That’s OK”

Once, during a technical interview, I got asked a simple question:

“Do you know the difference between a HashMap and a TreeMap?”
“When should you use one over the other?”
“Any performance differences? Big-O notation?”

I froze for a second.

It’s not like I didn’t know the answer. I had learned it before. But… I hadn’t used it in a while.

So I answered honestly:

“It’s been some time since I worked with that — I don’t remember exactly, but I’ll look it up right after this.”

And I did. Straight to the docs.

Now I’ve done the work — and I’m sharing it here so you don’t have to dig around when it comes up in your interview.

🆚 TreeMap vs HashMap in Java

They both implement the Map interface — but behave very differently.

Key Differences:

https://medium.com/media/c3101d5e008221c37671d5433ca9ffb7/href

☝️ When to Use Which?

Use HashMap when:

You need speed — performance is critical.
Order doesn’t matter.
You’re doing lookups, caching, or counting things.

Use TreeMap when:

You need sorted keys.
You want to navigate ranges (e.g., min to max).
You care about iteration order.

🔎 Code Example — Order Matters

Map hashMap = new HashMap<>();
hashMap.put("Orange", 1);
hashMap.put("Apple", 2);
hashMap.put("Banana", 3);
System.out.println(hashMap);
// Output might be: {Banana=3, Orange=1, Apple=2}

Map treeMap = new TreeMap<>();
treeMap.put("Orange", 1);
treeMap.put("Apple", 2);
treeMap.put("Banana", 3);
System.out.println(treeMap);
// Output: {Apple=2, Banana=3, Orange=1}

⏱ Big-O Comparison

https://medium.com/media/0764db88a8c7f0da8a10df6d66a7f524/href

🎯 Related Interview Questions

What is LinkedHashMap?
Like HashMap, but keeps insertion order. Great when you need predictable iteration.
How does TreeMap sort keys?
Internally uses a Red-Black Tree (self-balancing BST) to keep entries sorted.

✅ Wrap-Up

So yeah — sometimes an interviewer will throw a curveball.
You might not use TreeMap every day. That’s fine.

But now that you’ve read this, you won’t just answer technically — you’ll tell a story, show your learning mindset, and explain things with confidence.

Have you ever been asked this question in an interview?
How did you answer it?

Let’s talk in the comments 👇

📬 Want more content like this — plus real stories about tech interviews, remote jobs, and DevOps?
Join my newsletter:
👉 https://sendfox.com/thiago

TreeMap vs HashMap in Java — and When to Use Each (With a Real Interview Story) was originally published in Arquivo Livre on Medium, where people are continuing the conversation by highlighting and responding to this story.

What You Really Need to Know About Kubernetes in Interviews — Part I

Thiago Gonzaga — Tue, 08 Apr 2025 17:16:06 GMT

🔧 What You Really Need to Know About Kubernetes in Interviews — Part I

Recently, a friend of mine reached out — he was struggling with Kubernetes interview questions.

He’s been through several interviews already, and as it turns out, Kubernetes knowledge is the thing that keeps holding him back.

Kubernetes is showing up more and more in tech interviews — especially for roles in backend, DevOps, or anything cloud-related.

But what do you really need to know to avoid freezing mid-interview?

Let’s break down the essentials 👇

🧪 Part I — Troubleshooting

Most interviews want to know:

Have you ever gotten into real trouble with Kubernetes — and do you know how to dig deep and find the root cause?

That’s why troubleshooting is one of the most important skills for Kubernetes interviews.

Knowing where to start brings you much closer to your interview goal.

Let’s go through the key places to look when things go wrong.

1️⃣ Start with kubectl get events 📋

If something’s not right in the cluster, your first clue is almost always in the events.

Use this:

kubectl get events -A

This will give you a quick overview of what’s going wrong — like:

❌ Pods in CrashLoopBackOff
🚨 Node memory pressure
📦 Image pull errors
📈 HPA scale activity
🔁 Deployments not rolling out

This is your first stop when troubleshooting.

2️⃣ Check the Logs 🔍

If events don’t tell the full story, it’s time to dig deeper.

Run this to check logs in the kube-system namespace:

kubectl logs -n kube-system  -c

Here are the key components to know — and what kind of issues they might reveal:

🌐 Ingress Controller

What it does: Manages external access (e.g., NGINX ingress).

Common issue: Can’t access your app via browser? Check for route or TLS misconfig.

🧠 Kube Controller Manager

What it does: Manages Deployments, ReplicaSets, scaling, etc.

Common issue: No pods being created? This might have the answer.

🧭 Kube Scheduler

What it does: Decides which node runs a pod.

Common issue: Pod stuck in Pending? Maybe no node meets the requirements.

🔗 Kube Proxy

What it does: Handles networking between services.

Common issue: Can’t reach a ClusterIP service? Could be a proxy config issue.

📡 CoreDNS

What it does: Resolves internal DNS for services.

Common issue: Services can’t talk to each other by name? Look here.

🧱 CNI Plugin (e.g., Calico)

What it does: Manages pod networking and policies.

Common issue: Pods can’t reach each other across nodes? Might be a denied route or misconfigured policy.

🧾Don’t Forget to Check Your App Logs

Sometimes, the problem isn’t in the infrastructure — it’s inside your application.

Whether it’s a Spring Boot app crashing on startup or throwing 500s, your logs are usually the first place to look.

Use this to get the logs for your deployed app:

kubectl logs  -c  -n

3️⃣ Use kubectl top to Check Resources 📊

Sometimes, performance problems are just due to resource limits.

Check pod usage with:

kubectl top pod  -n

You’ll see how much CPU and memory your pods are using — super helpful when debugging autoscaling or performance issues.

🧪 Pro tip: Set resource requests & limits

Defining requests and limits helps the scheduler do its job and keeps things stable.

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

This ensures your pods don’t hog or starve resources 💥

💭 Final Thoughts

Kubernetes in interviews doesn’t have to be scary.

Knowing where to look, how to read the signals, and how to check usage already puts you ahead of the pack.

👇 Your Turn

Have you ever faced a frustrating Kubernetes issue in production — or during an interview?

Leave a comment — I’d love to hear your story (and maybe turn it into a lesson for others too).

📬 Want more content like this — plus real stories about tech interviews, remote jobs, Java and DevOps?
Join my newsletter:
👉 https://sendfox.com/thiago

🔧 What You Really Need to Know About Kubernetes in Interviews — Part I was originally published in Arquivo Livre on Medium, where people are continuing the conversation by highlighting and responding to this story.

How to Use Stream.gather in Java 24 for More Powerful Stream Processing

Thiago Gonzaga — Mon, 10 Mar 2025 20:11:44 GMT

While traveling to Belo Horizonte — the capital of Minas Gerais, Brazil — to catch The Offspring live (one of my all-time favorite bands, by the way), I found myself thinking about my next blog post. Just last week, I introduced some Java 24 changes to a large audience — even though Java 24 hasn’t officially launched yet (it’s set for March 18, 2025). That’s when I had an insight: developers can leverage Stream.gather 🎯 to write cleaner, more efficient stream-based code—elevating their knowledge and staying ahead of the launch.

One of the major highlights in Java 24, Stream.gather is a powerful feature that enhances stream processing by allowing custom intermediate operations. This article provides an easy-to-follow guide on how Stream.gather works, the problems it solves, and how to use it effectively.

If you want to start using Stream.gather today, you can either download the latest release candidate for JDK 24 from jdk.java.net/24 or enable preview features by passing --enable-preview to javac while compiling and java while executing with JDK 23.

🔍 How It Works

Stream.gather introduces gatherers—user-defined processors that transform stream elements in flexible ways. Unlike traditional intermediate operations, gatherers can perform:

🔄 One-to-one transformations (like map)
📦 One-to-many, many-to-many, and many-to-one aggregations
🧠 Stateful transformations that track previous elements
⏳ Short-circuiting operations to transform infinite streams into finite ones

A gatherer consists of four optional functions:

🛠️ Initializer — Creates an intermediate state object (if needed).
🔗 Integrator — Processes incoming elements, possibly using intermediate state, and optionally produces output elements.
⚡ Combiner — Merges intermediate states into one.
🏁 Finisher — Finalizes processing by handling the final intermediate state and performing a final action at the end of the input stream.

🤔 Which Problem It Can Solve

Java streams have been great for functional programming, but they lacked a way to define custom intermediate operations. Stream.gather fills this gap, making complex tasks like grouping, scanning, or stateful filtering more intuitive and efficient.

For example, imagine you have a list of students and want to split them into groups of three. Before Java 24, you would do it like this:

import java.util.stream.IntStream;

public class BeforeJava24 {
    private record Student(String name) {}

    public static void main(String[] args) {
        var batchSize = 3;
        var students = IntStream.rangeClosed(1, 10)
                .mapToObj(i -> new Student(String.format("Student #%d", i)))
                .toList();

        var groups = IntStream
                .range(0, students.size() % batchSize == 0 ? students.size() / batchSize : students.size() / batchSize + 1)
                .mapToObj(i -> students.subList(i * batchSize, Math.min((i + 1) * batchSize, students.size())))
                .toList();

        groups.forEach(System.out::println);
    }
}

With Java 24, you can achieve the same result more cleanly and efficiently using Stream.gather:

import java.util.stream.Gatherers;
import java.util.stream.IntStream;

public class AfterJava24 {
    private record Student(String name) {}

    public static void main(String[] args) {
        var students = IntStream.rangeClosed(1, 10)
                .mapToObj(i -> new Student(String.format("Student #%d", i)))
                .toList();

        var groups = students.stream()
                .gather(Gatherers.windowFixed(3))
                .toList();

        groups.forEach(System.out::println);
    }
}

Notice how the code becomes cleaner and less complex with Stream.gather. 🚀

🛠️ Built-in Gatherers

Java 24 provides several ready-to-use gatherers in java.util.stream.Gatherers:

📌 fold is a stateful many-to-one gatherer which performs an ordered reduction transformation.

Before Java 24, you would probably do it as follows:

import java.util.Arrays;

public class Fold {
    public static void main(String[] args) {
        var numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
        var total = numbers.stream().mapToInt(Integer::intValue).sum();
        System.out.println(total);
    }
}

Using fold in Java 24, the code looks like this:

import java.util.Arrays;
import java.util.stream.Gatherers;

public class Fold {
    public static void main(String[] args) {
        var numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
        // sums up all the integers in the list
        var total = numbers.stream()
                .gather(Gatherers.fold(() -> 0, Integer::sum)) // starts at zero and sums the previous number with current one
                .findFirst().orElse(0);
        System.out.println(total);
    }
}

Check in the example above — there was no need to use map since gather processes the stream pipeline and performs the sum in a single statement.

⚙️ mapConcurrent is a stateful one-to-one gatherer which applies functions concurrently with a concurrency limit.

In the example below, we have an implementation using Java 23:

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.List;

public class MapConcurrent {
    public static void main(String[] args) {
        var urls = List.of(
                "https://jsonplaceholder.typicode.com/todos/1",
                "https://jsonplaceholder.typicode.com/todos/2",
                "https://jsonplaceholder.typicode.com/posts/1",
                "https://jsonplaceholder.typicode.com/users/1",
                "https://jsonplaceholder.typicode.com/comments/1"
        );

        try (HttpClient client = HttpClient.newHttpClient()) {
            urls.stream()
                    .map(url -> client.sendAsync(
                                    HttpRequest.newBuilder().uri(URI.create(url)).build(),
                                    HttpResponse.BodyHandlers.ofString()
                            ).thenApply(HttpResponse::body)
                            .exceptionally(e -> "Error: " + e.getMessage()))
                    .toList()
                    .forEach(response -> System.out.println(response.join())); // Wait and print
        }
    }
}

With Java 24, we can simplify this task using mapConcurrent:

import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.List;
import java.util.function.Function;
import java.util.stream.Gatherers;

public class MapConcurrent {
    public static void main(String[] args) {
        var urls = List.of(
                "https://jsonplaceholder.typicode.com/todos/1",
                "https://jsonplaceholder.typicode.com/todos/2",
                "https://jsonplaceholder.typicode.com/posts/1",
                "https://jsonplaceholder.typicode.com/users/1",
                "https://jsonplaceholder.typicode.com/comments/1"
        );

        final Function fetchData = url -> {
            try (HttpClient client = HttpClient.newHttpClient()) {
                var resp = client.send(
                        HttpRequest.newBuilder().uri(URI.create(url)).build(),
                        HttpResponse.BodyHandlers.ofString());
                return resp.body();
            } catch (IOException | InterruptedException ex) {
                return "";
            }
        };
        // fetches all the 5 url at once
        urls.stream().gather(Gatherers.mapConcurrent(urls.size(), fetchData))
                .toList()
                .forEach(System.out::println);
    }
}

Look how the code is now simpler and easier to maintain.

🔄 scan is a stateful one-to-one gatherer that applies a function using the current state and element to produce the next.

Here’s how we handled this before Java 24 in an example where we calculate the interest of each installment in a financing:

import java.util.stream.IntStream;

public class Scan {
    public static void main(String[] args) {
        double principal = 10000; // Loan Amount
        double annualInterestRate = 12; // 12% per year
        int numInstallments = 12; // 12 months

        // Convert annual interest rate to monthly interest rate
        double monthlyInterestRate = (annualInterestRate / 100) / 12;
        double emi = (principal * monthlyInterestRate * Math.pow(1 + monthlyInterestRate, numInstallments)) /
                (Math.pow(1 + monthlyInterestRate, numInstallments) - 1);

        System.out.println("Reducing Balance Installment Schedule:");

        // Mutable array to track outstanding balance
        double[] balance = {principal};

        IntStream.rangeClosed(1, numInstallments)
                .forEach(i -> {
                    double interest = balance[0] * monthlyInterestRate;
                    double principalRepayment = emi - interest;
                    balance[0] -= principalRepayment;  // Reduce principal balance

                    System.out.printf("Month %d: Installment = %.2f, Interest = %.2f, Principal = %.2f, Remaining Balance = %.2f\n",
                            i, emi, interest, principalRepayment, balance[0]);
                });
    }
}

And here’s how it looks in Java 24:

import java.util.HashMap;
import java.util.Map;
import java.util.stream.Gatherers;
import java.util.stream.IntStream;

public class Scan {
    public static void main(String[] args) {
        final double principal = 10000; // Loan Amount
        final double annualInterestRate = 12; // 12% per year
        final int numInstallments = 12; // 12 months

        // Convert annual interest rate to monthly interest rate
        final double monthlyInterestRate = (annualInterestRate / 100) / 12;
        final double emi = (principal * monthlyInterestRate * Math.pow(1 + monthlyInterestRate, numInstallments)) /
                (Math.pow(1 + monthlyInterestRate, numInstallments) - 1);
        final double initialInterest = principal * monthlyInterestRate;
        
        System.out.println("Reducing Balance Installment Schedule:");

        IntStream.rangeClosed(1, numInstallments)
                .mapToObj(i -> new HashMap(Map.of("Month", (double) i)))
                .gather(Gatherers.scan(() -> Map.of("Balance", principal, "Interest", initialInterest, "PrincipalRepayment", emi - initialInterest),
                        (current, downstream) -> {
                            double interest = current.get("Balance") * monthlyInterestRate;
                            double principalRepayment = emi - interest;
                            downstream.put("Balance", current.get("Balance") - principalRepayment);
                            downstream.put("Interest", interest);
                            downstream.put("PrincipalRepayment", emi - interest);
                            return downstream;
                        }))
                .forEach(entry -> {
                    System.out.printf("Month %d: Installment = %.2f, Interest = %.2f, Principal = %.2f, Remaining Balance = %.2f\n",
                            entry.get("Month").intValue(), emi, entry.get("Interest"), entry.get("PrincipalRepayment"), entry.get("Balance"));
                });
    }
}

See that we no longer need an external mutable value to track the state — Gatherers.scan can maintain the state between iterations, allowing you to reuse the previous value to generate a new one. What do you think? The code is fancier now, isn't it? 😃

📦 windowFixed groups elements into fixed-size lists.

Imagine you need to process orders in batches of five orders each. To break them into groups, you would probably do something like this before Java 24:

import java.util.stream.IntStream;


public class WindowFixed {
    record Order(int orderId) {}

    public static void main(String[] args) {
        final int batchSize = 5;
        var orders = IntStream.rangeClosed(1, 51)
                .mapToObj(Order::new)
                .toList();

        var batches = IntStream
                .range(0, orders.size() % batchSize == 0 ? orders.size() / batchSize : orders.size() / batchSize + 1)
                .mapToObj(i -> orders.subList(i * batchSize, Math.min((i + 1) * batchSize, orders.size())))
                .toList();
        batches.forEach(System.out::println);
    }
}

In Java 24, you can simply use windowFixed with the batch size, and it will look like this:

import java.util.stream.Gatherers;
import java.util.stream.IntStream;

public class WindowFixed {
    record Order(int orderId) {}

    public static void main(String[] args) {
        final int batchSize = 5;
        var orders = IntStream.rangeClosed(1, 51)
                .mapToObj(Order::new)
                .toList();
        var batches = orders.stream()
                .gather(Gatherers.windowFixed(batchSize))
                .toList();
        batches.forEach(System.out::println);
    }
}

Now it looks simpler and easier to read.

🔍 windowSliding is similar to windowFixed, but with overlapping groups.

Moving averages are a great example of where we can use a sliding window. Here's how it was done in Java prior to 24:

import java.util.Arrays;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class WindowSliding {
    public static void main(String[] args) {
        var data = Arrays.asList(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0);
        int windowSize = 3;
        var movingAverages = IntStream.rangeClosed(0, data.size() - windowSize)
                .mapToObj(i -> data.subList(i, i + windowSize)
                        .stream()
                        .mapToDouble(Double::doubleValue)
                        .average()
                        .orElse(0.0))
                .collect(Collectors.toList());
        System.out.println(movingAverages);
    }
}

With Java 24, we can use windowSliding to reduce the complexity of the code above:

import java.util.Arrays;
import java.util.stream.Gatherers;

public class WindowSliding {
    public static void main(String[] args) {
        var data = Arrays.asList(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0);
        int windowSize = 3;
        var movingAverages = data.stream()
                .gather(Gatherers.windowSliding(windowSize))
                .gather(Gatherers.scan(() -> 0.0,
                        (_, windows) -> windows.stream().mapToDouble(Double::doubleValue)
                                .average().orElse(0.0)))
                .toList();
        System.out.println(movingAverages);
    }
}

In the end, we see that we don't need to use map or collect to get the same results.

⚡ Parallel Processing with Gatherers

Parallel execution in Stream.gather operates in two modes:

Without a Combiner — The upstream and downstream run concurrently, similar to parallel().forEachOrdered().
With a Combiner — Supports parallel reductions, akin to parallel().reduce().

Example using a parallel gatherer to get the maximum prime number between 1 and 1000:

import java.util.Objects;
import java.util.Optional;
import java.util.stream.Gatherer;
import java.util.stream.IntStream;

class LargestPrimeGatherer {

    public static void main(String[] args) {
        Optional largestPrime = IntStream.rangeClosed(1, 10000)
                .boxed()
                .filter(LargestPrimeGatherer::isPrime)    // Filter only prime numbers
                .gather(selectOne(Math::max))             // Use custom Gatherer
                .parallel()
                .findFirst();                             // Extract the largest prime

        System.out.println("Largest prime number between 1 and 10,000: " +
                largestPrime.orElse(-1)); // Print result
    }

    // Custom Gatherer to find the largest prime number
    static Gatherer selectOne(java.util.function.BinaryOperator selector) {
        Objects.requireNonNull(selector, "selector must not be null");

        // Private state to track information across elements
        class State {
            Integer value = null;  // The current best value
        }

        return Gatherer.of(
                State::new,  // The initializer creates a new State instance

                // The integrator
                Gatherer.Integrator.ofGreedy((state, element, downstream) -> {
                    if (state.value == null) {
                        state.value = element;  // First value
                    } else {
                        state.value = selector.apply(state.value, element); // Compare and update max
                    }
                    return true;
                }),

                // The combiner, used during parallel evaluation
                (leftState, rightState) -> {
                    if (leftState.value == null) return rightState;  // If left is empty, take right
                    if (rightState.value == null) return leftState;  // If right is empty, take left
                    leftState.value = selector.apply(leftState.value, rightState.value);  // Select max
                    return leftState;
                },

                // The finisher
                (state, downstream) -> {
                    if (state.value != null)
                        downstream.push(state.value);  // Emit the selected value
                }
        );
    }

    // Prime checking function
    private static boolean isPrime(int num) {
        if (num < 2) return false;
        return IntStream.rangeClosed(2, (int) Math.sqrt(num))
                .noneMatch(divisor -> num % divisor == 0);
    }
}

Notice that in this example, we have a more complex implementation where we define a Gatherer interface with an integrator, combiner, and finisher. We also implement a function that returns a gatherer and accepts an operator to select the maximum value. In this case, we use Math.max, which finds the maximum between two numbers.

🛠️ Creating Your Own Gatherer

Developers can define custom gatherers using Gatherer.ofSequential() or by implementing Gatherer directly, as shown in the example above using Gatherer.of(). Here's an example of a gatherer that emits distinct names based on their length. If two names have the same length, it picks only the first one that appears:

import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Gatherer;

public class Test {
    public static void main(String[] args) {
        Gatherer, String> distinctByLength = Gatherer.ofSequential(
                HashSet::new,
                (set, str, downstream) -> {
                    // If this length is new, send 'str' downstream
                    if (set.add(str.length())) {
                        downstream.push(str);
                    }
                    return true;
                },
                (set, downstream) -> {}
        );

        var names = Arrays.asList("amanda", "samantha", "carolina", "davis", "john", "juliana");

        names.stream()
                // "gather" is a proposed method in JEP 485 (not in standard Java yet)
                .gather(distinctByLength)
                .collect(Collectors.toSet())
                .forEach(System.out::println);
    }
}

⚖️ Gather vs. Collect

While Collector is used for terminal aggregation, Gatherer is designed for intermediate transformations. Key differences:

🔄 Integrator vs. BiConsumer — Gatherers integrate state with a Downstream object.
🏁 Finisher with Side Effects — Unlike Collector’s Function, Gatherer’s BiConsumer operates on Downstream.
⏳ Supports Short-Circuiting — Gatherers can stop processing early, unlike most collectors.

🎯 Conclusion

Java 24’s Stream.gather makes stream processing more powerful and expressive. Whether using built-in gatherers or creating custom ones, developers now have a tool that simplifies complex transformations while maintaining readability and efficiency. This feature represents a significant evolution in Java’s functional programming capabilities, bridging the gap between simplicity and flexibility. 🚀

🚀 How to Use Stream.gather in Java 24 for More Powerful Stream Processing was originally published in Arquivo Livre on Medium, where people are continuing the conversation by highlighting and responding to this story.

Why You Should Consider Migrating to Java 24

Thiago Gonzaga — Fri, 28 Feb 2025 02:26:46 GMT

This is my first blog post in many, many years, and I was wondering what to write about. I’m a community guy — I love sharing content! I’ve been part of the Java Noroeste JUG leadership for over a decade, and one thing I always enjoy is talking about the latest Java releases.

Back in September 2024, I gave a talk at a SouJava meetup about Java 23 features. So, I thought, why not do the same here but for Java 24? Let’s get straight to the point and talk about all the cool new things coming with Java 24 and why you should be thinking about migrating!

1️⃣ Virtual Threads Just Got Even Better! (JEP 491)

Virtual Threads (VTs) were a game-changer in Java 21, making concurrency easier and more scalable. But there was a problem: if you used synchronized methods, they could pin the carrier thread, blocking other tasks and hurting performance.

👉 What changed in Java 24?
With JEP 491, synchronized methods no longer block the carrier thread! Now, virtual threads can park safely, making them much more efficient for applications that rely on traditional synchronization mechanisms.

💡 Example:

Looking at the example below, pinning usually happens when a virtual thread is blocked inside a synchronized block for an extended period.

import java.util.concurrent.*;

public class VirtualThreadPinningTest {
    private static final Object lock = new Object();

    static void blockingTask() {
        synchronized (lock) { // 🔴 This used to pin in Java 21
            System.out.println(Thread.currentThread() + " - Holding lock...");
            try {
                Thread.sleep(3000); // Simulate a long blocking operation
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println(Thread.currentThread() + " - Released lock!");
        }
    }

    public static void main(String[] args) throws InterruptedException {
        var executor = Executors.newVirtualThreadPerTaskExecutor();

        long start = System.currentTimeMillis();

        executor.submit(VirtualThreadPinningTest::blockingTask);
        Thread.sleep(500); // Ensures the first thread locks first
        executor.submit(VirtualThreadPinningTest::blockingTask);

        executor.shutdown();
        executor.awaitTermination(5, TimeUnit.SECONDS);

        long elapsed = System.currentTimeMillis() - start;
        System.out.println("Total execution time: " + elapsed + "ms");
    }
}

Let’s explore how things turn out with Java 21:

$ javac --release 21 VirtualThreadPinningTest.java

$ java -Djdk.tracePinnedThreads=full VirtualThreadPinningTest
VirtualThread[#20]/runnable@ForkJoinPool-1-worker-1 - Holding lock...
VirtualThread[#20]/runnable@ForkJoinPool-1-worker-1 reason:MONITOR
    java.base/java.lang.VirtualThread$VThreadContinuation.onPinned(VirtualThread.java:199)
    java.base/jdk.internal.vm.Continuation.onPinned0(Continuation.java:393)
    java.base/java.lang.VirtualThread.parkNanos(VirtualThread.java:640)
    java.base/java.lang.VirtualThread.sleepNanos(VirtualThread.java:817)
    java.base/java.lang.Thread.sleepNanos(Thread.java:494)
    java.base/java.lang.Thread.sleep(Thread.java:527)
    VirtualThreadPinningTest.blockingTask(Main.java:11) <== monitors:1
    java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
    java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
    java.base/java.lang.VirtualThread.run(VirtualThread.java:329)
VirtualThread[#20]/runnable@ForkJoinPool-1-worker-1 - Released lock!
VirtualThread[#25]/runnable@ForkJoinPool-1-worker-3 - Holding lock...
Total execution time: 5516ms

If we do the same in Java 24, the thread is no longer pinned within a synchronized block:

$javac --release 24 Main.java

$ java --enable-preview -Djdk.tracePinnedThreads=full VirtualThreadPinningTest
VirtualThread[#26]/runnable@ForkJoinPool-1-worker-1 - Holding lock...
VirtualThread[#26]/runnable@ForkJoinPool-1-worker-1 - Released lock!
VirtualThread[#30]/runnable@ForkJoinPool-1-worker-2 - Holding lock...
Total execution time: 5522ms

Although this is a simple example, you’ll notice it seems slower. However, the thread is no longer pinned, allowing other threads to run freely. If you’re using virtual threads, no changes are needed — your existing code will now scale more efficiently.

2️⃣ G1 GC Just Got Smarter (JEP 475)

If you’re running Java apps in the cloud, you’re probably using G1 GC. Java 24 introduces a performance optimization called Late Barrier Expansion, which reduces JVM overhead by delaying the insertion of GC barriers until later in the compilation process.

👉 What does this mean?

Lower GC overhead ✅
Faster JVM performance ✅
More efficient cloud deployments ✅

It’s a low-level change that makes Java even better for high-performance and cloud-native applications.

3️⃣ Faster Startup with Ahead-of-Time Class Loading (JEP 483)

Have you ever felt your Spring Boot app takes too long to start? Java 24 brings Ahead-of-Time (AOT) Class Loading & Linking, which reduces startup time by up to 42% when the JVM is trained properly.

💡 Example:

Let' take this small Java program as an example, it the Stream API, which loads almost 600 JDK classes when executed.

import java.util.*;
import java.util.stream.*;

public class HelloStream {
    public static void main(String ... args) {
        long start = System.currentTimeMillis();
        var words = List.of("hello", "fuzzy", "world");
        var greeting = words.stream()
                .filter(w -> !w.contains("z"))
                .collect(Collectors.joining(", "));
        System.out.println(greeting);  // Output: hello, world
        long elapsed = System.currentTimeMillis() - start;
        System.out.println("Total execution time: " + elapsed + "ms");

    }
}

✅ First, compile the program:

$ javac HelloStream.java

✅ Run it once to observe normal startup time:

$ java HelloStream
hello, world
Total execution time: 9ms

✅ Then run the application in a training mode to let the JVM analyze and record its AOT configuration:

$ java -XX:AOTMode=record -XX:AOTConfiguration=app.aotconf \
     -cp . HelloStream           
hello, world
Total execution time: 25ms

This will generate an AOT configuration file (app.aotconf), storing details about the required classes and methods.

✅ Now, use the recorded configuration to create an AOT cache:

$ java -XX:AOTMode=create -XX:AOTConfiguration=app.aotconf \
     -XX:AOTCache=app.aot -cp . HelloStream            
AOTCache creation is complete: app.aot

This does not run the program but instead generates an optimized cache file (app.aot) that speeds up future executions.

✅ Now, run the program using the cache:

$ java -XX:AOTCache=app.aot -cp . HelloStream
hello, world
Total execution time: 1ms

This results in faster startup because classes are loaded instantly from the cache rather than being read, parsed, and linked at runtime.

To conclude, the JVM trains itself by learning which classes are used often and preloads them. This helps reduce startup time and improves performance.

👉 Great for:

Spring Boot 🚀
Micronaut, Quarkus ☁️
Serverless applications ⚡

Less waiting, more coding!

4️⃣ New Class-File API (JEP 484)

If you’ve ever worked with bytecode manipulation using ASM or other low-level tools, you know how painful it can be. Java 24 introduces a standard Class-File API, making it much easier to read, write, and transform class files without low-level hacks.

💡 Example: Using the new Class-File API

This example we will:
✔ Create a new HelloWorld class
✔ Generate the main() method that prints text
✔ Write the .class file that can be loaded and executed by the JVM

import java.lang.classfile.ClassFile;
import java.lang.constant.ClassDesc;
import java.lang.constant.MethodTypeDesc;
import java.nio.file.Path;

public class Main {
    public static void main(String[] args) throws Exception {
        Path classFilePath = Path.of("HelloWorld.class");
        ClassFile.of().buildTo(classFilePath, ClassDesc.of("HelloWorld"), classBuilder -> {
            // Set class metadata
            classBuilder
                    .withVersion(68, 0) // Java 24 = major version 68
                    .withSuperclass(ClassDesc.of("java.lang.Object")); // Superclass name

            // Add the "main" method
            classBuilder.withMethod(
                    "main",
                    MethodTypeDesc.ofDescriptor("([Ljava/lang/String;)V"), // void method with String[] parameter
                    ClassFile.ACC_PUBLIC | ClassFile.ACC_STATIC, // public static
                    methodBuilder -> methodBuilder.withCode(code -> code // System.out.println("Hello, world from Java 24!")
                            .getstatic(ClassDesc.of("java.lang.System"), "out", ClassDesc.of("java.io.PrintStream"))
                            .ldc("Hello, world from Java 24!")
                            .invokevirtual(ClassDesc.of("java.io.PrintStream"), "println", MethodTypeDesc.ofDescriptor("(Ljava/lang/String;)V"))
                            .return_())
            );
        });
        System.out.println("Generated HelloWorld.class successfully!");
    }
}

✅ Compile with Java 24

javac Main.java

✅ Run & Verify the Bytecode

$ java Main                   
Generated HelloWorld.class successfully!

$ java HelloWorld             
Hello, world from Java 24!

If you’re building instrumentation tools, agents, or compilers, this is a huge win!

5️⃣ Stream.gather() — More Powerful Stream Transformations (JEP 485)

Java Streams just got a new intermediate operation: Stream.gather(). It allows more flexible transformations, including:
✅ One-to-one
✅ One-to-many
✅ Many-to-many

💡 Example: Before vs. After

Before Java 24, you had to flatten streams manually:

In this example we want to convert an array like [1, 2, 3, 4, 5, 6, 7] into [[1, 2, 3], [4, 5, 6], [7]].

import java.util.Arrays;
import java.util.List;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        // Collect all elements into a list first
        List allNumbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7);
        int batchSize = 3;

        // Generate batches from the list
        Stream> batches = Stream.iterate(0, i -> i < allNumbers.size(), i -> i + batchSize)
                .map(i -> allNumbers.subList(i, Math.min(i + batchSize, allNumbers.size())));

        System.out.println(batches.toList());
    }
}

With Stream.gather(), it’s much simpler and more efficient:

import java.util.stream.Gatherers;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        System.out.println(Stream.of(1, 2, 3, 4, 5, 6, 7).gather(Gatherers.windowFixed(3)).toList());
    }
}

Output for both should be an array of arrays divided in blocks of 3 items:

$ javac Main.java
$ java Main
[[1, 2, 3], [4, 5, 6], [7]]

This makes working with nested or complex transformations a lot easier!

6️⃣ JDK Linking Without JMODs (JEP 493)

The JDK is getting smaller! With the new --enable-linkable-runtime option, you can build a JDK that allows jlink to create runtime images without JMOD files. This enables it to link JDK modules directly from the containing run-time image, resulting in a run-time image that is about 60% smaller than a full JDK run-time image.

✅ Same modules
✅ Smaller footprint
✅ Faster deployments

If you’re shipping custom JDK run-time image, this is great news!

7️⃣ Quantum-Resistant Cryptography (JEP 496 & 497)

With quantum computing advancing, traditional cryptographic algorithms like RSA and ECC could become vulnerable. Java 24 introduces post-quantum security with:

🔐 ML-KEM (Kyber) — For key exchange
✍ ML-DSA (Dilithium) — For digital signatures

This is huge for securing future-proof applications!

🔍 What Else Is Changing in Java 24?

🆕 New Features & Enhancements (Preview & Experimental)

Java 24 introduces several experimental and preview features that push the platform forward:

🚀 Performance & Memory Improvements

JEP 404: Generational Shenandoah (Experimental) — Adds a generational mode to the Shenandoah garbage collector for better memory efficiency.
JEP 450: Compact Object Headers (Experimental) — Reduces object header size to improve memory footprint.

🔐 Security & Cryptography

JEP 478: Key Derivation Function API (Preview) — Introduces a standardized API for key derivation functions to enhance cryptographic security.

🧵 Concurrency & Scoped Values

JEP 487: Scoped Values (Fourth Preview) — Improves efficiency over thread-local variables by providing immutable, inheritable values within a scope.
JEP 499: Structured Concurrency (Fourth Preview) — Simplifies concurrent programming by treating related tasks as a single unit.

🖥️ Language & Syntax Enhancements

JEP 488: Primitive Types in Patterns, instanceof, and switch (Second Preview) — Enhances pattern matching to support primitive types.
JEP 494: Module Import Declarations (Second Preview) — Simplifies module imports with new syntax.
JEP 495: Simple Source Files and Instance Main Methods (Fourth Preview) — Allows instance mainmethods and streamlines single-file execution.

⏩ Performance-Oriented APIs

JEP 489: Vector API (Ninth Incubator) — Continues enhancing SIMD vector operations for better CPU utilization.
JEP 492: Flexible Constructor Bodies (Third Preview) — Adds more flexibility in constructor execution.

🗑️ Deprecations, Removals & Restrictions

Java 24 also phases out outdated features:

📌 JNI & Unsafe Restrictions

JEP 472: Prepare to Restrict the Use of JNI — Lays groundwork for restricting Java Native Interface (JNI) in future releases.
JEP 498: Warn upon Use of Memory-Access Methods in sun.misc.Unsafe – Warns developers about unsafe memory operations.

🪦 Feature & Platform Removals

JEP 479: Remove the Windows 32-bit x86 Port — Ends support for 32-bit Windows.
JEP 501: Deprecate the 32-bit x86 Port for Removal — Signals the eventual removal of 32-bit x86 support.
JEP 486: Permanently Disable the Security Manager — Final removal of Java’s outdated Security Manager.
JEP 490: ZGC: Remove the Non-Generational Mode — Fully transitions ZGC to a generational model.

Final Thoughts

Java 24 brings major improvements in performance, scalability, and security. Whether you’re running cloud apps, microservices, or high-performance applications, there’s something valuable here.

So… are you upgrading to Java 24? Let me know! 🚀

Happy coding! 💻

Why You Should Consider Migrating to Java 24 🚀 was originally published in Arquivo Livre on Medium, where people are continuing the conversation by highlighting and responding to this story.