All Posts

Adapting to Virtual Threads for Spring Developers

Migrate your Spring Boot services from thread-pool exhaustion to JDK 21 virtual threads — the practical Spring developer guide

Abstract AlgorithmsAbstract Algorithms
··18 min read
Cover Image for Adapting to Virtual Threads for Spring Developers
Share
AI Share on X / Twitter
AI Share on LinkedIn
Copy link

TLDR: Platform threads (one OS thread per request) max out at a few hundred concurrent I/O-bound requests. Virtual threads (JDK 21+) allow millions — with zero I/O-blocking cost. Spring Boot 3.2 enables them with a single property. Avoid synchronized blocks wrapping I/O (they pin virtual threads to OS threads) and CPU-bound work inside virtual threads. For I/O-heavy Spring services, virtual threads are the most impactful JVM upgrade since Java 8 lambdas.

⚠️ When 2,000 Users Break a Service That Handled 200 Fine

It is 2 AM. PagerDuty fires. Your Spring Boot service is returning HTTP 503s. You open Grafana: CPU at 22%, heap at 40%, memory healthy. No database errors. Then you check the thread monitor: the Tomcat thread pool is at 200/200 active threads. Thousands of requests are queuing and timing out.

You pull a thread dump. The culprit is 187 threads sitting in SocketInputStream.read(), each waiting for a JDBC network round-trip to return. Each database query holds a platform thread hostage while it waits the ~10ms for the network. At 200 concurrent queries, the thread pool is spent — every new request either waits in Tomcat's accept queue or gets rejected.

Your CPU has capacity. Your database has capacity. But your JVM is out of threads.

This is the platform thread ceiling that has constrained Java concurrency for 25 years. Java 21 (LTS) solved it with virtual threads — a fundamentally different threading model where I/O blocking no longer monopolizes OS threads. Spring Boot 3.2 adopted it with a one-property configuration change. This post shows you what to change, what traps to avoid, and what to leave completely alone.

🔍 Why Platform Threads Break Under Concurrent I/O Load

Every thread managed by Tomcat, @Async, or a raw ExecutorService in classic Spring MVC maps 1:1 to an OS thread. OS threads are expensive along three dimensions:

  • Stack memory: ~1 MB of stack space reserved per thread, regardless of whether the thread is doing work or waiting on I/O
  • Context-switch overhead: When a thread blocks, the OS saves its register state, switches to another thread, and restores state on return — around 1–10 microseconds per switch, adding up fast at high concurrency
  • OS-level cap: Most Linux systems support 1,000–10,000 threads per JVM process before kernel overhead degrades the system

With Tomcat's default server.tomcat.threads.max=200, you can handle exactly 200 simultaneous blocking calls. Increase it to 1,000 and your JVM reserves ~1 GB in thread stacks alone. At 5,000 threads, context-switch thrashing begins to consume more CPU than your actual application logic.

The deeper problem: during an HTTP call, JDBC query, file read, or any I/O wait, the platform thread is completely idle — it holds memory and an OS slot while doing nothing except waiting for a kernel signal.

📖 Virtual Threads: JVM-Scheduled, Not OS-Scheduled

A virtual thread is a thread managed by the JVM rather than the OS. Internally, the JVM runs a small fixed pool of carrier threads — one OS thread per CPU core. Virtual threads are mounted onto a carrier thread when they have CPU work to perform. The moment a virtual thread hits a blocking I/O call, the JVM unmounts it from the carrier, suspends its stack in heap memory, and frees the carrier for another virtual thread that has work ready.

Analogy: imagine a restaurant where each waiter (platform thread) takes an order, walks to the kitchen, and stands there waiting for the food — unable to serve other tables. Virtual threads are like a ticket system: the waiter takes the order, drops a ticket at the kitchen pass, and immediately returns to the floor to serve other tables. When the food is ready, any available waiter picks it up.

You can create millions of virtual threads. Each consumes only a few hundred bytes of heap when suspended. The OS sees only the small carrier thread pool.

The following diagram contrasts the two models at 300 concurrent requests:

📊 How Virtual Thread Scheduling Differs from Platform Threads

The diagram below shows why the platform thread model exhausts at 200 concurrent requests while the virtual thread model scales to millions. Read each subgraph left-to-right: in the platform model, every request arrow terminates at a blocked OS thread; in the virtual model, all virtual threads are suspended in heap while two carrier threads handle all active CPU work.

flowchart TD
    subgraph PlatformModel[Platform Thread Model - 300 concurrent requests]
        R1[Request 1] --> T1[OS Thread 1 - blocked on JDBC]
        R2[Request 2] --> T2[OS Thread 2 - blocked on HTTP call]
        R3[Request 3] --> T3[OS Thread 3 - blocked on file I/O]
        R200[Request 200] --> T200[OS Thread 200 - blocked on JDBC]
        R201[Request 201] --> Q1[Queued - thread pool exhausted]
        R300[Request 300] --> Q2[Queued - timeout approaching]
    end
    subgraph VirtualModel[Virtual Thread Model - 300 concurrent requests]
        VR1[Request 1] --> VT1[Virtual Thread 1 - suspended in heap]
        VR2[Request 2] --> VT2[Virtual Thread 2 - suspended in heap]
        VR300[Request 300] --> VT300[Virtual Thread 300 - suspended in heap]
        CT1[Carrier Thread 1 - 1 per CPU core] -.->|mounts VT with active work| VT1
        CT2[Carrier Thread 2 - 1 per CPU core] -.->|mounts VT with active work| VT2
    end

In the platform thread model, each of the 200 threads is blocked and idle — consuming memory and OS slots while waiting on I/O. Requests 201–300 cannot be served. In the virtual thread model, all 300 virtual threads exist concurrently, each suspended in heap (~300 bytes each) during their I/O waits. Only the two carrier threads are OS-level resources, and they only activate when a virtual thread has CPU work — like deserializing a response or executing business logic.

🧪 Worked Example: Thread Pool Exhaustion Before and After Virtual Threads

To see the benefit concretely, contrast the same OrderService in both models. The code is identical — only the threading model beneath it changes.

// application.properties — classic Tomcat platform thread configuration
server.tomcat.threads.max=200          // hard ceiling: 200 concurrent blocking requests
server.tomcat.threads.min-spare=10
server.tomcat.accept-count=100         // queue depth before rejecting new connections

// Before: OrderService — each method call holds a platform thread for its entire duration
@Service
public class OrderService {

    private final RestTemplate restTemplate;
    private final JdbcTemplate jdbcTemplate;

    public OrderService(RestTemplate restTemplate, JdbcTemplate jdbcTemplate) {
        this.restTemplate = restTemplate;
        this.jdbcTemplate = jdbcTemplate;
    }

    // PROBLEM: This method occupies platform thread #47 for the full ~60ms.
    // During the 10ms JDBC wait and the 50ms HTTP wait, thread #47 sits idle
    // in SocketInputStream.read() — holding 1MB of stack and an OS slot.
    // At 200 concurrent calls, the pool is exhausted regardless of CPU load.
    public Order processOrder(String orderId) {

        // Blocks platform thread ~10ms waiting for DB network round-trip
        Order order = jdbcTemplate.queryForObject(
            "SELECT * FROM orders WHERE id = ?",
            new Object[]{orderId},
            orderRowMapper
        );

        // Blocks platform thread ~50ms waiting for external HTTP response
        String enrichment = restTemplate.getForObject(
            "https://inventory-service/api/items/" + order.getItemId(),
            String.class
        );

        order.setEnrichment(enrichment);
        return order;
    }
}

// Before: Controller — Tomcat dispatches a platform thread per HTTP request.
// That thread is held until processOrder() returns (~60ms total).
@RestController
@RequestMapping("/orders")
public class OrderController {

    private final OrderService orderService;

    public OrderController(OrderService orderService) {
        this.orderService = orderService;
    }

    @GetMapping("/{orderId}")
    public ResponseEntity<Order> getOrder(@PathVariable String orderId) {
        return ResponseEntity.ok(orderService.processOrder(orderId));
    }
}

With 200 threads and a 60ms average request latency, theoretical throughput is 200 / 0.060 ≈ 3,300 req/s. In practice, the moment you have 201 concurrent long-running requests, the 201st queues. When one dependency slows down — a saturated DB connection pool, a slow external API — the thread pool drains fast and latency cascades across all endpoints.

⚙️ Enabling Virtual Threads in Spring Boot 3.2

Spring Boot 3.2 wires virtual threads automatically when this property is set and JDK 21+ is detected:

# application.yml — enable virtual threads (Spring Boot 3.2+, JDK 21+ required)
spring:
  threads:
    virtual:
      enabled: true

Under the hood, this replaces Tomcat's fixed thread pool executor with one that creates a new virtual thread per incoming request. There is no thread pool ceiling. The OrderService code is unchanged — the same blocking JDBC call, the same RestTemplate call — they now run on virtual threads. The JVM unmounts the virtual thread during each I/O wait, freeing carrier threads for other requests.

For teams that need explicit programmatic control, the same configuration can be expressed in Java:

// Explicit programmatic virtual thread configuration
@Configuration
public class VirtualThreadConfig {

    // Replaces Tomcat's fixed thread pool with a virtual-thread-per-task executor.
    // Tomcat now creates one virtual thread per incoming HTTP request.
    @Bean
    public TomcatProtocolHandlerCustomizer<?> virtualThreadTomcatCustomizer() {
        return protocolHandler ->
            protocolHandler.setExecutor(
                Executors.newVirtualThreadPerTaskExecutor()  // JDK 21 API
            );
    }

    // Replace the @Async executor so background tasks also run on virtual threads.
    // SimpleAsyncTaskExecutor with virtualThreads=true is equivalent but
    // the explicit executor gives you more visibility in monitoring tools.
    @Bean
    @Primary
    public Executor virtualThreadAsyncExecutor() {
        return Executors.newVirtualThreadPerTaskExecutor();
    }
}

The service now handles 50,000 concurrent in-flight requests using the same synchronous, readable code style. No reactive WebFlux rewrite required.

🧠 Under the Hood: How the JVM Parks and Resumes Virtual Threads

Understanding the internals helps you reason about when virtual threads help, when they hurt, and why the pinning problem exists.

Virtual Thread Internals: Mounting, Unmounting, and Carrier Threads

When jdbcTemplate.queryForObject(...) sends a SQL query over a socket, the JVM intercepts the blocking I/O call and executes this lifecycle:

  1. The virtual thread calls SocketInputStream.read()
  2. The JVM detects this is a blocking I/O syscall via its internal socket implementation
  3. The virtual thread is unmounted from its carrier thread — its stack is moved to heap memory
  4. The carrier thread is freed to pick up another virtual thread that has CPU work ready
  5. The kernel signals via epoll/kqueue that the socket has data
  6. The JVM scheduler remounts the virtual thread onto any available carrier
  7. Execution resumes from exactly where it paused, with all local variables intact

This lifecycle is transparent to your code. A normal synchronous jdbcTemplate.queryForObject(...) call behaves identically to before, but the underlying thread model has changed fundamentally. The JVM's internal VirtualThread class implements this via continuation — a reified stack snapshot that can be stored and resumed.

Performance Analysis: Throughput and Latency Under High Concurrency

The throughput improvement is not from making individual requests faster — a virtual thread does not execute JDBC faster than a platform thread. The gain is from eliminating the queue that forms when the thread pool is saturated.

MetricPlatform Threads (pool=200)Virtual Threads
Max concurrent I/O requests200~millions
Memory per blocked thread~1 MB (stack)~300 bytes (heap)
p99 latency at 500 reqThread queue delayNear-baseline
p99 latency at 2000 reqRejections or timeoutsNear-baseline
CPU-bound throughputIdenticalIdentical

The practical cutover point: if your service handles fewer than 100 concurrent I/O-bound requests and averages under 20ms response time, platform threads are adequate. Above that threshold, virtual threads eliminate the queue and dramatically improve tail latency.

The Pinning Problem: synchronized Blocks That Block the Carrier

There is one critical failure mode that eliminates all virtual thread benefits: a synchronized block pins a virtual thread to its carrier thread for the entire duration of the block.

When a virtual thread is pinned, the JVM cannot unmount it during I/O waits inside the synchronized block. The carrier thread stays occupied, turning your unlimited virtual thread pool into the equivalent of a thread pool sized to your carrier count (typically 8–16 on a modern server).

// DANGEROUS: synchronized + I/O inside the lock = virtual thread pins its carrier
@Service
public class CachedOrderService {

    private final Map<String, Order> cache = new HashMap<>();
    private final JdbcTemplate jdbcTemplate;

    public CachedOrderService(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }

    // BAD: synchronized holds the carrier thread for the full JDBC call duration.
    // With 8 CPU cores (8 carrier threads), only 8 concurrent cached lookups
    // can proceed — far worse than the 200-thread platform model.
    public synchronized Order getOrLoad(String orderId) {
        if (cache.containsKey(orderId)) {
            return cache.get(orderId);
        }
        // This JDBC call (10ms+) runs while synchronized holds the carrier thread pinned.
        Order order = jdbcTemplate.queryForObject(
            "SELECT * FROM orders WHERE id = ?",
            new Object[]{orderId}, orderRowMapper
        );
        cache.put(orderId, order);
        return order;
    }
}

// CORRECT: ReentrantLock allows the virtual thread to unmount during lock wait.
// When another virtual thread holds the lock, this one parks in heap — not the carrier.
@Service
public class CachedOrderServiceFixed {

    private final Map<String, Order> cache = new HashMap<>();
    private final ReentrantLock lock = new ReentrantLock();
    private final JdbcTemplate jdbcTemplate;

    public CachedOrderServiceFixed(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }

    // GOOD: lockInterruptibly() allows the JVM to park the virtual thread if the lock
    // is held — it unmounts from the carrier and frees it for other virtual threads.
    public Order getOrLoad(String orderId) throws InterruptedException {
        lock.lockInterruptibly();
        try {
            if (cache.containsKey(orderId)) {
                return cache.get(orderId);
            }
            Order order = jdbcTemplate.queryForObject(
                "SELECT * FROM orders WHERE id = ?",
                new Object[]{orderId}, orderRowMapper
            );
            cache.put(orderId, order);
            return order;
        } finally {
            lock.unlock();
        }
    }
}

Detect pinning before going to production:

java -Djdk.tracePinnedThreads=full -jar your-app.jar

This logs a stack trace every time a virtual thread is pinned, showing exactly which synchronized block is responsible.

🧭 Spring Migration Decision Guide

AreaActionPriority
Spring Boot versionUpgrade to 3.2+ with JDK 21Required
Enable virtual threadsSet spring.threads.virtual.enabled=trueRequired
Remove thread pool ceilingRemove or raise server.tomcat.threads.max — no longer the bottleneckRecommended
Audit synchronized blocksReplace with ReentrantLock wherever blocking I/O is insideCritical
JDBC / HikariCPUpgrade to HikariCP 5.1+ — internal synchronized replaced with StampedLockRequired
@Async executorConfigure @Async to use newVirtualThreadPerTaskExecutor()Recommended
CPU-bound workKeep using bounded ExecutorService or ForkJoinPoolRequired
WebFluxDo not mix WebFlux reactive pipelines with virtual threads — choose one modelCritical
ThreadLocalWorks correctly — no changes neededNone
@Scheduled tasksWork correctly with virtual threadsNone
Pinning diagnosticsRun -Djdk.tracePinnedThreads=full in staging before launchRecommended

⚖️ Trade-offs: I/O-Bound Gains vs CPU-Bound Limits

Virtual threads are a solution for I/O-bound concurrency. For CPU-bound work — image processing, PDF generation, JSON serialization of large objects, cryptographic operations — the virtual thread holds its carrier thread active for the full duration. No mounting/unmounting occurs because there is no I/O pause. You get identical throughput to platform threads with added scheduler overhead.

For CPU-intensive workloads, a bounded ExecutorService sized to CPU cores prevents oversubscription:

@Service
public class ReportGenerationService {

    // Bounded pool for CPU-bound work: sized to CPU cores to prevent oversubscription.
    // Virtual threads would not help here and would compete for carrier threads.
    private final ExecutorService cpuPool = Executors.newFixedThreadPool(
        Runtime.getRuntime().availableProcessors()
    );

    public CompletableFuture<byte[]> generatePdfAsync(ReportRequest request) {
        return CompletableFuture.supplyAsync(
            () -> renderPdf(request),   // CPU-intensive: iText PDF rendering
            cpuPool                      // explicit pool — not virtual thread executor
        );
    }

    private byte[] renderPdf(ReportRequest request) {
        // Heavy computation — no I/O — virtual threads give zero benefit
        return PdfRenderer.render(request.getTemplate(), request.getData());
    }
}

A good rule: if Thread.sleep(50) in a benchmark makes your endpoint significantly faster (simulating I/O delay), virtual threads will help. If removing all I/O and running pure computation still saturates CPU, stick with a bounded pool.

🌍 Where Teams Are Deploying Virtual Threads in Production

Virtual threads have seen real-world adoption across industries since JDK 21's release in September 2023.

API gateway and microservice backends are the primary adoption site. Services that fan out to multiple downstream dependencies — inventory, pricing, inventory enrichment — benefit most because each downstream call blocks independently. A service making 5 downstream calls in parallel goes from consuming 5 platform threads per request to consuming ~0 carrier-thread time while waiting.

GitHub and similar code hosting platforms handle high-concurrency REST APIs where each endpoint involves multiple DB reads. Teams report that enabling virtual threads in Spring Boot 3.2 cuts p99 latency spikes during traffic bursts by eliminating the thread pool queue.

Batch-parallel processing in Spring Batch is another common use case. When a batch job processes records by making HTTP calls per record, virtual threads allow a single node to process thousands of records in-flight simultaneously without memory pressure from thread stacks.

Caution: reporting and analytics services that run heavy aggregation queries do not benefit. A 5-second SQL GROUP BY across millions of rows still holds the virtual thread active on the DB's thread for 5 seconds — the gain is only in the JVM-side wait, not the DB-side processing time.

Service TypeVirtual Thread BenefitReason
REST API with multiple downstream callsHighParallel I/O waits park independently
JDBC-heavy CRUD servicesHighPer-query blocking eliminated
Message consumer (Kafka, SQS)HighPoll wait parks without blocking OS threads
CPU-intensive calculation serviceNoneNo I/O to park on
Reactive WebFlux serviceNoneAlready non-blocking at the model level
Legacy JDBC with synchronized cacheRiskPinning can degrade performance below baseline

🛠️ Spring Framework: How It Integrates Project Loom

Spring Framework 6.1 and Spring Boot 3.2 were designed in tandem with Project Loom (JEP 444, finalized in JDK 21). When spring.threads.virtual.enabled=true is set, Spring Boot's autoconfiguration wires virtual thread executors across all embedded server adapters:

  • TomcatVirtualThreadExecutor for the request dispatcher
  • JettyVirtualThreadPool replacing the standard QueuedThreadPool
  • Undertow → virtual thread executor for request dispatch
  • @AsyncSimpleAsyncTaskExecutor configured with setVirtualThreads(true)
  • Spring SecuritySecurityContextHolder propagation works unchanged via virtual-thread-compatible ThreadLocal
  • Spring Data JPA / JDBC → no changes required; blocking calls park correctly at the JDBC socket layer

HikariCP 5.1.0 replaced its internal synchronized blocks with StampedLock, eliminating the main JDBC-level pinning risk. If you are on an older HikariCP version, this is the single most important dependency upgrade for virtual thread correctness.

Spring's SimpleAsyncTaskExecutor (the default for @Async when virtual threads are enabled) creates one virtual thread per submitted task with no pooling — which is the correct pattern since virtual threads are cheap to create and pool management adds overhead without benefit.

For a full reference on Project Loom's design, see JEP 444: Virtual Threads and the Spring Framework 6.1 release notes.

📚 Lessons Learned from Production Virtual Thread Migrations

  • One property, measurable impact. Enabling virtual threads in a Spring Boot 3.2 service handling 5,000 concurrent users dropped p99 latency by 40% with zero code change in the business logic. The thread pool was the ceiling — not the database or application logic.

  • HikariCP version is non-negotiable. Teams that upgraded to virtual threads without upgrading HikariCP saw worse performance because HikariCP's old synchronized connection acquisition code pinned carrier threads. Always upgrade to HikariCP 5.1+ first.

  • DB connection pool size still matters. Virtual threads eliminate the JVM thread ceiling, but they do not eliminate the database connection limit. With millions of virtual threads able to attempt DB calls concurrently, a connection pool of 20 becomes the new bottleneck. Right-size your connection pool based on DB server capacity, not JVM thread limits.

  • Naming virtual threads aids debugging. Anonymous virtual threads make thread dumps hard to read. Use Thread.ofVirtual().name("order-processor-", 0).factory() to produce readable names like order-processor-0, order-processor-1 in monitoring tools.

  • Do not migrate WebFlux services. If a service is already on Spring WebFlux, virtual threads offer no gain — both models solve the same I/O-blocking problem differently. Migrating a reactive codebase to virtual threads introduces synchronous blocking patterns without benefit.

📌 TLDR & Key Takeaways

  • Platform threads are OS-level resources: ~1 MB stack, ~200–1,000 max before throughput degrades. Every blocking I/O call holds one.
  • Virtual threads (JDK 21) are JVM-managed: ~300 bytes per suspended thread, millions can coexist. I/O waits unmount them from OS threads.
  • Spring Boot 3.2 wires virtual threads across Tomcat, Jetty, Undertow, and @Async with a single property: spring.threads.virtual.enabled=true.
  • synchronized + I/O = pinning. Replace with java.util.concurrent locks (ReentrantLock, StampedLock) wherever blocking I/O is inside a synchronized block.
  • CPU-bound work does not benefit from virtual threads. Use bounded ExecutorService sized to CPU cores for heavy computation.
  • HikariCP 5.1+ is required for JDBC virtual thread correctness. Upgrade it before enabling virtual threads.
  • Run -Djdk.tracePinnedThreads=full in staging to find and fix every pinning site before production.

📝 Practice Quiz

  1. Your Spring Boot 3.2 service handles 10,000 concurrent JDBC queries well after enabling virtual threads. But every request hitting CacheService.getOrLoad() times out under load. What is the most likely cause?

  2. a) Thread pool is too small

  3. b) A synchronized block around a JDBC call in CacheService is pinning carrier threads
  4. c) JDK 21 does not support JDBC
  5. d) HikariCP connection pool is exhausted

Correct Answer: b — A synchronized block wrapping JDBC code pins the virtual thread to its carrier for the full I/O wait, collapsing effective concurrency to carrier thread count (typically 8–16).

  1. You enable virtual threads and benchmark a CPU-intensive PDF generation endpoint. Performance is identical to the platform thread baseline. Why?

  2. a) Virtual threads are not enabled for @RestController endpoints

  3. b) CPU-bound work holds the carrier thread active with no I/O parking — virtual threads provide no throughput gain
  4. c) Spring Boot 3.2 requires WebFlux for virtual thread benefits
  5. d) Virtual threads are only beneficial when the thread pool is at maximum capacity

Correct Answer: b — Virtual threads only yield scalability gains during I/O waits. CPU-bound work occupies the carrier thread the entire time, making virtual threads equivalent to platform threads.

  1. What is the correct replacement for a synchronized method that contains a blocking JDBC call?

  2. a) Add @VirtualThreadSafe to the method

  3. b) Remove the lock entirely since virtual threads handle concurrency automatically
  4. c) Use ReentrantLock.lockInterruptibly() so the virtual thread can park during lock acquisition instead of pinning
  5. d) Wrap the method body in CompletableFuture.runAsync()

Correct Answer: c — ReentrantLock (and all java.util.concurrent locks) are virtual-thread-friendly; the JVM can unmount a waiting virtual thread during lock acquisition, unlike synchronized.

  1. After enabling virtual threads, a DB connection pool of size 20 becomes the new bottleneck at high concurrency. What is the correct resolution?

  2. a) Disable virtual threads — they are incompatible with HikariCP

  3. b) Increase the connection pool based on DB server capacity, not JVM thread limits
  4. c) Use synchronized to serialize all DB access to a single connection
  5. d) Add more heap memory to the JVM

Correct Answer: b — Virtual threads remove the JVM thread ceiling but not the DB connection ceiling. Size HikariCP based on how many concurrent queries the database can actually sustain.

  1. Open-ended challenge: A microservice uses a third-party client library that internally uses synchronized blocks extensively for connection management. You cannot modify the library source. How would you safely adopt virtual threads without incurring pinning penalties, and what diagnostic steps would you take to confirm the approach is working?
Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms