Virtual Threads Did Not Kill WebFlux

Posted on March 23, 2026 schedule 11 min read

JavaVirtual ThreadsSpring WebFluxProject ReactorSpring Boot

Every few months, someone posts a benchmark showing that virtual threads handle 100,000 concurrent requests with simple imperative code, and concludes that WebFlux is obsolete. The comments fill with “finally, no more reactive spaghetti.” Managers forward the article to their teams: “Can we rewrite this in virtual threads?”

No. Well — it depends on what “this” does. But probably no.

Virtual threads and WebFlux solve different problems. The confusion comes from the fact that they both appear in the sentence “handle many concurrent requests efficiently.” But that’s like saying a hammer and a screwdriver are competitors because they both “join things together.”

What Virtual Threads Actually Solve

Before Java 21, every request consumed an OS thread. OS threads are expensive — each one allocates a ~1 MB stack, and the OS scheduler manages them. A typical server maxes out at a few thousand concurrent threads. If your request blocks on I/O (a database query, an HTTP call, file read), that OS thread sits idle, doing nothing, but still consuming memory and a scheduler slot.

Virtual threads fix this. They are lightweight threads managed by the JVM, not the OS. When a virtual thread blocks on I/O, the JVM unmounts it from the carrier (OS) thread and mounts another virtual thread. The carrier thread is never idle. You can have millions of virtual threads with minimal memory overhead.

// Before: limited by OS thread pool size
@GetMapping("/order/{id}")
public Order getOrder(@PathVariable String id) {
    Order order = orderRepository.findById(id);       // blocks OS thread
    Customer customer = customerClient.fetch(order.customerId()); // blocks OS thread again
    return enrichOrder(order, customer);
}

// After: same code, but on virtual threads — no OS thread wasted
// Just configure: spring.threads.virtual.enabled=true

That’s the pitch: change one configuration flag, and your existing blocking code scales to thousands of concurrent requests. No code changes. No reactive rewrite. No Mono or Flux. The same imperative style you’ve always written.

And it works. For this specific problem — thread exhaustion from blocking I/O — virtual threads are the correct solution. If your app is a typical CRUD service that calls a database via JDBC and maybe one or two HTTP services, virtual threads are likely all you need.

So where does WebFlux fit?

What WebFlux Actually Solves

WebFlux is not “a way to handle many concurrent requests.” That’s a side effect, not the point. WebFlux is a reactive programming model built on Reactive Streams — a specification for asynchronous stream processing with backpressure.

The key concepts that virtual threads don’t provide:

1. Backpressure

Backpressure is a flow-control mechanism where the consumer tells the producer how much data it can handle. This matters when data flows are asymmetric — a fast producer and a slow consumer.

// WebFlux: the database emits rows only as fast as the HTTP response can flush them
@GetMapping(value = "/orders/stream", produces = MediaType.APPLICATION_NDJSON_VALUE)
public Flux<Order> streamOrders() {
    return orderRepository.findAllAsStream()  // R2DBC reactive query
            .delayElements(Duration.ofMillis(10));  // simulate slow consumer
    // If the client reads slowly, the database query slows down too.
    // No buffering explosion. No OOM. Backpressure propagates upstream.
}

// Virtual threads: no backpressure
@GetMapping("/orders/stream")
public List<Order> getAllOrders() {
    return orderRepository.findAll();  // loads EVERYTHING into memory
    // 10 million rows? Hope you have heap space.
}

With virtual threads, you can make the blocking call without exhausting OS threads — but the call itself still loads all data into memory before returning. There’s no mechanism to signal “slow down, I’m not ready for more.” You could add pagination, but that’s you manually implementing what backpressure gives you for free.

sequenceDiagram participant Client participant Server participant DB as Database Note over Client,DB: Virtual Threads (no backpressure) Client->>Server: GET /orders Server->>DB: SELECT * FROM orders DB-->>Server: 10M rows loaded into memory Server-->>Client: Full response (or OOM) Note over Client,DB: WebFlux (backpressure) Client->>Server: GET /orders/stream Server->>DB: SELECT * FROM orders (streaming) loop For each batch DB-->>Server: 100 rows Server-->>Client: 100 rows flushed Client-->>Server: (TCP backpressure / demand signal) end

2. Streaming and Real-Time Data

Some use cases are inherently streaming: Server-Sent Events, WebSocket connections, real-time dashboards, change data capture, event bus consumption. These aren’t request-response — they’re long-lived data flows.

// SSE stream: push order status updates to the client as they happen
@GetMapping(value = "/orders/{id}/events", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<OrderEvent>> orderEvents(@PathVariable String id) {
    return orderEventService.subscribe(id)
            .map(event -> ServerSentEvent.<OrderEvent>builder()
                    .data(event)
                    .event(event.type().name())
                    .build());
}

You could implement this with virtual threads — spawn a thread, loop, write to the output stream. But you’d be reinventing the reactive stream abstraction, badly. Flux gives you composition, error handling, timeout, retry, and cancellation out of the box.

3. Composition of Asynchronous Flows

Reactive code excels at composing complex async pipelines — fan-out, fan-in, merge, zip, retry with backoff, timeout, fallback:

// Fetch order, enrich with customer and inventory data in parallel,
// with timeout and fallback
public Mono<EnrichedOrder> getEnrichedOrder(String orderId) {
    Mono<Order> order = orderService.findById(orderId);

    return order.flatMap(o -> {
        Mono<Customer> customer = customerClient.fetch(o.customerId())
                .timeout(Duration.ofSeconds(2))
                .onErrorResume(e -> Mono.just(Customer.unknown()));

        Mono<Inventory> inventory = inventoryClient.check(o.productId())
                .timeout(Duration.ofSeconds(2))
                .onErrorResume(e -> Mono.just(Inventory.unavailable()));

        return Mono.zip(customer, inventory)
                .map(tuple -> EnrichedOrder.from(o, tuple.getT1(), tuple.getT2()));
    });
}

“But I can do the same with virtual threads and StructuredTaskScope!” — Yes, you can:

// Virtual threads with structured concurrency (Java 21+ preview)
public EnrichedOrder getEnrichedOrder(String orderId) {
    Order order = orderService.findById(orderId);

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        Subtask<Customer> customer = scope.fork(() ->
                withTimeout(() -> customerClient.fetch(order.customerId()), 2));
        Subtask<Inventory> inventory = scope.fork(() ->
                withTimeout(() -> inventoryClient.check(order.productId()), 2));

        scope.join().throwIfFailed();
        return EnrichedOrder.from(order, customer.get(), inventory.get());
    }
}

This works for fan-out/fan-in. But add retry-with-backoff, conditional branching based on intermediate results, merging multiple event streams, or chaining fallback strategies — and the imperative code becomes the spaghetti. Reactive operators exist because these patterns are common in event-driven systems, and encoding them imperatively every time is error-prone.

4. Context Propagation

Every request carries context — user ID, trace ID, tenant, locale. In traditional servlet code, you store it in a ThreadLocal and read it anywhere in the call chain. This works because one thread handles one request from start to finish.

Virtual threads preserve this model. One virtual thread still handles one request, so ThreadLocal technically works. But with millions of virtual threads, inheriting thread-local copies becomes expensive. Java’s answer is ScopedValue — immutable, structured, and cheap:

// ScopedValue — the virtual-thread-native way to propagate context
private static final ScopedValue<RequestContext> REQUEST_CTX = ScopedValue.newInstance();

public void handleRequest(HttpServletRequest req) {
    var ctx = RequestContext.from(req);  // user ID, trace ID, tenant

    ScopedValue.runWhere(REQUEST_CTX, ctx, () -> {
        // Every method in this call tree — including across forked virtual threads
        // in StructuredTaskScope — can read REQUEST_CTX.get()
        orderService.process(req.getOrderId());
    });
}

// Deep in the call stack, no parameter passing needed:
public void auditLog(String action) {
    var ctx = REQUEST_CTX.get();  // implicitly available
    log.info("Action: {}, user: {}, trace: {}", action, ctx.userId(), ctx.traceId());
}

ScopedValue is immutable (no accidental mutation), bounded to the scope (automatically cleaned up), and inherited by child threads in a StructuredTaskScope. It’s the clean replacement for ThreadLocal in the virtual thread world.

But in WebFlux, there is no “one thread per request.” A single request hops across multiple event-loop threads as it passes through operators. ThreadLocal is useless — by the time you read it, you might be on a different thread. ScopedValue doesn’t help either, because there’s no single thread scope to bind to.

Reactor’s answer is Context — an immutable map that flows through the reactive chain, attached to the subscription rather than the thread:

// Reactor Context — the reactive way to propagate context
public Mono<Order> processOrder(String orderId) {
    return orderService.findById(orderId)
            .flatMap(order -> auditLog("process", order).thenReturn(order))
            .contextWrite(ctx -> ctx.put("userId", currentUserId())
                                    .put("traceId", TraceId.generate()));
}

// Anywhere in the reactive chain:
public Mono<Void> auditLog(String action, Order order) {
    return Mono.deferContextual(ctx -> {
        log.info("Action: {}, user: {}, trace: {}",
                action, ctx.get("userId"), ctx.get("traceId"));
        return Mono.empty();
    });
}

Different problem, different solution. ScopedValue is elegant — but only when you have a thread to scope to. Reactor Context is more explicit — but it works across async boundaries that no thread-based mechanism can follow.

The Real Comparison

Let’s be precise about what each technology gives you:

Capability	Virtual Threads	WebFlux / Reactor
Scale blocking I/O without thread exhaustion	Yes	N/A (doesn’t use blocking I/O)
Non-blocking I/O end-to-end	No (wraps blocking calls)	Yes (R2DBC, WebClient, etc.)
Backpressure	No	Yes (Reactive Streams spec)
Streaming responses (SSE, WebSocket)	Manual	Native (`Flux<T>`)
Composable async operators (retry, zip, merge)	Limited (`StructuredTaskScope`)	Rich (100+ operators)
Context propagation	`ScopedValue` (implicit, thread-bound)	Reactor `Context` (explicit, subscription-bound)
Code readability for simple CRUD	Imperative (familiar)	Reactive (learning curve)
Debugging / stack traces	Full stack traces	Fragmented (improving with context propagation)
Ecosystem maturity	JDBC, RestTemplate, all blocking libs	R2DBC, WebClient, reactive drivers
Integration with event-driven architectures	Possible but manual	Native fit

graph LR subgraph "Virtual Threads sweet spot" A["CRUD APIs"] B["JDBC / JPA"] C["Blocking HTTP calls"] D["File I/O"] E["Simple request-response"] end subgraph "WebFlux sweet spot" F["Streaming APIs (SSE, WebSocket)"] G["Event-driven / CQRS"] H["High-throughput pipelines"] I["Backpressure-sensitive flows"] J["Reactive database (R2DBC)"] end subgraph "Either works" K["Parallel HTTP fan-out"] L["High-concurrency APIs"] end style A fill:#4caf50,color:#fff style B fill:#4caf50,color:#fff style C fill:#4caf50,color:#fff style D fill:#4caf50,color:#fff style E fill:#4caf50,color:#fff style F fill:#2196f3,color:#fff style G fill:#2196f3,color:#fff style H fill:#2196f3,color:#fff style I fill:#2196f3,color:#fff style J fill:#2196f3,color:#fff style K fill:#ff9800,color:#fff style L fill:#ff9800,color:#fff

The “Just Use Virtual Threads” Fallacy

The argument usually goes: “Virtual threads give us the concurrency of reactive with the readability of imperative. Why would anyone choose the harder programming model?”

Because concurrency is not the only axis.

Scenario 1: You’re building a REST API that reads from PostgreSQL via JPA and calls two downstream services.

Use virtual threads. You get massive concurrency with zero code changes. WebFlux would force you to rewrite your JPA repositories to R2DBC, replace RestTemplate with WebClient, and learn a new programming model — all for the same end result: handling more concurrent requests.

Scenario 2: You’re building an event-driven order processing pipeline that consumes from Kafka, enriches events from multiple sources with different latencies, writes to a reactive database, and pushes real-time updates to connected clients via SSE.

Use WebFlux. The entire data flow is a stream. Backpressure from the SSE connection propagates through enrichment all the way back to the Kafka consumer. If the client disconnects, the pipeline stops consuming. Virtual threads would handle the concurrency, but you’d have to build the streaming, backpressure, and composition yourself — which is just reimplementing Reactor, poorly.

Scenario 3: You have a Spring WebFlux application in production that’s working well.

Keep it. “Virtual threads exist” is not a reason to rewrite. The reactive code is already written, tested, and debugged. The alleged readability benefit of imperative code doesn’t apply — you’d be trading debugged reactive code for new imperative code that needs testing from scratch.

What Managers Get Wrong

The “WebFlux is dead” narrative appeals to managers because it promises simplification: one model instead of two, no reactive learning curve, easier hiring. I get it. Reactive code is harder to write, harder to debug, and harder to hire for.

But the decision isn’t “reactive or virtual threads.” It’s “what problem are you solving?”

If the answer is “our thread pool runs out under load” — virtual threads. Done.

If the answer is “we need streaming, backpressure, or event-driven composition” — WebFlux. Virtual threads don’t solve this.

If the answer is “we heard virtual threads are the future” — that’s not a problem statement. Go back and define the actual technical requirement.

The worst outcome is rewriting a working WebFlux application to virtual threads because a conference talk said reactive is dead, and then discovering six months later that you need SSE streaming or backpressure-aware data pipelines — and reimplementing the reactive patterns you just deleted.

They Coexist

Spring Boot 3.2+ supports both models in the same application. You can have:

MVC controllers on virtual threads for traditional request-response endpoints
WebFlux controllers for streaming and reactive endpoints

# application.yml
spring:
  threads:
    virtual:
      enabled: true  # MVC controllers use virtual threads
  # WebFlux endpoints continue to use the reactive event loop

This isn’t a compromise — it’s using each tool where it fits. Your CRUD endpoints get the readability of imperative code with virtual thread scalability. Your streaming endpoints get native reactive composition with backpressure. Same application, same deployment.

The Decision Framework

Before choosing, ask these questions:

Do you need backpressure? If data flows are asymmetric (fast producer, slow consumer), you need reactive. Virtual threads don’t provide flow control.
Do you need streaming? SSE, WebSocket, real-time feeds — these are reactive by nature. You can implement them imperatively, but you’ll be fighting the abstraction.
Is your data access blocking? If you’re on JDBC/JPA and not planning to change, virtual threads give you concurrency without a rewrite. R2DBC is not required.
Is your existing codebase reactive? Keep it. A rewrite to virtual threads is a rewrite — with all the risk that implies. It’s not a simplification; it’s a replacement.
Are you building event-driven / CQRS systems? Reactive fits naturally. Commands produce events, projections consume streams, and the entire pipeline benefits from backpressure and composition operators.

If you answered “no” to all of 1, 2, and 5 — virtual threads are likely the simpler choice. If you answered “yes” to any — WebFlux earns its complexity.

The Bottom Line

Virtual threads are a fantastic addition to Java. They eliminate the accidental complexity of thread pool management for blocking I/O. For a large category of applications — CRUD APIs, batch processors, traditional request-response services — they’re the right default.

But they didn’t make reactive programming obsolete. They made the concurrency argument for reactive obsolete. The other arguments — backpressure, streaming, event-driven composition — are exactly as valid as they were before Java 21.

The next time someone tells you “WebFlux is dead,” ask them how they’d implement backpressure with virtual threads. The silence that follows is your answer.

Cover photo by Jimmy Liu on Unsplash.

arrow_back Back to blog