Products Consulting About Blog Contact Us Česky
arrow_back Back to blog

Solving N+1 with Data Loaders and Field Resolvers

Solving N+1 with Data Loaders and Field Resolvers

Part 2 of 7 in the “Production GraphQL with Netflix DGS” series


GraphQL gives clients the power to ask for exactly the data they need. That’s its strength — and its trap. Without careful backend design, a single query can trigger hundreds of database calls. This article shows how DGS data loaders and field resolvers prevent that from happening.

The N+1 Problem, Visualized

Consider a simple query that fetches orders with their products:

GRAPHQL
query {
    orders(pageNumber: 0, pageSize: 20) {
        items {
            id
            status
            product {    # ← This field is the problem
                name
                price
            }
        }
    }
}

Without optimization, here’s what happens on the backend:

graph TD Q["Client Query"] --> O["1 query → fetch 20 orders"] O --> P1["query → product for order #1"] O --> P2["query → product for order #2"] O --> P3["query → product for order #3"] O --> PD["..."] O --> P20["query → product for order #20"] style Q fill:#4a9eff,stroke:#2171c7,color:#fff style O fill:#00bfa5,stroke:#00897b,color:#fff style P1 fill:#ff7043,stroke:#e64a19,color:#fff style P2 fill:#ff7043,stroke:#e64a19,color:#fff style P3 fill:#ff7043,stroke:#e64a19,color:#fff style PD fill:#ff7043,stroke:#e64a19,color:#fff style P20 fill:#ff7043,stroke:#e64a19,color:#fff

21 total database queries for a single GraphQL request.

This is the N+1 problem: 1 query for the parent list, plus N queries for each child. At 20 items, it’s tolerable. At 200 items with 3 nested fields each, you’re looking at 601 queries. Your database team will not be happy.

Field Resolvers: Lazy Loading with @DgsData

The first piece of the puzzle is the field resolver. Instead of eagerly loading every field when the parent is fetched, DGS lets you resolve fields on demand — only when the client actually requests them.

Java
@DgsComponent
@RequiredArgsConstructor
public class OrderProductResolver {

    private final ProductService productService;

    @DgsData(parentType = "Order", field = "product")
    public Mono<Product> resolveProduct(DgsDataFetchingEnvironment dfe) {
        // DGS gives us the parent object
        Order order = dfe.getSource();
        String productId = order.getProductId();

        if (productId == null) {
            return Mono.empty();
        }

        return productService.findById(productId);
    }
}

Key concepts:

  • @DgsData(parentType = "Order", field = "product") — tells DGS “when someone queries Order.product, call this method.”
  • DgsDataFetchingEnvironment — provides access to the parent object (dfe.getSource()), the current field’s arguments, and — critically — data loaders.
  • Lazy evaluation — if the client doesn’t request the product field, this method never executes. No wasted queries.

This solves the “unnecessary work” problem, but not the N+1 problem. Each order still triggers its own product query. That’s where data loaders come in.

Data Loaders: Batching with MappedBatchLoader

A data loader collects individual load requests within a single GraphQL execution and combines them into one batched query. Instead of 20 individual findById calls, you get one findByIds call.

DGS supports two types:

  • BatchLoader<K, V> — returns a List<V> in the same order as the input keys
  • MappedBatchLoader<K, V> — returns a Map<K, V> where keys match the input

MappedBatchLoader is almost always the better choice. It handles missing entries gracefully (the key simply isn’t in the map) and doesn’t require you to maintain order between input and output. Here’s a real-world pattern:

Java
@DgsDataLoader(name = "products")
@RequiredArgsConstructor
public class ProductDataLoader implements MappedBatchLoader<String, Product> {

    private final ProductRepository productRepository;

    @Override
    public CompletionStage<Map<String, Product>> load(Set<String> productIds) {
        return productRepository.findByIds(new ArrayList<>(productIds))
                .collectList()
                .map(products -> products.stream()
                        .collect(Collectors.toMap(Product::getId, Function.identity())))
                .toFuture();
    }
}

Let’s break this down:

  1. @DgsDataLoader(name = "products") — registers this loader with DGS under the name "products". Field resolvers reference this name to use the loader.
  2. MappedBatchLoader<String, Product> — the key type is String (product ID), the value type is Product.
  3. load(Set<String> productIds) — DGS calls this once per GraphQL execution, with all the product IDs collected from every field resolver that used this loader.
  4. Returns CompletionStage<Map<String, Product>> — the map associates each ID with its loaded product. If a product isn’t found, it’s simply absent from the map (the client gets null for that field).

The .toFuture() call converts a reactive Mono to the CompletionStage that the DataLoader API expects. This is the standard bridge between reactive internals and the DataLoader contract.

Wiring It Together

Now the field resolver uses the data loader instead of calling the service directly:

Java
@DgsData(parentType = "Order", field = "product")
public Mono<Product> resolveProduct(DgsDataFetchingEnvironment dfe) {
    Order order = dfe.getSource();
    String productId = order.getProductId();

    if (productId == null || productId.isBlank()) {
        return Mono.empty();
    }

    // Get the data loader by its registered name
    DataLoader<String, Product> productLoader = dfe.getDataLoader("products");

    // Queue this ID for batching — doesn't execute immediately
    return Mono.fromCompletionStage(productLoader.load(productId));
}

The magic happens in productLoader.load(productId). This doesn’t trigger a query immediately. Instead, it queues the product ID. After all field resolvers for the current execution level have been called, DGS dispatches the queued IDs to the ProductDataLoader.load() method in a single batch.

The result:

graph TD Q["Client Query"] --> O["1 query → fetch 20 orders"] O --> DL["DataLoader collects all 20 product IDs"] DL --> B["1 query → batch load all products"] style Q fill:#4a9eff,stroke:#2171c7,color:#fff style O fill:#00bfa5,stroke:#00897b,color:#fff style DL fill:#7c4dff,stroke:#5e35b1,color:#fff style B fill:#00bfa5,stroke:#00897b,color:#fff

2 total database queries — regardless of page size.

Why MappedBatchLoader Over BatchLoader

BatchLoader<K, V> requires you to return results in the exact same order as the input keys. This creates a subtle but dangerous contract:

Java
// BatchLoader — must maintain order, handle missing entries with nulls
@Override
public CompletionStage<List<Product>> load(List<String> keys) {
    return productRepository.findByIds(keys)
            .collectList()
            .map(products -> {
                // Must return products in the SAME ORDER as keys
                // Must include null for missing products
                Map<String, Product> byId = products.stream()
                        .collect(Collectors.toMap(Product::getId, Function.identity()));
                return keys.stream()
                        .map(byId::get)  // null if not found
                        .toList();
            })
            .toFuture();
}

MappedBatchLoader sidesteps this entirely. Return a map, and the framework handles the matching:

Java
// MappedBatchLoader — just return what you found
@Override
public CompletionStage<Map<String, Product>> load(Set<String> keys) {
    return productRepository.findByIds(new ArrayList<>(keys))
            .collectList()
            .map(products -> products.stream()
                    .collect(Collectors.toMap(Product::getId, Function.identity())))
            .toFuture();
}

Less code, no ordering bugs, no null-padding. In practice, MappedBatchLoader is the right choice for almost every use case.

Pagination: The *Page Pattern

GraphQL offers two pagination styles: Relay-style cursor-based pagination (using Connections and Edges) and offset-based pagination. Both are valid; the right choice depends on your data:

Offset-basedCursor-based
Best forAdmin tables, search resultsInfinite scroll, real-time feeds
Client complexityLow — just pass page/sizeHigher — must track cursors
Supports “jump to page 5”YesNo
Handles insertions/deletionsCan skip or duplicate itemsStable — cursor is a bookmark

For most CRUD applications with search and sorting, offset-based pagination is simpler and sufficient. Here’s the pattern:

Schema

GRAPHQL
type ProductPage {
    items: [Product]
    totalPages: Int
    currentPage: Int
    totalElements: Int
}

type Query {
    products(
        searchText: String,
        category: ProductCategory,
        pageNumber: Int!,
        pageSize: Int!,
        sortBy: String,
        sortOrder: SortOrder
    ): ProductPage
}

Data Fetcher

Java
@DgsQuery
public Mono<ProductPage> products(
        @InputArgument String searchText,
        @InputArgument ProductCategory category,
        @InputArgument Integer pageNumber,
        @InputArgument Integer pageSize,
        @InputArgument String sortBy,
        @InputArgument SortOrder sortOrder) {

    // Default pagination parameters
    int page = pageNumber != null ? pageNumber : 0;
    int size = pageSize != null ? pageSize : 10;
    String sort = sortBy != null ? sortBy : "name";

    return productService.search(searchText, category, page, size, sort, sortOrder);
}

The *Page wrapper type is repeated for each domain entity (ProductPage, OrderPage, UserPage). This is intentional — GraphQL doesn’t support generics, so each paginated type needs its own wrapper. The upside is that each page type can include domain-specific aggregate fields (e.g., ProductPage might include averagePrice).

Multiple Data Loaders in One Request

A complex query might hit several data loaders simultaneously:

GRAPHQL
query {
    orders(pageNumber: 0, pageSize: 20) {
        items {
            id
            status
            product {         # → ProductDataLoader
                name
                price
            }
            customer {        # → CustomerDataLoader
                name
            }
            warehouse {       # → WarehouseDataLoader
                location
            }
        }
    }
}

Each field resolver queues IDs to its respective data loader. DGS dispatches all three batches after the field resolvers return:

graph TD Q["Client Query"] --> O["1 query → fetch 20 orders"] O --> R1["Field resolver: product"] O --> R2["Field resolver: customer"] O --> R3["Field resolver: warehouse"] R1 --> DL1["ProductDataLoader
1 batch query"] R2 --> DL2["CustomerDataLoader
1 batch query"] R3 --> DL3["WarehouseDataLoader
1 batch query"] style Q fill:#4a9eff,stroke:#2171c7,color:#fff style O fill:#00bfa5,stroke:#00897b,color:#fff style R1 fill:#7c4dff,stroke:#5e35b1,color:#fff style R2 fill:#7c4dff,stroke:#5e35b1,color:#fff style R3 fill:#7c4dff,stroke:#5e35b1,color:#fff style DL1 fill:#00bfa5,stroke:#00897b,color:#fff style DL2 fill:#00bfa5,stroke:#00897b,color:#fff style DL3 fill:#00bfa5,stroke:#00897b,color:#fff

4 total queries — O(fields), not O(items x fields). Adding more orders doesn’t increase the query count.

This is linear in the number of distinct fields, not in the number of items. Adding more orders doesn’t increase the query count.

Common Pitfalls

1. Calling the service directly instead of using the data loader

Java
// WRONG — bypasses batching, causes N+1
@DgsData(parentType = "Order", field = "product")
public Mono<Product> resolveProduct(DgsDataFetchingEnvironment dfe) {
    Order order = dfe.getSource();
    return productService.findById(order.getProductId());  // N queries!
}

// CORRECT — uses data loader for batching
@DgsData(parentType = "Order", field = "product")
public Mono<Product> resolveProduct(DgsDataFetchingEnvironment dfe) {
    Order order = dfe.getSource();
    DataLoader<String, Product> loader = dfe.getDataLoader("products");
    return Mono.fromCompletionStage(loader.load(order.getProductId()));
}

2. Forgetting null checks on the parent ID

Java
// WRONG — NullPointerException when productId is null
DataLoader<String, Product> loader = dfe.getDataLoader("products");
return Mono.fromCompletionStage(loader.load(order.getProductId()));

// CORRECT — guard against null
String productId = order.getProductId();
if (productId == null || productId.isBlank()) {
    return Mono.empty();
}
DataLoader<String, Product> loader = dfe.getDataLoader("products");
return Mono.fromCompletionStage(loader.load(productId));

3. Using BatchLoader when results may be missing

If your database query doesn’t return a result for every ID (e.g., deleted records), BatchLoader will misalign keys and values. Use MappedBatchLoader instead — missing keys simply aren’t in the map.

What’s Next

Data loaders prevent the performance floor from collapsing, but they don’t address another critical concern: who’s allowed to call these queries? In Part 3, we’ll cover securing your GraphQL API — authentication, authorization, error sanitization, and protecting against abusive queries.


Cover photo by Aakash Dhage on Unsplash.

More from the Blog