DGS at Scale: Testing, Schema Evolution, and Federation

Posted on March 26, 2026 schedule 9 min read

GraphQLNetflix DGSTestingFederationJava

DGS at Scale: Testing, Schema Evolution, and Federation

Part 5 of 7 in the “Production GraphQL with Netflix DGS” series

Building a GraphQL API is one thing. Running hundreds of operations in production, testing them reliably, and evolving the schema without breaking clients — that’s where the real engineering starts. This final article covers the practices that matter when your DGS API grows beyond a handful of queries.

Testing DGS Components

DGS data fetchers are Spring beans with injected dependencies. This makes them straightforward to unit test — mock the dependencies, call the method, assert the result.

Unit Testing with Spock

Spock’s expressive syntax pairs well with DGS’s method-per-operation pattern:

class ProductDataFetcherSpec extends Specification {

    ProductService productService = Mock()
    CurrentUserProvider currentUser = Mock()
    ProductDataFetcher dataFetcher

    def setup() {
        dataFetcher = new ProductDataFetcher(productService, currentUser)
    }

    def "should fetch product by ID"() {
        given:
        def productId = "prod-123"
        def mockProduct = new Product(id: productId, name: "Widget")

        when:
        def result = dataFetcher.product(productId).block()

        then:
        1 * productService.findById(productId) >> Mono.just(mockProduct)
        result.id == productId
        result.name == "Widget"
    }

    def "should return null when product not found"() {
        when:
        def result = dataFetcher.product("non-existent").block()

        then:
        1 * productService.findById("non-existent") >> Mono.empty()
        result == null
    }

    def "should propagate service errors"() {
        given:
        def error = new RuntimeException("Database unavailable")

        when:
        dataFetcher.product("prod-123").block()

        then:
        1 * productService.findById(_) >> Mono.error(error)
        thrown(RuntimeException)
    }
}

Key patterns:

Constructor injection — create the data fetcher with mocked dependencies in setup(). No Spring context needed.
.block() — converts the reactive Mono to a synchronous result for assertions. Acceptable in tests, never in production.
Interaction verification — 1 * productService.findById(productId) verifies the service was called exactly once with the right argument.

Testing Authenticated Operations

Mock the auth supplier to return different user IDs:

def "should fetch authenticated user's orders"() {
    given:
    def userId = "user-42"
    def mockPage = new OrderPage(items: [new Order(id: "order-1")], totalElements: 1)

    when:
    def result = dataFetcher.myOrders(0, 10).block()

    then:
    1 * currentUser.getUserId() >> Mono.just(userId)
    1 * orderService.findByOwner(userId, 0, 10) >> Mono.just(mockPage)
    result.items.size() == 1
}

def "should reject unauthenticated access"() {
    when:
    dataFetcher.myOrders(0, 10).block()

    then:
    1 * currentUser.getUserId() >> Mono.error(new AccessDeniedException("Not authenticated"))
    thrown(AccessDeniedException)
}

Note that @PreAuthorize annotations are not evaluated in unit tests (they’re Spring AOP proxies). To test authorization rules, use integration tests with @SpringBootTest and real security context.

Testing Data Loaders

Data loaders are independent classes with a single method — test them in isolation:

class ProductDataLoaderSpec extends Specification {

    ProductRepository productRepository = Mock()
    ProductDataLoader loader

    def setup() {
        loader = new ProductDataLoader(productRepository)
    }

    def "should batch load products by IDs"() {
        given:
        def ids = ["prod-1", "prod-2", "prod-3"] as Set
        def products = [
            new Product(id: "prod-1", name: "Widget"),
            new Product(id: "prod-2", name: "Gadget")
            // prod-3 intentionally missing — deleted product
        ]

        when:
        def result = loader.load(ids).toCompletableFuture().get()

        then:
        1 * productRepository.findByIds(ids as List) >> Flux.fromIterable(products)
        result.size() == 2
        result["prod-1"].name == "Widget"
        result["prod-2"].name == "Gadget"
        result["prod-3"] == null  // Missing entries return null
    }
}

The MappedBatchLoader contract means missing keys are simply absent from the map. The DataLoader framework translates this to null for the requesting field resolver.

Schema Evolution

GraphQL schemas are contracts. Changing them can break clients. Here’s how to evolve safely.

Additive Changes (Always Safe)

These never break existing clients:

type Product {
    id: ID!
    name: String!
    description: String
    price: Float!
    category: ProductCategory!
    # New fields — clients that don't request them are unaffected
    rating: Float              # ← Added
    reviewCount: Int           # ← Added
    tags: [String]             # ← Added
}

type Query {
    product(id: ID!): Product
    products(...): ProductPage
    # New queries — existing clients don't know about them
    productsByTag(tag: String!, ...): ProductPage   # ← Added
    featuredProducts(...): ProductPage               # ← Added
}

Adding fields, queries, mutations, or types is always backward-compatible.

Deprecation (The Gentle Removal)

When you need to rename or replace a field, deprecate first:

type Product {
    id: ID!
    name: String!
    # Phase 1: Add new field alongside old one
    productName: String!                          # ← New preferred field
    name: String! @deprecated(reason: "Use productName instead")  # ← Mark old one
}

GraphQL tooling (GraphiQL, Apollo Studio, code generators) surfaces deprecation warnings to clients. After monitoring confirms zero usage of the deprecated field, remove it.

Breaking Changes (Avoid If Possible)

These break clients and should be a last resort:

Removing a field or type
Changing a field’s type (e.g., String to Int)
Making a nullable field non-nullable
Adding a required argument to an existing query

If unavoidable, coordinate with API consumers and version the change with a migration period.

Schema Linting

Catch breaking changes before they reach production. Tools like graphql-inspector can compare schema versions:

# CI pipeline — compare current schema against the deployed version
graphql-inspector diff schema-deployed.graphqls schema-current.graphqls

This catches field removals, type changes, and other breaking modifications as a CI gate.

Federation: One Graph, Multiple Services

As your system grows, a single backend service may become too large. GraphQL federation lets you split the schema across multiple services while presenting a unified API to clients.

The Gateway Pattern

Rather than federating at the DGS level (using @DgsEntityFetcher), many teams use a gateway router that composes schemas from multiple services:

graph TD Client --> GW["API Gateway
Auth + rate limiting"] GW --> Router["GraphQL Router
Schema composition + query planning"] Router --> A["Service A — DGS
Products, Inventory"] Router --> B["Service B — DGS
Orders, Payments"] Router --> C["Service C — DGS
Users, Notifications"] style Client fill:#4a9eff,stroke:#2171c7,color:#fff style GW fill:#7c4dff,stroke:#5e35b1,color:#fff style Router fill:#7c4dff,stroke:#5e35b1,color:#fff style A fill:#00bfa5,stroke:#00897b,color:#fff style B fill:#00bfa5,stroke:#00897b,color:#fff style C fill:#00bfa5,stroke:#00897b,color:#fff

The router (Apollo Router or similar federation-capable routers) reads each service’s schema, composes them into a supergraph, and routes incoming queries to the right service.

Why Gateway-Level Federation

Federating at the gateway rather than within DGS itself has practical benefits:

Services stay simple. Each DGS service owns its schema and data fetchers without federation-specific annotations. Standard @DgsQuery and @DgsMutation patterns work unchanged.
Independent deployment. Services can be deployed, rolled back, and scaled independently. The router recomposes the supergraph when a service’s schema changes.
Separation of concerns. The router handles cross-cutting concerns (query planning, deduplication, caching) that individual services shouldn’t own.
Technology freedom. Not every service needs to be DGS. The router composes any GraphQL service, regardless of the framework behind it.

Supergraph Composition

The supergraph schema is typically composed as part of CI/CD:

graph LR S1["Service A
schema"] --> REG["Schema Registry"] S2["Service B
schema"] --> REG S3["Service C
schema"] --> REG REG --> COMP["Composition Step
CI/CD pipeline"] COMP --> SG["Supergraph Schema"] SG --> R["Router reloads
& validates"] style REG fill:#7c4dff,stroke:#5e35b1,color:#fff style COMP fill:#4a9eff,stroke:#2171c7,color:#fff style SG fill:#00bfa5,stroke:#00897b,color:#fff style R fill:#00bfa5,stroke:#00897b,color:#fff

Composition catches conflicts early — two services defining the same type, incompatible field types, or missing entity references surface as build failures, not runtime errors.

Organizing DGS Components at Scale

When you have dozens of domain modules with their own queries, mutations, and data loaders, organization matters.

Domain-Driven Structure

Group DGS components by business domain, not by technical role:

com.example.product.graphql/
├── ProductDataFetcher.java           # Queries
├── CreateProductMutation.java        # Mutations (one per operation)
├── UpdateProductMutation.java
├── ProductStockResolver.java         # Field resolvers
└── ProductDataLoader.java            # Data loaders

com.example.order.graphql/
├── OrderDataFetcher.java
├── PlaceOrderMutation.java
├── OrderProductResolver.java
└── OrderCustomerDataLoader.java

This is preferable to a technical grouping like com.example.graphql.queries/, com.example.graphql.mutations/, etc. — because when you’re debugging the product search query, you want all product-related GraphQL code in one place.

One Class Per Mutation

A small API might put all mutations in one class. At scale, this creates a god class that grows linearly with every new operation. The pattern of one mutation per class keeps each file focused:

// CreateProductMutation.java — ~40 lines, single responsibility
@DgsComponent
@RequiredArgsConstructor
public class CreateProductMutation {
    private final ProductService productService;
    private final CurrentUserProvider currentUser;

    @DgsMutation
    @PreAuthorize("hasRole('ADMIN')")
    public Mono<Product> createProduct(@InputArgument CreateProductInput input) {
        return currentUser.getUserId().flatMap(userId ->
                productService.create(input, userId));
    }
}

Each file is small, testable, and obvious in its purpose. The file name tells you exactly what it does.

Module Boundaries and the Anti-Corruption Layer

If you use Spring Modulith or a similar modular architecture, DGS components should respect module boundaries:

A product module’s DGS component can call the product service directly.
To access order data, it should go through a published API (query gateway, event, or public interface) — not import the order module’s internal classes.

This is the same Anti-Corruption Layer principle from Part 1 applied at the module level. In Part 1, we placed mappers between the GraphQL layer, the domain model, and the database to prevent one model from corrupting another. At scale, the same pattern applies between modules:

graph TD subgraph Product Module PG["ProductDataFetcher
GraphQL layer"] PS["ProductService
Domain layer"] PR["ProductRepository
Persistence layer"] PG --> PS --> PR end subgraph Order Module OG["OrderDataFetcher
GraphQL layer"] OS["OrderService
Domain layer"] OR["OrderRepository
Persistence layer"] OG --> OS --> OR end OS -->|"ACL: published API
or event"| PS style PG fill:#4a9eff,stroke:#2171c7,color:#fff style PS fill:#00bfa5,stroke:#00897b,color:#fff style PR fill:#ff7043,stroke:#e64a19,color:#fff style OG fill:#4a9eff,stroke:#2171c7,color:#fff style OS fill:#00bfa5,stroke:#00897b,color:#fff style OR fill:#ff7043,stroke:#e64a19,color:#fff

The Order module’s service doesn’t import ProductEntity or call ProductRepository directly. Instead, it goes through an ACL — a published interface, a query, or a domain event — that translates between the two modules’ models. Each module owns its own types and can evolve independently.

This matters because module boundaries become service boundaries during federation. If the Order module already communicates with Product through a well-defined ACL, extracting it into a separate DGS service is a deployment change, not a code rewrite. The anti-corruption layers you built within the monolith become the service contracts of your distributed system.

Recap: The Production GraphQL Checklist

Across this seven-part series, we’ve covered:

Concern	Solution	Article
Schema design	Schema-first with domain-specific type files	Part 1
Type safety	DGS code generation + MapStruct mappers	Part 1
Layer isolation	Anti-corruption layers between API, domain, and persistence	Part 1, Part 5
N+1 performance	`MappedBatchLoader` data loaders	Part 2
Lazy field loading	`@DgsData` field resolvers	Part 2
Pagination	Offset-based `*Page` wrapper types	Part 2
Authentication	`@PreAuthorize` + auth supplier	Part 3
Error handling	`DataFetcherExceptionHandler` with sanitization	Part 3
Query abuse protection	Depth + complexity limits	Part 3
Real-time updates	`@DgsSubscription` with WebSocket transport	Part 4
Async patterns	`CompletableFuture` for fetchers, `Mono` for services	Part 4
Observability	AOP metrics + structured logging	Part 4
Testing	Spock specs with mocked dependencies	Part 5 (this article)
Schema evolution	Additive changes + deprecation	Part 5 (this article)
Federation	Gateway-level composition	Part 5 (this article)
Frontend type safety	GraphQL Code Generator + TypedDocumentNode	Part 6
Observability	Operation metrics, SLO histograms, distributed tracing	Part 7
Performance	Persisted queries, complexity budgets, load testing	Part 7

The DGS framework, now deeply integrated with Spring for GraphQL, gives you a mature stack for building GraphQL APIs that are fast, safe, and maintainable at scale. The annotations-first programming model keeps your code readable. The Spring integration gives you the full ecosystem. And the patterns in this series keep you out of the pitfalls that trip up teams as their API grows.

What’s Next

In Part 6, we cross from the backend to the frontend — exploring how GraphQL Code Generator and TypedDocumentNode bring the same type safety to your client code that DGS codegen provides on the server. The result: a type-safe chain from schema to UI component, with compile-time guarantees at every layer.

Cover photo by Yuriy Vertikov on Unsplash.