API Query Batching Roadmap¶

This document describes planned optimizations for REST API endpoints that list objects, addressing the N+1 query problem that affects performance when returning large result sets.

Problem Statement¶

Many REST API endpoints that return lists of objects suffer from an N+1 query problem. When listing N objects, each object's external_view() method makes one or more additional database queries to fetch related data. This results in poor performance that scales linearly with result set size.

Example: Blob Listing¶

When listing blobs via the REST API:

Query returns N blob records from etcd/MariaDB
For each blob, external_view() calls get_valid_checksums(blob_uuid)
Each get_valid_checksums() makes a separate database query
Total queries: 1 + N

With 100 blobs, this means 101 database round-trips. With content-defined chunking (where blob counts could reach thousands), this becomes a significant bottleneck.

Affected Endpoints¶

The N+1 pattern likely affects any endpoint that:

Lists objects with external_view() rendering
Fetches related data (checksums, references, metadata) per object
Returns paginated results where page size can be large

Known affected areas:

Blob listings: Each blob fetches checksums from blob_hashes table
Instance listings: May fetch network info, disk info per instance
Artifact listings: May fetch version/blob info per artifact

Proposed Solution¶

Phase 1: Batch Query Infrastructure¶

Add batch query functions to the database layer that can fetch related data for multiple objects in a single query.

# Example batch function signature
def get_valid_checksums_batch(
    blob_uuids: list[str]
) -> dict[str, dict[str, str]]:
    """Get valid checksums for multiple blobs in a single query.

    Args:
        blob_uuids: List of blob UUIDs to fetch checksums for.

    Returns:
        Dict mapping blob_uuid to dict of algorithm -> hash_value.
    """
    # Single query: SELECT * FROM blob_hashes
    #               WHERE blob_uuid IN (:uuids)
    #               AND verification_status = 'valid'

Phase 2: Prefetch Pattern in API Layer¶

Modify API endpoints to prefetch related data before rendering objects.

# Before (N+1 queries):
blobs = get_blobs()
return [b.external_view() for b in blobs]

# After (2 queries):
blobs = get_blobs()
blob_uuids = [str(b.uuid) for b in blobs]
checksums = mariadb.get_valid_checksums_batch(blob_uuids)
return [b.external_view(prefetched_checksums=checksums.get(str(b.uuid)))
        for b in blobs]

Phase 3: Generic Prefetch Framework¶

Consider a more generic approach where external_view() can declare what related data it needs, and the API layer automatically batches those fetches.

class Blob(DatabaseBackedObject):
    @classmethod
    def prefetch_requirements(cls) -> list[PrefetchSpec]:
        return [
            PrefetchSpec(
                name='checksums',
                fetcher=mariadb.get_valid_checksums_batch,
                key_extractor=lambda b: str(b.uuid)
            )
        ]

Alternative Approaches Considered¶

Denormalized Cache Tables¶

For frequently-accessed data like valid checksums, a denormalized table could reduce query complexity:

CREATE TABLE valid_checksums (
    blob_uuid UUID NOT NULL,
    algorithm VARCHAR(32) NOT NULL,
    hash_value VARCHAR(256) NOT NULL,
    PRIMARY KEY (blob_uuid, algorithm)
);

Pros:

Simpler queries (no filtering by status needed)
Smaller result sets (no node/timestamp fields)

Cons:

Must be kept in sync with source table
Adds write overhead when checksums are updated
Still requires batching to solve N+1

Verdict: Could be combined with batching for maximum benefit, but batching alone solves the core problem.

GraphQL / DataLoader Pattern¶

A more radical approach would be adopting GraphQL with DataLoader-style batching, which automatically coalesces related queries.

Verdict: Too large a change for the current architecture. Batching within the existing REST API is more pragmatic.

Implementation Order¶

Blob checksums batching (highest impact due to blob count scaling)
Audit affected endpoints (identify other N+1 patterns)
Generic framework (if patterns are common enough to warrant it)

Success Metrics¶

Blob listing API response time should not scale linearly with result count
Database query count for listing N objects should be O(1) not O(N)

Dependencies¶

None (can be implemented independently)

Blob Storage Roadmap - Phase 1 introduced the blob_hashes table that exposed this N+1 pattern

📝 Report an issue with this page