Skip to content

Content API times out when pulp_label_select filter is used #7477

@dkliban

Description

@dkliban

Problem

The Content API times out when the pulp_label_select filter is used. This is a performance issue caused by the current implementation of LabelsFilter.

Example Request

The following request triggers the timeout:

/api/pulp/public-copr/api/v3/content/rpm/packages/?q=%28pulp_label_select%3D%22build_id%3D10148801%22%29+AND+pulp_label_select%3D%22chroot%3Dfedora-42-x86_64%22&offset=0&limit=1000&fields=location_href

Decoded query: (pulp_label_select="build_id=10148801") AND pulp_label_select="chroot=fedora-42-x86_64"

Root Cause

The current filter uses a key-based lookup:

pulp_labels__key = value

This lookup does not benefit from standard indexes and results in slow queries on large datasets.

Proposed Fix

1. Update LabelsFilter to use __contains

Change the filter to use a containment lookup instead:

pulp_labels__contains = {key: value}

This leverages PostgreSQL's JSONB containment operator (@>), which is compatible with GIN indexes.

2. Add a GIN index on pulp_labels

Add a GIN index to the pulp_labels field to make the containment query efficient:

from django.contrib.postgres.indexes import GinIndex

class Meta:
    indexes = [
        GinIndex(fields=["pulp_labels"]),
    ]

Impact

Without this fix, any query using pulp_label_select on a repository with a large number of content units will time out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions