-
Notifications
You must be signed in to change notification settings - Fork 143
Description
Problem
The Content API times out when the pulp_label_select filter is used. This is a performance issue caused by the current implementation of LabelsFilter.
Example Request
The following request triggers the timeout:
/api/pulp/public-copr/api/v3/content/rpm/packages/?q=%28pulp_label_select%3D%22build_id%3D10148801%22%29+AND+pulp_label_select%3D%22chroot%3Dfedora-42-x86_64%22&offset=0&limit=1000&fields=location_href
Decoded query: (pulp_label_select="build_id=10148801") AND pulp_label_select="chroot=fedora-42-x86_64"
Root Cause
The current filter uses a key-based lookup:
pulp_labels__key = valueThis lookup does not benefit from standard indexes and results in slow queries on large datasets.
Proposed Fix
1. Update LabelsFilter to use __contains
Change the filter to use a containment lookup instead:
pulp_labels__contains = {key: value}This leverages PostgreSQL's JSONB containment operator (@>), which is compatible with GIN indexes.
2. Add a GIN index on pulp_labels
Add a GIN index to the pulp_labels field to make the containment query efficient:
from django.contrib.postgres.indexes import GinIndex
class Meta:
indexes = [
GinIndex(fields=["pulp_labels"]),
]Impact
Without this fix, any query using pulp_label_select on a repository with a large number of content units will time out.