Skip to content

otel_metrics task is too expensive — high DB CPU usage and long execution time #7475

@git-hyagi

Description

@git-hyagi

The otel_metrics scheduled task (pulpcore.app.tasks.telemetry.otel_metrics) runs every 5 minutes and currently takes approximately 1 minute and 30 seconds to complete. During execution, the underlying query consumes ~3 vCPUs from the database.

Here is where we believe the issue is happening (pulpcore/app/tasks/telemetry.py:31-33):

  space_usage_per_domain = Artifact.objects.values("pulp_domain__name").annotate(
      total_size=Sum("size", default=0)
  )
  • A telemetry task that is meant to be lightweight is placing significant load on the database.
  • 3 vCPU consumption for a single periodic query reduces capacity available for actual content operations (sync, publish, etc.).
  • The task takes 1m30s out of every 5-minute cycle (30% duty cycle), meaning a worker is occupied with telemetry for a disproportionate amount of time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions