Skip to content

[Konflux] improve backend performance with targeted catalog lookups and response caching#2630

Open
rrosatti wants to merge 6 commits intoredhat-developer:mainfrom
rrosatti:KFLUXUI-1159/konflux-perf-improvements
Open

[Konflux] improve backend performance with targeted catalog lookups and response caching#2630
rrosatti wants to merge 6 commits intoredhat-developer:mainfrom
rrosatti:KFLUXUI-1159/konflux-perf-improvements

Conversation

@rrosatti
Copy link
Copy Markdown
Contributor

Fixes

Fixes https://redhat.atlassian.net/browse/KFLUXUI-1159

Summary

The Konflux plugin backend was experiencing slow API responses (~14-17s) in the RHDH stage instance (https://inscope.corp.stage.redhat.com/). The issue is not reproducible locally, my assumption is that it seems it only happens in environments with large catalogs (thousands of entities).

Root cause (likely): getRelatedEntities was calling catalog.getEntities() with no filter, fetching the entire catalog on every resource request. This was called multiple times per page load, which could be blocking the node.js event loop.

main changes

Backend:

  • Replace unfiltered catalog scan with targeted lookup using the parent entity's hasPart relations and catalog.getEntitiesByRefs()
  • Cache CustomObjectsApi clients per cluster to avoid recreating KubeConfig and TLS connections on every request
  • Add 30s TTL cache for catalog entity/config/combination lookups to prevent duplicate calls across parallel requests
  • Strip metadata.managedFields from K8s and Kubearchive responses to reduce payload size

Frontend:

  • Increase react-query staleTime from 30s to 5min
  • Disable refetchOnWindowFocus to reduce unnecessary re-fetches

These changes should improve performance, but since the issue is environment-specific, further investigation may be needed if the problem persists in the stage instance ☕

Visual references

This is just to show that the plugin continues working with the new changes.

konflux-plugin-after-changes-all-good.mov

Plan

My idea is to try to simulate this into their ephemeral environment once we have a new OCI image published for testing 🙏

✔️ Checklist

  • A changeset describing the change and affected packages. (more info)
  • Added or Updated documentation
  • Tests for new functionality and regression tests for bug fixes
  • Screenshots attached (for UI changes)

…okup

getRelatedEntities was calling catalog.getEntities() with no filter,
fetching the entire catalog on every request.

Now uses the parent entity's hasPart relations with
catalog.getEntitiesByRefs() to fetch only the needed subcomponents.
…elds

- Cache CustomObjectsApi instances per cluster instead of creating
  new KubeConfig + client on every request. Auth headers are injected
  per-request via middleware, so cached clients are safe to share.
- Add a 30s TTL cache for catalog entity/config/combination lookups
  to avoid duplicate calls across parallel resource requests.
- Strip metadata.managedFields from K8s and Kubearchive responses
  to reduce payload size (~50% reduction).
Increase react-query staleTime from 30s to 5min and disable
refetchOnWindowFocus to reduce unnecessary re-fetches when
switching browser tabs.
Add changesets with the changes made for both konflux and konflux-backend
plugins.
@rhdh-gh-app
Copy link
Copy Markdown

rhdh-gh-app bot commented Mar 27, 2026

Changed Packages

Package Name Package Path Changeset Bump Current Version
@red-hat-developer-hub/backstage-plugin-konflux-backend workspaces/konflux/plugins/konflux-backend patch v0.1.4
@red-hat-developer-hub/backstage-plugin-konflux workspaces/konflux/plugins/konflux patch v0.1.4

@rrosatti rrosatti marked this pull request as ready for review March 27, 2026 12:57
@rhdh-qodo-merge
Copy link
Copy Markdown

Review Summary by Qodo

Improve Konflux backend performance with targeted catalog lookups and response caching

✨ Enhancement 🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Replace unfiltered catalog scan with targeted entity lookup using hasPart relations
• Cache K8s API clients per cluster to avoid recreating connections
• Add 30s TTL cache for catalog lookups to prevent duplicate parallel requests
• Strip metadata.managedFields from K8s responses to reduce payload size
• Increase react-query staleTime to 5min and disable refetchOnWindowFocus
Diagram
flowchart LR
  A["Catalog Scan"] -->|Before: getEntities| B["Full Catalog"]
  B -->|Filter in-memory| C["Related Entities"]
  
  D["Catalog Scan"] -->|After: hasPart relations| E["Entity Refs"]
  E -->|getEntitiesByRefs| F["Targeted Entities"]
  
  G["API Client"] -->|Before: New per request| H["KubeConfig + TLS"]
  I["API Client"] -->|After: Cached per cluster| J["Reused Connection"]
  
  K["Response"] -->|Before: Full metadata| L["Large Payload"]
  M["Response"] -->|After: Strip managedFields| N["Reduced Payload"]
Loading

Grey Divider

File Changes

1. workspaces/konflux/plugins/konflux-backend/src/helpers/config.ts 🐞 Bug fix +16/-13

Replace unfiltered catalog scan with targeted lookup

workspaces/konflux/plugins/konflux-backend/src/helpers/config.ts


2. workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts ✨ Enhancement +63/-1

Add client caching and getOrCreateClient function

workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts


3. workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts ✨ Enhancement +112/-15

Add 30s TTL cache for catalog entity lookups

workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts


View more (9)
4. workspaces/konflux/plugins/konflux-backend/src/services/kubearchive-service.ts ✨ Enhancement +12/-9

Use cached client and strip managedFields from responses

workspaces/konflux/plugins/konflux-backend/src/services/kubearchive-service.ts


5. workspaces/konflux/plugins/konflux-backend/src/services/resource-fetcher.ts ✨ Enhancement +12/-9

Use cached client and strip managedFields from responses

workspaces/konflux/plugins/konflux-backend/src/services/resource-fetcher.ts


6. workspaces/konflux/plugins/konflux/src/hooks/useKonfluxResource.ts ✨ Enhancement +3/-4

Increase staleTime to 5min and disable refetchOnWindowFocus

workspaces/konflux/plugins/konflux/src/hooks/useKonfluxResource.ts


7. workspaces/konflux/plugins/konflux/src/api/KonfluxQueryProvider.tsx ✨ Enhancement +2/-4

Update default query client cache and refetch settings

workspaces/konflux/plugins/konflux/src/api/KonfluxQueryProvider.tsx


8. workspaces/konflux/plugins/konflux-backend/src/helpers/__tests__/config.test.ts 🧪 Tests +14/-4

Update tests to use getEntitiesByRefs and hasPart relations

workspaces/konflux/plugins/konflux-backend/src/helpers/tests/config.test.ts


9. workspaces/konflux/plugins/konflux-backend/src/services/__tests__/kubearchive-service.test.ts 🧪 Tests +9/-15

Update tests to mock getOrCreateClient instead of createKubeConfig

workspaces/konflux/plugins/konflux-backend/src/services/tests/kubearchive-service.test.ts


10. workspaces/konflux/plugins/konflux-backend/src/services/__tests__/resource-fetcher.test.ts 🧪 Tests +8/-15

Update tests to mock getOrCreateClient instead of createKubeConfig

workspaces/konflux/plugins/konflux-backend/src/services/tests/resource-fetcher.test.ts


11. workspaces/konflux/.changeset/better-melons-shave.md 📝 Documentation +7/-0

Add changeset for backend performance improvements

workspaces/konflux/.changeset/better-melons-shave.md


12. workspaces/konflux/.changeset/fifty-moose-appear.md 📝 Documentation +5/-0

Add changeset for frontend query optimization

workspaces/konflux/.changeset/fifty-moose-appear.md


Grey Divider

Qodo Logo

@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge bot commented Mar 27, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Cross-user catalog cache reuse🐞 Bug ⛨ Security
Description
KonfluxService caches catalog entities/config/combinations by entityRef only, so a request can
receive cached results that were fetched under different BackstageCredentials/user identity. This
can bypass per-request catalog authorization checks and leak entity/subcomponent-derived
configuration across users for the cache TTL window.
Code

workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts[R181-247]

+  private async getCachedEntity(
+    entityRef: string,
+    credentials: BackstageCredentials,
+  ): Promise<Entity | undefined> {
+    const cacheKey = `entity:${entityRef}`;
+    let entity = this.catalogCache.get<Entity>(cacheKey);
+    if (!entity) {
+      entity =
+        (await this.catalog!.getEntityByRef(entityRef, {
+          credentials,
+        })) ?? undefined;
+      if (entity) {
+        this.catalogCache.set(cacheKey, entity);
+      }
+    }
+    return entity;
+  }
+
+  /**
+   * Fetch and cache the Konflux configuration for an entity
+   */
+  private async getCachedKonfluxConfig(
+    entityRef: string,
+    entity: Entity,
+    credentials: BackstageCredentials,
+  ): Promise<KonfluxConfig | undefined> {
+    const cacheKey = `config:${entityRef}`;
+    let konfluxConfig = this.catalogCache.get<KonfluxConfig>(cacheKey);
+    if (!konfluxConfig) {
+      konfluxConfig = await getKonfluxConfig(
+        this.config,
+        entity,
+        credentials,
+        this.catalog!,
+        this.konfluxLogger,
+      );
+      if (konfluxConfig) {
+        this.catalogCache.set(cacheKey, konfluxConfig);
+      }
+    }
+    return konfluxConfig;
+  }
+
+  /**
+   * Fetch and cache cluster-namespace combinations for an entity
+   */
+  private async getCachedCombinations(
+    entityRef: string,
+    entity: Entity,
+    credentials: BackstageCredentials,
+    konfluxConfig: KonfluxConfig,
+  ): Promise<SubcomponentClusterConfig[]> {
+    const cacheKey = `combinations:${entityRef}`;
+    let combinations =
+      this.catalogCache.get<SubcomponentClusterConfig[]>(cacheKey);
+    if (!combinations) {
+      combinations = await determineClusterNamespaceCombinations(
+        entity,
+        credentials,
+        konfluxConfig,
+        this.konfluxLogger,
+        this.catalog!,
+      );
+      this.catalogCache.set(cacheKey, combinations);
+    }
+    return combinations;
+  }
Evidence
KonfluxService is instantiated once per router and reused for all requests, while
getCachedEntity/getCachedKonfluxConfig/getCachedCombinations build cache keys that omit
credentials/user identity, returning cached data without calling the catalog with the current
request credentials. Elsewhere in this repo, entity access is explicitly permissioned per-request
credentials (catalog.entity.read), demonstrating that access can differ between users and must not
be bypassed by shared caches.

workspaces/konflux/plugins/konflux-backend/src/router.ts[151-201]
workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts[178-247]
workspaces/scorecard/plugins/scorecard-backend/src/permissions/permissionUtils.ts[35-48]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
KonfluxService’s new TTL cache is keyed only by `entityRef` (e.g., `entity:${entityRef}`, `config:${entityRef}`, `combinations:${entityRef}`), but catalog access can be credential-dependent. Because the service instance is shared across requests, a cache hit can return results computed with a different user’s credentials.

### Issue Context
- Router creates a single `KonfluxService` instance and uses per-request `credentials`.
- This repo includes explicit permission checks for `catalog.entity.read` based on request credentials, demonstrating user-specific access control.

### Fix Focus Areas
- Include a stable caller identity in cache keys (e.g., `userEntityRef` when available, else user email), or change caching to be per-request/per-aggregateResources invocation rather than global to the service.
- Ensure cached `entity`, derived `konfluxConfig`, and `combinations` are all scoped consistently.

- workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts[178-247]
- workspaces/konflux/plugins/konflux-backend/src/router.ts[151-201]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Unbounded catalog TTL cache 🐞 Bug ⛯ Reliability
Description
CatalogCache is an unbounded Map with lazy expiry cleanup (only removed on get), so
high-cardinality entityRefs/users can cause memory growth over the backend process lifetime. Expired
entries for one-off keys are never reclaimed unless accessed again.
Code

workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts[R56-84]

+const CATALOG_CACHE_TTL_MS = 30_000; // 30 seconds
+
+interface CacheEntry<T> {
+  data: T;
+  expiry: number;
+}
+
+/**
+ * Simple TTL cache for catalog lookups. Prevents duplicate catalog calls
+ * when multiple resource requests arrive in parallel for the same entity.
+ */
+class CatalogCache {
+  private readonly cache = new Map<string, CacheEntry<unknown>>();
+
+  get<T>(key: string): T | undefined {
+    const entry = this.cache.get(key);
+    if (entry && Date.now() < entry.expiry) {
+      return entry.data as T;
+    }
+    if (entry) {
+      this.cache.delete(key);
+    }
+    return undefined;
+  }
+
+  set<T>(key: string, data: T): void {
+    this.cache.set(key, { data, expiry: Date.now() + CATALOG_CACHE_TTL_MS });
+  }
+}
Evidence
The TTL cache stores entries in a Map with no max size and no background sweep; expiration is
enforced only when get is called for a key, so keys that are never re-read remain in memory
indefinitely.

workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts[56-84]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`CatalogCache` can grow without bound because it uses a `Map` with no max size and only deletes expired entries on access.

### Issue Context
This backend likely runs as a long-lived process; with many unique entityRefs (and potentially per-user scoping), this cache can accumulate indefinitely.

### Fix Focus Areas
- Replace Map with a bounded LRU+TTL cache (or implement max entries + periodic sweep).
- Consider tracking cache size and logging/metrics to detect growth.

- workspaces/konflux/plugins/konflux-backend/src/services/konflux-service.ts[56-84]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Unbounded K8s client cache 🐞 Bug ⛯ Reliability
Description
The global clientCache for CustomObjectsApi clients is an unbounded Map with no eviction, so
clusters/apiUrl churn can cause memory growth across the backend process lifetime. This can lead to
elevated memory usage and eventual OOM in long-running deployments.
Code

workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts[R21-31]

+/**
+ * Cache for CustomObjectsApi clients keyed by "cluster:apiUrl".
+ *
+ * Per-request auth headers are injected via middleware in the callers,
+ * so cached clients are safe to share across users.
+ *
+ *  Caching avoids creating a new KubeConfig, CustomObjectsApi, and TLS
+ * connection on every API call.
+ */
+const clientCache = new Map<string, CustomObjectsApi>();
+
Evidence
clientCache is a module-level Map that only adds entries and never removes them; it persists for
the life of the Node.js process.

workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts[21-31]
workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts[120-158]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`clientCache` stores `CustomObjectsApi` clients forever and is never evicted.

### Issue Context
Even if the number of clusters is usually small, this is a global cache keyed by `cluster:apiUrl` and will accumulate across process lifetime if configs/environments vary.

### Fix Focus Areas
- Implement a bounded cache (LRU) with a reasonable max size and/or TTL.
- Optionally expose a method to clear cache (e.g., for tests or config reload).

- workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts[21-31]
- workspaces/konflux/plugins/konflux-backend/src/helpers/client-factory.ts[120-158]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant