Enhance tenant replication process by adding default configuration fetching and improving backfill validation. #1625
Conversation
…tching and improving backfill validation. Updated tenant service to include a method for fetching default tenant configuration and modified the tenant consumer to utilize this configuration during replication. Improved error handling in backfill script for missing tenant fields.
WalkthroughTenant backfilling workflow extended to support both newly created tenants and backfill messages. Default tenant configuration is now fetched once and passed through the Kafka consumer pipeline. CSV validation requirements strengthened, and configuration replication now deletes existing data before applying new configuration in a transaction. Changes
Sequence Diagram(s)sequenceDiagram
participant Script as Backfill Script
participant Service as Tenant Service
participant Queue as Kafka Queue
participant Consumer as Kafka Consumer
participant DB as Database
Script->>Service: fetchDefaultTenantConfig()
Service->>DB: Query default tenant config<br/>(templates, forms, entity types, etc.)
DB-->>Service: config object
Service-->>Script: defaultConfig
Script->>Script: Iterate CSV rows
Script->>Queue: Publish message with<br/>defaultConfig + tenant data
Consumer->>DB: Find/Create tenant
alt Tenant Created or Backfill
Consumer->>Service: replicateConfigFromDefaultTenant<br/>(code, orgId, orgCode, defaultConfig)
Service->>DB: DELETE FROM config tables<br/>WHERE tenant_code = :tenantCode
Service->>DB: INSERT config data from<br/>pre-fetched defaultConfig
Service->>DB: Rebuild materialized views
end
alt New Tenant Only
Service->>DB: Schedule periodic refresh job
end
DB-->>Consumer: Success
Consumer-->>Script: Acknowledgement
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/tenant.js (1)
199-204:⚠️ Potential issue | 🟠 Major
fetchExistingqueries wrongorganization_coderange; will not find inserted entity types whenorg_codein message differs fromDEFAULT_ORGANISATION_CODEenv var.The
transform(line 150) writes entity types withorganization_code: resolvedOrgCode(which isnewOrgCode || defaultOrgCode), butfetchExisting(line 201) queries only for[defaultOrgCode]. When the Kafka message contains anorg_codethat differs fromDEFAULT_ORGANISATION_CODE, the inserted records withorganization_code = org_codewon't be found during the fallback query. This causesreplicateWithIdMapto attempt re-insertion on the next run, potentially duplicating records.Per the design intent (configuration data should retain the default org's code), update the
baseTransformto usedefaultOrgCodeinstead ofresolvedOrgCode, or updatefetchExistingto query[resolvedOrgCode]. The matchKey comparison will also need to match the selected organization code.Suggested fix (preserve default org code intent)
- organization_code: resolvedOrgCode, + organization_code: defaultOrgCode,and
- fetchExisting: () => EntityTypeQueries.findAllEntityTypes([defaultOrgCode], newTenantCode, null), - matchKey: (oldET) => (c) => c.value === oldET.value && c.organization_code === oldET.organization_code, + fetchExisting: () => EntityTypeQueries.findAllEntityTypes([defaultOrgCode], newTenantCode, null), + matchKey: (oldET) => (c) => c.value === oldET.value,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/tenant.js` around lines 199 - 204, The bug is that transform is writing entity types with the resolved org code but fetchExisting only looks up [defaultOrgCode], causing misses and duplicate inserts; to fix, make the transform produced by baseTransform use defaultOrgCode (not resolvedOrgCode/newOrgCode) so stored records use the default org code, ensure EntityTypeQueries.findAllEntityTypes is still called with [defaultOrgCode], and update the matchKey comparator in replicateWithIdMap (the matchKey closure that compares c.organization_code) to compare against defaultOrgCode so existing rows are correctly matched.
🧹 Nitpick comments (2)
src/scripts/backfillTenantData.js (1)
67-68: Consider deferringfetchDefaultTenantConfig()when--dry-runis specified.Currently, the default config is fetched unconditionally before processing tenants. In dry-run mode, this call is unnecessary and will fail if
DEFAULT_TENANT_CODE/DEFAULT_ORGANISATION_CODEenvironment variables are not configured, preventing users from validating their CSV file structure without a full environment setup.♻️ Suggested improvement
async function backfillTenants(tenants, options = {}) { const { dryRun = false } = options let success = 0 let failed = 0 - const defaultConfig = await TenantService.fetchDefaultTenantConfig() + let defaultConfig = null + if (!dryRun) { + defaultConfig = await TenantService.fetchDefaultTenantConfig() + } for (const tenant of tenants) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scripts/backfillTenantData.js` around lines 67 - 68, The code unconditionally calls TenantService.fetchDefaultTenantConfig() into defaultConfig before tenant processing, which breaks --dry-run; change the logic to defer or skip calling fetchDefaultTenantConfig when dry-run is active (detect the existing dryRun flag or the presence of "--dry-run" in process.argv) and only fetch defaultConfig when actually performing writes; update any code paths that assume defaultConfig exists to handle it being undefined in dry-run mode (refer to TenantService.fetchDefaultTenantConfig, defaultConfig, and the dry-run flag/variable).src/services/tenant.js (1)
156-175: Clear-before-replicate pattern ensures idempotent reruns.The ordered DELETE statements correctly handle foreign key dependencies (e.g.,
entitiesbeforeentity_types). This approach guarantees a clean slate for backfill operations.One consideration: using raw SQL with hardcoded table names couples this code to the database schema. If table names change or soft deletes are introduced, this could silently break.
💡 Optional: Use model-based destroy for consistency
Consider using Sequelize model
destroymethods (e.g.,EntityModel.destroy({ where: { tenant_code: newTenantCode }, transaction })) to:
- Respect soft-delete behavior if added later
- Get automatic table name resolution from models
- Trigger any model hooks if needed
However, the current raw SQL approach is acceptable for performance-critical bulk operations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/tenant.js` around lines 156 - 175, The current loop using configTables and db.sequelize.query deletes rows by table name and tenant_code (using newTenantCode and transaction) which can break if table names or soft-delete behavior change; replace these raw DELETEs with the corresponding Sequelize model destroy calls (e.g., EntityModel.destroy({ where: { tenant_code: newTenantCode }, transaction })) for each table represented in configTables so model table names, soft-deletes, and hooks are honored (or keep raw SQL but add a clear comment explaining intentional bypass of model behavior and ensure table names stay in sync).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/services/tenant.js`:
- Around line 199-204: The bug is that transform is writing entity types with
the resolved org code but fetchExisting only looks up [defaultOrgCode], causing
misses and duplicate inserts; to fix, make the transform produced by
baseTransform use defaultOrgCode (not resolvedOrgCode/newOrgCode) so stored
records use the default org code, ensure EntityTypeQueries.findAllEntityTypes is
still called with [defaultOrgCode], and update the matchKey comparator in
replicateWithIdMap (the matchKey closure that compares c.organization_code) to
compare against defaultOrgCode so existing rows are correctly matched.
---
Nitpick comments:
In `@src/scripts/backfillTenantData.js`:
- Around line 67-68: The code unconditionally calls
TenantService.fetchDefaultTenantConfig() into defaultConfig before tenant
processing, which breaks --dry-run; change the logic to defer or skip calling
fetchDefaultTenantConfig when dry-run is active (detect the existing dryRun flag
or the presence of "--dry-run" in process.argv) and only fetch defaultConfig
when actually performing writes; update any code paths that assume defaultConfig
exists to handle it being undefined in dry-run mode (refer to
TenantService.fetchDefaultTenantConfig, defaultConfig, and the dry-run
flag/variable).
In `@src/services/tenant.js`:
- Around line 156-175: The current loop using configTables and
db.sequelize.query deletes rows by table name and tenant_code (using
newTenantCode and transaction) which can break if table names or soft-delete
behavior change; replace these raw DELETEs with the corresponding Sequelize
model destroy calls (e.g., EntityModel.destroy({ where: { tenant_code:
newTenantCode }, transaction })) for each table represented in configTables so
model table names, soft-deletes, and hooks are honored (or keep raw SQL but add
a clear comment explaining intentional bypass of model behavior and ensure table
names stay in sync).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fda3e40f-7899-4987-84e7-370870851545
📒 Files selected for processing (3)
src/generics/kafka/consumers/tenant.jssrc/scripts/backfillTenantData.jssrc/services/tenant.js
Updated tenant service to include a method for fetching default tenant configuration and modified the tenant consumer to utilize this configuration during replication. Improved error handling in backfill script for missing tenant fields.
Release Notes
fetchDefaultTenantConfig()method to TenantService to load default tenant configuration from environment-driven source, including templates, forms, entity types, entities, questions, question sets, reports metadata, and role extensionscode,name,org_id,org_code) to be present in CSV, removing default environment variable fallbacks for organization fields