Skip to content

feat: PostgreSQL-backed search/document storage as ElasticSearch alternative#700

Open
pandor4u wants to merge 2 commits intomoqui:masterfrom
pandor4u:feature/postgres-search-backend
Open

feat: PostgreSQL-backed search/document storage as ElasticSearch alternative#700
pandor4u wants to merge 2 commits intomoqui:masterfrom
pandor4u:feature/postgres-search-backend

Conversation

@pandor4u
Copy link
Copy Markdown

PostgreSQL-Backed Search & Document Storage

This PR adds a new postgres ElasticClient type that implements the full ElasticFacade.ElasticClient interface using PostgreSQL — eliminating the need for a separate ElasticSearch/OpenSearch cluster.

Motivation

Many Moqui deployments already run PostgreSQL. Requiring a separate ElasticSearch/OpenSearch cluster adds operational complexity, memory overhead, and cost — especially for small-to-medium deployments. This provides a zero-dependency alternative that uses PostgreSQL's native JSONB, tsvector full-text search, and GIN indexes.

Changes

New Files

  • PostgresElasticClient.groovy — Full ElasticClient implementation: index/get/update/delete/bulk/search/count operations backed by PostgreSQL tables
  • ElasticQueryTranslator.groovy — Translates ElasticSearch Query DSL (bool, term, terms, range, nested, exists, match_all, query_string, ids) into parameterized PostgreSQL SQL
  • PostgresSearchLogger.groovy — Log4j2 appender that writes structured logs to PostgreSQL instead of ES
  • SearchEntities.xml — Moqui entity definitions for moqui_search_index, moqui_document, moqui_logs, moqui_http_log with JSONB columns, tsvector, and GIN/BRIN indexes
  • moqui-postgres-only-compose.yml — Docker Compose for postgres-only deployment

Modified Files

  • ElasticFacadeImpl.groovy — Added type="postgres" client instantiation path
  • ElasticRequestLogFilter.groovy — Updated to work with postgres client
  • MoquiDefaultConf.xml — Default configuration for postgres search
  • moqui-conf-3.xsd — Schema update for type attribute on elastic-client
  • build.gradle — PostgreSQL JDBC driver dependency
  • MoquiSuite.groovy — Test suite registration

Tests (83 tests, all passing)

  • PostgresSearchTranslatorTests.groovy — 46 unit tests for query translation including 13 SQL injection prevention tests
  • PostgresElasticClientTests.groovy — 37 integration tests for CRUD, bulk, search, count, delete-by-query
  • PostgresSearchSuite.groovy — JUnit test suite

Security

  • All field names validated against ^[a-zA-Z0-9_@][a-zA-Z0-9_.\-]*$ regex before SQL interpolation
  • Double-dash (--) SQL comments explicitly blocked
  • LIMIT/OFFSET use JDBC parameterized queries (not string interpolation)
  • Docker compose uses environment variable references for credentials
  • TLS 1.2 minimum enforced in Docker nginx proxy
  • 13 dedicated security tests verify SQL injection rejection

Configuration

<elastic-client name="default" type="postgres" cluster-name="my-app"/>

Set type="postgres" on any elastic-client element in your Moqui XML configuration. The client will use the transactional entity group's database connection.

Testing

All 166 Postgres-related tests pass (83 per suite × 2 suite runs). No regressions in existing framework tests.

…rnative

Add a new 'postgres' ElasticClient type that implements the full
ElasticFacade.ElasticClient interface using PostgreSQL with JSONB
document storage, tsvector full-text search, and GIN indexes.

Key changes:
- PostgresElasticClient: full ElasticClient implementation backed by
  PostgreSQL tables (moqui_search_index, moqui_logs, moqui_http_log)
- ElasticQueryTranslator: translates ES Query DSL (bool, term, terms,
  range, nested, exists, match_all, query_string, ids) into
  parameterized PostgreSQL SQL with sanitized field names
- PostgresSearchLogger: Log4j2 appender writing to PostgreSQL
- SearchEntities.xml: entity definitions with JSONB, tsvector, GIN indexes
- Security hardening: field name sanitization, parameterized queries,
  env-var credentials in Docker, TLS 1.2 minimum
- Comprehensive test suite: 83 tests covering query translation, CRUD,
  bulk indexing, search, and SQL injection prevention

Configuration: set elastic-client type="postgres" in Moqui XML conf.
No external search engine dependency required.
Bug fixes for ElasticQueryTranslator:
- Add tsqueryParams field separation to prevent parameter binding
  mismatch when bool queries combine query_string with term/range
- Remove no-op AND→AND/OR→OR replacements in cleanLuceneQuery

Bug fixes for PostgresElasticClient:
- Fix bulk() delete pair skew: rewrite from fixed i+=2 stride to
  variable stride (delete consumes 1 item, others consume 2)
- Add prefixIndexName() on action-spec _index in bulk()
- Fix search() to use tq.tsqueryParams for SELECT score clause
  instead of tq.params (was misaligning parameter indexes)
- Wrap CREATE EXTENSION pg_trgm in try-catch for managed DB envs

Also adds AI instruction files (AGENTS.md, CLAUDE.md, GEMINI.md) and
updates .gitignore for development artifacts.
@schue
Copy link
Copy Markdown

schue commented Mar 27, 2026

Here is some feedback from GLM-5.1

PR #700 Evaluation

Feature: Replaces ElasticSearch/OpenSearch with a PostgreSQL-backed search backend (JSONB + tsvector) via the existing ElasticFacade.ElasticClient interface.

What's in scope (appropriate for upstream)

  • Core framework changes in framework/: ElasticFacadeImpl, ElasticRequestLogFilter, MoquiDefaultConf.xml, XSD schema, build.gradle, MoquiSuite.groovy
  • New implementations: PostgresElasticClient (1276 lines), ElasticQueryTranslator (675 lines), PostgresSearchLogger (244 lines), SearchEntities.xml
  • Tests: PostgresSearchTranslatorTests (unit), PostgresElasticClientTests (integration), PostgresSearchSuite

What does NOT belong upstream

  • AGENTS.md, CLAUDE.md, GEMINI.md — fork-specific AI tooling instructions, all ~99% identical (510 lines of redundancy). They reference moqui-flutter, moqui-postgreonly, etc.
  • docker/moqui-postgres-only-compose.yml — fork-specific Docker setup
  • .gitignore additions — many are local dev tooling artifacts (.playwright-mcp/, *_PLAN.md, test_*.py, etc.)

Key technical concerns

  1. Fragile guessCastType() — uses field name heuristics (contains "date" → timestamp, "amount" → numeric). Will produce wrong SQL casts for unconventional naming.

  2. SQL copy-pasted 3x — the INSERT ... ON CONFLICT for documents appears identically in upsertDocument(), bulkIndex(), and bulkIndexDataDocument(). Should be a shared constant.

  3. minimum_should_match > 1 not implemented — code comment says "use CASE/SUM trick" but just uses OR, so the parameter is effectively ignored.

  4. Highlight support is a stub — uses Java regex instead of PostgreSQL ts_headline().

  5. Integration tests in main suitePostgresSearchSuite is added to MoquiSuite, meaning the full test suite will fail without a live PostgreSQL. These should be opt-in or a separate profile.

  6. destroy() uses instanceof checks — could just call .destroy() polymorphically since both implement ElasticClient.

  7. Raw REST API methods throw UnsupportedOperationException — any code using call(), callFuture(), or makeRestClient() will fail silently at runtime.

Overall assessment

The architecture is solid — clean Strategy pattern, backward-compatible, good SQL injection prevention, proper parameterized queries. The main issue is that this PR mixes the legitimate framework feature with a lot of fork-specific baggage (3 identical AI instruction files, fork-specific gitignore entries, fork-specific docker compose). If you're evaluating this for upstream merge, those should be stripped out first, the integration test suite issue should be fixed, and the duplicated SQL should be consolidated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants