Skip to content

Fix double newline in NDJSON bulk body when using RawEncoding#49557

Open
trilamsr wants to merge 3 commits intoelastic:mainfrom
trilamsr:fix/bulk-raw-encoding-double-newline
Open

Fix double newline in NDJSON bulk body when using RawEncoding#49557
trilamsr wants to merge 3 commits intoelastic:mainfrom
trilamsr:fix/bulk-raw-encoding-double-newline

Conversation

@trilamsr
Copy link

@trilamsr trilamsr commented Mar 19, 2026

Proposed commit message

Fix double newline in NDJSON bulk body when using RawEncoding

When events are pre-encoded in the queue (via event_encoder.go), the encoded bytes include a trailing newline from the Marshal/AddRaw call. When the bulk body assembler later writes these bytes via AddRaw(RawEncoding{...}), it unconditionally appends another newline, producing an empty line (\n\n) in the NDJSON bulk body.

While Elasticsearch tolerates empty lines in bulk requests, ES-compatible endpoints like Axiom and OpenSearch reject them with:

400 Bad Request: invalid event at index 1: ReadObject: expect { or ,
or } or n, but found \u0000

Closes #49558

What does this PR do?

Checks whether RawEncoding bytes already end with a newline and skips the additional WriteByte('\n') / Write(nl) if so. Applied to both jsonEncoder and gzipEncoder paths.

Backward compatible: RawEncoding bytes without a trailing newline still get the newline appended.

How to test this locally

cd libbeat/esleg/eslegclient
go test -run "TestRawEncodingNoDoubleNewline|TestEncoderHeaders" -v

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective
  • I have added a changelog fragment

🤖 Generated with Claude Code

@trilamsr trilamsr requested review from a team as code owners March 19, 2026 07:15
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 19, 2026
@botelastic
Copy link

botelastic bot commented Mar 19, 2026

This pull request doesn't have a Team:<team> label.

@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@cla-checker-service
Copy link

cla-checker-service bot commented Mar 19, 2026

💚 CLA has been signed

@mergify
Copy link
Contributor

mergify bot commented Mar 19, 2026

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix/bulk-raw-encoding-double-newline upstream/fix/bulk-raw-encoding-double-newline
git merge upstream/main
git push upstream fix/bulk-raw-encoding-double-newline

@mergify
Copy link
Contributor

mergify bot commented Mar 19, 2026

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @trilamsr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

When events are pre-encoded in the queue (via event_encoder.go), the
encoded bytes include a trailing newline from the Marshal/AddRaw call.
When the bulk body assembler later writes these bytes via
AddRaw(RawEncoding{...}), it unconditionally appends another newline,
producing an empty line (\n\n) in the NDJSON bulk body.

While Elasticsearch tolerates empty lines in bulk requests,
ES-compatible endpoints like Axiom and OpenSearch reject them with:

  400 Bad Request: invalid event at index 1: ReadObject: expect { or ,
  or } or n, but found \u0000

The fix checks whether RawEncoding bytes already end with a newline
and skips the additional one if so. This preserves backward
compatibility: RawEncoding bytes without a trailing newline (e.g. from
json.Marshal) still get the newline appended as before.

Applied to both jsonEncoder and gzipEncoder paths.
@trilamsr trilamsr force-pushed the fix/bulk-raw-encoding-double-newline branch from 138d525 to 1c08869 Compare March 19, 2026 07:17
@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19e53bc4-6f66-467c-829a-a865871e219c

📥 Commits

Reviewing files that changed from the base of the PR and between 457c007 and 4b9ffc6.

📒 Files selected for processing (3)
  • changelog/fragments/1773900000-fix-bulk-raw-encoding-double-newline.yaml
  • libbeat/esleg/eslegclient/enc.go
  • libbeat/esleg/eslegclient/enc_test.go
✅ Files skipped from review due to trivial changes (2)
  • libbeat/esleg/eslegclient/enc.go
  • libbeat/esleg/eslegclient/enc_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • changelog/fragments/1773900000-fix-bulk-raw-encoding-double-newline.yaml

📝 Walkthrough

Walkthrough

This change fixes a bug in NDJSON bulk request body encoding where pre-encoded events using RawEncoding were receiving an unintended double newline. The fix updates jsonEncoder.AddRaw and gzipEncoder.AddRaw to detect when v.Encoding already ends with '\n' and return without appending an extra newline. A unit test TestRawEncodingNoDoubleNewline was added to verify that combining a metadata line with a pre–raw-encoded JSON document does not produce "\n\n" in the resulting bulk body.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR fully addresses all objectives from #49558: fixes double newline in RawEncoding by checking if bytes end with '\n' [49557], applies fix to both jsonEncoder and gzipEncoder paths, preserves backward compatibility for bytes without trailing newline, and includes tests verifying the fix.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the double-newline bug: changelog fragment, encoder logic fixes, and test cases. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.

OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required.

When events are pre-encoded in the queue (via event_encoder.go), the
encoded bytes include a trailing newline from the Marshal/AddRaw call.
When the bulk body assembler later writes these bytes via
AddRaw(RawEncoding{...}), it unconditionally appends another newline,
producing an empty line (\n\n) in the NDJSON bulk body.

While Elasticsearch tolerates empty lines in bulk requests,
ES-compatible endpoints like Axiom and OpenSearch reject them with:

  400 Bad Request: invalid event at index 1: ReadObject: expect { or ,
  or } or n, but found \u0000

The fix checks whether RawEncoding bytes already end with a newline
and skips the additional one if so. This preserves backward
compatibility: RawEncoding bytes without a trailing newline (e.g. from
json.Marshal) still get the newline appended as before.

Applied to both jsonEncoder and gzipEncoder paths.
@trilamsr trilamsr force-pushed the fix/bulk-raw-encoding-double-newline branch from 457c007 to abeb91b Compare March 19, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs_team Indicates that the issue/PR needs a Team:* label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Filebeat 9.x bulk requests contain double newline, breaking ES-compatible endpoints

2 participants