Skip to content

CLDSRV-863: add checksums to PutObject and UploadPart#6094

Draft
leif-scality wants to merge 1 commit intodevelopment/9.4from
improvement/CLDSRV-863-checksums-put-object-part
Draft

CLDSRV-863: add checksums to PutObject and UploadPart#6094
leif-scality wants to merge 1 commit intodevelopment/9.4from
improvement/CLDSRV-863-checksums-put-object-part

Conversation

@leif-scality
Copy link
Contributor

No description provided.

@bert-e
Copy link
Contributor

bert-e commented Mar 6, 2026

Hello leif-scality,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Mar 6, 2026

Incorrect fix version

The Fix Version/s in issue CLDSRV-863 contains:

  • None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 9.4.0

Please check the Fix Version/s of CLDSRV-863, or the target
branch of this pull request.

@leif-scality leif-scality changed the title tmp CLDSRV-863: add checksums to PutObject and UploadPart Mar 6, 2026
config.json Outdated
"clusters": 1,
"log": {
"logLevel": "info",
"logLevel": "trace",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log level changed to trace in committed config. This will produce extremely verbose logs in any environment using the default config.

— Claude Code

const mdOnlyHeader = request.headers['x-amz-meta-mdonly'];
const mdOnlySize = request.headers['x-amz-meta-size'];

// console.log('============== createAndStoreObject');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out console.log statements left in production code. Remove before merging.

— Claude Code

const { data } = require('../../../data/wrapper');
const { prepareStream, stripTrailingChecksumStream } = require('./prepareStream');
// const { prepareStream, prepareStream2, stripTrailingChecksumStream } = require('./prepareStream');
const { prepareStream2 } = require('./prepareStream');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out imports and console.log left in production code. The old prepareStream and stripTrailingChecksumStream imports are commented out instead of removed.

— Claude Code

// if (!dataStreamTmp) {
// return process.nextTick(() => cb(errors.InvalidArgument));
// }
// const dataStream = stripTrailingChecksumStream(dataStreamTmp, log, cbOnce);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out old code block (lines 63-67) should be removed, not left as dead code.

— Claude Code

const valid = checksumedStream.validateChecksum();
if (valid !== null) {
// console.log(valid);
return cbOnce(errors.BadDigest);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When checksum validation fails, the stored data is not cleaned up. checkHashMatchMD5 deletes stored data on MD5 mismatch, but here the data is left orphaned. Should call data.batchDelete on dataRetrievalInfo before returning the error.

— Claude Code

// Authentication header, while the chunked upload method
// requires V4: in such case we don't get any V4 params
// and we should return an error to the client.
return null; // FIXME: use CB
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return null with FIXME comment — the caller (dataStore in storeObject.js) passes the result directly to data.put without a null check. The old prepareStream had a null guard (if (!dataStreamTmp)) but it was commented out. This will crash with a TypeError when piping from null.

— Claude Code

checksumAlgo.expected,
checksumAlgo.isTrailer,
);
stream.pipe(checksumedStream);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No error listener on ChecksumTransform. The STREAMING-AWS4-HMAC-SHA256-PAYLOAD, UNSIGNED-PAYLOAD, and default cases pipe into a ChecksumTransform but never attach an error event handler. Unhandled stream errors will crash the process.

— Claude Code

return checksumedStream;
}
case 'STREAMING-UNSIGNED-PAYLOAD-TRAILER': {
// console.log(checksumAlgo);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out console.log left in production code. Remove before merging.

— Claude Code


_transform(chunk, encoding, callback) {
const input = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
// console.log("chunk: '%s'", input);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out console.log left in production code.

— Claude Code


validateChecksum() {
if (this.isTrailer) {
// FIXME: Handle trailer is missing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME comment indicates a known unhandled case: if setExpectedChecksum is never called, trailerChecksumValue is undefined, producing a MalformedChecksum error instead of a more appropriate missing trailer error. This should be resolved before merging.

— Claude Code

const vault = require('../vault');
const constructChunkStringToSign = require('./constructChunkStringToSign');

// Do we use this one or vaults?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray question comment added to an unrelated file. Remove before merging.

— Claude Code


const { unsupportedSignatureChecksums, supportedSignatureChecksums } = require('../../../../constants');

// FIXME: merge this
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME and question comments should be resolved or removed before merging.

— Claude Code

"test-gcp": "mocha -t 40000 test/GCP/ --exit",
"test-routes": "mocha -t 40000 test/routes/ --exit",
"test": "mocha -t 40000 test/ --exit",
"test": "mocha -t 40000 test/xAmzChecksum.js --exit",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test script hardcoded to only run xAmzChecksum.js instead of the full test suite. This will cause other functional tests to stop running.

— Claude Code

}
const value = Buffer.alloc(parsedContentLength);
const cbOnce = jsutil.once(callback);
// TODO
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare TODO comment with no context. If this Veeam code path needs updating for prepareStream2, track it as a follow-up issue.

— Claude Code


const trailerAlgo = trailer.slice('x-amz-checksum-'.length);
if (!(trailerAlgo in algorithms)) {
return { error: ChecksumError.AlgoNotSupported, details: { algorithm: trailerAlgo } };;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double semicolons.

— Claude Code

@codecov
Copy link

codecov bot commented Mar 6, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
8163 1 8162 0
View the top 1 failed test(s) by shortest run time
"before each" hook for "should not respond to request with CORS headers, even if request was sent with Origin header"::Cross Origin Resource Sharing requests "before each" hook for "should not respond to request with CORS headers, even if request was sent with Origin header"
Stack Traces | 0.005s run time
The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again.

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@leif-scality leif-scality force-pushed the improvement/CLDSRV-863-checksums-put-object-part branch from 163fccb to 5b6aa59 Compare March 6, 2026 17:51
@claude
Copy link

claude bot commented Mar 6, 2026

Issues found: (1) Crash bug - prepareStream2 returns null but caller has no null guard (2) Data leak - stored data not cleaned up when validateChecksum fails (3) Missing stream error handlers on ChecksumTransform (4) config.json logLevel changed to trace (5) Test script narrowed to single file (6) Debug artifacts - commented-out console.log and code blocks (7) Unresolved FIXME/TODO comments (8) Double semicolons in validateChecksums.js:205 -- Review by Claude Code

const mdOnlyHeader = request.headers['x-amz-meta-mdonly'];
const mdOnlySize = request.headers['x-amz-meta-size'];

// console.log('============== createAndStoreObject');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out debug console.log statements should be removed before merging. There are multiple instances throughout this file.

— Claude Code

// Authentication header, while the chunked upload method
// requires V4: in such case we don't get any V4 params
// and we should return an error to the client.
return null; // FIXME: use CB
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns null with a FIXME comment. The caller in storeObject.js will pass null to data.put(), causing a crash. Should call errCb(errors.InvalidArgument) instead.

— Claude Code

checksumAlgo.expected,
checksumAlgo.isTrailer,
);
stream.pipe(checksumedStream);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No error handler on ChecksumTransform streams. If ChecksumTransform emits an error, it will crash the process. The TrailingChecksumTransform path correctly adds .on('error', errCb) at line 87, but ChecksumTransform never gets one in any branch. Add checksumedStream.on('error', errCb) in each case.

— Claude Code


const { data } = require('../../../data/wrapper');
const { prepareStream, stripTrailingChecksumStream } = require('./prepareStream');
// const { prepareStream, prepareStream2, stripTrailingChecksumStream } = require('./prepareStream');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out imports should be removed.

— Claude Code

// return process.nextTick(() => cb(errors.InvalidArgument));
// }
// const dataStream = stripTrailingChecksumStream(dataStreamTmp, log, cbOnce);
// console.log('...................................');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out code and debug console.log should be removed.

— Claude Code


// console.log('================',
// checksumedStream.algoName, checksumedStream.digest, checksumedStream.expectedDigest);
const valid = checksumedStream.validateChecksum();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateChecksum() reads this.digest set in _flush(). The data.put callback fires when the backend finishes consuming data, but _flush() runs when the stream ends. If _flush() has not completed, this.digest is undefined and validation silently passes (non-trailer) or produces wrong results (trailer). Consider validating inside _flush() or waiting for the finish event.

— Claude Code

const valid = checksumedStream.validateChecksum();
if (valid !== null) {
// console.log(valid);
return cbOnce(errors.BadDigest);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When checksum validation fails, stored data is not cleaned up. The existing checkHashMatchMD5 deletes data on MD5 mismatch via data.batchDelete. This path should do the same, otherwise failed-checksum objects leak storage.

— Claude Code


_transform(chunk, encoding, callback) {
const input = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
// console.log("chunk: '%s'", input);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug console.log (commented out) should be removed.

— Claude Code

const input = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
// console.log("chunk: '%s'", input);
this.hash.update(input, encoding);
callback(null, input, encoding);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Transform _transform callback signature is callback(error, chunk). The third argument encoding is not part of the API and will be ignored.

— Claude Code

const vault = require('../vault');
const constructChunkStringToSign = require('./constructChunkStringToSign');

// Do we use this one or vaults?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This development note comment should not be committed.

— Claude Code

if (!trailer.startsWith('x-amz-checksum-')) {
return { error: 'invalid x-amz-trailer' };
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double semicolon.

— Claude Code


const { unsupportedSignatureChecksums, supportedSignatureChecksums } = require('../../../../constants');

// FIXME: merge this
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME and inline question on line 36 are dev notes that should be resolved or removed before merging.

— Claude Code

}
const value = Buffer.alloc(parsedContentLength);
const cbOnce = jsutil.once(callback);
// TODO
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO without description. The prepareStream call here likely needs migrating to prepareStream2 as well.

— Claude Code

return checksumedStream;
}
case 'STREAMING-UNSIGNED-PAYLOAD-TRAILER': {
// console.log(checksumAlgo);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out console.log should be removed.

— Claude Code

return checksumedStream;
}
default: {
// console.log(checksumAlgo);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out console.log should be removed.

— Claude Code

@claude
Copy link

claude bot commented Mar 6, 2026

Critical issues:
- prepareStream.js:69 — Returns null instead of calling errCb, will crash when caller passes null to data.put()
- prepareStream.js:81 — No error handler on ChecksumTransform in any branch; unhandled stream errors will crash the process
- storeObject.js:92 — Possible race: validateChecksum() reads this.digest before _flush() may have set it
- storeObject.js:95 — Checksum failure does not clean up already-stored data (unlike checkHashMatchMD5 which calls data.batchDelete)

Should fix:
- validateChecksums.js:202 — Double semicolon
- ChecksumTransform.js:71 — Extra encoding arg in _transform callback (not part of Transform API)

Cleanup needed (many commented-out console.logs, FIXMEs, TODOs):
- createAndStoreObject.js:215,218,220,255 — Commented-out console.log
- storeObject.js:4,6,63-67,68,90-91,94 — Commented-out imports, code, and console.log
- prepareStream.js:85,113 — Commented-out console.log
- ChecksumTransform.js:69 — Commented-out console.log
- V4Transform.js:9 — Dev note comment
- validateChecksumHeaders.js:5,36 — FIXME and inline question
- veeam/utils.js:38 — Empty TODO; prepareStream call not migrated to prepareStream2

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-863-checksums-put-object-part branch from 5b6aa59 to aa38363 Compare March 7, 2026 03:45
v4Transform.headers = request.headers;
stream = v4Transform;

const checksumedStream = new ChecksumTransform(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No error event handler is attached to any ChecksumTransform instance in this function. If _flush rejects (e.g. CRC digest throws), the stream emits an unhandled error event which will crash the process. Add .on('error', errCb) like TrailingChecksumTransform does at line 92.

— Claude Code

const valid = checksumedStream.validateChecksum();
if (valid !== null) {
// console.log(valid);
return cbOnce(errors.BadDigest);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When checksum validation fails, the data that was already stored is not deleted. Compare with checkHashMatchMD5 just below, which calls data.batchDelete before returning errors.BadDigest. This will leave orphaned data in the backend.

— Claude Code


const { data } = require('../../../data/wrapper');
const { prepareStream, stripTrailingChecksumStream } = require('./prepareStream');
// const { prepareStream, prepareStream2, stripTrailingChecksumStream } = require('./prepareStream');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out code and debug console.log statements throughout this file. These should be removed before merging.

— Claude Code

const mdOnlyHeader = request.headers['x-amz-meta-mdonly'];
const mdOnlySize = request.headers['x-amz-meta-size'];

// console.log('============== createAndStoreObject');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out console.log debug statements should be removed before merging.

— Claude Code

if ('x-amz-trailer' in headers) {
const trailer = headers['x-amz-trailer'];
if (!trailer.startsWith('x-amz-checksum-')) {
return { error: 'invalid x-amz-trailer' };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns a raw error string 'invalid x-amz-trailer' instead of a ChecksumError enum value. arsenalErrorFromChecksumError won't match any case and will fall through to the default returning BadDigest, which is incorrect for an invalid trailer header. Use a proper ChecksumError value.

— Claude Code


if (xAmzChecksumCnt === 0) {
// There was no x-amz-checksum- or x-amz-trailer
return { algorithm: 'crc64nvme', isTrailer: false, expected: undefined };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When no checksum headers are present, this defaults to crc64nvme. This means every regular request (without any checksum headers) will compute a CRC64 hash of the entire body, adding CPU overhead for no benefit since there is nothing to validate against (expected: undefined and validateChecksum() returns null for that case). Consider returning a sentinel value indicating no checksum is needed.

— Claude Code


validateChecksum() {
if (this.isTrailer) {
// FIXME: Handle trailer is missing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple FIXME/TODO comments in this file indicate unfinished work: line 24 ("FIXME: Handle trailer is missing"), line 69 (commented-out console.log). These should be resolved or tracked before merging.

— Claude Code

const checksumAlgo = getChecksumDataFromHeaders(request.headers);
if (checksumAlgo.error) {
log.debug('invalid checksum headers', checksumAlgo);
return errCb(arsenalErrorFromChecksumError(checksumAlgo)); // FIXME sometimes we use CB sometimes we use null
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-callback bug: errCb() is called here (which calls cbOnce), and its return value (undefined) is returned. Back in storeObject.js:71, the \!checksumedStream check then fires cbOnce(errors.InvalidArgument) — but cbOnce was already invoked by errCb. Even with jsutil.once, only one error reaches the caller, and the real checksum error is replaced by a generic InvalidArgument.

Either return null (like the V4 params check below) and let the caller handle it, or don't return the result of errCb().

— Claude Code


const trailerAlgo = trailer.slice('x-amz-checksum-'.length);
if (!(trailerAlgo in algorithms)) {
return { error: ChecksumError.AlgoNotSupported, details: { algorithm: trailerAlgo } };;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double semicolon.

— Claude Code

@claude
Copy link

claude bot commented Mar 7, 2026

  • Double-callback bug (prepareStream.js:58): errCb() called AND undefined return triggers duplicate callback in storeObject.js. Return null without calling errCb, let caller map error.
    - Missing error handler (prepareStream.js:81+): No error handler on ChecksumTransform. _flush error crashes process. Add error handler like TrailingChecksumTransform.
    - Data leak on checksum failure (storeObject.js:95-98): Stored data not deleted on BadDigest. Call data.batchDelete like the MD5 path.
    - Raw error string (validateChecksums.js:200): Falls through to wrong error. Use ChecksumError enum.
    - Unnecessary default CRC64 (validateChecksums.js:218): Hashes body with nothing to validate. Skip hashing when no checksum requested.
    - Debug leftovers: Remove commented console.log and dead code.
    - FIXME/TODO: Resolve or track unresolved markers.
    - Double semicolon (validateChecksums.js:205).

    Review by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants