feat(iceberg): add IcebergBinaryRowWriter#366
feat(iceberg): add IcebergBinaryRowWriter#366charlesdong1991 wants to merge 7 commits intoapache:mainfrom
Conversation
leekeiabstraction
left a comment
There was a problem hiding this comment.
TY for the PR, left some comments.
| /// | ||
| /// [`CompactedRowWriter`]: crate::row::compacted::CompactedRowWriter | ||
| /// [`IcebergBucketingFunction`]: crate::bucketing::IcebergBucketingFunction | ||
| pub struct IcebergBinaryRowWriter { |
There was a problem hiding this comment.
Should this PR also contain IcebergKeyEncoder? IcebergKeyEncoder implementation should be small, but the test cases there should provide a lot of value in verifying correctness of IcebergBinaryRowWriter.
See CompactedKeyEncoder implementation and test as example.
There was a problem hiding this comment.
Actually, similar to previous comment, I notice IcebergKeyEncoder is being worked on in #308 .
So I can have a TODO to verify and add tests (since that key encoder is sort of consumer of row writer) after that PR is merged?
There was a problem hiding this comment.
i added a TODO comment in the function, and will address after 308
|
Hi @leekeiabstraction @luoyuxia i wonder what's your view on this PR? For all dependencies on KeyEncoder part, i highlighted as TODO to address after it is implmeneted/merged. This PR has been hanging for a while, so i'd like to give it a push ^^ If it is fine, I am happy also to take over the IcebergKeyEncoder implementation to push it though if you prefer to have that merged first, as I'd like to test fluss/iceberg in python for demo. Thanks again! |
leekeiabstraction
left a comment
There was a problem hiding this comment.
Agreed on moving things along instead of blocking this PR. I've added an additional question. Once answered/addressed, we can approve/merge this PR and loop back to it once the blocking PR is merged.
|
|
||
| impl BinaryWriter for IcebergBinaryRowWriter { | ||
| fn reset(&mut self) { | ||
| self.position = 0; |
There was a problem hiding this comment.
There was a problem hiding this comment.
i think both buffer and to_bytes function only return [0..position], so i think stale bytes won't be included in output even we don't zero it IMHO @leekeiabstraction
But it's quite straightforward insurance if do it explicitly, let me then update it
There was a problem hiding this comment.
Updated, thanks! PTAL @leekeiabstraction
leekeiabstraction
left a comment
There was a problem hiding this comment.
TY for addressing the comments. Approved 🙌
Purpose
This PR implements IcebergBinaryRowWriter, efforts as part of #194
Tests
All tests are passed locally