Conversation
Automated Review URLs |
Co-Authored-By: Josh Moore <josh@openmicroscopy.org>
|
Could be more clear about how necessary the prefixes are - whether a bare hex-encoded UUID is acceptable, or whether |
|
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/ngff-weekly-dev-update-thread/110810/72 |
|
can you explain what this is for? wouldn't you want a UUID for each array? |
|
I think you need to explain the semantics for identity here, and potentially describe a content-aware procedure for creating unique identifiers. If I rechunk arrays, does the identifier change? And I assume you would want an independently-created |
|
@d-v-b thanks for the thoughts. I think I can answer some of this, but not all. Identity, as I see it, refers explicitly to the multiscales object as a package of array data + metadata. Rechunking, for instance, will change the semantics of how the described data can be accessed, but it doesn't change the actual data. That being said, in practice the uuid would probably be generated on write, so an operation like I guess the deeper question here is whether one would want to store different array layouts of the same data under the same uuid which ties into the "same array, different metadata" discussion? In that context, these different metadata objects would probably end up with different uuids, which I think would be ok? That would mean:
It doesn't need to because rechunking is more about the modalities of data access, not description. But if you read -> rechunk -> write, it probably would change, and that would be ok.
Kind of the same answer: I think same metadata + same arrays (modulo chunk layout) should be allowed to have the same uuid, but they don't need to. |
|
I think the spec needs to be really clear on what data transformations require changing the UUID, and whether it's an error if two different multiscales use the same UUID. I think this requires defining what "identity" means for multiscales objects. |
|
If the identifier is content aware, does that mean I have to recalculate the identifier when the content changes? This essential would be come a checksum, and quite an onerous one at that if the checksum must depend on the content as a whole. |
Fixes ome/ngff#463
Supersedes ome/ngff#115