Skip to content

(Feature) HDN minimum example#396

Draft
CatEek wants to merge 92 commits intomainfrom
hdn_config
Draft

(Feature) HDN minimum example#396
CatEek wants to merge 92 commits intomainfrom
hdn_config

Conversation

@CatEek
Copy link
Copy Markdown
Contributor

@CatEek CatEek commented Feb 10, 2025

Description

Note

tldr: Minimum working HDN integrated into CAREamist.

Unfortunately, note that many of the modifications are artifacts of the merge with main. Also it's important to keep in mind that this is a mimimal example that is not supposed to be used as is, so I believe we can leave certain things up to future refactoring.

Background - why do we need this PR?

HDN is currently not available in CAREamics, but most of the features are present (Noise model, LVAE model).
It will use existing LVAE model code. The difference from MicroSplit is it is unsupervised(targets for loss computation are defined as input) and the loss function is different(but uses MIcroSplit loss under the hood)

Overview - what changed?

New configurations for HDN, but also modifying the LVAE code base to make it compatible.

New features or files

  • HDN relevant code in careamist.py
  • relevant config in hdn_algorithm_model.py
  • relevant changes in vae_algorithm_model.py
  • relevant changes in lvae_model.py
  • HDN configuration code in configuration_factory.py and relevant modules
  • HDN relevant code in lightning_module.py
  • HDN code in losses.py and loss factory
  • etc...

New features or files

Configuration:

  • Added: HDNAlgorithm algorithm configuration.
  • Modified: Added hdn into all relevant configuration.
  • Added hdn_loss which is using microsplit losses inside

How has this been tested?

Created a notebook in the examples repo to check performance on the BSD dataset without noise model

Related Issues

  • Resolves #

Breaking changes

Additional Notes and Examples

--- BMZ doesn't work, raises NotImplemented because the model outputs are incompatible
--- 3D isn't tested and would need another PR
--- HDN with noise model isn't tested

Please ensure your PR meets the following requirements:

  • Code builds and passes tests locally, including doctests
  • New tests have been added (for bug fixes/features)
  • Pre-commit passes
  • PR to the documentation exists (for bug fixes / features)

@jdeschamps jdeschamps marked this pull request as draft February 10, 2025 15:02
@CatEek CatEek changed the title HDN minimum example (Feature) HDN minimum example Feb 27, 2025
Comment on lines +67 to +74
# hdn
if self.algorithm == SupportedAlgorithm.HDN:
if self.loss.loss_type != SupportedLoss.HDN:
raise ValueError(
f"Algorithm {self.algorithm} only supports loss `hdn`."
)
if self.model.multiscale_count > 1:
raise ValueError("Algorithm `hdn` does not support multiscale models.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see there should probably be separate child classes for MuSplit, DenoiSplit, but I guess this can wait and we create an issue

Comment on lines +19 to +20
input_shape: tuple[int, ...] = Field(default=(64, 64), validate_default=True)
"""Shape of the input patch (Z, Y, X) or (Y, X) if the data is 2D."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing this to tuple has some serialization issues with the current way we do model dump.

Note, it can actually be solved doing .model_dump(mode="json") which should automatically cast iterable python types to list.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't know. And I guess when reading out the list, it gets casted into tuple without any issue?

Should we open an issue for refactoring the way we export the configuration? It would be nice to support tuples, that would also allow immutable defaults in functions signatures.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah default mode argument is "python", which will keep objects as python types, using "json" will convert python types to json serializable objects, and I guess there is overlap with yaml, model_dump API

],
predict_logvar: Literal[None, "pixelwise"],
analytical_kl: bool,
model_params: Optional[dict[str, Any]] = None,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of having this model_params argument? All the arguments above are none optional and will overwrite any arguments included in model_params.

Comment on lines +236 to +239
def _create_unet_based_algorithm(
axes: str,
algorithm: Literal["n2v", "care", "n2n"],
loss: Literal["n2v", "mae", "mse"],
algorithm: Literal["n2v", "care", "n2n", "hdn"],
loss: Literal["n2v", "mae", "mse", "hdn"],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "hdn" as algorithm option in UNet-based algorithm creation function.

Comment on lines +253 to +255
algorithm : {"n2v", "care", "n2n", "hdn"}
Algorithm to use.
loss : {"n2v", "mae", "mse"}
loss : {"n2v", "mae", "mse", "hdn"}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove hdn in docs

Comment on lines +290 to +291
def _create_vae_based_algorithm(
algorithm: Literal["hdn"],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this function be used for MuSplit and DenoiSplit algorithms as well?

Copy link
Copy Markdown
Member

@jdeschamps jdeschamps Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future probably! Currently there is no use for the convenience functions, since MicroSplit code has its own wrapper? But at some point we should have the equivalence here

Comment on lines 554 to 558
# algorithm
algorithm_params = _create_algorithm_configuration(
algorithm_params = _create_unet_based_algorithm(
axes=axes,
algorithm=algorithm,
loss=loss,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe _create_supervised_config_dict can be renamed since now it seems to only relate to the UNet based configs? But MicroSplit algorithms are also supervised

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Technically, we could consider that N2N is also a CARE approach. So we could choose _create_care_config_dict or the more obvious _create_unet_based_config_dict.

Comment on lines +131 to +133
if isinstance(model, VAEModule):
raise ValueError("Export of VAE models is not supported.")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a NotImplementedError with a note that we are planning to support it in the future?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

algorithm_config=self.cfg.algorithm_config,
)
elif isinstance(self.cfg.algorithm_config, VAEBasedAlgorithm):
self.model = VAEModule(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model should not even be instantiated.


loss: LVAELossConfig

model: LVAEModel # TODO add validators
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open an issue and state what these validators should be? Otherwise you will have to figure it out again.


HDN = "HDN"

HDN_DESCRIPTION = ""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO here?

from careamics.config.architectures import LVAEModel
from careamics.config.loss_model import LVAELossConfig

HDN = "HDN"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually "Hierarchical DivNoising"

Comment on lines +67 to +74
# hdn
if self.algorithm == SupportedAlgorithm.HDN:
if self.loss.loss_type != SupportedLoss.HDN:
raise ValueError(
f"Algorithm {self.algorithm} only supports loss `hdn`."
)
if self.model.multiscale_count > 1:
raise ValueError("Algorithm `hdn` does not support multiscale models.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they should be moved to child classes. Can you open an issue (maybe first a general MicroSplit issue, then this as a sub-issue)?

Comment on lines +19 to +20
input_shape: tuple[int, ...] = Field(default=(64, 64), validate_default=True)
"""Shape of the input patch (Z, Y, X) or (Y, X) if the data is 2D."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't know. And I guess when reading out the list, it gets casted into tuple without any issue?

Should we open an issue for refactoring the way we export the configuration? It would be nice to support tuples, that would also allow immutable defaults in functions signatures.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python module should be renamed to vae_loss_model.py

gaussian_likelihood: Optional[GaussianLikelihood],
noise_model_likelihood: Optional[NoiseModelLikelihood],
) -> Optional[dict[str, torch.Tensor]]:
"""Loss function for DenoiSplit.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says "Loss function for DenoiSplit", it should be HDN.

Comment on lines -398 to +402
x, target = batch
x, *target = batch
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't MicroSplit using the VAEModule without this modification? In this case, why would we need it?

Also this feels totally out of PR-scope. I feel it does add an overhead in reading and understanding the code.

@jdeschamps jdeschamps self-requested a review June 17, 2025 12:43
masked = torch.zeros_like(batch)
mask = torch.zeros_like(batch, dtype=torch.uint8)

self.rng = (
Copy link
Copy Markdown
Member

@jdeschamps jdeschamps Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this changed? It implies to me that every N2V Manipulate call will be the same, rather than just following the same steps each time (instantiating a new generator with the same seed each time, vs having a seeded generator created once at the beginning).

@jdeschamps jdeschamps marked this pull request as draft September 2, 2025 19:29
@jdeschamps jdeschamps mentioned this pull request Sep 2, 2025
@jdeschamps
Copy link
Copy Markdown
Member

jdeschamps commented Sep 3, 2025

Superseded by #511, leaving open for reference until the other one is merged.

@jdeschamps
Copy link
Copy Markdown
Member

@CatEek Do we still need this PR? Can we close it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants