Add a reinforcement learning based imaginary time evolution algorithm. by hzhangxyz · Pull Request #65 · USTC-KnowledgeComputingLab/qmp-kit

hzhangxyz · 2025-07-17T18:23:57Z

Description

todo：

自由地调整ref network
优化s'而非s
在mlp和attn上测试
为什么能量没下去？对比vmc？是不是log scale的loss function需要？

Checklist:

I have read the CONTRIBUTING.md.

Copilot

Pull Request Overview

This PR adds a new reinforcement learning based imaginary time evolution (RLIM) algorithm to the qmb quantum many-body simulation library. The implementation provides an alternative optimization approach that combines reinforcement learning concepts with imaginary time evolution for quantum state optimization.

Implements a new RLIM algorithm class with configurable parameters for sampling, learning rates, and optimization steps
Integrates the new algorithm into the existing CLI framework through subcommand registration
Provides comprehensive logging and TensorBoard monitoring for the optimization process

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
qmb/rlim.py	Complete implementation of the RLIM algorithm with configuration class and main optimization loop
qmb/main.py	Registration of the new RLIM subcommand in the CLI interface

Copilot · 2025-07-17T18:30:53Z

qmb/rlim.py

+        # pylint: disable=too-many-locals
+
+        model, network, data = self.common.main()
+        ref_network = network


Assigning ref_network = network creates a reference to the same object rather than an independent copy. This means both networks will be updated simultaneously during optimization, which may not be the intended behavior for a reference network that should remain stable.

Suggested change

ref_network = network

ref_network = type(network)() # Create a new instance of the same model class

ref_network.load_state_dict(network.state_dict()) # Copy the parameters from the original network

Copilot · 2025-07-17T18:30:54Z

qmb/rlim.py

+                a = torch.outer(psi_src.detach().conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_psi_src.detach())
+                b = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)
+                diff = (a - self.evolution_time * b).flatten()


[nitpick] The variable name 'a' is not descriptive. Consider renaming it to something more meaningful like 'overlap_diff' or 'reference_term' to clarify its role in the loss calculation.

Suggested change

a = torch.outer(psi_src.detach().conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_psi_src.detach())

b = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)

diff = (a - self.evolution_time * b).flatten()

overlap_diff = torch.outer(psi_src.detach().conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_psi_src.detach())

b = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)

diff = (overlap_diff - self.evolution_time * b).flatten()

Copilot · 2025-07-17T18:30:54Z

qmb/rlim.py

+                b = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)
+                diff = (a - self.evolution_time * b).flatten()


[nitpick] The variable name 'b' is not descriptive. Consider renaming it to something more meaningful like 'hamiltonian_term' or 'energy_term' to clarify its role in the loss calculation.

Suggested change

b = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)

diff = (a - self.evolution_time * b).flatten()

hamiltonian_term = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)

diff = (a - self.evolution_time * hamiltonian_term).flatten()

Copilot · 2025-07-17T18:30:54Z

qmb/rlim.py

+                loss.energy = energy  # type: ignore[attr-defined]
+                return loss
+
+            logging.info("Starting local optimization process")
+
+            for i in range(self.local_step):
+                loss: torch.Tensor = optimizer.step(closure)  # type: ignore[assignment,arg-type]
+                energy: float = loss.energy  # type: ignore[attr-defined]


Dynamically adding attributes to tensor objects is not a clean practice. Consider returning a tuple or using a dataclass to pass both loss and energy values instead of monkey-patching the tensor object.

Suggested change

loss.energy = energy # type: ignore[attr-defined]

return loss

logging.info("Starting local optimization process")

for i in range(self.local_step):

loss: torch.Tensor = optimizer.step(closure) # type: ignore[assignment,arg-type]

energy: float = loss.energy # type: ignore[attr-defined]

return LossEnergy(loss=loss, energy=energy)

logging.info("Starting local optimization process")

for i in range(self.local_step):

loss_energy: LossEnergy = optimizer.step(closure) # type: ignore[assignment,arg-type]

loss, energy = loss_energy.loss, loss_energy.energy

Copilot · 2025-07-17T18:30:54Z

qmb/rlim.py

+
+            for i in range(self.local_step):
+                loss: torch.Tensor = optimizer.step(closure)  # type: ignore[assignment,arg-type]
+                energy: float = loss.energy  # type: ignore[attr-defined]


Accessing the dynamically added energy attribute requires type ignore comments and makes the code fragile. This is a consequence of the monkey-patching approach on line 117.

hzhangxyz requested a review from Copilot July 17, 2025 18:24

This comment was marked as outdated.

Sign in to view

hzhangxyz force-pushed the dev/rl branch from ec1fd8e to 345136f Compare July 17, 2025 18:25

Add RL based imaginary time evolution algorithm.

ca296b4

hzhangxyz force-pushed the dev/rl branch from 345136f to ca296b4 Compare July 17, 2025 18:28

hzhangxyz requested a review from Copilot July 17, 2025 18:29

Copilot AI reviewed Jul 17, 2025

View reviewed changes

hzhangxyz added 3 commits July 18, 2025 11:17

Add normalization layers in the mlp networks.

7edd117

Add dropout in the loss function.

af7e168

Remove useless conjugate in rlim.

0e18364

hzhangxyz force-pushed the dev/rl branch from 700f1fc to 0e18364 Compare July 18, 2025 05:30

hzhangxyz force-pushed the main branch from 3c7d134 to fd78860 Compare December 18, 2025 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a reinforcement learning based imaginary time evolution algorithm.#65

Add a reinforcement learning based imaginary time evolution algorithm.#65
hzhangxyz wants to merge 4 commits intomainfrom
dev/rl

hzhangxyz commented Jul 17, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 17, 2025

Uh oh!

Copilot AI Jul 17, 2025

Uh oh!

Copilot AI Jul 17, 2025

Uh oh!

Copilot AI Jul 17, 2025

Uh oh!

Copilot AI Jul 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	ref_network = network
	ref_network = type(network)() # Create a new instance of the same model class
	ref_network.load_state_dict(network.state_dict()) # Copy the parameters from the original network

		b = torch.outer(hamiltonian_psi_dst.conj(), ref_psi_src) - torch.outer(psi_src.conj(), ref_hamiltonian_psi_dst)
		diff = (a - self.evolution_time * b).flatten()

Conversation

hzhangxyz commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist:

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hzhangxyz commented Jul 17, 2025 •

edited

Loading