Disorder builder by ColinBundschu · Pull Request #1410 · materialsproject/emmet

ColinBundschu · 2026-04-01T19:18:41Z

This is for the disordered materials builder

esoteric-ephemera · 2026-04-03T16:16:28Z

emmet-core/emmet/core/disorder.py

+    n: int = Field(..., description="Number of structures in this group.")
+    mae_per_site: float = Field(..., description="Mean absolute error per site.")
+    rmse_per_site: float = Field(..., description="Root-mean-square error per site.")
+    max_abs_per_site: float = Field(..., description="Maximum absolute error per site.")


More nitpicky but you can skip the ellipses here and below

esoteric-ephemera · 2026-04-03T16:17:03Z

emmet-core/emmet/core/disorder.py

+    in_sample: CEFitMetrics = Field(...)
+    five_fold_cv: CEFitMetrics = Field(...)


Same as here, for readability this is identical:

in_sample: CEFitMetrics five_fold_cv: CEFitMetrics

esoteric-ephemera · 2026-04-03T16:18:43Z

emmet-core/emmet/core/disorder.py

+    standardization: Literal["none", "column_zscore"] = Field(
+        ..., description="Column standardization mode applied before SVD."
+    )


Are there more column standardization methods you plan to apply, or does it make sense to change this to something like:

standardization: bool = Field(description = "True if column zscore standardization was applied before SVD.")

esoteric-ephemera · 2026-04-03T16:24:08Z

emmet-builders/emmet/builders/disorder/disorder.py

+def build_disorder_doc(
+    disordered_documents: list[DisorderedTaskDoc],
+    ordered_task_doc: CoreTaskDoc,
+    *,


Let's remove the kwarg delimiters (*)

esoteric-ephemera · 2026-04-03T16:27:55Z

emmet-builders/emmet/builders/disorder/disorder.py

+    for doc in disordered_documents[1:]:
+        if doc.ordered_task_id != first.ordered_task_id:
+            raise ValueError("Ordered task IDs do not match across documents.")
+        if doc.supercell_diag != first.supercell_diag:
+            raise ValueError("Supercell diagonals do not match across documents.")
+        if doc.prototype != first.prototype:
+            raise ValueError("Prototypes do not match across documents.")
+        if doc.prototype_params != first.prototype_params:
+            raise ValueError("Prototype parameters do not match across documents.")
+        if doc.versions != first.versions:
+            raise ValueError("Versions do not match across documents.")


Not sure how many disordered_documents go into building the final doc, but you may just want to replace these with list comprehensions to get a speedup from C

for attr, exc in { "ordered_task_id": "Ordered task IDs do not match across documents.", "supercell_diag": "Supercell diagonals do not match across documents.", ... # add the rest }.items(): if any(getattr(doc,attr) != getattr(first,attr) for doc in disordered_documents[1:]): raise ValueError(exc)

esoteric-ephemera · 2026-04-03T16:29:50Z

emmet-builders/emmet/builders/disorder/disorder.py

+    )
+
+    num_bins = len(wl_block["state"].bin_indices)
+    while num_bins < min_bins or num_bins > max_bins:


Maybe set a maximum number of refinements here, unless there's a guarantee the while won't hang indefinitely?

esoteric-ephemera · 2026-04-03T16:30:17Z

emmet-builders/emmet/builders/disorder/disorder.py

+        num_bins = len(wl_block["state"].bin_indices)
+
+    # --- WL convergence loop ---
+    while wl_block["state"].mod_factor > wl_convergence_threshold:


Same as above, let's impose a maximum number of recursions for the while

esoteric-ephemera · 2026-04-03T16:34:16Z

emmet-builders/emmet/builders/disorder/infinite_wang_landau.py

+        )
+        new_entropy = float(self._entropy_d.get(int(new_bin_id), 0.0))
+
+        assert self.mcusher is not None, "MCUsher is not initialized"


Linting will eventually complain about this assert, let's change to raise ValueError or a more specific DisorderedBuilderError

esoteric-ephemera · 2026-04-03T16:35:43Z

emmet-builders/emmet/builders/disorder/prototype_spec.py

+from ase.spacegroup import crystal
+
+
+class PrototypeStructure(str, Enum):


Just use StrEnum? Our base is py3.11

esoteric-ephemera · 2026-04-03T16:38:56Z

emmet-builders/pyproject.toml

+  "smol",
+  "ase",
+  "scikit-learn",


Can you move these to a separate dependency group like disorder?

esoteric-ephemera · 2026-04-03T16:40:58Z

Thanks! Skimmed through mostly looking for structural stuff, will take a deeper look later

Maybe a major question for you: Do you see a benefit to moving some of the Wang-Landau code to pymatgen / ase for others to use?

ColinBundschu added 6 commits April 1, 2026 13:17

disorder builder first pass

9a4e661

disorder integration

ced3f09

removed debug print statements

2965cbd

DisorderDoc now inherits from PropertyDoc

594b30a

improvements to post processing

0fe9c3c

code cleanup

5a8126a

ColinBundschu changed the title ~~WIP disorder builder first pass~~ Disorder builder Apr 3, 2026

esoteric-ephemera reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disorder builder#1410

Disorder builder#1410
ColinBundschu wants to merge 6 commits intomaterialsproject:new-buildersfrom
ColinBundschu:new-builders

ColinBundschu commented Apr 1, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026 •

edited

Loading

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026 •

edited

Loading

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera Apr 3, 2026

Uh oh!

esoteric-ephemera commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		in_sample: CEFitMetrics = Field(...)
		five_fold_cv: CEFitMetrics = Field(...)

		from ase.spacegroup import crystal


		class PrototypeStructure(str, Enum):

Conversation

ColinBundschu commented Apr 1, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

esoteric-ephemera Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

esoteric-ephemera Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

esoteric-ephemera commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

esoteric-ephemera Apr 3, 2026 •

edited

Loading

esoteric-ephemera Apr 3, 2026 •

edited

Loading