Skip to content

ourofoundation/ggen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GGen: Accelerating Materials Discovery

Python 3.12+ License: MIT Code style: black

GGen is a toolkit for discovering novel crystalline materials through computational exploration. Given a set of elements, GGen systematically generates crystal structures across all feasible stoichiometries, optimizes them using machine learning interatomic potentials (MLIPs), and constructs phase diagrams to identify thermodynamically stable candidates. For the most promising phases, phonon calculations verify dynamical stability—a key indicator of synthesizability.

The goal: find materials that could actually exist in nature or be synthesized in a lab.

Installation

git clone https://github.com/ourofoundation/ggen.git
cd ggen
pip install -e .

For the latest orb-models v0.6.0 stack and faster GPU graph construction, use Python 3.12+. If you want the legacy cuML edge builders available as well, install:

pip install -e ".[gpu-cu12]"

ggen defaults to ORB's recommended knn_alchemi graph construction path. You can override it at runtime with GGEN_ORB_EDGE_METHOD, for example knn_cuml_rbc or knn_cuml_brute.

CLI Usage

GGen is designed for command-line exploration of chemical systems. The typical workflow:

  1. Explore a chemical system to find low-energy structures
  2. Analyze the phase diagram to identify stable candidates
  3. Validate promising phases with phonon calculations

Exploring a Chemical System

# Explore the Fe-Mn-Si ternary system
python scripts/explore.py Fe-Mn-Si

# Control search depth
python scripts/explore.py Fe-Mn-Si --max-atoms 24 --num-trials 25

# Focus on specific crystal systems
python scripts/explore.py Li-Co-O --crystal-systems hexagonal trigonal

# Bias towards Fe-rich compositions (Fe must be ≥40% of atoms)
python scripts/explore.py Fe-Co-Bi --min-fraction Fe:0.4

# Combined constraints: high Fe, low Bi
python scripts/explore.py Fe-Co-Bi --min-fraction Fe:0.3 --max-fraction Bi:0.2

What happens: GGen enumerates all stoichiometries up to --max-atoms (default: 20), generates --num-trials candidate structures per stoichiometry (default: 15), and relaxes each using the ORB force field. Results are stored in a unified SQLite database that persists across runs—structures from Fe-Mn explored in one run are automatically reused when you explore Fe-Mn-Co later.

Output:

============================================================
EXPLORING: Fe-Mn-Si
============================================================
Database: ./ggen.db

Starting exploration...
  [████████████████████████████████████████] 87/87 stoichiometries

============================================================
RESULTS: Fe-Mn-Si
============================================================
Newly generated:           52
Reused from database:      35
Time elapsed:              127.3s

Stable/Near-Stable Phases (E_hull < 150 meV/atom)
------------------------------------------------------------
  Fe3Si        E=-7.2341 eV/atom  SG=Fm-3m       E_hull=  0.0 meV  dyn:✓  [db]
  Mn3Si        E=-6.8924 eV/atom  SG=Fm-3m       E_hull=  0.0 meV  dyn:?  [new]
  FeMnSi       E=-6.5112 eV/atom  SG=P6_3/mmc    E_hull= 12.3 meV  dyn:?  [new]
  ...

CLI Options Reference

python scripts/explore.py --help
Option Default Description
--max-atoms 20 Maximum atoms per unit cell
--min-atoms 2 Minimum atoms per unit cell
--num-trials 15 Generation attempts per stoichiometry
--max-stoichiometries 100 Cap on stoichiometries to explore
--crystal-systems all Filter: cubic, hexagonal, tetragonal, etc.
--require-all-elements off Only generate formulas containing all elements (skip subsystems)
--min-fraction none Minimum element fractions, e.g. Fe:0.4 Co:0.2
--max-fraction none Maximum element fractions, e.g. Bi:0.2
-j, --workers 1 Parallel workers for generation
--e-above-hull 0.15 Energy cutoff (eV) for "stable" phases
--compute-phonons off Run phonon calculations during exploration
--hide-unstable off Hide dynamically unstable phases from output
--skip-existing off Skip formulas that already exist in database
--db-path ./ggen.db Path to unified structure database
--output-dir ./runs Directory for CIF files and plots
--seed random Random seed for reproducibility

Phonon Stability (Dynamical Stability)

A structure can be thermodynamically stable (on the convex hull) but dynamically unstable—meaning it would spontaneously distort into a different structure. Phonon calculations detect imaginary vibrational modes that indicate this instability.

Since phonon calculations can be expensive (~10s per structure), GGen separates them from the initial exploration:

# First, explore quickly (no phonons)
python scripts/explore.py Zn-Cu-Sn

# Then compute phonons for promising candidates
python scripts/phonons.py --e-above-hull 0.1

# Target a specific system
python scripts/phonons.py --system Zn-Cu-Sn --e-above-hull 0.05

# Preview what would be computed
python scripts/phonons.py --dry-run

# Limit the batch size
python scripts/phonons.py --max-structures 20

If you want phonons computed during exploration (slower but all-in-one):

python scripts/explore.py Zn-Cu-Sn --compute-phonons

Interpreting results:

  • dyn:✓ — Dynamically stable (no imaginary modes) → likely synthesizable
  • dyn:✗ — Dynamically unstable (imaginary modes present) → would distort
  • dyn:? — Not yet tested

Exporting Candidates

Export the best candidates as CIF files for further analysis or DFT validation:

# Export top 10 dynamically stable candidates
python scripts/export.py Co-Fe-Mn -n 10

# Filter by crystal system
python scripts/export.py Co-Fe-Mn -c tetragonal

# Export more candidates with energy cutoff
python scripts/export.py Co-Fe-Mn -n 50 --max-ehull 0.1

# Include dynamically unstable structures
python scripts/export.py Co-Fe-Mn --include-unstable

# Custom output directory
python scripts/export.py Co-Fe-Mn -o ./my_export/

Output directories are auto-generated based on filters:

exports/Co-Fe-Mn/                      # default
exports/Co-Fe-Mn-tetragonal/           # with -c tetragonal
exports/Co-Fe-Mn-50meV/                # with --max-ehull 0.05
exports/Co-Fe-Mn-cubic-100meV/         # combined filters

Each export includes:

  • CIF files named formula_spacegroup_ehull.cif (e.g., Co3FeMn2_P4-mmm_0meV.cif)
  • metadata.json with full details for each structure
Option Default Description
-n, --top 10 Number of candidates to export
-o, --output auto Output directory
-c, --crystal-system all Filter: cubic, tetragonal, etc.
--max-ehull none Maximum energy above hull (eV/atom)
--include-unstable off Include dynamically unstable structures
--no-metadata off Skip writing metadata.json

Generating Reports

Query the database for statistics and structure lists:

# Summary report for a system (includes phase diagram)
python scripts/report.py Co-Fe-Mn

# List all explored systems with stats
python scripts/report.py --list

# Show only on-hull structures
python scripts/report.py Co-Fe-Mn --stable

# Show fully stable (on-hull + phonon stable)
python scripts/report.py Co-Fe-Mn --fully-stable

# Filter by crystal system
python scripts/report.py Co-Fe-Mn -c tetragonal

# Filter by space group
python scripts/report.py Co-Fe-Mn -s Fm-3m
python scripts/report.py Co-Fe-Mn -s 225

# Show stability breakdown
python scripts/report.py Co-Fe-Mn --breakdown

# Export as JSON
python scripts/report.py Co-Fe-Mn --json > report.json

Example output:

Co-Fe-Mn
──────────────────────────────────────────────────
1052 structures, 847 unique formulas

Thermodynamic Stability (per formula)
  42 on hull, 215 within 150 meV, 590 above

Dynamical Stability (phonon)
  377 stable, 412 unstable, 263 untested
  → 38 fully stable (on hull + phonon stable)

Crystal Systems
  cubic        ████       156 (14.8%)
  tetragonal   ███        102 ( 9.7%)
  orthorhombic ████████   298 (28.3%)
  ...

Phase diagram: Co-Fe-Mn_phase_diagram.html

Phase Diagrams

Every report automatically generates an interactive phase diagram as an HTML file (and PNG if kaleido is installed). The phase diagram shows all explored structures plotted by composition and energy above hull.

# Default: report + phase diagram
python scripts/report.py Co-Fe-Mn
# → Generates: Co-Fe-Mn_phase_diagram.html

# Exclude P1 structures (often low-symmetry noise)
python scripts/report.py Co-Fe-Mn --exclude-p1
# → Generates: Co-Fe-Mn_phase_diagram_noP1.html

# Only include phonon-tested structures
python scripts/report.py Co-Fe-Mn --tested-only
# → Generates: Co-Fe-Mn_phase_diagram_tested.html

# Include all polymorphs (not just lowest-energy per formula)
python scripts/report.py Co-Fe-Mn --all-polymorphs
# → Generates: Co-Fe-Mn_phase_diagram_allpoly.html

# Custom energy cutoff for display (default: 150 meV)
python scripts/report.py Co-Fe-Mn --energy-cutoff 0.2
# → Generates: Co-Fe-Mn_phase_diagram_cutoff200.html

# Combine filters
python scripts/report.py Co-Fe-Mn --exclude-p1 --tested-only
# → Generates: Co-Fe-Mn_phase_diagram_noP1_tested.html

The phase diagram filename automatically reflects the filters applied, making it easy to compare different views of the same system.

Option Description
--list, -l List all systems in database with summary stats
--stable Show on-hull structures
--fully-stable Show structures that are on-hull AND phonon-stable
-c, --crystal-system Filter by crystal system
-s, --space-group Filter by space group (number or symbol)
--breakdown Show stability category breakdown
--json, -j Output full report as JSON
--energy-cutoff Energy above hull cutoff in eV/atom (default: 0.150)
--exclude-p1 Exclude P1 structures from phase diagram
--tested-only Only include phonon-tested structures in phase diagram
--all-polymorphs Include all polymorphs, not just lowest-energy per formula

Scouting Candidate Elements

When you have a template like Fe-Bi-X and want to find which X is most promising, use the scout to systematically screen all candidates in an element group:

# See available element groups
python scripts/scout.py --list-groups

# Preview which systems would be explored
python scripts/scout.py "Fe-Bi-{X}" --group 3d_metals --dry-run

# Scan all 3d transition metals, scoring for tetragonal/hexagonal phases
python scripts/scout.py "Fe-Bi-{X}" --group 3d_metals \
    --crystal-systems tetragonal hexagonal

# Add composition constraints (Fe-rich) and use explicit element list
python scripts/scout.py "Fe-Bi-{X}" --elements Co Mn Ni Cr V Ti \
    --crystal-systems tetragonal hexagonal \
    --min-fraction Fe:0.3

What happens: For each candidate X, the scout runs a shallow exploration (fewer trials per stoichiometry for speed), then queries the convex hull to score how many stable structures appeared in your target crystal systems. Results are ranked in a summary table:

Rank  System           X     On-Hull  Near-Hull  Tgt-Hits   Best-E_hull  Formulas     Time
  1   Bi-Fe-Ti        Ti          3         8         5       12.0 meV        42      1.2h
  2   Bi-Co-Fe        Co          2         6         3       34.0 meV        38      1.1h
  3   Bi-Fe-Mn        Mn          1         5         2       89.0 meV        41      1.3h
  ...

The top-ranked systems are the ones worth a deep exploration run.

Option Default Description
--group Named element group (3d_metals, transition_metals, metalloids, etc.)
--elements Explicit list of candidate elements
--exclude Elements to exclude (template elements are auto-excluded)
--crystal-systems all Target crystal systems for scoring
--shallow-trials 5 Generation attempts per stoichiometry
--min-fraction none Minimum element fractions, e.g. Fe:0.3
--max-fraction none Maximum element fractions, e.g. Bi:0.2
--max-atoms 12 Maximum atoms per unit cell
--e-above-hull 0.15 Energy cutoff (eV) for "near hull"
--dry-run off Show systems without running
--list-groups off Print available element groups

Running Multiple Systems

Use GNU parallel to explore multiple systems concurrently:

parallel python scripts/explore.py ::: Fe-Mn-Si Li-Co-O Zn-Sn-Cu Na-P-S

Each run shares the unified database, so common subsystems (e.g., Fe-Mn) are explored once and reused.

Key Concepts

Phase Diagrams & Convex Hulls

A phase diagram shows which compositions are thermodynamically stable for a given set of elements. The convex hull is the lower envelope of energies—structures on the hull are stable against decomposition into competing phases. Structures above the hull are metastable: the distance above hull (E_hull) indicates how much energy would be released if they decomposed.

GGen computes phase diagrams automatically using pymatgen and generates interactive HTML plots (via Plotly) showing all explored structures.

Thermodynamic vs. Dynamical Stability

  • Thermodynamic stability (energy above hull): Can this structure exist without decomposing?
  • Dynamical stability (phonon modes): Is this structure mechanically stable, or would it spontaneously distort?

A promising synthesis candidate should be both thermodynamically stable (or nearly so) and dynamically stable. GGen lets you filter for exactly this.

Features

  • Crystal Generation: Generate structures from chemical formulas via PyXtal with automatic space group selection
  • MLIP Optimization: Geometry relaxation using ORB force field models (fast and accurate)
  • Phase Diagram Construction: Convex hull analysis with interactive Plotly visualizations
  • Phonon Calculations: Detect imaginary modes to assess dynamical stability
  • Unified Database: SQLite storage with cross-system structure sharing
  • Incremental Exploration: Skip already-explored formulas; build on previous runs
  • Structure Mutations: Lattice scaling, shearing, substitution, site operations for evolutionary optimization
  • Multiple Export Formats: CIF, XYZ, JSON, ASE trajectory files
  • Trajectory Tracking: Record structure evolution through mutations and optimizations

Python API

For programmatic access, GGen provides two main classes:

ChemistryExplorer — Systematic Exploration

from ggen import ChemistryExplorer

explorer = ChemistryExplorer(output_dir="./runs")

result = explorer.explore(
    chemical_system="Fe-Mn-Si",
    max_atoms=16,
    num_trials=15,
    optimize=True,
    preserve_symmetry=True,
    # Bias towards Fe-rich compositions
    min_fraction={"Fe": 0.4},
    max_fraction={"Si": 0.3},
)

print(f"Found {len(result.hull_entries)} phases on the convex hull")

# Get stable candidates
stable = explorer.get_stable_candidates(result, e_above_hull_cutoff=0.025)
for s in stable:
    print(f"{s.formula}: {s.energy_per_atom:.4f} eV/atom, E_hull={s.e_above_hull*1000:.1f} meV")

# Generate interactive phase diagram
fig = explorer.plot_phase_diagram(result)
fig.write_html("phase_diagram.html")

GGen — Single Structure Operations

from ggen import GGen

ggen = GGen()

# Generate a crystal
result = ggen.generate_crystal("BaTiO3", num_trials=10, optimize_geometry=True)
print(f"Space group: {result['final_space_group_symbol']}")
print(f"Energy: {result['best_crystal_energy']:.4f} eV")

# Apply mutations
ggen.scale_lattice(1.05)
ggen.substitute("Ba", "Sr", fraction=0.5)
ggen.jitter_sites(sigma=0.01)

# Export trajectory
ggen.export_trajectory("evolution.traj")

Output Structure

Each exploration run creates a timestamped directory:

runs/exploration_Fe-Mn-Si_20260105_143500/
├── structures/           # CIF files for all generated structures
│   ├── Fe2Mn_Im-3m.cif
│   ├── FeMnSi_P63-mmc.cif
│   └── ...
└── summary.json          # JSON summary of results

Use python scripts/report.py Fe-Mn-Si to generate reports and phase diagram visualizations.

The unified database (ggen.db by default) stores all structures across all runs.

API Reference

GGen Class

Initialization:

GGen(
    calculator=None,        # ASE calculator (default: ORB)
    random_seed=None,       # For reproducibility
    enable_trajectory=True  # Track structure evolution
)

Key Methods:

  • generate_crystal(formula, space_group=None, num_trials=10, optimize_geometry=False)
  • set_structure(structure, add_to_trajectory=True)
  • get_structure()
  • optimize_geometry(max_steps=400, fmax=0.01, relax_cell=True)

Mutation Methods:

  • scale_lattice(scale_factor, isotropic=True)
  • shear_lattice(angle_deltas)
  • substitute(element_from, element_to, fraction=1.0)
  • add_site(element, coordinates, coords_are_cartesian=False)
  • remove_site(site_indices=None, element=None, max_remove=None)
  • move_site(site_index, displacement=None, new_coordinates=None)
  • jitter_sites(sigma=0.01, element=None)
  • symmetry_break(displacement_scale=0.01, angle_perturbation=0.1)
  • change_space_group(target_space_group, symprec=0.1)

Export Methods:

  • export_trajectory(filename) — Export to CIF, XYZ, JSON, or ASE .traj (auto-detected from extension)
ChemistryExplorer Class

Initialization:

ChemistryExplorer(
    calculator=None,     # ASE calculator (default: ORB, lazy-loaded)
    random_seed=None,    # For reproducibility
    output_dir=None      # Base directory for results
)

Main Methods:

  • explore(chemical_system, ...) — Run full exploration
  • get_stable_candidates(result, e_above_hull_cutoff=0.025) — Filter stable phases
  • plot_phase_diagram(result, show_unstable=0.1) — Generate interactive plot
  • export_summary(result, output_path=None) — Export JSON summary

Utility Methods:

  • parse_chemical_system(system) — Parse "Li-Co-O" → ["Co", "Li", "O"]
  • enumerate_stoichiometries(elements, max_atoms=12, ...)
  • find_previous_runs(chemical_system)
  • load_structures_from_previous_runs(chemical_system)
  • load_run(run_directory)
Data Classes

CandidateResult — A generated structure candidate:

  • formula, stoichiometry, energy_per_atom, total_energy
  • num_atoms, space_group_number, space_group_symbol
  • structure (pymatgen), cif_path, generation_metadata
  • is_valid, error_message
  • is_dynamically_stable, phonon_result

ExplorationResult — Complete exploration results:

  • chemical_system, elements
  • num_candidates, num_successful, num_failed
  • candidates, phase_diagram, hull_entries
  • run_directory, database_path, total_time_seconds

Dependencies

Core: numpy, scipy, pymatgen, pyxtal, ase, torch
ML Potentials: orb-models, torch-sim-atomistic
Phonons: phonopy, seekpath
Visualization: plotly, matplotlib
Utilities: tqdm, requests, pynanoflann

Contributing

Contributions welcome! Please open an issue to discuss major changes before submitting a PR.

License

MIT License — see LICENSE

Citation

@software{ggen2026,
  title={GGen: Crystal Generation and Mutation Library},
  author={Matt Moderwell},
  year={2026},
  url={https://github.com/ourofoundation/ggen}
}

Acknowledgments

Built on PyXtal, pymatgen, ASE, orb-models, phonopy, and torch-sim.

1. Import external structures (MP by default)

python scripts/import.py --database ggen.db --provider mp

2. Relax with ORB (consistent energy domain)

python scripts/relax.py --source mp --unrelaxed --batch-size 32

3. Recompute hulls (now all energies are from ORB)

python scripts/report.py # or your hull recomputation workflow

About

Explore chemical spaces and discover stable crystals with ML potentials

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors