Skip to content

Refactor Sem multi-group support#317

Open
alyst wants to merge 18 commits intoStructuralEquationModels:develfrom
alyst:refactor_sem_terms
Open

Refactor Sem multi-group support#317
alyst wants to merge 18 commits intoStructuralEquationModels:develfrom
alyst:refactor_sem_terms

Conversation

@alyst
Copy link
Contributor

@alyst alyst commented Mar 9, 2026

This is a largest remaining part of #193, which changes some interfaces.

Refactoring of the SEM types

  • AbstractLoss is the base type for all functions
  • SemLoss{O,I} <: AbstractLoss is the base type for all SEM losses, it now requires to have observed::O and implied::I field
  • Since SemLoss ctor should always be given observed and implied (positional), meanstructure keyword is gone -- loss should always respect implied specification.
  • LossTerm is a thin wrapper around AbstractLoss that adds optional id of the loss term and optional weight
  • Sem is a container of LossTerm objects (accessible via loss_terms(sem), or loss_term(sem, id)), so it can handle multiple SEM terms (accessible via sem_terms(sem) -- subset of loss_terms(sem), or sem_term(sem, id)).
    It replaces both the old Sem and SemEnsemble.
  • AbstractSingleSem, AbstractSemCollection and SemEnsemble are gone.

Method changes

Multi-term SEMs could be created like

model = Sem(
    :Pasteur => SemML(obs_g1, RAMSymbolic(specification_g1)),
    :Grant_White => SemML(obs_g2, RAM(specification_g2)),
    ...
)

Or with weights specification

model = Sem(
    :Pasteur => SemML(obs_g1, RAMSymbolic(specification_g1)) => 0.5,
    :Grant_White => SemML(obs_g2, RAM(specification_g2)) => 0.6,
)

The new Sem() and loss-term constructors rely less on keyword arguments and more on positional arguments, but some keywords support is present.

  • update_observed!() was removed. It was only used by replace_observed(),
    but otherwise in-place model modification with unclear semantics is error-prone.
  • replace_observed(sem, data) was simplified by removing support of additional keywords or requirement to pass SEM specification.
    It only creates a copy of the given Sem with the observed data replaced,
    but implied and loss definitions intact.
    Changing observed vars is not supported -- that is something use-case specific
    that user should implement in their code.
  • check_single_lossfun() was renamed into check_same_semterm_type() as
    it better describes what it does. If check is successful, it returns the specific
    subtype of SemLoss.
  • bootstrap() and se_bootstrap() use bootstrap!(acc::BootstrapAccumulator, ...)
    function to reduce code duplication
  • bootstrap() returns BootstrapResult{T} for better type inference
  • fit_measures() now also accepts vector of functions, and includes CFI by default (DEFAULT_FIT_MEASURES constant)
  • test_fitmeasures() was tweaked to handle more repetitive code: calculating the subset of fit measures, and compairing this subset against lavaan refs, checking for measures that could not be applied to given loss types (SemWLS).

@alyst alyst changed the base branch from main to devel March 9, 2026 22:48
@alyst alyst force-pushed the refactor_sem_terms branch from 3c39941 to 32cea82 Compare March 11, 2026 20:31
Alexey Stukalov and others added 3 commits March 21, 2026 11:03
- for SemImplied require spec::SemSpec as positional
- for SemLossFunction require implied argument
@alyst alyst force-pushed the refactor_sem_terms branch from 32cea82 to eb039a2 Compare March 23, 2026 04:49
@alyst alyst force-pushed the refactor_sem_terms branch from eb039a2 to 88a1ff0 Compare March 23, 2026 07:33
@alyst alyst changed the title Refactor Sem mult-group support Refactor Sem multi-group support Mar 23, 2026
@alyst alyst marked this pull request as ready for review March 23, 2026 08:12
@alyst alyst force-pushed the refactor_sem_terms branch from 88a1ff0 to 0406f29 Compare March 23, 2026 17:51
Comment on lines +6 to +7
In this case, [`FiniteDiffWrapper`](@ref) method to generate a wrapper around the specific `SemLoss` term that only uses its objective
to calculate the gradient using the finite difference approximation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this case, [`FiniteDiffWrapper`](@ref) method to generate a wrapper around the specific `SemLoss` term that only uses its objective
to calculate the gradient using the finite difference approximation.
In this case, [`FiniteDiffWrapper`](@ref) can be used to generate a wrapper around the specific `SemLoss` term. This wrapper only uses the `LossTerm`s objective, and calculates the gradient using finite difference approximation.

@alyst
Copy link
Contributor Author

alyst commented Mar 24, 2026

@Maximilian-Stefan-Ernst It might be a nice idea to use copilot for catching typos, incorrect sentences, but also potential bugs.
I cannot select copilot as a reviewer -- I'm not exactly sure why, whether it is the organization/repository-level setting, or it's my status in the repository.
But I'm also fine if SEM.jl is kept AI-free :)

function χ²(fit::SemFit, model::AbstractSemSingle)
check_single_lossfun(model; throw_error = true)
return χ²(model.loss.functions[1], fit::SemFit, model::AbstractSemSingle)
return χ²(typeof(term1), fit, model)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to pass typeof(term1) instead of term1? I personally find the syntax a bit cleaner without the extra typeof call.

############################################################################################
function χ²(fit::SemFit, model::AbstractSem)
terms = sem_terms(model)
isempty(terms) && return 0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should throw an error for a Sem with no terms?

Comment on lines +20 to +30
term1 = _unwrap(loss(terms[1]))
L = typeof(term1).name

# check that all SemLoss terms are of the same class (ML, FIML, WLS etc), ignore typeparams
for (i, term) in enumerate(terms)
lossterm = _unwrap(loss(term))
@assert lossterm isa SemLoss
if typeof(_unwrap(lossterm)).name != L
@error "SemLoss term #$i is $(typeof(_unwrap(lossterm)).name), expected $L. Heterogeneous loss functions are not supported"
end
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this is done in check_semterm_type?

Suggested change
term1 = _unwrap(loss(terms[1]))
L = typeof(term1).name
# check that all SemLoss terms are of the same class (ML, FIML, WLS etc), ignore typeparams
for (i, term) in enumerate(terms)
lossterm = _unwrap(loss(term))
@assert lossterm isa SemLoss
if typeof(_unwrap(lossterm)).name != L
@error "SemLoss term #$i is $(typeof(_unwrap(lossterm)).name), expected $L. Heterogeneous loss functions are not supported"
end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants