Add lassosum RSS method with C++ coordinate descent#457
Conversation
Port lassosum (Mak et al 2017) to work with pre-computed LD matrices and summary statistics. Uses the same interface as prs_cs/sdpr: lassosum_rss(bhat, LD, n, ...) with LD as a list of block matrices. - Add C++ coordinate descent solver (src/lassosum_rss.cpp) tracking Rbeta as a running sum, matching the original elnet() algorithm - Add lassosum_rss() and lassosum_rss_weights() R wrappers - Add shrinkage parameter to compute_LD() for LD regularization - Add unit tests mirroring existing prs_cs test patterns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validation ExperimentsSetupAll experiments use simulated data: genotype matrix The pecotmr version takes a pre-shrunk LD matrix Experiment 1: Numerical identity with original lassosumCompared beta coefficient matrices across 9 configurations (3 seeds × 3 shrinkage values), n=500, p=40, 20 lambda values each:
Values are max|beta_diff| across all p×20 = 800 coefficients per configuration. Result: Betas are identical to machine epsilon (~1e-15) across all seeds, shrinkage values, and lambda values. KKT optimality conditions verified with zero violations. Experiment 2: Comparison with PRS-CSSimulated n=1000, p=20, 3 causal SNPs with effects (0.3, -0.2, 0.15):
Result: Both methods recover the true signal direction well (cor > 0.95 with truth). Lassosum is sparser (L1 penalty), PRS-CS is denser (continuous shrinkage prior). They agree with each other at cor=0.98. SDPR could not be compiled on ARM/Apple Silicon due to SSE intrinsics in Note on
|
The original SDPR (Zhou et al.) used x86 SSE intrinsics (log_ps, exp_ps, _mm_max_ps, _mm_hadd_ps from sse_mathfun.h) for computing log-sum-exp over M cluster probabilities in sample_assignment(). This prevented compilation on ARM/Apple Silicon. Replace with Armadillo vectorized operations (arma::log, arma::exp, arma::max, arma::accu) which delegate to platform-optimal SIMD (NEON on ARM, SSE/AVX on x86) through the compiler auto-vectorization, giving portable performance without architecture-specific intrinsics. - Rewrite sample_assignment() using arma::vec for cluster probabilities - Remove src/sse_mathfun.h (719 lines of x86 SSE math functions) - Remove src/simde/ directory (~789K lines of x86→ARM translation headers) - Remove SIMDE_ENABLE_NATIVE_ALIASES flag from Makevars.in Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This reverts commit 767abc6.
Summary
src/lassosum_rss.cpp, trackingRbetaas a running sum to match the originalelnet()algorithmlassosum_rss(bhat, LD, n, ...)andlassosum_rss_weights(stat, LD, ...)with identical interface toprs_cs/sdprshrinkageparameter tocompute_LD()for LD regularization:R_s = (1-s)*R + s*Iprs_cstest patternsTest plan
devtools::test(filter="regularized_regression")passesR CMD checkpasses🤖 Generated with Claude Code