Skip to content

Bug: l2_parser can fail to load ldscores #342

@yyoshiaki

Description

@yyoshiaki

Hi, I report a bug and the fixation.

In partitioned heritability calculations for multiple annotations and chromosomes, --h2 can fail in l2_parser section. In my case, for example, I got the following error.

python ../../ldsc/ldsc.py \
 --h2 ../../data/ldsc/all_sumstats/${TRAIT}.sumstats \
 --ref-ld-chr annotations/Roadmap_U_ABC/merged. \
 --frqfile-chr ../../data/ldsc/1000G_Phase3_frq/1000G.EUR.QC. \
 --w-ld-chr ../../data/ldsc/1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC. \
 --overlap-annot --print-coefficients --print-delete-vals \
 --out heritability/Roadmap_U_ABC_h2/${TRAIT}_merged
*********************************************************************
* LD Score Regression (LDSC)
* Version 1.0.1
* (C) 2014-2019 Brendan Bulik-Sullivan and Hilary Finucane
* Broad Institute of MIT and Harvard / MIT Department of Mathematics
* GNU General Public License v3
*********************************************************************
Call: 
./ldsc.py \
--h2 ../../data/ldsc/all_sumstats/PASS_AdultOnsetAsthma_Ferreira2019.sumstats \
--ref-ld-chr annotations/Roadmap_U_ABC/merged. \
--out heritability/Roadmap_U_ABC_h2/PASS_AdultOnsetAsthma_Ferreira2019_merged \
--overlap-annot  \
--frqfile-chr ../../data/ldsc/1000G_Phase3_frq/1000G.EUR.QC. \
--w-ld-chr ../../data/ldsc/1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC. \
--print-coefficients  \
--print-delete-vals  

Beginning analysis at Tue Apr  5 15:51:00 2022
Reading summary statistics from ../../data/ldsc/all_sumstats/PASS_AdultOnsetAsthma_Ferreira2019.sumstats ...
Read summary statistics for 350506 SNPs.
Reading reference panel LD Score from annotations/Roadmap_U_ABC/merged.[1-22] ... (ldscore_fromlist)
Read reference panel LD Scores for 1190321 SNPs.
Removing partitioned LD Scores with zero variance.
Traceback (most recent call last):
  File "../../ldsc/ldsc.py", line 644, in <module>
    sumstats.estimate_h2(args, log)
  File "/mnt/media32TB/home/yyasumizu/bioinformatics/autoimmune_10x/sclinker/ldsc/ldscore/sumstats.py", line 326, in estimate_h2
    args, log, args.h2)
  File "/mnt/media32TB/home/yyasumizu/bioinformatics/autoimmune_10x/sclinker/ldsc/ldscore/sumstats.py", line 246, in _read_ld_sumstats
    M_annot, ref_ld, novar_cols = _check_variance(log, M_annot, ref_ld)
  File "/mnt/media32TB/home/yyasumizu/bioinformatics/autoimmune_10x/sclinker/ldsc/ldscore/sumstats.py", line 200, in _check_variance
    M_annot = M_annot[:, ii_m]
IndexError: boolean index did not match indexed array along dimension 1; dimension is 11 but corresponding boolean dimension is 10

Analysis finished at Tue Apr  5 15:51:07 2022
Total time elapsed: 6.9s
Traceback (most recent call last):
  File "../../ldsc/ldsc.py", line 644, in <module>
    sumstats.estimate_h2(args, log)
  File "/mnt/media32TB/home/yyasumizu/bioinformatics/autoimmune_10x/sclinker/ldsc/ldscore/sumstats.py", line 326, in estimate_h2
    args, log, args.h2)
  File "/mnt/media32TB/home/yyasumizu/bioinformatics/autoimmune_10x/sclinker/ldsc/ldscore/sumstats.py", line 246, in _read_ld_sumstats
    M_annot, ref_ld, novar_cols = _check_variance(log, M_annot, ref_ld)
  File "/mnt/media32TB/home/yyasumizu/bioinformatics/autoimmune_10x/sclinker/ldsc/ldscore/sumstats.py", line 200, in _check_variance
    M_annot = M_annot[:, ii_m]
IndexError: boolean index did not match indexed array along dimension 1; dimension is 11 but corresponding boolean dimension is 10

This is because old pandas sort columns in alphabetical order during the concatenations (https://stackoverflow.com/questions/39046931/column-order-in-pandas-concat). I fixed it by specifying columns after concatenations in PR #341 .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions