Multivariate Fine-Mapping with mvSuSiE and mr.mash

Multivariate Fine-Mapping with mvSuSiE and mr.mash#

Multivariate fine-mapping using mvSuSiE and mr.mash is also available in our pipeline.

Input#

--genoFile: path to a text file contatining information on genotype files. For example:

#id     #path
21      $PATH/protocol_example.genotype.chr21_22.21.bed
22      $PATH/protocol_example.genotype.chr21_22.22.bed

--phenoFile: a tab delimited file containing chr, start, end, ID and path for the regions. For example:

#chr    start   end     ID      path
chr21   0       14120807        TADB_1297       $PATH/protocol_example.ha.bed.gz
chr21   10840000        16880069        TADB_1298       $PATH/protocol_example.ha.bed.gz

--covFile: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows: a tab delimited file containing chr, start, end, and ID regions. For example:

#chr    start   end     ID
chr21   0       14120807        TADB_1297
chr21   10840000        16880069        TADB_1298

--region-name: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows file

--mixture_prior: rds file from mr.mash

Minimal Working Example Steps#

iv. Run the Fine-Mapping with mvSuSiE#

sos run $PATH/protocol/pipeline/mnm_regression.ipynb mnm \
    --name ROSMAP_mega_eQTL --cwd $PATH/output/ \
    --genoFile $PATH/genofile/ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.11.bed \
    --phenoFile $PATH/phenofile/Mic/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Ast/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Oli/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Oli.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/OPC/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.OPC.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Exc/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Exc.mega.normalized.log2cpm.region_list.txt \
                $PATH/phenofile/Inh/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Inh.mega.normalized.log2cpm.region_list.txt \
    --covFile $PATH/phenofile/Mic/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Ast/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Oli/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Oli.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/OPC/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.OPC.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Exc/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Exc.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
              $PATH/phenofile/Inh/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Inh.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
    --customized-association-windows $PATH/windows/TADB_enhanced_cis.coding.bed \
    --region-name ENSG00000073921 --save_data --no-skip-twas-weights \
    --phenotype-names Mic_mega_eQTL Ast_mega_eQTL Oli_mega_eQTL OPC_mega_eQTL Exc_mega_eQTL Inh_mega_eQTL \
    --mixture_prior /data/analysis_result/mash/mixture_prior.EZ.prior.rds \
    --max_cv_variants 5000 \
	--ld_reference_meta_file $PATH/ldref/ld_meta_file.tsv 

Anticipated Results#

For each gene, multivariate finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.

ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_bvrs.rds:

  • For each gene of interest, this file contains:

    1. mrmash_fitted

    2. reweighted_mixture_prior

    3. reweighted_mixture_prior_cv

    4. mvsusie_fitted

    5. variant_names

    6. analysis_script

    7. other_quantities

    8. context_names

    9. top_loci

    10. susie_result_trimmed

    11. total_time_elapsed

    12. region_info

ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_data.rds: (from the –save-data argument)

ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_twas_weights.rds:

  • For each gene of interest and phenotype, this file contains:

    1. twas_weights - weights mrmash and mvsusie methods

    2. twas_predictions - twas predictions for mrmash and mvsusie methods

    3. variant_names

    4. twas_cv_result

    5. total_time_elapsed

    6. region_info