Multivariate Fine-Mapping with mvSuSiE and mr.mash#
Multivariate fine-mapping using mvSuSiE and mr.mash is also available in our pipeline.
Input#
--genoFile
: path to a text file contatining information on genotype files. For example:
#id #path
21 $PATH/protocol_example.genotype.chr21_22.21.bed
22 $PATH/protocol_example.genotype.chr21_22.22.bed
--phenoFile
: a tab delimited file containing chr, start, end, ID and path for the regions. For example:
#chr start end ID path
chr21 0 14120807 TADB_1297 $PATH/protocol_example.ha.bed.gz
chr21 10840000 16880069 TADB_1298 $PATH/protocol_example.ha.bed.gz
--covFile
: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows
: a tab delimited file containing chr, start, end, and ID regions. For example:
#chr start end ID
chr21 0 14120807 TADB_1297
chr21 10840000 16880069 TADB_1298
--region-name
: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows
file
--mixture_prior
: rds file from mr.mash
Minimal Working Example Steps#
iv. Run the Fine-Mapping with mvSuSiE#
sos run $PATH/protocol/pipeline/mnm_regression.ipynb mnm \
--name ROSMAP_mega_eQTL --cwd $PATH/output/ \
--genoFile $PATH/genofile/ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.11.bed \
--phenoFile $PATH/phenofile/Mic/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.region_list.txt \
$PATH/phenofile/Ast/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.region_list.txt \
$PATH/phenofile/Oli/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Oli.mega.normalized.log2cpm.region_list.txt \
$PATH/phenofile/OPC/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.OPC.mega.normalized.log2cpm.region_list.txt \
$PATH/phenofile/Exc/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Exc.mega.normalized.log2cpm.region_list.txt \
$PATH/phenofile/Inh/analysis_ready/phenotype_preprocessing/snuc_pseudo_bulk.Inh.mega.normalized.log2cpm.region_list.txt \
--covFile $PATH/phenofile/Mic/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Mic.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
$PATH/phenofile/Ast/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Ast.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
$PATH/phenofile/Oli/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Oli.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
$PATH/phenofile/OPC/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.OPC.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
$PATH/phenofile/Exc/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Exc.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
$PATH/phenofile/Inh/analysis_ready/covariate_preprocessing/snuc_pseudo_bulk.Inh.mega.normalized.log2cpm.rosmap_cov.ROSMAP_NIA_WGS.leftnorm.bcftools_qc.plink_qc.snuc_pseudo_bulk_mega.related.plink_qc.extracted.pca.projected.Marchenko_PC.gz \
--customized-association-windows $PATH/windows/TADB_enhanced_cis.coding.bed \
--region-name ENSG00000073921 --save_data --no-skip-twas-weights \
--phenotype-names Mic_mega_eQTL Ast_mega_eQTL Oli_mega_eQTL OPC_mega_eQTL Exc_mega_eQTL Inh_mega_eQTL \
--mixture_prior /data/analysis_result/mash/mixture_prior.EZ.prior.rds \
--max_cv_variants 5000 \
--ld_reference_meta_file $PATH/ldref/ld_meta_file.tsv
Anticipated Results#
For each gene, multivariate finemapping will produce a file containing results for the top hits and a file containing twas weights produced by susie.
ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_bvrs.rds
:
For each gene of interest, this file contains:
mrmash_fitted
reweighted_mixture_prior
reweighted_mixture_prior_cv
mvsusie_fitted
variant_names
analysis_script
other_quantities
context_names
top_loci
susie_result_trimmed
total_time_elapsed
region_info
ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_data.rds
: (from the –save-data argument)
see pecotmr code for description
ROSMAP_mega_eQTL.chr11_ENSG00000073921.multivariate_twas_weights.rds
:
For each gene of interest and phenotype, this file contains:
twas_weights - weights mrmash and mvsusie methods
twas_predictions - twas predictions for mrmash and mvsusie methods
variant_names
twas_cv_result
total_time_elapsed
region_info