Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE#
Univariate fine-mapping for functional (epigenomic) data is conducted with fSuSiE. This is similar to the normal univariate fine-mapping, with the main difference being the use of epigonmic data.
: path to a text file contatining information on genotype files. For example:
#id #path
21 $PATH/protocol_example.genotype.chr21_22.21.bed
22 $PATH/protocol_example.genotype.chr21_22.22.bed
: a tab delimited file containing chr, start, end, ID and path for the regions. For example:
#chr start end ID path
chr21 0 14120807 TADB_1297 $PATH/protocol_example.ha.bed.gz
chr21 10840000 16880069 TADB_1298 $PATH/protocol_example.ha.bed.gz
: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
: a tab delimited file containing chr, start, end, and ID regions. For example:
#chr start end ID
chr21 0 14120807 TADB_1297
chr21 10840000 16880069 TADB_1298
: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows
Minimal Working Example Steps#
iii. Run the Fine-Mapping with fSuSiE#
sos run $PATH/mnm_regression.ipynb fsusie \
--cwd $PATH/fsusie_test/ \
--name protocol_example_methylation \
--genoFile $PATH/mwe_data/protocol_data/output/genotype_by_chrom/protocol_example.genotype.chr21_22.genotype_by_chrom_files.txt \
--phenoFile $PATH/fsusie_test/protocol_example.ha.phenotype_by_region_files.corrected.reformat.txt \
--covFile $PATH/mwe_data/protocol_data/output/covariate/protocol_example.protein.protocol_example.samples.protocol_example.genotype.chr21_22.pQTL.plink_qc.prune.pca.Marchenko_PC.gz \
--container oras://ghcr.io/cumc/pecotmr_apptainer:latest \
--walltime 2h \
--numThreads 8 \
--customized-association-windows $PATH/fsusie_test/regions.reformat.txt \
-c ../scripts/csg.yml -q neurology \
--save-data \
--region-name TADB_1298
Anticipated Results#
Univariate finemapping for functional data will produce a file containing results for the top hits and a file containing residuals from SuSiE.
For each region of interest, this file contains:
susie_on_top_pc - ?
twas_weights - for each variant (for enet, lasso and mrash methods). no susie?
twas predictions - for each sample (for enet, lasso, mrash methods)
twas cross validation results - information on the best method. Data is split into five parts
fsusie results - ?
Y coordinates - ?
fsusie summary - ?
total time elapsed
region info - information on the region specified
For each gene of interest, contains residuals for each sample and phenotype
see pecotmr code for description at fsusie uses the
function, an explanation of the arguments can be found at the similarload_regional_association_data