Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE#
Description#
Univariate fine-mapping for functional (epigenomic) data is conducted with fSuSiE. This is similar to the normal univariate fine-mapping, with the main difference being the use of epigonmic data.
Input#
--genoFile
: path to a text file contatining information on genotype files. For example:
#id #path
21 $PATH/protocol_example.genotype.chr21_22.21.bed
22 $PATH/protocol_example.genotype.chr21_22.22.bed
--phenoFile
: a tab delimited file containing chr, start, end, ID and path for the regions. For example:
#chr start end ID path
chr21 0 14120807 TADB_1297 $PATH/protocol_example.ha.bed.gz
chr21 10840000 16880069 TADB_1298 $PATH/protocol_example.ha.bed.gz
--covFile
: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows
: a tab delimited file containing chr, start, end, and ID regions. For example:
#chr start end ID
chr21 0 14120807 TADB_1297
chr21 10840000 16880069 TADB_1298
--region-name
: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows
file
Minimal Working Example Steps#
iii. Run the Fine-Mapping with fSuSiE#
sos run $PATH/mnm_regression.ipynb fsusie \
--cwd $PATH/fsusie_test/ \
--name protocol_example_methylation \
--genoFile $PATH/mwe_data/protocol_data/output/genotype_by_chrom/protocol_example.genotype.chr21_22.genotype_by_chrom_files.txt \
--phenoFile $PATH/fsusie_test/protocol_example.ha.phenotype_by_region_files.corrected.reformat.txt \
--covFile $PATH/mwe_data/protocol_data/output/covariate/protocol_example.protein.protocol_example.samples.protocol_example.genotype.chr21_22.pQTL.plink_qc.prune.pca.Marchenko_PC.gz \
--container oras://ghcr.io/cumc/pecotmr_apptainer:latest \
--walltime 2h \
--numThreads 8 \
--customized-association-windows $PATH/fsusie_test/regions.reformat.txt \
-c ../scripts/csg.yml -q neurology \
--save-data \
--region-name TADB_1298
Anticipated Results#
Univariate finemapping for functional data will produce a file containing results for the top hits and a file containing residuals from SuSiE.
protocol_example_methylation.chr21_10840000_16880069.fsusie_mixture_normal_top_pc_weights.rds
:
For each region of interest, this file contains:
susie_on_top_pc - ?
twas_weights - for each variant (for enet, lasso and mrash methods). no susie?
twas predictions - for each sample (for enet, lasso, mrash methods)
twas cross validation results - information on the best method. Data is split into five parts
fsusie results - ?
Y coordinates - ?
fsusie summary - ?
total time elapsed
region info - information on the region specified
protocol_example_methylation.chr21_10840000_16880069.16_marks.dataset.rds
:
For each gene of interest, contains residuals for each sample and phenotype
see pecotmr code for description at fsusie uses the
load_regional_functional_data
function, an explanation of the arguments can be found at the similarload_regional_association_data
function