Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE

Univariate Fine-Mapping of Functional (Epigenomic) Data with fSuSiE#

Description#

Univariate fine-mapping for functional (epigenomic) data is conducted with fSuSiE. This is similar to the normal univariate fine-mapping, with the main difference being the use of epigonmic data.

Input#

--genoFile: path to a text file contatining information on genotype files. For example:

#id     #path
21      $PATH/protocol_example.genotype.chr21_22.21.bed
22      $PATH/protocol_example.genotype.chr21_22.22.bed

--phenoFile: a tab delimited file containing chr, start, end, ID and path for the regions. For example:

#chr    start   end     ID      path
chr21   0       14120807        TADB_1297       $PATH/protocol_example.ha.bed.gz
chr21   10840000        16880069        TADB_1298       $PATH/protocol_example.ha.bed.gz

--covFile: path to a gzipped file containing covariates in the rows, and sample ids in the columns.
--customized-association-windows: a tab delimited file containing chr, start, end, and ID regions. For example:

#chr    start   end     ID
chr21   0       14120807        TADB_1297
chr21   10840000        16880069        TADB_1298

--region-name: if you only wish to analyze one region, then include the ID of a region found in the customized-association-windows file

Minimal Working Example Steps#

iii. Run the Fine-Mapping with fSuSiE#

sos run $PATH/mnm_regression.ipynb fsusie \
    --cwd $PATH/fsusie_test/ \
    --name protocol_example_methylation \
    --genoFile $PATH/mwe_data/protocol_data/output/genotype_by_chrom/protocol_example.genotype.chr21_22.genotype_by_chrom_files.txt \
    --phenoFile $PATH/fsusie_test/protocol_example.ha.phenotype_by_region_files.corrected.reformat.txt \
    --covFile $PATH/mwe_data/protocol_data/output/covariate/protocol_example.protein.protocol_example.samples.protocol_example.genotype.chr21_22.pQTL.plink_qc.prune.pca.Marchenko_PC.gz \
    --container oras://ghcr.io/cumc/pecotmr_apptainer:latest \
    --walltime 2h \
    --numThreads 8 \
    --customized-association-windows $PATH/fsusie_test/regions.reformat.txt \
    -c ../scripts/csg.yml -q neurology \
    --save-data \
    --region-name TADB_1298

Anticipated Results#

Univariate finemapping for functional data will produce a file containing results for the top hits and a file containing residuals from SuSiE.

protocol_example_methylation.chr21_10840000_16880069.fsusie_mixture_normal_top_pc_weights.rds:

  • For each region of interest, this file contains:

    1. susie_on_top_pc - ?

    2. twas_weights - for each variant (for enet, lasso and mrash methods). no susie?

    3. twas predictions - for each sample (for enet, lasso, mrash methods)

    4. twas cross validation results - information on the best method. Data is split into five parts

    5. fsusie results - ?

    6. Y coordinates - ?

    7. fsusie summary - ?

    8. total time elapsed

    9. region info - information on the region specified

protocol_example_methylation.chr21_10840000_16880069.16_marks.dataset.rds:

  • For each gene of interest, contains residuals for each sample and phenotype

  • see pecotmr code for description at fsusie uses the load_regional_functional_data function, an explanation of the arguments can be found at the similar load_regional_association_data function