# Alternative polyadenylation

This document shows the use of various modules to prepare reference data, perform APA calling, Peer correction and imputation. In particular,

* apa_calling.ipynb

* PEER_factor.ipynb

A minimal working example is available on [Google Drive](https://drive.google.com/drive/u/0/folders/1H6PfQjU44Sk0NEz_l_qzsP0Bqai2jv2_).

## Prerequisite

Make sure you have the comprehensive gene annotation files on the reference chromosomes (gtf). 

## Generate the 3'UTR region

In [None]:
sos run /mnt/mfs/statgen/ls3751/github/xqtl-protocol/code/molecular_phenotypes/calling/apa_calling.ipynb UTR_reference \
    --cwd /mnt/mfs/statgen/ls3751/MWE_dapars2/Output \
    --hg_gtf /mnt/mfs/statgen/ls3751/MWE_dapars2/gencode.v39.annotation.gtf \
    --container /mnt/mfs/statgen/ls3751/container/dapars2_final.sif

## Convert bam files into wig files and flagstate files

In [None]:
sos run /mnt/mfs/statgen/ls3751/github/xqtl-protocol/code/molecular_phenotypes/calling/apa_calling.ipynb bam2tools \
    --n 0 1 2 3 4 5 6 7 8 \
    --container /mnt/mfs/statgen/ls3751/container/dapars2_final.sif

## Compile config files

In [None]:
sos run /mnt/mfs/statgen/ls3751/github/xqtl-protocol/code/molecular_phenotypes/calling/apa_calling.ipynb APAconfig \
    --cwd /mnt/mfs/statgen/ls3751/rosmap/dlpfcTissue/batch0 \
    --bfile /mnt/mfs/statgen/ls3751/rosmap/dlpfcTissue/batch0 \
    --annotation /mnt/mfs/statgen/ls3751/MWE_dapars2/Output/gencode.v39.annotation_3UTR.bed \
    --container /mnt/mfs/statgen/ls3751/container/dapars2_final.sif

## Use Dapars2 to quantify APA events

In [None]:
sos run /mnt/mfs/statgen/ls3751/github/xqtl-protocol/code/molecular_phenotypes/calling/apa_calling.ipynb APAmain \
    --cwd /mnt/mfs/statgen/ls3751/rosmap/dlpfcTissue/batch0 \
    --chrlist chr21 chr14 chr1 \
    --container /mnt/mfs/statgen/ls3751/container/dapars2_final.sif

## Use PEER to estimate cofounders 

In [None]:
sos run PEER_factor.ipynb PEER \
    --cwd output \
    --phenoFile AC.mol_phe.annotated.bed.gz \
    --covFile output/AC.APEX.pca.cov.gz \
    --name demo -N 3 \
    --container PEER.sif

## Impute missing values and quality check

In [None]:
sos run /mnt/mfs/statgen/ls3751/github/xqtl-protocol/pipeline/molecular_phenotypes/calling/apa_calling.ipynb APAimpute \
    --cwd /mnt/mfs/statgen/ls3751/MWE_dapars2/Output \
    --cov /data/example.cov.txt
    --chrlist chr1 \
    --container /mnt/mfs/statgen/ls3751/container/dapars2.sif