Commit e68898d4 authored by Kevin Kunzmann's avatar Kevin Kunzmann Committed by GitHub

Update README.md

parent 93f95db1
# Impute gene expression data for CENTER-TBI using PrediXcan
# Impute gene expression for CENTER-TBI with PrediXcan
The singularity container is available for download under https://doi.org/10.5281/zenodo.3376504.
Data currently needs to be accessed manually due to access restrictions, this workflow should work for enssentially any
vcf.gz file with dosage (DS) information.
More information on PrediXcan can be found here https://github.com/hakyimlab/PrediXcan and here in the publication:
The singularity container with most software dpendencies is available at
https://doi.org/10.5281/zenodo.3376504.
Data currently needs to be accessed manually due to access restrictions.
This workflow is design for *.vcf.gz files with dosage (DS) information.
More information on PrediXcan can be found here https://github.com/hakyimlab/PrediXcan and in:
> Gamazon ER†, Wheeler HE†, Shah KP†, Mozaffari SV, Aquino-Michaels K, Carroll RJ, Eyler AE, Denny JC,
Nicolae DL, Cox NJ, Im HK. (2015) A gene-based association method for mapping traits using reference transcriptome data.
Nat Genet. doi:10.1038/ng.3367.
We use snakemake to organize the workflow (also pre-installed in the container) and support cluster execution.
> Johannes Köster, Sven Rahmann, Snakemake—a scalable bioinformatics workflow engine, Bioinformatics,
Volume 28, Issue 19, 1 October 2012, Pages 2520–2522, https://doi.org/10.1093/bioinformatics/bts480
## Dependencies
1. linux shell (`bash`), possibly via virtual machine on Windows/Mac
2. `wget` (pre-installed or via distribution package manager)
3. `singularity` container software (tested on 3.3.0) https://sylabs.io/guides/3.3/user-guide
4. `git`
3. `singularity` container software (tested on 3.3.0, https://sylabs.io/guides/3.3/user-guide)
4. `git` (https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
### Optional
5. python 3.7+ and snakemake
6. slurm cluster
We use snakemake to organize the workflow (also pre-installed in the container) and support cluster execution.
Snakemake is available via `pip` package for python 3.7.
> Johannes Köster, Sven Rahmann, Snakemake—a scalable bioinformatics workflow engine, Bioinformatics,
Volume 28, Issue 19, 1 October 2012, Pages 2520–2522, https://doi.org/10.1093/bioinformatics/bts480
## Execution
Download and extract the contents of this repository (might be access restricted)
......@@ -46,4 +48,9 @@ Optionally, if snakemake is installed, the workflow can be run in parallel via
snakemake --use-singularity -j 8 impute
where '8' can be replaced by the number of available cores.
Cluster execution is enables via the `scripts/slurm_snakemake.sh` script as
bash scripts/slurm_snakemake.sh impute
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment