Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
GOSe-6mo-imputation-paper
Project overview
Project overview
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Commits
Issue Boards
Open sidebar
Kevin Kunzmann
GOSe-6mo-imputation-paper
Commits
37cd8615
Commit
37cd8615
authored
Mar 18, 2019
by
Kevin
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
...
parent
a5ac5123
Changes
8
Show whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
75 additions
and
95 deletions
+75
-95
.gitignore
.gitignore
+2
-1
README.md
README.md
+58
-41
Snakefile
Snakefile
+5
-21
docker/dockerfile
docker/dockerfile
+1
-2
download.R
download.R
+0
-6
reports/imputations.Rmd
reports/imputations.Rmd
+3
-11
reports/model_assessment.Rmd
reports/model_assessment.Rmd
+2
-8
reports/prepare_data.Rmd
reports/prepare_data.Rmd
+4
-5
No files found.
.gitignore
View file @
37cd8615
data
output
output
.snakemake
.snakemake
.Rproj.user
.Rproj.user
...
...
README.md
View file @
37cd8615
# CENTER-TBI six-months GOSe-Outcome Imputation
# CENTER-TBI six-months GOSe-Outcome Imputation
# Prerequisites
This repository contains the entire source code to reproduce the imputation for
the six-months GOSe in CENTER-TBI.
We assume a Unix command line workflow. The following software is required to take advantage of the pre-defined workflow:
*
curl for downloading the data (in case you do not have curl installed, it is also available from within the container)
*
[
python
](
https://www.python.org/download/releases/3.5.1/
)
3.5.1 (higher versions might work as well)
*
[
snakemake
](
https://snakemake.readthedocs.io/en/stable/getting_started/installation.html
)
version 5.2.1 (higher versions will work as well)
*
[
singularity
](
https://www.sylabs.io/guides/2.6/user-guide/index.html
)
2.6.0 (higher versions might work as well)
+
CENTER-TBI account and API key, store as NEUROBOT_USR and NEUROBOT_API
environment variables.
The entire analysis is containerized using a
[
docker container
](
https://cloud.docker.com/u/kkmann/repository/docker/kkmann/gose-6mo-imputation
)
.
The container can either be used to execute scripts individually inside the container, or it can be used to run the entire
pre-defined snakemake workflow using the container via singularity (recommended).
A
[
script
](
https://github.com/kkmann/center-6mo-gose-imputation/blob/master/snakemake_slurm
)
for running the entire analysis
on a slurm cluster is provided.
Make sure to adjust the parameters in the
[
cluster configuration file
](
https://github.com/kkmann/center-6mo-gose-imputation/blob/master/cluster.json
)
accordingly.
A
[
script
](
https://github.com/kkmann/center-6mo-gose-imputation/blob/master/snakemake
)
for execution on a single desktop
machine is provided as well. Depending on the number of cores and available RAM, the cross-validated model comparison may take several
days (3+) to complete.
The script can be invoked via
```
./snakemake_slurm [target]
```
where
`[target]`
is the build target (e.g. 'data_report_v1_1').
# Executing the workflow
The available rules can be listed by invoking
## Prerequisites
```
snakemake -lt
```
To reproduce the data extraction an population description on version v1.1 of the neurobot CENTER-TBI data, invoke
### Data Access
```
./snakemake_slurm data_report_v1_1
```
To reproduce the cross-validated model comparison on version v1.1 of the neurobot CENTER-TBI data, invoke
To reproduce the analysis, access to the CENTER-TBI 'Neurobot' database at
```
https://center-tbi.incf.org and a personal access toke to the curl API
./snakemake_slurm cv_model_comparison_report_v1_1
is required.
For information on how to get dat access, see https://www.center-tbi.eu/data.
### Software dependencies
The workflow assumes a linux command line.
To facilitate reproducibility, a
[
docker container
](
https://cloud.docker.com/u/kkmann/repository/docker/kkmann/gose-6mo-imputation
)
container with all software dependencies
(R packages etc.) is provided
[
here
](
).
The workflow itself is automated using
[
snakemake
](
)
5.2.1.
To fully leverage the container and snakemake workflow, the following software
dependencies must be available:
*
[
python
](
https://www.python.org/download/releases/3.5.1/
)
3.5.1 (higher versions might work as well)
*
[
snakemake
](
https://snakemake.readthedocs.io/en/stable/getting_started/installation.html
)
version 5.2.1 (higher versions will work as well)
*
[
singularity
](
https://www.sylabs.io/guides/2.6/user-guide/index.html
)
2.6.0 (higher versions might work as well)
## How-To
The download script requires the neurobot user name and the personal API key
to be stored in the environment variables
`NEUROBOT_USR`
`NEUROBOT_API`
,
respectively, i.e.
```
bash
export
NEUROBOT_USR
=[
my-neurobot-username]
export
NEUROBOT_API
=[
my-neurobot-api-key]
```
```
To reproduce the MSM model-based imputation for v1.1 of the neurobot CENTER-TBI data, invoke
### Execute Workflow on Desktop
The workflow can be executed on a potent desktop machine although a cluster
execution is recommended (cf. blow).
```
bash
./singularity manuscript_v1_1
./singularity impute_msm_v1_1
```
```
./snakemake_slurm cv_model_comparison_report_v1_1
All output is written to
`output/`
.
Depending on the number of cores and available RAM,
the cross-validated model comparison may take several days (3+) to complete.
### Execute Workflow on Cluster
Cluster execution requires slightly more
[
configuration
](
https://github.com/kkmann/center-6mo-gose-imputation/blob/master/cluster.json
)
and assumes existence of a slurm cluster.
Simply modify the
`cluster.json`
accordingly and execute
```
bash
./singularity_slurm manuscript_v1_1
./singularity_slurm impute_msm_v1_1
```
```
Snakefile
View file @
37cd8615
singularity: "docker://kkmann/gose-6mo-imputation@sha256:
42e72ea0ccaa938b50aea85b7ac3b5d5f8efada79f0b7e2411b1a70a2e037801
"
singularity: "docker://kkmann/gose-6mo-imputation@sha256:
62540d4bc41b228639bce7e4fe764acfaaeef76e467b92d9c55b26f8ea4f4c5f
"
configfile: "config.yml"
configfile: "config.yml"
rule download_data:
rule download_data:
output:
output:
"data/{version}/df_baseline.rds",
"data/{version}/df_baseline.rds",
...
@@ -20,8 +18,6 @@ rule download_data:
...
@@ -20,8 +18,6 @@ rule download_data:
rule prepare_data:
rule prepare_data:
input:
input:
rules.download_data.output,
rules.download_data.output,
...
@@ -29,7 +25,7 @@ rule prepare_data:
...
@@ -29,7 +25,7 @@ rule prepare_data:
output:
output:
"output/{version}/data/df_gose.rds",
"output/{version}/data/df_gose.rds",
"output/{version}/data/df_baseline.rds",
"output/{version}/data/df_baseline.rds",
"output/{version}/prepare_data.
pdf
",
"output/{version}/prepare_data.
html
",
figures = "output/{version}/prepare_data_figures.zip"
figures = "output/{version}/prepare_data_figures.zip"
shell:
shell:
"""
"""
...
@@ -41,8 +37,6 @@ rule prepare_data:
...
@@ -41,8 +37,6 @@ rule prepare_data:
rule impute_baseline:
rule impute_baseline:
input:
input:
rules.prepare_data.output
rules.prepare_data.output
...
@@ -56,8 +50,6 @@ rule impute_baseline:
...
@@ -56,8 +50,6 @@ rule impute_baseline:
rule generate_validation_data:
rule generate_validation_data:
input:
input:
rules.prepare_data.output,
rules.prepare_data.output,
...
@@ -76,8 +68,6 @@ rule generate_validation_data:
...
@@ -76,8 +68,6 @@ rule generate_validation_data:
# adjust threads by model type
# adjust threads by model type
def get_rule_threads(wildcards):
def get_rule_threads(wildcards):
if wildcards.model in ("locf", "msm"):
if wildcards.model in ("locf", "msm"):
...
@@ -101,7 +91,6 @@ rule fit_model_validation_set:
...
@@ -101,7 +91,6 @@ rule fit_model_validation_set:
# helper rule to just build all posterior datasets
# helper rule to just build all posterior datasets
rule model_posteriors:
rule model_posteriors:
input:
input:
...
@@ -113,9 +102,6 @@ rule model_posteriors:
...
@@ -113,9 +102,6 @@ rule model_posteriors:
# rules for imputing on entire dataset
# rules for imputing on entire dataset
rule generate_imputation_data:
rule generate_imputation_data:
input:
input:
...
@@ -133,8 +119,6 @@ rule generate_imputation_data:
...
@@ -133,8 +119,6 @@ rule generate_imputation_data:
rule model_impute:
rule model_impute:
input:
input:
"config.yml",
"config.yml",
...
@@ -172,20 +156,20 @@ rule imputation_report:
...
@@ -172,20 +156,20 @@ rule imputation_report:
rules.post_process_imputations.output,
rules.post_process_imputations.output,
markdown = "reports/imputations.Rmd"
markdown = "reports/imputations.Rmd"
output:
output:
pdf = "output/{version}/gose_imputations_{model}.pdf
",
html = "output/{version}/gose_imputations_{model}.html
",
figures = "output/{version}/gose_imputations_{model}_figures.zip"
figures = "output/{version}/gose_imputations_{model}_figures.zip"
shell:
shell:
"""
"""
mkdir -p output/{wildcards.version}
mkdir -p output/{wildcards.version}
Rscript -e "rmarkdown::render(\\"{input.markdown}\\", params = list(data_dir = \\"../output/{wildcards.version}/data\\", imputations = \\"../output/v1.1/data/imputation/{wildcards.model}/df_gose_imputed.csv\\"))"
Rscript -e "rmarkdown::render(\\"{input.markdown}\\", params = list(data_dir = \\"../output/{wildcards.version}/data\\", imputations = \\"../output/v1.1/data/imputation/{wildcards.model}/df_gose_imputed.csv\\"))"
mv reports/imputations.
pdf {output.pdf
}
mv reports/imputations.
html {output.html
}
mv reports/figures.zip {output.figures}
mv reports/figures.zip {output.figures}
"""
"""
# define corresponding target rule for ease of use
# define corresponding target rule for ease of use
rule impute_msm_v1_1:
rule impute_msm_v1_1:
input:
input:
pdf = "output/v1.1/gose_imputations_msm.pdf
",
html = "output/v1.1/gose_imputations_msm.html
",
figures = "output/v1.1/gose_imputations_msm_figures.zip"
figures = "output/v1.1/gose_imputations_msm_figures.zip"
...
...
docker/dockerfile
View file @
37cd8615
...
@@ -6,7 +6,7 @@ MAINTAINER Kevin Kunzmann kevin.kunzmann@mrc-bsu.cam.ac.uk
...
@@ -6,7 +6,7 @@ MAINTAINER Kevin Kunzmann kevin.kunzmann@mrc-bsu.cam.ac.uk
RUN
sudo
apt-get update
RUN
sudo
apt-get update
# install prerequisits
# install prerequisits
RUN
sudo
apt-get
-y
install
libcurl4-openssl-dev
RUN
sudo
apt-get
-y
install
libcurl4-openssl-dev
curl
# install required R packages
# install required R packages
RUN
R
-e
"install.packages('rstan')"
RUN
R
-e
"install.packages('rstan')"
...
@@ -18,5 +18,4 @@ RUN R -e "install.packages('msm')"
...
@@ -18,5 +18,4 @@ RUN R -e "install.packages('msm')"
RUN
R
-e
"install.packages('cowplot')"
RUN
R
-e
"install.packages('cowplot')"
RUN
R
-e
"install.packages('pander')"
RUN
R
-e
"install.packages('pander')"
RUN
R
-e
"install.packages('DiagrammeR')"
RUN
R
-e
"install.packages('DiagrammeR')"
RUN
R
-e
"devtools::install_github('kkmann/reportr')"
RUN
R
-e
"devtools::install_github('kkmann/describr')"
RUN
R
-e
"devtools::install_github('kkmann/describr')"
download.R
deleted
100644 → 0
View file @
a5ac5123
#!/usr/bin bash
curl
\
--
user
$
NEUROBOT_USR
:$
NEUROBOT_API
\
--
digest
https
://
neurobot
-
stage.incf.org
/
api
/
data
/
_
5
c8a757252dc3879e3b7cc35.csv
reports/imputations.Rmd
View file @
37cd8615
---
---
title: "Imputing GOSE scores in CENTER-TBI"
title: "Imputing GOSE scores in CENTER-TBI, assessing final imputations"
subtitle: "assessing final imputations"
date: "`r Sys.time()`"
date: "`r Sys.time()`"
statistician: "Kevin Kunzmann (kevin.kunzmann@mrc-bsu.cam.ac.uk)"
author: "Kevin Kunzmann (kevin.kunzmann@mrc-bsu.cam.ac.uk)"
collaborator: "David Menon (dkm13@cam.ac.uk)"
output: reportr::report
git-commit-hash: "`r system('git rev-parse --verify HEAD', intern=TRUE)`"
git-wd-clean: "`r ifelse(system('git diff-index --quiet HEAD') == 0, 'clean', 'file changes, working directory not clean!')`"
output: html_document
params:
params:
data_dir: "../output/v1.1/data"
data_dir: "../output/v1.1/data"
...
...
reports/model_assessment.Rmd
View file @
37cd8615
...
@@ -3,15 +3,9 @@ title: "Imputing GOSE scores in CENTER-TBI"
...
@@ -3,15 +3,9 @@ title: "Imputing GOSE scores in CENTER-TBI"
date
:
"`r Sys.time()`"
date
:
"`r Sys.time()`"
statistician
:
"Kevin Kunzmann (kevin.kunzmann@mrc-bsu.cam.ac.uk)"
author
:
"Kevin Kunzmann (kevin.kunzmann@mrc-bsu.cam.ac.uk)"
collaborator
:
"David Menon (dkm13@cam.ac.uk)"
output
:
html_document
output
:
reportr
::
report
git
-
commit
-
hash
:
"`r system('git rev-parse --verify HEAD', intern=TRUE)`"
git
-
wd
-
clean
:
"`r ifelse(system('git diff-index --quiet HEAD') == 0, 'clean', 'file changes, working directory not clean!')`"
bibliography
:
"references.bib"
bibliography
:
"references.bib"
...
...
reports/prepare_data.Rmd
View file @
37cd8615
---
---
title: "Extract and prepare data"
title: "Extract and prepare data"
statistician: "Kevin Kunzmann (kevin.kunzmann@mrc-bsu.cam.ac.uk)
"
date: "`r Sys.time()`
"
collaborator: "David Menon (dkm13@
cam.ac.uk)"
author: "Kevin Kunzmann (kevin.kunzmann@mrc-bsu.
cam.ac.uk)"
output:
reportr::repor
t
output:
html_documen
t
date: "`r Sys.time()`"
params:
params:
datapath: "../data/v1.1"
datapath: "../data/v1.1"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment