Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
impute-gene-expression
Project overview
Project overview
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Commits
Issue Boards
Open sidebar
Kevin Kunzmann
impute-gene-expression
Commits
e2c4cf01
Commit
e2c4cf01
authored
Sep 02, 2019
by
Kevin Kunzmann
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
keep everything seperate
parent
e211fe26
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
7 additions
and
40 deletions
+7
-40
Snakefile
Snakefile
+7
-11
scripts/combine_expression_data.R
scripts/combine_expression_data.R
+0
-29
No files found.
Snakefile
View file @
e2c4cf01
...
@@ -10,15 +10,7 @@ localrules: impute, clean, download_container, generate_samples_file
...
@@ -10,15 +10,7 @@ localrules: impute, clean, download_container, generate_samples_file
rule impute:
rule impute:
input:
input:
"container.sif",
"container.sif",
expand("{output_dir}/imputed-gene-expressions/{region}.expression.txt", region = config['brain_regions'], output_dir = config['output_dir'])
expand("{output_dir}/imputed-gene-expressions/{region}.expression.txt.gz", region = config['brain_regions'], output_dir = config['output_dir'])
output:
expand("{output_dir}/imputed-gene-expressionss_combined.rds", output_dir = config['output_dir'])
singularity:
"container.sif"
shell:
"""
Rscript scripts/combine_expression_data.R
"""
# delete output and logs (if run on slurm cluster)
# delete output and logs (if run on slurm cluster)
rule clean:
rule clean:
...
@@ -108,9 +100,12 @@ rule impute_gene_expressions:
...
@@ -108,9 +100,12 @@ rule impute_gene_expressions:
input:
input:
"container.sif",
"container.sif",
samples_file = expand("{output_dir}/dosages/samples.txt", output_dir = config['output_dir']),
samples_file = expand("{output_dir}/dosages/samples.txt", output_dir = config['output_dir']),
dosage_files = expand("{output_dir}/dosages/chr{i}.dosage.txt.gz", i = range(1, 23), output_dir = config['output_dir'])
dosage_files = expand("{output_dir}/dosages/chr{i}.dosage.txt.gz",
i = list(map(str, range(1, 23))) + ['X'],
output_dir = config['output_dir']
)
output:
output:
"{output_dir}/imputed-gene-expressions/{region}.expression.txt"
"{output_dir}/imputed-gene-expressions/{region}.expression.txt
.gz
"
singularity:
singularity:
"container.sif"
"container.sif"
shell:
shell:
...
@@ -125,4 +120,5 @@ rule impute_gene_expressions:
...
@@ -125,4 +120,5 @@ rule impute_gene_expressions:
--output_prefix {wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}
--output_prefix {wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}
mv {wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}_predicted_expression.txt \
mv {wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}_predicted_expression.txt \
{wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}.expression.txt
{wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}.expression.txt
gzip {wildcards.output_dir}/imputed-gene-expressions/{wildcards.region}.expression.txt
"""
"""
scripts/combine_expression_data.R
deleted
100644 → 0
View file @
e211fe26
#!/usr/bin/env Rscript
library
(
dplyr
,
warn.conflicts
=
FALSE
)
library
(
readr
,
warn.conflicts
=
FALSE
)
config
<-
yaml
::
read_yaml
(
'config.yml'
)
col_types
<-
cols
(
.default
=
col_double
(),
FID
=
col_character
(),
IID
=
col_character
()
)
# read and combine individual expression data in long format
config
$
brain_regions
%>%
purrr
::
map
(
function
(
x
)
{
sprintf
(
"%s/imputed-gene-expressions/%s.expression.txt"
,
config
$
output_dir
,
x
)
%>%
read_tsv
(
col_types
=
col_types
,
progress
=
FALSE
)
%>%
tidyr
::
gather
(
'ensembl_gene_id'
,
'expression'
,
-
FID
,
-
IID
)
%>%
mutate
(
tissue
=
x
)
%>%
select
(
tissue
,
everything
())
}
)
%>%
{
do.call
(
rbind
,
.
)}
%>%
# make missing genes explicit
tidyr
::
spread
(
ensembl_gene_id
,
expression
,
fill
=
NA_real_
)
%>%
tidyr
::
gather
(
'ensembl_gene_id'
,
'expression'
,
-
FID
,
-
IID
,
-
tissue
)
%>%
write_rds
(
sprintf
(
'%s/imputed-gene-expressionss_combined.rds'
,
config
$
output_dir
),
compress
=
'gz'
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment