• Kevin's avatar
    ... · 37cd8615
    Kevin authored
    37cd8615
imputations.Rmd 3.99 KB
---
title: "Imputing GOSE scores in CENTER-TBI, assessing final imputations"

date: "`r Sys.time()`"
      
author: "Kevin Kunzmann (kevin.kunzmann@mrc-bsu.cam.ac.uk)"

output: html_document

params:
  data_dir:    "../output/v1.1/data"
  imputations: "../output/v1.1/data/imputation/msm/df_gose_imputed.csv"
  config_file: "../config.yml"
---

```{r setup-chunk, include=FALSE}
options(tidyverse.quiet = TRUE) # supresses filter/lag conflicts
require(tidyverse, quietly = TRUE)

config <- yaml::read_yaml(params$config_file)

set.seed(config$seed)

df_imputations <- read_csv(params$imputations)
```



Since we do not use the raw imputed values but give preference to the per
protocol values (when a derived composite GOSE is availabel within 5-8 months
after injury), we start by comparing the final combined version with the raw
imputations.
Note that this plot also includes the confirmed deaths before 6 months!

```{r, fig.height=7, fig.width=7}
caret::confusionMatrix(
    df_imputations$Subject.DerivedImputed180DaysGOSE %>% factor(levels = 1:8),
    df_imputations$Subject.GOSE6monthEndpointDerived %>% factor(levels = 1:8)
  ) %>% 
  as.matrix %>% as_tibble %>%
  mutate(
    Subject.DerivedImputed180DaysGOSE = row_number() %>% as.character
  ) %>% 
  gather(Subject.GOSE6monthEndpointDerived, n, `1`:`8`) %>%
  ggplot(aes(Subject.GOSE6monthEndpointDerived, Subject.DerivedImputed180DaysGOSE, fill = n)) +
    geom_raster() +
    geom_hline(yintercept = c(2, 4, 6) + .5, color = "black") +
    geom_vline(xintercept = c(2, 4, 6) + .5, color = "black") +
    geom_text(aes(label = sprintf("%i", n)), vjust = 1) +
    scale_fill_gradient(low = "white", high = "black") +
    coord_fixed(expand = FALSE) + 
    theme_bw() +
    theme(
      panel.grid = element_blank()
    ) + 
    ggtitle("Confusion matrix (absolute counts)")

ggsave(filename = "confusion_matrix.pdf", width = 7, height = 7)
ggsave(filename = "confusion_matrix.png", width = 7, height = 7)
```

```{r}
df_gose <- readRDS(sprintf("%s/df_gose.rds", params$data_dir))

df_per_protocol_gose <- df_gose %>%
  filter(
    Outcomes.DerivedCompositeGOSEDaysPostInjury >= 5*30,
    Outcomes.DerivedCompositeGOSEDaysPostInjury <= 8*30
  ) %>%
  # pick closest to 180
  group_by(gupi) %>%
  summarize(
    closest_per_protocol_GOSE = Outcomes.DerivedCompositeGOSE[
      which.min(abs(Outcomes.DerivedCompositeGOSEDaysPostInjury - 180))
    ]
  )
```

Overall, only `r nrow(df_per_protocol_gose)` six-months GOSE are observed,
i.e., `r nrow(df_imputations %>% filter(complete.cases(.) & Subject.DerivedImputed180DaysGOSE > 1)) - nrow(df_per_protocol_gose)` 
model-based values are used.

```{r, fig.height=8}
df_posteriors <- df_imputations %>% 
  select(-Subject.GOSE6monthEndpointDerived) %>% 
  gather(GOSE, probability, 3:10) %>% 
  arrange(gupi, GOSE) %>% 
  mutate(
    GOSE = factor(GOSE) %>% as.numeric,
    t = 180
  )

idx <- unique(df_gose$gupi)[1:40]

df_gose %>% 
  filter(gupi %in% idx) %>% 
  transmute(
    gupi,
    t = Outcomes.DerivedCompositeGOSEDaysPostInjury,
    GOSE = Outcomes.DerivedCompositeGOSE %>% as.numeric
  ) %>% 
  ggplot(aes(t, GOSE)) + 
    geom_rect(
      aes(
        xmin = t - 14, 
        xmax = t + 14, 
        ymin = GOSE -.5, 
        ymax = GOSE + .5, 
        fill = probability
      ), 
      data = df_posteriors %>% filter(gupi %in% idx)
    ) +
    facet_wrap(~gupi, ncol = 5) + 
    geom_point(size = .5, color = "red") + 
    theme_bw() + 
    theme(
      panel.grid = element_blank(),
      strip.background = element_blank(),
      strip.text.x = element_blank()
    ) +
    scale_fill_gradient2(low = "white", high = "black", limits = c(0, 1)) +
    scale_y_continuous(breaks = 1:8, limits = c(.5, 8.5)) +
    ggtitle("First 30 GOSE trajectories with 6-months predicted probabilities.")

ggsave(filename = "sample_probs.pdf", width = 7, height = 8)
ggsave(filename = "sample_probs.png", width = 7, height = 8)
```



# Session Info

```{r zip-figures}
system("zip figures.zip *.png *.pdf")
system("rm *.png *.pdf")
```

```{r session-info}
sessionInfo()
```