Interpretable Explainable machine learning for
time-to-event prediction in medicine and healthcare


 Hubert Baniecki


University of Warsaw, Poland


 BIRS Workshop, Banff, Canada

February 12, 2024

đź‘‹ Hi!

PhD student at the University of Warsaw, Poland

explainable machine learning, evaluating explanations, robustness

statistical software: dalex (JMLR 2021, John M. Chambers Award by ASA), modelStudio (JOSS 2019), survex (Bioinformatics 2023)

đź“Ł How to explain machine learning survival models?

  1. method: time-dependent explanations (Knowledge-Based Systems 2023)
  2. software: an R package survex (Bioinformatics 2023)
  3. application(s): finding bias in predictions from medical data (AIME 2023)


🔨 method:
time-dependent explanations

How to explain machine learning survival models?

For classification (or regression), we have:


What about time-to-event prediction like survival analysis?

SurvLIME

(Ribeiro et al. 2016) LIME idea: approximate a black-box function with an interpretable model in the local neighbourhood of observation \(x\).

M. S. Kovalev et al. SurvLIME: A method for explaining machine learning survival models. Knowledge-Based Systems 2020

SurvLIME: example and limitations

  1. an explanation lacks the time dimension
  2. approximating a complex model with linear coefficients (local accuracy)
  3. coefficients are not importances (depend on feature values)

SurvSHAP(t)

M. Krzyzinski, M. Spytek, H. Baniecki, P. Biecek. SurvSHAP(t): Time-dependent explanations of machine learning survival models. Knowledge-Based Systems 2023

SurvSHAP(t): Shapley values

(Lundberg et al. 2017) SHAP idea: use game theory to estimate additive feature attributions \(\phi_{i}\) to the model’s prediction for observation \(x\).

\[ \phi_{i}(x)=\sum_{S\subseteq P\setminus \{i\}}{\frac {|S|!\;(|P|-|S|-1)!}{|P|!}}(v_{S\cup \{i\}}(x)-v_S(x)) \] \(P\) – feature set, \(v_S(x) = \mathbb{E}_{x|S} f(x)\) – prediction for feature values in set \(S\) that are marginalized over features not included in \(S\).


Over 24 algorithms to estimate Shapley value feature attributions:

KernelSHAP, TreeSHAP, marginal vs conditional..

SurvSHAP(t): example

Idea: explain \(f_t(x)\) – a prediction for an observation \(x\) at time point \(t\) – for all time points separately – \(\phi_t(x, i)\).

đź“Š software: survex, survshap

survex: dalex for survival models

M. Spytek, M. Krzyzinski, S. H. Langbein, H. Baniecki, M. N. Wright, P. Biecek. survex: an R package for explaining machine learning survival models. Bioinformatics 2023

survex: code example

Find more on our GitHub page

🏥 application:
explaining hospital length of stay

Predicting hospital length of stay (LoS)

To what extent can the patient’s length of stay in a hospital be predicted using only an X-ray image?

Motivation: explain black-box models predicting hospital LoS
(K. Stone et al. PLOS Digital Health 2022)

Task: time-to-event prediction
instead of a single-value time regression or time-span classification

Dataset i.e. tabular data openly available on GitHub

We manually annotated textual radiology reports of X-ray images:

  • 1235 patients from one of the Polish hospitals
  • target feature: time between the patient’s radiological examination and hospital discharge (in days, \(min=1\), \(median=7\), \(mean=13.73\), \(max=330\)); about 20% of outcomes are right-censored
  • 95 features:
    • 2 baseline: age & sex
    • 17 human-annotated pathology occurrences, e.g. pulmonary nodule, pleural effusion, medical devices
    • 76 algorithm-extracted image statistics using the pyradiomics tool (J. Van Griethuysen et al. Cancer Research 2017)

Benchmarking machine learning models

Predicting the patient’s LoS from an X-ray image is indeed possible, but challenging.

Multi-modal feature performance

Performance of the interpretable model declines when increasing the number of features.

Explaining LoS predictions to humans

H. Baniecki, B. Sobieski, P. Bombinski, P. Szatkowski, P. Biecek. Hospital Length of Stay Prediction Based on Multi-modal Data Towards Trustworthy Human-AI Collaboration in Radiomics. AIME 2023

Time-dependent feature importance and effects

H. Baniecki, B. Sobieski, P. Bombinski, P. Szatkowski, P. Biecek. Hospital Length of Stay Prediction Based on Multi-modal Data Towards Trustworthy Human-AI Collaboration in Radiomics. AIME 2023

Takeaways & discussion

  1. Adapt explanations to the time-to-event use-case
  2. The TLOS dataset is openly available for research purposes github.com/mi2datalab/xlungs-trustworthy-los-prediction

How to evaluate explanationsâť“ How to interpret explanationsâť“

One needs to be sure that humans properly interpret explanation visualizations, i.e. physicians and other stakeholders interpreting predictions need to understand that explanations are only an approximation of the black-box model.


🧑 🔬 Shameless plug: I’m open to a 3-month research visit in 2025 (self-)funded by “Polish NSF” as part of my PhD.

References

  • H.Baniecki, W. Kretowicz, P. Piatyszek, J. Wisniewski, P. Biecek. dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python. JMLR 2021
  • M. Krzyzinski, M. Spytek, H. Baniecki, P. Biecek. SurvSHAP(t): Time-dependent explanations of machine learning survival models. Knowledge-Based Systems 2023
  • M. Spytek, M. Krzyzinski, S. H. Langbein, H. Baniecki, M. N. Wright, P. Biecek. survex: an R package for explaining machine learning survival models. Bioinformatics 2023
  • H. Baniecki, B. Sobieski, P. Bombinski, P. Szatkowski, P. Biecek. Hospital Length of Stay Prediction Based on Multi-modal Data Towards Trustworthy Human-AI Collaboration in Radiomics. AIME 2023
  • K. Stone et al. A systematic review of the prediction of hospital length of stay: Towards a unified framework. PLOS Digital Health 2022
  • J. Van Griethuysen et al. Computational radiomics system to decode the radiographic phenotype. Cancer Research 2017