Hubert Baniecki

h.baniecki (at) uw.edu.pl

I am a 3rd year PhD student in Computer Science at the University of Warsaw, advised by Przemyslaw Biecek. As part of my doctoral studies, I’ve been a visiting researcher at LMU Munich, hosted by Bernd Bischl (2024) and Eyke Hüllermeier (2025). Before my PhD, I did a Master’s in Data Science at the Warsaw University of Technology.

My research focuses on machine learning interpretability and explainable AI, with particular interest in efficient explanation estimation and the robustness of post-hoc explanations to adversarial manipulation. I also care about how humans interact with models through explanations, especially in the context of medical imaging and survival analysis. When time allows, I contribute to open-source software facilitating the responsible development of predictive models, for which I received the John M. Chambers Statistical Software Award (2022).

recent news [previous]

2025 May	Foundation for Polish Science awarded me the START scholarship for young scientists.
2025 May	A paper Interpreting CLIP with hierarchical sparse autoencoders is accepted at ICML 2025.
2025 Mar	I stay in Germany until April for a 1-month research visit at LMU Munich hosted by Eyke Hüllermeier.
2025 Jan	A paper Efficient and accurate explanation estimation with distribution compression is accepted as a Spotlight at ICLR 2025 (notable 5% of submissions).
2024 Nov	A paper Increasing phosphorus loss despite widespread concentration decline in US rivers is published in the Proceedings of the National Academy of Sciences.

selected publications [full list]

ICLR Spotlight

Efficient and accurate explanation estimation with distribution compression

H. Baniecki, G. Casalicchio, B. Bischl, P. Biecek

ICLR 2025 (Spotlight)
Compress then explain: Sample-efficient estimation of feature attributions, importance, effects.

Abstract Paper arXiv Code

We discover a theoretical connection between explanation estimation and distribution compression that significantly improves the approximation of feature attributions, importance, and effects. While the exact computation of various machine learning explanations requires numerous model inferences and becomes impractical, the computational cost of approximation increases with an ever-increasing size of data and model parameters. We show that the standard i.i.d. sampling used in a broad spectrum of algorithms for post-hoc explanation leads to an approximation error worthy of improvement. To this end, we introduce Compress Then Explain (CTE), a new paradigm of sample-efficient explainability. It relies on distribution compression through kernel thinning to obtain a data sample that best approximates its marginal distribution. CTE significantly improves the accuracy and stability of explanation estimation with negligible computational overhead. It often achieves an on-par explanation approximation error 2-3x faster by using fewer samples, i.e. requiring 2-3x fewer model evaluations. CTE is a simple, yet powerful, plug-in for any explanation method that now relies on i.i.d. sampling.
ECML PKDD

On the robustness of global feature effect explanations

H. Baniecki, G. Casalicchio, B. Bischl, P. Biecek

ECML PKDD 2024
Theoretical bounds for the robustness of feature effects to data and model perturbations.

Abstract Paper arXiv Code Slides

We study the robustness of global post-hoc explanations for predictive models trained on tabular data. Effects of predictor features in black-box supervised learning are an essential diagnostic tool for model debugging and scientific discovery in applied sciences. However, how vulnerable they are to data and model perturbations remains an open research question. We introduce several theoretical bounds for evaluating the robustness of partial dependence plots and accumulated local effects. Our experimental results with synthetic and real-world datasets quantify the gap between the best and worst-case scenarios of (mis)interpreting machine learning predictions globally.
DAMI

The grammar of interactive explanatory model analysis

H. Baniecki, D. Parzych, P. Biecek

Data Mining and Knowledge Discovery, 2023
Interactive explanation of a model improves the performance of human decision making.

Abstract Paper arXiv Code Website

The growing need for in-depth analysis of predictive models leads to a series of new methods for explaining their local and global properties. Which of these methods is the best? It turns out that this is an ill-posed question. One cannot sufficiently explain a black-box machine learning model using a single method that gives only one perspective. Isolated explanations are prone to misunderstanding, leading to wrong or simplistic reasoning. This problem is known as the Rashomon effect and refers to diverse, even contradictory, interpretations of the same phenomenon. Surprisingly, most methods developed for explainable and responsible machine learning focus on a single-aspect of the model behavior. In contrast, we showcase the problem of explainability as an interactive and sequential analysis of a model. This paper proposes how different Explanatory Model Analysis (EMA) methods complement each other and discusses why it is essential to juxtapose them. The introduced process of Interactive EMA (IEMA) derives from the algorithmic side of explainable machine learning and aims to embrace ideas developed in cognitive sciences. We formalize the grammar of IEMA to describe human-model interaction. It is implemented in a widely used human-centered open-source software framework that adopts interactivity, customizability and automation as its main traits. We conduct a user study to evaluate the usefulness of IEMA, which indicates that an interactive sequential analysis of a model may increase the accuracy and confidence of human decision making.
ECML PKDD

Fooling partial dependence via data poisoning

H. Baniecki, W. Kretowicz, P. Biecek

ECML PKDD 2022
Feature effect explanations can be manipulated in an adversarial manner.

Abstract Paper arXiv Code Slides

Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. It turns out that such explanations are not robust nor trustworthy, and they can be fooled. This paper presents techniques for attacking Partial Dependence (plots, profiles, PDP), which are among the most popular methods of explaining any predictive model trained on tabular data. We showcase that PD can be manipulated in an adversarial manner, which is alarming, especially in financial or medical applications where auditability became a must-have trait supporting black-box models. The fooling is performed via poisoning the data to bend and shift explanations in the desired direction using genetic and gradient algorithms. To the best of our knowledge, this is the first work using a genetic algorithm for attacking explanations, which is highly transferable as it generalizes both ways: in a model-agnostic and an explanation-agnostic manner.
JMLR

dalex: Responsible machine learning with interactive explainability and fairness in Python

H. Baniecki, W. Kretowicz, P. Piatyszek, J. Wisniewski, P. Biecek

Journal of Machine Learning Research, 2021
2022 John M. Chambers Statistical Software Award

Abstract Paper arXiv Code Website

In modern machine learning, we observe the phenomenon of opaqueness debt, which manifests itself by an increased risk of discrimination, lack of reproducibility, and deflated performance due to data drift. An increasing amount of available data and computing power results in the growing complexity of black-box predictive models. To manage these issues, good MLOps practice asks for better validation of model performance and fairness, higher explainability, and continuous monitoring. The necessity for deeper model transparency comes from both scientific and social domains and is also caused by emerging laws and regulations on artificial intelligence. To facilitate the responsible development of machine learning models, we introduce dalex, a Python package which implements a model-agnostic interface for interactive explainability and fairness. It adopts the design crafted through the development of various tools for explainable machine learning; thus, it aims at the unification of existing solutions. This library's source code and documentation are available under open license at https://python.drwhy.ai.