MIX: A Multi-view Time-Frequency Interactive Explanation Framework for Time Series Classification

1Queen's University Belfast 2Institut Polytechnique de Paris 3Aarhus University
Neural Information Processing Systems (NeurIPS) 2025

*Indicates Equal Contribution

Abstract

Deep learning models for time series classification (TSC) have achieved impressive performance, but explaining their decisions remains a significant challenge. Existing post-hoc explanation methods typically operate solely in the time domain and from a single-view perspective, limiting both faithfulness and robustness. In this work, we propose MIX (Multi-view Time-Frequency Interactive EXplanation Framework), a novel framework that helps to explain deep learning models in a multi-view setting by leveraging multi-resolution, time-frequency views constructed using the Haar Discrete Wavelet Transform (DWT). MIX introduces an interactive cross-view refinement scheme, where explanation's information from one view is propagated across views to enhance overall interpretability. To align with user-preferred perspectives, we propose a greedy selection strategy that traverses the multi-view space to identify the most informative features. Additionally, we present OSIGV, a user-aligned segment-level attribution mechanism based on overlapping windows for each view, and introduce keystone-first IG, a method that refines explanations in each view using additional information from another view. Extensive experiments across multiple TSC benchmarks and model architectures demonstrate that MIX significantly outperforms state-of-the-art (SOTA) methods in terms of explanation faithfulness and robustness.

Overview Framework

Overview of the MIX framework with three phases.

(A), Multi-view Construction and Independent Explanation: views Vr are constructed via Haar DWT, then explained independently using IGV and OSIGV.

(B), Cross-view Refinement: the best view Vq is selected using KAUCS̃top, then refined using KIGV and OSIGV guided by top-h segments.

(C), Multi-view Greedy Selection: MIX traverses all views to select key features and maps them to the user-preferred view. Phase 3 is practical for selecting top features directly.

Description (D), Attribution mechanism in Phase 1: IGV is applied to each view, and scores are aggregated into overlapping segments via OSIGV.

Description (E), Attribution mechanism in Phase 2: Keystone-first IG for view (KIGV) is used to prioritize keystone features before generate importance score to others, then apply OSIGV again to overlapping segments.

MITBIH Visualization

Visualization. explanations on the synthetic dataset (A, B) and MIT-BIH (C, D), without (A, C) and with (B, D) Phase 3. Labels use the format cAsegment idlevel. Phase 3 enhances interpretability by highlighting key features of time series and relevant granularity.

BibTeX

@inproceedings{
        anonymous2025mix,
        title={{MIX}: A Multi-view Time-Frequency Interactive Explanation Framework for Time Series Classification},
        author={Anonymous},
        booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
        year={2025},
        url={https://openreview.net/forum?id=XDtwXau0BX}
}