Interpreting black-box machine learning models with decision rules and knowledge graph reasoning
- Interpretation von Black-Box-Modellen für maschinelles Lernen mit Entscheidungsregeln Regeln und Knowledge Graph Reasoning
Karim, Md. Rezaul; Decker, Stefan Josef (Thesis advisor); Rebholz-Schuhmann, Dietrich (Thesis advisor)
Aachen : RWTH Aachen University (2022)
Dissertation / PhD Thesis
Dissertation, RWTH Aachen University, 2022
Abstract
Machine learning (ML) algorithms are increasingly used to solve complex problems yielding high accuracy. However, due to high non-linear and higher-order interactions between features, complex ML models tend to be less interpretable and increasingly become black-boxes, exposing a clear trade-off between accuracy and interpretability. Further, using a black-box model, we don’t know how and why inputs are ultimately mapped to certain decisions. This may not be acceptable in many situations (e.g., in clinical situations where AI may significantly impact human lives). Consequently, legal landscapes have been fast-moving in European and North American countries, e.g., with the EU GDPR, explainability, adversarial robustness, transparency, and fairness are only some desirable properties of AI but also become legal requirements. An interpretable ML model, on the other hand, can outline how input instances are mapped into certain outputs by identifying statistically significant features. This thesis aims to improve the interpretability and explainability of black-box ML models without sacrificing significant predictive accuracy. First, by employing different representation learning techniques, a black-box multimodal convolutional auto encoder (MCAE) embeds multimodal data into a joint latent space. Learned representations are then used for the classification task. To improve interpretability of the black-box model, different interpretable ML methods such as probing, perturbing, and model surrogation techniques are applied. Further, an interpretable surrogate model is trained to approximate the behavior of the back-box model. The surrogate model is subsequently used to provide explanations in terms of decision rules and counterfactuals. To ensure the models are robust to adversaries and behaves as intended, adversarial retraining is performed for the identification of adversarial inputs. Since an adversarially robust model will be able to generate moderately consistent and reliable predictions, the robustness is formulated as a property to make sure that the predictions remain stable to small variations in the input so that a small invisible noise by addinga minor perturbation to the supplied input should not flip the prediction to a completely different cancer type. To add symbolic reasoning capability to a connectionist model (either black-box MCAE model or surrogate), a domain-specific knowledge graph (KG) is constructed by integrating knowledge and facts from scientific literature and domain-specific ontologies. A semantic reasoner is then used to validate the association of significant features w.r.t different classes based on relations it learned from the KG. Finally, evidence-based decision rules are generated by combining decision rules, counterfactuals, and reasoning to mitigate prediction biases. Besides, a web application is developed to ease assess the quality of explanations in terms of system causality scale, comprehensiveness, and sufficiency via a user-friendly interface. Quantitative evaluation shows that our approach significantly outperformed existing approaches when evaluated on held-out test set, indicating low biases and potentially high generalizability.
Identifier
- DOI: 10.18154/RWTH-2022-07610
- RWTH PUBLICATIONS: RWTH-2022-07610