Improving Interpretation Faithfulness for Transformers

Di Wang, Assistant Professor, Computer Science

Coordinators:

Dominik L. Michels, Associate Professor, Computer Science

Nov 20, 11:30 - 12:30

B9 L2 H2 H2

transformers nlp interpretation faithfulness

Currently, attention mechanism becomes a standard fixture in most state-of-the-art NLP, Vision and GNN models, not only due to outstanding performance it could gain, but also due to plausible innate explanation for the behaviors of neural architectures it provides, which is notoriously difficult to analyze. However, recent studies show that attention is unstable against randomness and perturbations during training or testing, such as random seeds and slight perturbation of input or embedding vectors, which impedes it from becoming a faithful explanation tool. Thus, a natural question is whether we can find some substitute of the current attention which is more stable and could keep the most important characteristics on explanation and prediction of attention.

Abstract

Brief Biography

Di Wang is currently an Assistant Professor of Computer Science and Adjunct Professor of Statistics at the King Abdullah University of Science and Technology (KAUST). Before that, he got his PhD degree in the Computer Science and Engineering at the State University of New York (SUNY) at Buffalo. And he obtained his BS and MS degrees in mathematics from Shandong University and the University of Western Ontario, respectively. His research areas include privacy-preserving machine learning, interpretability, machine learning theory, and trustworthy machine learning.

Improving Interpretation Faithfulness for Transformers

Abstract

Brief Biography

Presenters

Di Wang, Assistant Professor, Computer Science

Share

Related Sites

Computer, Electrical and Mathematical Sciences and Engineering (CEMSE)

Connect with us

Improving Interpretation Faithfulness for Transformers

Overview

Abstract

Brief Biography

Presenters

Di Wang, Assistant Professor, Computer Science

Related People

Related Researchers

Di Wang

Share

Related Sites