[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Self Attn Layer #7

Closed
shawei3000 opened this issue Oct 4, 2018 · 2 comments
Closed

Question about Self Attn Layer #7

shawei3000 opened this issue Oct 4, 2018 · 2 comments

Comments

@shawei3000
Copy link

In the SelfAttnMatch Layer, you manually modified the score tensor (line# 362), as below, can I ask what's the rationale behind this? is this step important to model performance?

if not self.diag:
x_len = x.size(1)
for i in range(x_len):
scores[:, i, i] = 0

@seanliu96
Copy link
Collaborator

Hi shawei3000, we do not want to add attention to the word itself, so the diag of the score tensor is masked as 0. You can see the detail and Eq.(5) in https://arxiv.org/pdf/1705.02798v3.pdf

@shawei3000
Copy link
Author

Got it, Thank you! seanliu96! will close this question shortly...
"the diagonal of selfcoattention matrix is set to be zero in case of the word being aligned with itself"

Updating the diagonal of the matrix/tensor in tensorflow (slice update) appears not that easy or possible as in Pytorch... just in case someone might have the experience, any suggestion how to accomplish this with tensorflow tensor?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants