`tf.dynamic_stitch` gradient is incorrect #7397

drasmuss · 2017-02-09T19:09:54Z

If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)

Original reproduction code (TensorFlow 1.0)

import tensorflow as tf

x = tf.zeros((1, 3))
y = tf.dynamic_stitch([[0], [0]], [x, tf.ones((1, 3))])

with tf.Session() as sess:
    print("y")
    print(sess.run(y))

    analytic, numeric = tf.test.compute_gradient(x, (1, 3), y, (1, 3))
    print("analytic")
    print(analytic)
    print("numeric")
    print(numeric)

Updated reproduction code (TensorFlow 2.16)

import tensorflow as tf

x = tf.zeros((1, 3))

analytic, numeric = tf.test.compute_gradient(
    lambda x: tf.dynamic_stitch([[0], [0]], [x, tf.ones((1, 3))]), [x]
)
print("analytic")
print(analytic)
print("numeric")
print(numeric)

gives output

y
[[ 1.  1.  1.]]
analytic
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
numeric
[[ 0.  0.  0.]
 [ 0.  0.  0.]
 [ 0.  0.  0.]]

The numeric gradient correctly shows that x has no impact on y (since the value of x is completely overwritten by a constant in the dynamic_stitch). The analytic gradient is incorrect; it seems like the gradient calculation in dynamic_stitch does not handle the case where there are duplicate indices being merged.

The text was updated successfully, but these errors were encountered:

girving · 2017-02-10T16:41:48Z

Ug. You're correct that the gradients are wrong, but I don't see how to fix it without a dramatic performance hit. Do you have any suggestions?

Addresses tensorflow#7397 Also expanded unit tests to cover these cases.

Addresses tensorflow#7397 Expanded unit tests to cover these cases.

aselle · 2017-03-03T23:43:07Z

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

drasmuss · 2017-03-04T03:26:52Z

The bug still exists, meaning that the tf.dynamic_stitch gradients are incorrect. Is there any other information I can provide that would be helpful?

girving · 2017-03-06T16:22:20Z

Let's leave this open. Anyone interested should refer to the comments in #7487. The next step would have been to add a new C++ kernel to speed up the bookkeeping required by accurate gradients.

bhack · 2021-05-13T17:56:38Z

Can we close this?

drasmuss · 2021-05-13T18:13:30Z

The gradient implementation is still incorrect, as of TF 2.5.0rc. Here is an updated example showing the same error

import tensorflow as tf

x = tf.zeros((1, 3))

analytic, numeric = tf.test.compute_gradient(
    lambda x: tf.dynamic_stitch([[0], [0]], [x, tf.ones((1, 3))]), [x]
)
print("analytic")
print(analytic)
print("numeric")
print(numeric)

gives

analytic
(array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]], dtype=float32),)
numeric
(array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]], dtype=float32),)

bhack · 2021-05-13T18:16:29Z

/cc @rthadur Can we update the label?

chunduriv · 2022-07-15T08:44:04Z

I was able to replicate issues in tf-nightly 2.10.0-dev20220714. Please find the gist for reference. Thank you

mohantym · 2022-09-30T01:38:50Z

Hi @drasmuss !
Just trying to put my observation based on yesterday's trials.
Actually @drasmuss has used same indices [0] , [0] to stitch to two tensors ([0,0,0] and [1,1,1]. So while trying to allocate a tensor at same index , it is taking the max of two tensors and failing the dynamic stitch.

If we use two different indices like [0] , [1] or [0],[2] then results for analytical and theoritical from test.comput_gradient is same.
@yongtang @bhack
May be we can put assertion condition in code itself to check whether user is putting different indices or not .

Attached gist for reference.

Thank you!

drasmuss · 2022-09-30T01:56:11Z

Yes, this bug is caused by having duplicate indices. But that is defined and supported behaviour for dynamic stitch (e.g., see the documentation):

Values are merged in order, so if an index appears in both indices[m][i] and indices[n][j] for (m,i) < (n,j) the slice data[n][j] will appear in the merged result

mohantym · 2022-09-30T03:10:23Z

Ok @drasmuss !
Thanks for the update.

pjpratik · 2022-12-20T06:50:09Z

I was able to reproduce this issue in TF Nighly 2.12.0-dev20221218. Please find the gist here. Thank you.

synandi · 2023-04-20T08:37:50Z

I was able to replicate this issue in TF Nighly 2.13.0-dev20230419. Please find the gist here. Thank you.

sushreebarsa · 2024-05-21T07:00:55Z

Hi,

Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.

The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.

Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space.

Thank you!

drasmuss · 2024-05-21T19:41:17Z

This bug is still present in TensorFlow 2.16.1. The code from #7397 (comment) is still valid, and reproduces the bug. I have edited that into the original post for clarity.

girving added stat:awaiting response Status - Awaiting response from author type:bug Bug labels Feb 10, 2017

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 13, 2017

Fix dynamic_stitch gradient implementation

e17aaa4

Addresses tensorflow#7397 Also expanded unit tests to cover these cases.

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 13, 2017

Fix dynamic_stitch gradient implementation

5bcfe24

Addresses tensorflow#7397 Also expanded unit tests to cover these cases.

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 14, 2017

Fix dynamic_stitch gradient implementation

244e545

Addresses tensorflow#7397 Expanded unit tests to cover these cases.

drasmuss mentioned this issue Feb 14, 2017

Fix dynamic_stitch gradient implementation #7487

Closed

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 14, 2017

Fix dynamic_stitch gradient implementation

44e0df1

Addresses tensorflow#7397 Expanded unit tests to cover these cases.

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 14, 2017

Fix dynamic_stitch gradient implementation

edb0275

Addresses tensorflow#7397 Expanded unit tests to cover these cases.

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 15, 2017

Fix dynamic_stitch gradient implementation

9a811cb

Addresses tensorflow#7397 Expanded unit tests to cover these cases.

drasmuss added a commit to drasmuss/tensorflow that referenced this issue Feb 16, 2017

Fix dynamic_stitch gradient implementation

e4e7093

Addresses tensorflow#7397 Expanded unit tests to cover these cases.

aselle closed this as completed Mar 3, 2017

girving reopened this Mar 6, 2017

girving added stat:contribution welcome Status - Contributions welcome and removed stat:awaiting response Status - Awaiting response from author labels Mar 6, 2017

codrut3 mentioned this issue Oct 7, 2017

Fix the gradient computation of dynamic stitch. #13557

Closed

rthadur removed the stat:contribution welcome Status - Contributions welcome label May 13, 2021

rthadur assigned ymodak May 13, 2021

ymodak added comp:ops OPs related issues TF 2.5 Issues related to TF 2.5 stat:awaiting tensorflower Status - Awaiting response from tensorflower labels May 19, 2021

chunduriv added TF 2.9 Issues found in the TF 2.9 release (or RCs) and removed TF 2.5 Issues related to TF 2.5 labels Jul 15, 2022

chunduriv assigned chunduriv and unassigned ymodak Jul 15, 2022

chunduriv assigned mohantym and unassigned chunduriv Sep 29, 2022

mohantym removed their assignment Oct 3, 2022

sushreebarsa self-assigned this May 21, 2024

sushreebarsa added the stat:awaiting response Status - Awaiting response from author label May 21, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label May 21, 2024

sushreebarsa removed their assignment May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`tf.dynamic_stitch` gradient is incorrect #7397

`tf.dynamic_stitch` gradient is incorrect #7397

tf.dynamic_stitch gradient is incorrect #7397

tf.dynamic_stitch gradient is incorrect #7397

Comments

If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)

`tf.dynamic_stitch` gradient is incorrect #7397

`tf.dynamic_stitch` gradient is incorrect #7397