tf.map_fn on RaggedTensors crash during gradient computation on a GPU #55475

foxik · 2022-04-03T11:15:36Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.8
Python version: 3.7

Describe the current behavior

When some loss (tf.losses.SparseCategoricalCrossentropy, tf.losses.CategoricalCrossentropy, tf.losses.BinaryCrossentropy, or tf.losses.MeanSquaredError) is used on Ragged tensors, which is computed via a tf.map_fn on a RaggedTensor, that the gradient computation on a GPU crashes with

Node: 'Adam/gradients/zeros_like_2'
2 root error(s) found.
  (0) INTERNAL:  No unary variant unary_op function found for op ZEROS_LIKE Variant type_name: RaggedTensorVariant for device type: GPU
	 [[{{node Adam/gradients/zeros_like_2}}]]
	 [[binary_crossentropy/map/while/loop_body_control/_124/_67]]
  (1) INTERNAL:  No unary variant unary_op function found for op ZEROS_LIKE Variant type_name: RaggedTensorVariant for device type: GPU
	 [[{{node Adam/gradients/zeros_like_2}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_16690]

The computation does not crash on a CPU and it does not crash when tf.functions are executed eagerly.

Also, if the tf.map_fn is circumvented by using the following argument to compile

  loss=lambda yt, yp: tf.losses.BinaryCrossentropy()(yt. values, yp.values)

it works on GPU without a crash.

Describe the expected behavior

The code does not crash on a GPU.

Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue

A simple Colab reproducing the error is here: https://colab.research.google.com/drive/1OELAhvpQHhaz3sOYabf4SdBqKlQCjNjs?usp=sharing

Other info / logs

The map_fn used is here: https://github.com/keras-team/keras/blob/2db5acf3e3c5904b014cb409d3c514bef44f9640/keras/losses.py#L1408

The text was updated successfully, but these errors were encountered:

foxik · 2022-04-03T11:17:09Z

Note that I also opened an issue in the Keras repository keras-team/tf-keras#638 , where we discuss whether we should avoid the tf.map_fn on the RaggedTensors, because it can probably be avoided -- the metrics with ragged tensors take a different approach, and instead of a ragged map, they use flat_values, see https://github.com/keras-team/keras/blob/2db5acf3e3c5904b014cb409d3c514bef44f9640/keras/utils/metrics_utils.py#L800 .

sushreebarsa · 2022-04-04T12:34:45Z

@chunduriv I was able to reproduce the issue on colab using TF v2.8.0 ,tf-nightly on both gpu and cpu , please find the attached gists for reference.Thanks!

foxik · 2022-04-07T05:44:54Z

Oh, I was just pointed to me (by djoshea) that this is a duplicate of 46635, so closing.

foxik · 2022-04-07T05:45:27Z

Closing as a duplicate of #46635 .

google-ml-butler · 2022-04-07T05:45:30Z

Are you satisfied with the resolution of your issue?
Yes
No

foxik added the type:bug Bug label Apr 3, 2022

google-ml-butler bot assigned sushreebarsa Apr 3, 2022

foxik mentioned this issue Apr 3, 2022

tf.keras.losses on RaggedTensors crash during gradient computation on a GPU keras-team/tf-keras#638

Open

sushreebarsa added comp:ops OPs related issues TF 2.8 labels Apr 4, 2022

sushreebarsa assigned chunduriv and unassigned sushreebarsa Apr 4, 2022

foxik closed this as completed Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tf.map_fn on RaggedTensors crash during gradient computation on a GPU #55475

tf.map_fn on RaggedTensors crash during gradient computation on a GPU #55475

tf.map_fn on RaggedTensors crash during gradient computation on a GPU #55475

tf.map_fn on RaggedTensors crash during gradient computation on a GPU #55475

Comments