-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing GPU op for zeros_like for RaggedTensorVariant, error occurs when Ragged Tensor fed thru tf.map_fn #46635
Comments
I ran the code on tf 2.4 and and face a different error on nightly, please find the gist here |
As I mentioned, I don't think tf-nightly-gpu is running on the GPU in Colab. The issue is only present on GPU. In the gist you sent, the cell where it searches for the GPU returns:
Locally, where it is running on GPU, TF nightly fails with the same error as TF2.4, so this issue is still present in nightly, at least on my local |
@djoshea this looks like a reasonable request, will you be able to send a PR for this? |
I could potentially try developing it, but I unfortunately don't know where to start. Is there a guide somewhere to implementing new ops for the GPU? |
I don't think we have a detailed guide for this, but if you're comfortable with C++ you could try to start at the check that's failing. |
It would be nice if this could be implemented soon. Using map_fn with RaggedTensors is very convenient when working with data of different shapes. Unfortunately one must run these operations on CPU right now. |
I'm relying on map_fn to extract ragged image patches using bounding boxes so I can then produce ragged bounding-box centered feature maps via convolutions. I am encountering the same bug with the gradient step inside a tf.function (I can successfully run outside tf.function). The ragged image patches method is enabling a large speed up with a reduced memory footprint so it would be very useful to enable this in graph mode on GPUs. |
For me RaggedTensors + map_fn are quite an enabler because I am implementing some kind of link prediction in a graph neural network. Having TensorFlow handling the individual samples allows for pretty safe programming as I do not have to put all samples in a batch into a large, disconnected graph and make sure that no nodes of different graphs will be connected. Using TensorFlow this way results in nice code that is fast to write in contrast to solutions with masks which may need to be changed every time you change your computation. Having this in GPU would be awesome. 💯 |
I would love to see this fixed, especially as iterating over RaggedTensors seems to not properly work in a lot of cases. |
I ran the code in TF v2.5 and face error ,please find the gist here..Thanks ! |
Hi @sanjoy any updates on this? This is a really needed feature. |
Hi, i will also very appreciate if this bug get fixed. The part with Maybe this will be a quick win to fix. |
Hi, thanks for raising the issue. According to @edloper "Basically, RaggedTensorVariant objects should never be copied to GPU, because we can't do anything useful with them there. But Placer isn't currently smart enough to figure that out (it just sees a Variant tensor, and doesn't know what kind of value it contains)." We have a project going on right now that hopefully will fix the issue. |
Hello. Could you please inform if there has been any progress on the issue? For me, it resulted in a sudden sixfold increase of the train step duration due to having to move to CPU (compared to residing to a contrived solution with non-ragged tensors) |
Also needing this resolved |
I faced the same issue and resolved it by rolling my own def map_fn2(fn, elems, fn_output_signature):
batch_size = tf.shape(tf.nest.flatten(elems)[0])[0]
arr = tf.TensorArray(
fn_output_signature.dtype, size=batch_size, element_shape=fn_output_signature.shape)
for i in tf.range(batch_size):
arr = arr.write(i, fn(tf.nest.map_structure(lambda x: x[i], elems)))
return arr.stack() |
As of TF v. 2.15, still not resolved. |
System information
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes, included below
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.10
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/a
TensorFlow installed from (source or binary): pip binary
TensorFlow version (use command below): v1.12.1-49539-g18d8bcbe72b 2.5.0-dev20210123
Python version: '3.8.6 | packaged by conda-forge | (default, Nov 27 2020, 19:31:52) \n[GCC 9.3.0]'
Bazel version (if compiling from source): n/a
GCC/Compiler version (if compiling from source): n/a
CUDA/cuDNN version: 11.0 / 8
GPU model and memory: TITAN X (Pascal) computeCapability: 6.1
Describe the current behavior
I have a keras layer
RescaleB
that accepts a ragged tensor with shape [batch, (time), in_dim]. The layer callsmap_fn
to process each example in the batch separately, scaling the values along the inner dimension by a trainable gain vector. (The details of the operation aren't critical, but the ragged tensor going into map_fn is.)Using this layer fails with
No unary variant unary_op function found for unary variant op enum: 1 Variant type_name: RaggedTensorVariant for device type: GPU
on a node whose name ends withrescale_b/map/while/TensorArrayV2Write/TensorListSetItem_grad/zeros_like
which suggests that the zeros_like operationisn't defined for Ragged Tensors on GPU?
In this simple example, i also include
RescaleA
, which accomplishes the same task usingtf.ragged.map_flat_values
, although in my real use case I needmap_fn
. This is a simplified example.Describe the expected behavior
I'd expect
RescaleB
andRescaleA
to function identically.Standalone code to reproduce the issue
https://colab.research.google.com/drive/1mHycCXJL94VuCGkXIJ0bIXtbYamyZo78
I've reproduced the issue locally with tf-nightly-gpu TF 2.5, but I can't seem to get the nightly version to see the GPU on Colab. The Colab notebook is using TF 2.4, but the issue remains in TF 2.5 nightly.
Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
This may be the same issue as #44231 but hopefully the additional detail here is helpful.
The text was updated successfully, but these errors were encountered: