[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No registered 'ResourceScatterNdUpdate' OpKernel for 'GPU' #45663

Open
Flamefire opened this issue Dec 14, 2020 · 9 comments
Open

No registered 'ResourceScatterNdUpdate' OpKernel for 'GPU' #45663

Flamefire opened this issue Dec 14, 2020 · 9 comments
Labels
comp:gpu GPU related issues comp:ops OPs related issues TF 2.9 Issues found in the TF 2.9 release (or RCs) type:bug Bug

Comments

@Flamefire
Copy link
Contributor

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS 7
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 2.4.0rc4
  • Python version: 3.7.4
  • Bazel version (if compiling from source): 3.4.1
  • GCC/Compiler version (if compiling from source): GCC 8.3.0
  • CUDA/cuDNN version: 10.1
  • GPU model and memory: V100

Describe the current behavior
A test shows that a GPU implementation for BOOL inputs of ResourceScatterNdUpdate is seemingly missing.
The test is //tensorflow/python/kernel_tests:batch_scatter_ops_test -> ScatterTest.testBooleanScatterUpdate

Standalone code to reproduce the issue
Run bazel test

Other info / logs

ERROR: testBooleanScatterUpdate (__main__.ScatterTest)
ScatterTest.testBooleanScatterUpdate
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/bazel-tf/20db8ac50b74c328e6dea9b20829b459/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/python/kernel_tests/batch_scatter_ops_test.runfiles/org_tensorflow/tensorflow/python/kernel_tests/batch_scatter_ops_test.py", line 91, in testBooleanScatterUpdate
    update0 = state_ops.batch_scatter_update(var, [1], [True])
  File "/tmp/bazel-tf/20db8ac50b74c328e6dea9b20829b459/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/python/kernel_tests/batch_scatter_ops_test.runfiles/org_tensorflow/tensorflow/python/util/deprecation.py", line 340, in new_func
    return func(*args, **kwargs)
  File "/tmp/bazel-tf/20db8ac50b74c328e6dea9b20829b459/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/python/kernel_tests/batch_scatter_ops_test.runfiles/org_tensorflow/tensorflow/python/ops/state_ops.py", line 915, in batch_scatter_update
    ref, final_indices, updates, use_locking=use_locking)
  File "/tmp/bazel-tf/20db8ac50b74c328e6dea9b20829b459/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/python/kernel_tests/batch_scatter_ops_test.runfiles/org_tensorflow/tensorflow/python/ops/state_ops.py", line 368, in scatter_nd_update
    name=name))
  File "/tmp/bazel-tf/20db8ac50b74c328e6dea9b20829b459/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/python/kernel_tests/batch_scatter_ops_test.runfiles/org_tensorflow/tensorflow/python/ops/gen_state_ops.py", line 740, in resource_scatter_nd_update
    _ops.raise_from_not_ok_status(e, name)
  File "/tmp/bazel-tf/20db8ac50b74c328e6dea9b20829b459/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/python/kernel_tests/batch_scatter_ops_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 6862, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.NotFoundError: No registered 'ResourceScatterNdUpdate' OpKernel for 'GPU' devices compatible with node {{node ResourceScatterNdUpdate}}
	 (OpKernel was found, but attributes didn't match) Requested Attributes: T=DT_BOOL, Tindices=DT_INT32, use_locking=true
	.  Registered:  device='GPU'; T in [DT_COMPLEX128]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_COMPLEX128]; Tindices in [DT_INT32]
  device='GPU'; T in [DT_COMPLEX64]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_COMPLEX64]; Tindices in [DT_INT32]
  device='GPU'; T in [DT_DOUBLE]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_DOUBLE]; Tindices in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_FLOAT]; Tindices in [DT_INT32]
  device='GPU'; T in [DT_HALF]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_HALF]; Tindices in [DT_INT32]
  device='GPU'; T in [DT_INT64]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_INT64]; Tindices in [DT_INT32]
  device='GPU'; T in [DT_INT32]; Tindices in [DT_INT64]
  device='GPU'; T in [DT_INT32]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_BOOL]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_BOOL]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_STRING]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_STRING]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tindices in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tindices in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tindices in [DT_INT32]
 [Op:ResourceScatterNdUpdate]
@amahendrakar
Copy link
Contributor

@Flamefire,
TensorFlow 2.4 is tested and built against CUDA 11 and cuDNN 8. For more information, please take a look at the tested build configurations.

Could you please check if you are facing the same error with CUDA 11 and cuDNN 8 as well? Thanks!

@amahendrakar amahendrakar added comp:gpu GPU related issues stat:awaiting response Status - Awaiting response from author TF 2.4 for issues related to TF 2.4 labels Dec 15, 2020
@Flamefire
Copy link
Contributor Author

As I can't use CUDA 11 I retested that with TF 2.3 and python git/tensorflow/tensorflow/python/kernel_tests/batch_scatter_ops_test.py ScatterTest.testBooleanScatterUpdate and am seeing the same issue. So I'd assume that this is not related to the version.

@amahendrakar amahendrakar removed the stat:awaiting response Status - Awaiting response from author label Dec 16, 2020
@amahendrakar amahendrakar assigned ymodak and unassigned amahendrakar Dec 16, 2020
@amahendrakar amahendrakar added comp:ops OPs related issues and removed comp:gpu GPU related issues labels Dec 16, 2020
@ymodak ymodak added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Dec 18, 2020
@Flamefire
Copy link
Contributor Author
Flamefire commented Dec 18, 2020

The issue is fully expected looking at the source: The bool kernel is simply never declared for GPU, the relevant line is

TF_CALL_GPU_NUMBER_TYPES(DECLARE_GPU_SPECS);

That uses

#define TF_CALL_GPU_NUMBER_TYPES(m) \
TF_CALL_half(m) TF_CALL_float(m) TF_CALL_double(m)

And as one can see the Bool type is missing. I guess using TF_CALL_GPU_ALL_TYPES would be the right choice here. And looking at the other CPU implementations I'd say TF_CALL_INTEGRAL_TYPES instead of the TF_CALL_int32/TF_CALL_int64 should also be preferred

@tilakrayal tilakrayal assigned tilakrayal and unassigned ymodak Jul 15, 2022
@mohantym mohantym assigned mohantym and unassigned tilakrayal Jul 22, 2022
@mohantym
Copy link
Contributor
mohantym commented Jul 22, 2022

Hi @Flamefire!

We are checking whether you still need any assistance in this issue. I could not find above test in 2.9 version now.
Could you check this issue with Bazel 5.0.0 and TF 2.9 and let us know.

Thank you!

@mohantym mohantym added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jul 22, 2022
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jul 29, 2022
@google-ml-butler google-ml-butler bot removed stat:awaiting response Status - Awaiting response from author stale This label marks the issue/pr stale - to be closed automatically if no activity labels Aug 2, 2022
@mohantym
Copy link
Contributor
mohantym commented Aug 3, 2022

Hi @Flamefire !
Test seems to be passing in 2.9 , Bazel 5.0.0 and Gcc 7.3 . Attached gist for reference. Could you let us know from your end with above specs.
Thank you!

@mohantym mohantym added the stat:awaiting response Status - Awaiting response from author label Aug 3, 2022
@Flamefire
Copy link
Contributor Author

Sorry but your gist doesn't show the test passing. Last lines I see are

 [5,459 / 9,765] 2 actions running
    Compiling .../mlir/expansions/segmentation_spmd_expander.cc; 13s local
    Compiling llvm/lib/Target/X86/X86FastTileConfig.cpp; 1s local

As shown in the source there is simply no Boolean type registered for the "ResourceScatterNdUpdate".
So if you evaluate the last expression of

var = variables.Variable([True, False])
update0 = state_ops.batch_scatter_update(var, [1], [True])

on GPU it will try to find a GPU kernel for a bool tensor(see code at https://github.com/tensorflow/tensorflow/blob/v2.9.1/tensorflow/python/ops/state_ops.py#L1041 ) and fail with the above error.

I'm not familiar enough with TF to know how to evaluate the above on GPU but if you do that should be enough to reproduce.

Again: This only fails when run on GPU as the CPU kernels is registered for bool: https://github.com/tensorflow/tensorflow/blob/v2.9.1/tensorflow/core/kernels/scatter_nd_op.cc#L501-L502

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Aug 4, 2022
@mohantym
Copy link
Contributor
mohantym commented Aug 5, 2022

Thanks @Flamefire for the update.

@mohantym mohantym removed their assignment Aug 5, 2022
@mohantym mohantym added TF 2.9 Issues found in the TF 2.9 release (or RCs) comp:gpu GPU related issues and removed TF 2.4 for issues related to TF 2.4 labels Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:gpu GPU related issues comp:ops OPs related issues TF 2.9 Issues found in the TF 2.9 release (or RCs) type:bug Bug
Projects
None yet
Development

No branches or pull requests

6 participants