No registered 'Const' OpKernel for GPU devices with constant folding #52200

albertz · 2021-09-30T13:54:11Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
TensorFlow installed from (source or binary): pip binary
TensorFlow version (use command below): v2.6.0-rc2-32-g919f693420e 2.6.0
Python version: 3.8.10
Bazel version (if compiling from source):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: 11.4 / 8.2.4.15
GPU model and memory: NVIDIA GeForce RTX 2070

Describe the current behavior

The code below fails with an exception.
This is the full output:

TF: 2.6.0
2021-09-30 15:52:24.159169: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.162278: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.162637: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.163155: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-09-30 15:52:24.163754: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.164103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.164431: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.456691: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.457036: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.457342: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-30 15:52:24.457640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5732 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:09:00.0, compute capability: 7.5
2021-09-30 15:52:24.466132: W tensorflow/core/grappler/utils/graph_view.cc:836] No registered 'Const' OpKernel for GPU devices compatible with node {{node ConstantFolding/Const_enter}}
         (OpKernel was found, but attributes didn't match) Requested Attributes: dtype=DT_STRING, value=Tensor<type: string shape: [] values: foo>, _device="/job:localhost/replica:0/task:0/device:GPU:0"
        .  Registered:  device='XLA_GPU'; dtype in [DT_UINT8, DT_QUINT8, DT_UINT16, DT_INT8, DT_QINT8, DT_INT16, DT_INT32, DT_QINT32, DT_INT64, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128, DT_BOOL, DT_BFLOAT16]
  device='XLA_CPU'; dtype in [DT_UINT8, DT_QUINT8, DT_UINT16, DT_INT8, DT_QINT8, DT_INT16, DT_INT32, DT_QINT32, DT_INT64, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128, DT_BOOL, DT_BFLOAT16]
  device='DEFAULT'; dtype in [DT_VARIANT]
  device='DEFAULT'; dtype in [DT_BOOL]
  device='DEFAULT'; dtype in [DT_QUINT16]
  device='DEFAULT'; dtype in [DT_QINT16]
  device='DEFAULT'; dtype in [DT_QINT32]
  device='DEFAULT'; dtype in [DT_QUINT8]
  device='DEFAULT'; dtype in [DT_QINT8]
  device='DEFAULT'; dtype in [DT_COMPLEX128]
  device='DEFAULT'; dtype in [DT_COMPLEX64]
  device='DEFAULT'; dtype in [DT_INT8]
  device='DEFAULT'; dtype in [DT_UINT8]
  device='DEFAULT'; dtype in [DT_INT16]
  device='DEFAULT'; dtype in [DT_UINT16]
  device='DEFAULT'; dtype in [DT_UINT32]
  device='DEFAULT'; dtype in [DT_INT64]
  device='DEFAULT'; dtype in [DT_UINT64]
  device='DEFAULT'; dtype in [DT_DOUBLE]
  device='DEFAULT'; dtype in [DT_FLOAT]
  device='DEFAULT'; dtype in [DT_BFLOAT16]
  device='DEFAULT'; dtype in [DT_HALF]
  device='DEFAULT'; dtype in [DT_INT32]
  device='CPU'
  device='TPU_SYSTEM'
  device='GPU'; dtype in [DT_VARIANT]
  device='GPU'; dtype in [DT_BOOL]
  device='GPU'; dtype in [DT_COMPLEX128]
  device='GPU'; dtype in [DT_COMPLEX64]
  device='GPU'; dtype in [DT_UINT64]
  device='GPU'; dtype in [DT_INT64]
  device='GPU'; dtype in [DT_QINT32]
  device='GPU'; dtype in [DT_UINT32]
  device='GPU'; dtype in [DT_QUINT16]
  device='GPU'; dtype in [DT_QINT16]
  device='GPU'; dtype in [DT_INT16]
  device='GPU'; dtype in [DT_UINT16]
  device='GPU'; dtype in [DT_QINT8]
  device='GPU'; dtype in [DT_INT8]
  device='GPU'; dtype in [DT_UINT8]
  device='GPU'; dtype in [DT_DOUBLE]
  device='GPU'; dtype in [DT_FLOAT]
  device='GPU'; dtype in [DT_BFLOAT16]
  device='GPU'; dtype in [DT_HALF]

Traceback (most recent call last):
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1375, in _do_call
    return fn(*args)
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1359, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1451, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.NotFoundError: No registered 'Const' OpKernel for 'GPU' devices compatible with node {{node ConstantFolding/Const_enter}}
         (OpKernel was found, but attributes didn't match) Requested Attributes: _XlaHasReferenceVars=false, dtype=DT_STRING, value=Tensor<type: string shape: [] values: foo>, _device="/job:localhost/replica:0/task:0/device:GPU:0"
        .  Registered:  device='XLA_CPU_JIT'; dtype in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_COMPLEX64, DT_INT64, DT_BOOL, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64, DT_STRING]
  device='XLA_GPU_JIT'; dtype in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_COMPLEX64, DT_INT64, DT_BOOL, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64, DT_STRING]
  device='XLA_GPU'; dtype in [DT_UINT8, DT_QUINT8, DT_UINT16, DT_INT8, DT_QINT8, DT_INT16, DT_INT32, DT_QINT32, DT_INT64, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128, DT_BOOL, DT_BFLOAT16]
  device='XLA_CPU'; dtype in [DT_UINT8, DT_QUINT8, DT_UINT16, DT_INT8, DT_QINT8, DT_INT16, DT_INT32, DT_QINT32, DT_INT64, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128, DT_BOOL, DT_BFLOAT16]
  device='DEFAULT'; dtype in [DT_VARIANT]
  device='DEFAULT'; dtype in [DT_BOOL]
  device='DEFAULT'; dtype in [DT_QUINT16]
  device='DEFAULT'; dtype in [DT_QINT16]
  device='DEFAULT'; dtype in [DT_QINT32]
  device='DEFAULT'; dtype in [DT_QUINT8]
  device='DEFAULT'; dtype in [DT_QINT8]
  device='DEFAULT'; dtype in [DT_COMPLEX128]
  device='DEFAULT'; dtype in [DT_COMPLEX64]
  device='DEFAULT'; dtype in [DT_INT8]
  device='DEFAULT'; dtype in [DT_UINT8]
  device='DEFAULT'; dtype in [DT_INT16]
  device='DEFAULT'; dtype in [DT_UINT16]
  device='DEFAULT'; dtype in [DT_UINT32]
  device='DEFAULT'; dtype in [DT_INT64]
  device='DEFAULT'; dtype in [DT_UINT64]
  device='DEFAULT'; dtype in [DT_DOUBLE]
  device='DEFAULT'; dtype in [DT_FLOAT]
  device='DEFAULT'; dtype in [DT_BFLOAT16]
  device='DEFAULT'; dtype in [DT_HALF]
  device='DEFAULT'; dtype in [DT_INT32]
  device='CPU'
  device='TPU_SYSTEM'
  device='GPU'; dtype in [DT_VARIANT]
  device='GPU'; dtype in [DT_BOOL]
  device='GPU'; dtype in [DT_COMPLEX128]
  device='GPU'; dtype in [DT_COMPLEX64]
  device='GPU'; dtype in [DT_UINT64]
  device='GPU'; dtype in [DT_INT64]
  device='GPU'; dtype in [DT_QINT32]
  device='GPU'; dtype in [DT_UINT32]
  device='GPU'; dtype in [DT_QUINT16]
  device='GPU'; dtype in [DT_QINT16]
  device='GPU'; dtype in [DT_INT16]
  device='GPU'; dtype in [DT_UINT16]
  device='GPU'; dtype in [DT_QINT8]
  device='GPU'; dtype in [DT_INT8]
  device='GPU'; dtype in [DT_UINT8]
  device='GPU'; dtype in [DT_DOUBLE]
  device='GPU'; dtype in [DT_FLOAT]
  device='GPU'; dtype in [DT_BFLOAT16]
  device='GPU'; dtype in [DT_HALF]

         [[ConstantFolding/Const_enter]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tf-const-gpu.py", line 18, in <module>
    session.run(n)
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 967, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1190, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1368, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "/home/az/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1394, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.NotFoundError: No registered 'Const' OpKernel for 'GPU' devices compatible with node {{node ConstantFolding/Const_enter}}
         (OpKernel was found, but attributes didn't match) Requested Attributes: _XlaHasReferenceVars=false, dtype=DT_STRING, value=Tensor<type: string shape: [] values: foo>, _device="/job:localhost/replica:0/task:0/device:GPU:0"
        .  Registered:  device='XLA_CPU_JIT'; dtype in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_COMPLEX64, DT_INT64, DT_BOOL, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64, DT_STRING]
  device='XLA_GPU_JIT'; dtype in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, DT_INT8, DT_COMPLEX64, DT_INT64, DT_BOOL, DT_QINT8, DT_QUINT8, DT_QINT32, DT_BFLOAT16, DT_UINT16, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64, DT_STRING]
  device='XLA_GPU'; dtype in [DT_UINT8, DT_QUINT8, DT_UINT16, DT_INT8, DT_QINT8, DT_INT16, DT_INT32, DT_QINT32, DT_INT64, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128, DT_BOOL, DT_BFLOAT16]
  device='XLA_CPU'; dtype in [DT_UINT8, DT_QUINT8, DT_UINT16, DT_INT8, DT_QINT8, DT_INT16, DT_INT32, DT_QINT32, DT_INT64, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128, DT_BOOL, DT_BFLOAT16]
  device='DEFAULT'; dtype in [DT_VARIANT]
  device='DEFAULT'; dtype in [DT_BOOL]
  device='DEFAULT'; dtype in [DT_QUINT16]
  device='DEFAULT'; dtype in [DT_QINT16]
  device='DEFAULT'; dtype in [DT_QINT32]
  device='DEFAULT'; dtype in [DT_QUINT8]
  device='DEFAULT'; dtype in [DT_QINT8]
  device='DEFAULT'; dtype in [DT_COMPLEX128]
  device='DEFAULT'; dtype in [DT_COMPLEX64]
  device='DEFAULT'; dtype in [DT_INT8]
  device='DEFAULT'; dtype in [DT_UINT8]
  device='DEFAULT'; dtype in [DT_INT16]
  device='DEFAULT'; dtype in [DT_UINT16]
  device='DEFAULT'; dtype in [DT_UINT32]
  device='DEFAULT'; dtype in [DT_INT64]
  device='DEFAULT'; dtype in [DT_UINT64]
  device='DEFAULT'; dtype in [DT_DOUBLE]
  device='DEFAULT'; dtype in [DT_FLOAT]
  device='DEFAULT'; dtype in [DT_BFLOAT16]
  device='DEFAULT'; dtype in [DT_HALF]
  device='DEFAULT'; dtype in [DT_INT32]
  device='CPU'
  device='TPU_SYSTEM'
  device='GPU'; dtype in [DT_VARIANT]
  device='GPU'; dtype in [DT_BOOL]
  device='GPU'; dtype in [DT_COMPLEX128]
  device='GPU'; dtype in [DT_COMPLEX64]
  device='GPU'; dtype in [DT_UINT64]
  device='GPU'; dtype in [DT_INT64]
  device='GPU'; dtype in [DT_QINT32]
  device='GPU'; dtype in [DT_UINT32]
  device='GPU'; dtype in [DT_QUINT16]
  device='GPU'; dtype in [DT_QINT16]
  device='GPU'; dtype in [DT_INT16]
  device='GPU'; dtype in [DT_UINT16]
  device='GPU'; dtype in [DT_QINT8]
  device='GPU'; dtype in [DT_INT8]
  device='GPU'; dtype in [DT_UINT8]
  device='GPU'; dtype in [DT_DOUBLE]
  device='GPU'; dtype in [DT_FLOAT]
  device='GPU'; dtype in [DT_BFLOAT16]
  device='GPU'; dtype in [DT_HALF]

         [[ConstantFolding/Const_enter]]

Describe the expected behavior

The code below should work without error on a GPU.

Standalone code to reproduce the issue

import tensorflow as tf


print("TF:", tf.__version__)
tf.compat.v1.disable_eager_execution()
tf.compat.v1.disable_control_flow_v2()


with tf.compat.v1.Session() as session:
  x = tf.constant("foo")

  def body(i):
    with tf.control_dependencies([tf.print(x)]):
      return i + 1

  n = tf.while_loop(cond=lambda i: tf.less(i, 1), body=body, loop_vars=[0])
  session.run(n)

The text was updated successfully, but these errors were encountered:

tensorflow/tensorflow#52200 rwth-i6/returnn#694

tilakrayal · 2021-09-30T16:35:39Z

@sachinprasadhs ,
I was able to reproduce the issue in tf v2.5,v2.6 and nightly.Please find the gist of it here.

albertz added the type:bug Bug label Sep 30, 2021

google-ml-butler bot assigned tilakrayal Sep 30, 2021

albertz added a commit to albertz/playground that referenced this issue Sep 30, 2021

demo TF with constant folding error

b7f29a1

tensorflow/tensorflow#52200 rwth-i6/returnn#694

albertz mentioned this issue Sep 30, 2021

NotFoundError: No registered 'Const' OpKernel for 'GPU' devices rwth-i6/returnn#694

Closed

tilakrayal added 2.6.0 comp:gpu GPU related issues labels Sep 30, 2021

tilakrayal assigned sachinprasadhs and unassigned tilakrayal Sep 30, 2021

albertz mentioned this issue Oct 1, 2021

Explicitly execute function on CPU rwth-i6/returnn#702

Merged

sachinprasadhs assigned sanjoy and unassigned sachinprasadhs Oct 13, 2021

sachinprasadhs added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No registered 'Const' OpKernel for GPU devices with constant folding #52200

No registered 'Const' OpKernel for GPU devices with constant folding #52200

No registered 'Const' OpKernel for GPU devices with constant folding #52200

No registered 'Const' OpKernel for GPU devices with constant folding #52200

Comments