[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuBLAS Error in 2.14.0 #62002

Open
Romeo-CC opened this issue Sep 28, 2023 · 53 comments
Open

cuBLAS Error in 2.14.0 #62002

Romeo-CC opened this issue Sep 28, 2023 · 53 comments
Assignees
Labels
comp:gpu GPU related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF2.14 For issues related to Tensorflow 2.14.x type:build/install Build and install issues

Comments

@Romeo-CC
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

binary

TensorFlow version

2.14.0

Custom code

No

OS platform and distribution

Ubuntu 23.04

Mobile device

No response

Python version

3.11.5

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

CUDA 11.8 CUDNN 8.9.4

GPU model and memory

Nvidia RTX 3080ti

Current behavior?

in the shell terminal

install tensorflow via pip

pip install tensorflow==2.14.0

In the python terminal

input

import tensorflow as tf

then the output

2023-09-28 19:19:50.298229: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-09-28 19:19:50.298259: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-09-28 19:19:50.298302: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-09-28 19:19:50.303578: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-28 19:19:50.982905: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Standalone code to reproduce the issue

no

Relevant log output

No response

@google-ml-butler google-ml-butler bot added the type:bug Bug label Sep 28, 2023
@tilakrayal tilakrayal added TF2.14 For issues related to Tensorflow 2.14.x type:build/install Build and install issues labels Sep 29, 2023
@tilakrayal
Copy link
Contributor

@Romeo-CC,
I tried to install the latest tensorflow v2.14, and it was installed on the ubuntu 23.04 environment. Though it produced the Information warnings and the mentioned erros, I was able to import the tensorflow and ran the code without any issues. Kindly find the screenshot below for the reference.

(tf) tilakrayal@tilak-214-gpu:~$ python3 -c "import tensorflow as tf; print(tf.__version__)"
2023-09-29 09:38:46.654714: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-29 09:38:46.724680: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-09-29 09:38:46.724902: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-09-29 09:38:46.725069: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-09-29 09:38:46.738490: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-29 09:38:46.739029: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-29 09:38:48.156496: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2.14.0

Thank you!

@tilakrayal tilakrayal added the stat:awaiting response Status - Awaiting response from author label Sep 29, 2023
@Romeo-CC
Copy link
Author

Hi @tilakrayal. Thank you for your reply !
In your test case, the output shows "2023-09-29 09:38:46.654714: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used." You may not have installed proper CUDA version in your env, which means TF running in CPU mode.
Have you ever tested the tf-2.14.0 running in GPU mode ?

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Sep 29, 2023
@sk000007
Copy link
sk000007 commented Oct 3, 2023

Any luck on this ?

I seem to have exact problem on win10/wsl2/tensorflow2.14/

@MOmoharrum
Copy link

i have the same problem win11/wsl2/tensorflow 2.14.0

@shkarupa-alex
Copy link
Contributor
shkarupa-alex commented Oct 9, 2023

Same issue with Ubuntu 22.04 + Cuda 12.2 + cudnn 8.9 + build from source

alex@pc:~$ python
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-10-09 22:31:24.914587: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-09 22:31:24.914620: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-09 22:31:24.914625: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-09 22:31:24.919056: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

@shamikbosefj
Copy link

I'm facing the same issue, but it is able to show all the physical devices

python test.py 
2023-10-10 11:43:10.106768: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-10 11:43:10.138994: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-10 11:43:10.139027: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-10 11:43:10.139047: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-10 11:43:10.145045: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:4', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:5', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:6', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:7', device_type='GPU')]
2.14.0

@Syndicateeee
Copy link

I get the same error. rtx 3090 Driver Version: 536.99 cuda: 12.2
image

@Snailya
Copy link
Snailya commented Oct 11, 2023

got the same error on win11/wsl2/ubuntu22.04.2LTS.

2023-10-11 08:35:26.693719: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-11 08:35:26.696431: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-11 08:35:26.883564: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-11 08:35:28.856732: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT```

@gjohnen1
Copy link

Same issue with Ubuntu 22.04 + Cuda 12.2 + cudnn 8.9 + build from source

alex@pc:~$ python
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2023-10-09 22:31:24.914587: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-09 22:31:24.914620: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-09 22:31:24.914625: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-09 22:31:24.919056: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

Same here!

@DarshPareek
Copy link
DarshPareek commented Oct 11, 2023
darsh@darsh-Dell-G15-5520 ~ python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

2023-10-11 15:56:05.907841: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-11 15:56:05.930153: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-11 15:56:05.930185: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-11 15:56:05.930202: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-11 15:56:05.934830: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-11 15:56:07.902179: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-10-11 15:56:07.905331: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-10-11 15:56:07.905438: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Same Here! and this is making my kernel crash when i am trying to do subclassing

@2EM34E13
Copy link

Same! Do anyone found a solution?

@dubrovskiykot
Copy link
dubrovskiykot commented Oct 12, 2023

Same here!

tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-12 17:38:30.347657: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-12 17:38:30.347695: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-12 17:38:30.355820: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

@benrennison
Copy link

Same again,

2023-10-13 09:41:44.982611: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2023-10-13 09:41:45.005469: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2023-10-13 09:41:45.005514: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2023-10-13 09:41:45.005529: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-10-13 09:41:45.009825: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. /usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.0 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"

@sachinprasadhs
Copy link
Contributor

cc: @learning-to-play

@sachinprasadhs sachinprasadhs added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Oct 13, 2023
@agentcoops
Copy link

I had this issue and so I downgraded to CUDA 11.8. I modified the following script to handle the downgrade, as the process turned out to be a bit of a pain (lingering files and accidental NVIDIA linux driver installs can cause issues): https://gist.github.com/agentcoops/2c46871c151b32989908361516d08b2a

@McKone
Copy link
McKone commented Oct 16, 2023

Same error, reproduced also in TF nightly build
Ubuntu 22.04 + CUDA 12.2 + NVIDIA 535.113.01 + cuDNN 8.9.5

`Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import tensorflow as tf
2023-10-16 20:53:16.857979: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-16 20:53:16.858019: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-16 20:53:16.858727: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-16 20:53:16.862814: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-16 20:53:17.441781: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
print(tf.version)
2.16.0-dev20231013
`

@Syndicateeee
Copy link
Syndicateeee commented Oct 17, 2023

Here is my complete Fix, supposedly its the newer Kernel causing incompatibility issues so use Ubuntu 20.04 LTS:

#62095 (comment)

@ItayXD
Copy link
ItayXD commented Oct 23, 2023

same here, using nvidia 535 on ubuntu 22.04, install tf using pip install tensorflow[and-cuda] and got the same errors.

@CREESTL
Copy link
CREESTL commented Nov 18, 2023

Installing TF2.9 from this comment fixed the "Unable to register..." errors for me. This bug happens in TF2.14. Just downgrade to 2.9.

@Amyy
Copy link
Amyy commented Nov 20, 2023

Installing TF2.9 from this comment fixed the "Unable to register..." errors for me. This bug happens in TF2.14. Just downgrade to 2.9.

Still happens in TF 2.15 - downgrading to TF 2.9 also fixed that error.

@ddoddii
Copy link
ddoddii commented Nov 24, 2023

Same here. I was trying to run
!diffuzers api --port 10000 & npx localtunnel --port 10000 on google colab with T4 GPU,
and got the error message

2023-11-24 03:37:11.007629: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-24 03:37:11.007686: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-24 03:37:11.007726: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

I tried
!pip install tensorflow[and-cuda], and still doesn't seem to work.

@black7375
Copy link

How about using !pip install tensorflow[and-cuda]==2.13.1?
I use poetry, so your environment may be different, but it worked for me in version 2.13.1.

@shkarupa-alex
Copy link
Contributor

Same issue with v2.15

@maulberto3
Copy link

same

@brunomendes1
Copy link

Same for me.

@ltnetcase
Copy link

Same for me, with python3.11, CUDA 12.3 cuDNN 8.9.5 on CentOS 7.9 with Nvidia 3090 Driver Version 545.23.06,tensorflow 2.15.0 installed by pip.

@Lucas-Servi
Copy link

same

@ymodak ymodak added comp:gpu GPU related issues and removed type:bug Bug labels Nov 29, 2023
@Nuri-Tas
Copy link

tensorflow is a disappointment

@BinyanHu
Copy link
BinyanHu commented Dec 1, 2023

@BinyanHu , Could you please try in the Tensorflow 2.15rc0

Tried to install tf2.15 on Unbuntu with python3.11 using:

pip install tensorflow[and-cuda]

and

conda install tensorflow[and-cuda]

Both did not install tf2.15 but tf2.13 or 2.14. I guess that is because cudatoolkit 12.2 (which tf2.15 depends on) is not yet available. (p.s. I am using an HPC with no admin, thus relying on conda to handle cuda dependencies.)

In case pip install tensorflow[and-cuda] may not find the proper source of cuda dependencies, I tried to install tf2.14 and cuda dependencies manually using:

conda install cudatoolkit=11.8
pip install nvidia-cudnn-cu11==8.7.0.84
conda install -c nvidia cuda-nvcc
pip install tensorflow==2.14

The three lines of errors Unable to register ... persist.

@D-shuoshuo
Copy link

Can this problem be solved now?

@beijingrong
Copy link

same

@dominik59
Copy link

Same

@jhonalino
Copy link

same

@Syndicateeee
Copy link

These errors are not impacting performance!!!

Tested it on ResNet50 google colab V100 vs local rtx 3090 yield almost the same performance.

https://github.com/tensorflow/tensorflow/issues/62075#issuecomment-1867470232

@nivance
Copy link
nivance commented Jan 11, 2024

OS: centos 7

NVIDIA driver : latest
cuda: 11-8
tensorflow[and-cuda]==2.13.0

These versions can work

@ArbiterGe
Copy link

same no solution

@asheerali
Copy link

tensorflow[and-cuda]==2.13.0

it says no match

@Cihagnir
Copy link
Cihagnir commented Feb 2, 2024

I am not sure it's will work but it works at least for me. I have same problem with that situation. I run that command "sudo apt-get install -y cuda-drivers" after the end of my install.

Screenshot from 2024-02-02 12-50-20

During my install processes, I run the commend bellow. If you run the commend above during on your first install, you can try to run commend bellow. As I said I am not sure it's will work. However, It solve mine at least. I hope it works for you.
Also I am the probably rockiest guy in chat so take that account too while you trying solution.

@mozturan
Copy link
mozturan commented Feb 3, 2024

I have been using Ubuntu 20.04 since September and same error was always there with Tensorflow 2.14. Although I was using suggested driver and cuda versions i couldn't solve this error so I had to run on cpu for months sadly.

I switched to Fedora 39 and tried to install drivers, cuda, cudnn and etc with 3 different methods; fedora rpm repos, nvidia's repos and .run files. All has same error -Just like Ubuntu 20.04- with Tensorflow for versions 2.14 and 2.15:

Python: 3.11.7 and 3.10.13 (both tried up)
Cuda: 12.3

2024-02-03 23:19:07.962719: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN>
2024-02-03 23:19:07.962748: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT >
2024-02-03 23:19:07.963406: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuB>
2024-02-03 23:19:07.967310: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critica>
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-03 23:19:08.464693: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-03 23:19:08.802898: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must >
2024-02-03 23:19:08.826886: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must >
2024-02-03 23:19:08.827011: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must >
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Tf finds gpu but cant use cudnn and this error is there for months. I tried nightly version, even though i confirm that cuda is installed correctly and works fine, i get this error:

2024-02-03 22:44:56.398757: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-02-03 22:44:56.426233: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-03 22:44:56.882016: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-03 22:44:57.163106: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:1000] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-02-03 22:44:57.184565: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

This time gpu cannot be found?? but 2.15 was able to find with the same set up? Does anyone got a solution please i spend days to solve this.. I am thinking to switch to PyTorch since this issue is not handled bu tensorflow team for months.

Btw @asheerali i believe 2.13 doesn't come with cuda libraries (2.14=> does) so just download cuDNN libraries and set up as the guide says in nvidia's site. then pip install tensorflow==2.13. I didn't try this yet but it should work this way i believe.

@ManzarIMalik
Copy link

No solution so far.

@ybhwang
Copy link
ybhwang commented Mar 6, 2024

Same Issue... tensorflow == 1.15.0.post1

Latest version of tensorflow[and-cuda]

Environment

  • Ubuntu Server 22.04 LTS (x86_64)
  • Python 3.10.12 (OS Default)
  • CUDA 12.2.140 (repo from NVIDIA with APT)

@egehancosgun
Copy link

Same problem with me as well.

@JasOleander
Copy link

Still Issue...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:gpu GPU related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF2.14 For issues related to Tensorflow 2.14.x type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests