tf.device
context manager does not restore cudaCurrentDevice
under some conditions
#61911
Labels
comp:apis
Highlevel API related issues
comp:gpu
GPU related issues
stat:awaiting tensorflower
Status - Awaiting response from tensorflower
TF 2.13
For issues related to Tensorflow 2.13
type:bug
Bug
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
No
Source
binary
TensorFlow version
2.13
Custom code
No
OS platform and distribution
Linux Ubuntu 20.04.5 LTS
Mobile device
No response
Python version
3.10
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
12.0
GPU model and memory
No response
Current behavior?
When using
tf.device
context manager, the current device of cuda runtime remains "dirty" even after exiting the context manager. This happens when: 1. tensorflow is initializing GPU context on this line (tf.device), 2. there is no materialization of tensors on GPU.For context, keeping a clean state of current device context is important to keep tensorflow in sync with other GPU based libraries such as cuDF. RMM memory allocators also depends on the assumption that the context stays the same throughout the lifetime of allocations.
Standalone code to reproduce the issue
Relevant log output
The text was updated successfully, but these errors were encountered: