tf.data.Dataset prefetch not fetching data asynchronously #61084
Labels
comp:data
tf.data related issues
stat:awaiting tensorflower
Status - Awaiting response from tensorflower
TF 2.11
Issues related to TF 2.11
type:bug
Bug
type:performance
Performance Issue
Click to expand!
Issue Type
Bug
Have you reproduced the bug with TF nightly?
No
Source
source
Tensorflow Version
2.11
Custom Code
Yes
OS Platform and Distribution
Debian/Linux 11
Mobile device
No response
Python version
3.7
Bazel version
No response
GCC/Compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current Behaviour?
After implementing a data pipeline using tf.data.Dataset to pull image data from Google Cloud Storage, TensorBoard profiler shows that the GPU compute and CPU prefetch are running synchronously. I used data.Dataset.AUTOTUNE to determine the appropriate prefetch batch size. Monitoring GPU usage while the model is running confirms this with the GPU at 0% utilization to actually computing something for about a 2:1 ratio, which is reflected in the profiler. CPU usage when monitored does not appear to max out.
I expected the prefetch to occur concurrently with GPU processing as described in the data.Dataset documentation and tutorials.
Standalone code to reproduce the issue
Relevant log output
No response
The text was updated successfully, but these errors were encountered: