-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test TensorFloat32 with conv2d #46168
Comments
I have tried in colab with TF -gpu version 2.4 and i did not notice any major performance issue. Please, find the gist here. |
I saved the code to a file "test_conv.py", and execute command "nsys nvprof python3 test_conv.py" in a terminal, and this is a part of the output: Time(%) Total Time Instances Average Minimum Maximum Name 96.7 1927958 28 68855.6 67072 71072 redzone_checker |
Hi @WangTuoxyty, The benchmark you're using is very small so tf32 or not will not make a big difference, do you see the same issue when you try larger convolutions? |
I change the shape of x_in from [1, 5, 5, 1] to [10, 5000, 5000, 1],while it has the same result. |
Any updates on this issue? Here, I run into a similar situation. Firstly, I tested a simple MatMul example (as given here:https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_tensor_float_32_execution). TF-32's and FP-32's results are indeed different as expected. It indicates my environment supports TF-32 mode by default. However, I used tf.nn.conv2d() method, fed with randomly generated float32 data, to test acc of convolution under TF-32 mode, but TF-32's and FP-32's results turned to be identical. It seems like tf.nn.conv2d() failed to activate TF-32? |
Please make sure that this is an issue related to performance of TensorFlow.
As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:performance_template
System information
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
I ran this on an RTX 3090 with Nsight system. Compared with tf.config.experimental.enable_tensor_float_32_execution(False), the conv2d kernels don`t have higher performance with tf.config.experimental.enable_tensor_float_32_execution(True).
Describe the expected behavior
With tf.config.experimental.enable_tensor_float_32_execution(True), the conv2d kernels should have higher performance.
Standalone code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.
Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
The text was updated successfully, but these errors were encountered: