-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVIDIA TF] Support building against CUDA 12.0 #58867
Conversation
As of CUDA 12, CUSPARSE_MM_ALG_DEFAULT is replaced by CUSPARSE_SPMM_ALG_DEFAULT.
Hi @cantonios Can you please review this PR ? Thank you! |
Fix #2176. See also tensorflow/tensorflow#58867. Note that CUDA Toolkit 12.0 requires CUDA driver 525.60.13. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Fixes issue where minor version was incorrectly included in dso name with cuda 12.
Added a fix to use cupti_version (major version only for cuda 12 and later) rather than cuda_version (major.minor version) to load the libcupti DSO. |
@reedwm is this blocked? Anything I can do on my side to help? |
It was blocked but now it simply needs to be approved internally. It probably will be merged Monday. |
Fix deepmodeling#2176. See also tensorflow/tensorflow#58867. Note that CUDA Toolkit 12.0 requires CUDA driver 525.60.13. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Imported from GitHub PR tensorflow/tensorflow#58867 This PR updates TensorFlow to build against CUDA 12.0. Most changes are minor with the exception of the replacing the csrGemmV2 APIs with SpGEMM, since the former was removed from cusparse 12.0. Attn: @hawkinsp Copybara import of the project: -- be94c459eaffd51e9cf1f96c13385f6fed9d6752 by Nathan Luehr <nluehr@nvidia.com>: Use major version for cupti lib with CUDA>=12 -- 9145a60e65165a517a3789bcb49b79c871f67641 by Nathan Luehr <nluehr@nvidia.com>: Update CUDA stub libraries for CUDA 12 -- dee09e90a6a88caf94a1df7f6ca16bf5e5a1336f by Nathan Luehr <nluehr@nvidia.com>: Migrate remaining calls to cusparseCsrmvEx to cusparseSpMV for CUDA 12 -- 6481f1a35a5b153222ba280d1d74ee5239e90180 by Nathan Luehr <nluehr@nvidia.com>: Switch from CUSPARSE_CSR2CSC_ALG2 to CUSPARSE_CSR2CSC_ALG1 for CUDA >= 12 -- 4ff3164858c6b7741bfa494b1a40c65eae0fe171 by Nathan Luehr <nluehr@nvidia.com>: Replace csrGemmV2 with calls to SpGEMM APIs when compiling for CUDA 12. -- 4aa8b6fab9a1b19c9ca8936aee5ec54eaeed54b1 by Nathan Luehr <nluehr@nvidia.com>: Update algorithm enum sparse mat_mul_op for CUDA 12 As of CUDA 12, CUSPARSE_MM_ALG_DEFAULT is replaced by CUSPARSE_SPMM_ALG_DEFAULT. -- 25656bd776c759316db7ded528d50f6cb4c04266 by Nathan Luehr <nluehr@nvidia.com>: Bump NCCL version to 2.16.2 to support CUDA 12 and NVIDIA Hopper GPUs. -- c3b2dbbea466b54309c677b639387031c1e48604 by Nathan Luehr <nluehr@nvidia.com>: Update cudaGraph APIs for CUDA 12. -- 51cb95a2ce37988f0d6bb6f100ffb0cfdfaa8291 by Nathan Luehr <nluehr@nvidia.com>: Reduce memory overheads in sparse-sparse matmul. Memory reduction comes at the cost of an additional device-side copy to concat the gemm results across the batch. -- ae70777421a2c7f603171484a530bfa2143eedec by Nathan Luehr <nluehr@nvidia.com>: Guard cuda_blas_utils include to fix ROCM build. -- 4a04c65383f333fc23d70dc72e8a76b605ccc465 by Nathan Luehr <nluehr@nvidia.com>: Load cupti dso using correct version. Fixes issue where minor version was incorrectly included in dso name with cuda 12. Merging this change closes #58867 PiperOrigin-RevId: 502803087
Imported from GitHub PR tensorflow/tensorflow#58867 This PR updates TensorFlow to build against CUDA 12.0. Most changes are minor with the exception of the replacing the csrGemmV2 APIs with SpGEMM, since the former was removed from cusparse 12.0. Attn: @hawkinsp Copybara import of the project: -- be94c459eaffd51e9cf1f96c13385f6fed9d6752 by Nathan Luehr <nluehr@nvidia.com>: Use major version for cupti lib with CUDA>=12 -- 9145a60e65165a517a3789bcb49b79c871f67641 by Nathan Luehr <nluehr@nvidia.com>: Update CUDA stub libraries for CUDA 12 -- dee09e90a6a88caf94a1df7f6ca16bf5e5a1336f by Nathan Luehr <nluehr@nvidia.com>: Migrate remaining calls to cusparseCsrmvEx to cusparseSpMV for CUDA 12 -- 6481f1a35a5b153222ba280d1d74ee5239e90180 by Nathan Luehr <nluehr@nvidia.com>: Switch from CUSPARSE_CSR2CSC_ALG2 to CUSPARSE_CSR2CSC_ALG1 for CUDA >= 12 -- 4ff3164858c6b7741bfa494b1a40c65eae0fe171 by Nathan Luehr <nluehr@nvidia.com>: Replace csrGemmV2 with calls to SpGEMM APIs when compiling for CUDA 12. -- 4aa8b6fab9a1b19c9ca8936aee5ec54eaeed54b1 by Nathan Luehr <nluehr@nvidia.com>: Update algorithm enum sparse mat_mul_op for CUDA 12 As of CUDA 12, CUSPARSE_MM_ALG_DEFAULT is replaced by CUSPARSE_SPMM_ALG_DEFAULT. -- 25656bd776c759316db7ded528d50f6cb4c04266 by Nathan Luehr <nluehr@nvidia.com>: Bump NCCL version to 2.16.2 to support CUDA 12 and NVIDIA Hopper GPUs. -- c3b2dbbea466b54309c677b639387031c1e48604 by Nathan Luehr <nluehr@nvidia.com>: Update cudaGraph APIs for CUDA 12. -- 51cb95a2ce37988f0d6bb6f100ffb0cfdfaa8291 by Nathan Luehr <nluehr@nvidia.com>: Reduce memory overheads in sparse-sparse matmul. Memory reduction comes at the cost of an additional device-side copy to concat the gemm results across the batch. -- ae70777421a2c7f603171484a530bfa2143eedec by Nathan Luehr <nluehr@nvidia.com>: Guard cuda_blas_utils include to fix ROCM build. -- 4a04c65383f333fc23d70dc72e8a76b605ccc465 by Nathan Luehr <nluehr@nvidia.com>: Load cupti dso using correct version. Fixes issue where minor version was incorrectly included in dso name with cuda 12. Merging this change closes #58867 PiperOrigin-RevId: 502803087
does this cover CUDA 12 in WSL2 as I ran into issue with tensorflow right after pip install. verification command was looking for CUDA 11 library instead of 12 installed on the WSL2 instance. #59413 |
This PR enables building TF from source against CUDA 12. The nightly and release builds available from PyPI continue to be built at present against CUDA 11.8. |
Is there a plan to provide docker images that work with CUDA 12.0? Thank you. |
Can anyone share the plan for when this will be released? |
I tried |
@alanwilter Presently you need to build TensorFlow from source to use it with CUDA 12.x. Either the master or the r2.12 release branches will build against CUDA 12. |
Is there a timeline for building the official wheels against CUDA 12.x? |
These 12.x packages need to be built on PyPI. CUDA 11.x is deprecated in some systems |
@Talador12 can you provide more information about where CUDA 11.8 is deprecated? |
It could be a few reasons, but Fedramp compliance and using current Debian versions. Namely, debian Bookworm. I was surprised that Tensorflow did not have prebuilt CUDA 12 support on PyPi. |
Is there an update on this issue? There is still a need for a CUDA 12 build of tensorflow on Pypi |
We unfortunately do not yet have official pip wheels with CUDA 12. It's possible TensorFlow 2.14 will be built with CUDA 12 but not guaranteed. |
Could we re-open this issue? This has not been resolved yet |
This is a PR that has been merged, not an issue, so it cannot be reopened. We have not yet released pip packages with CUDA 12 support, but are working on this. Feel free to file a new GitHub issue to have CUDA 12 pip packages (please CC me on the issue if you file it). |
Apologies - I thought the python package would be built using the merged code in this pull request. I created a separate issue for the python package at #60943 |
This PR updates TensorFlow to build against CUDA 12.0. Most changes are minor with the exception of the replacing the csrGemmV2 APIs with SpGEMM, since the former was removed from cusparse 12.0.
Attn: @hawkinsp