Fix gemm_fusion_autotuner_test on Hopper #14796

sergey-kozub · 2024-07-11T10:15:38Z

Updated result type and error thresholds for the SelectsSplitK test.
Previously this failed on Hopper.

Imported from GitHub PR openxla/xla#14796 Updated result type and error thresholds for the SelectsSplitK test. Previously this failed on Hopper. Copybara import of the project: -- 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 by Sergey Kozub <skozub@nvidia.com>: Fix gemm_fusion_autotuner_test on Hopper Merging this change closes #14796 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 651348522

FUTURE_COPYBARA_INTEGRATE_REVIEW=#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f28 PiperOrigin-RevId: 650317473

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650317473

Imported from GitHub PR openxla/xla#14796 Updated result type and error thresholds for the SelectsSplitK test. Previously this failed on Hopper. Copybara import of the project: -- 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 by Sergey Kozub <skozub@nvidia.com>: Fix gemm_fusion_autotuner_test on Hopper Merging this change closes #14796 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650257579

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650298390

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650827514

We already converted triton gpu dialect to nvvm in TritonGPUTOLLVMPass but since we need to lower SparseDot afterwards and we generate a gpu.thread_id in the lowering, add a pattern to also convert that to nvvm. FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650924687

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650567453

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650300991

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650827506

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650824283

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650990450

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 650075692

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 651370257

This is needed for M_LN2l Without the include the build is failing on MacOS. FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 651373744

`DetectUnusedVariables` can be expensive, but often we don't have symbols in the indexing map at all, so there is nothing to remove. FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 651370098

- Inline Clear and Consume into Start and Stop. - Replace internal::g_trace_level with a private class, TraceLevel. - Rename SplitEventTracker to IncompleteEventTracker. FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 PiperOrigin-RevId: 645012398

Imported from GitHub PR openxla/xla#14796 Updated result type and error thresholds for the SelectsSplitK test. Previously this failed on Hopper. Copybara import of the project: -- 5005f288b67a2a34ec643cfcc3fbae815b5f0ef6 by Sergey Kozub <skozub@nvidia.com>: Fix gemm_fusion_autotuner_test on Hopper Merging this change closes #14796 PiperOrigin-RevId: 651359673

Imported from GitHub PR openxla#14796 Updated result type and error thresholds for the SelectsSplitK test. Previously this failed on Hopper. Copybara import of the project: -- 5005f28 by Sergey Kozub <skozub@nvidia.com>: Fix gemm_fusion_autotuner_test on Hopper Merging this change closes openxla#14796 COPYBARA_INTEGRATE_REVIEW=openxla#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f28 PiperOrigin-RevId: 651359673

Fix gemm_fusion_autotuner_test on Hopper

5005f28

sergey-kozub requested review from akuegel and ddunl July 11, 2024 10:15

akuegel approved these changes Jul 11, 2024

View reviewed changes

copybara-service bot mentioned this pull request Jul 11, 2024

PR #14796: Fix gemm_fusion_autotuner_test on Hopper tensorflow/tensorflow#71656

Merged

copybara-service bot pushed a commit that referenced this pull request Jul 11, 2024

Automated Code Change

e36e15a

FUTURE_COPYBARA_INTEGRATE_REVIEW=#14796 from openxla:skozub/gemm_fusion_autotuner_test 5005f28 PiperOrigin-RevId: 650317473

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change #14637

Merged

copybara-service bot closed this in d8f3bc8 Jul 11, 2024

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71160

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

PR #14796: Fix gemm_fusion_autotuner_test on Hopper tensorflow/tensorflow#71183

Draft

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71557

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71427

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Add gpu.thread_id conversion to nvvm after sparse dot lowering tensorflow/tensorflow#71555

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71299

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71441

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71413

Draft

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71364

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

[XLA:GPU] Use llvm::SmallVector instead of std::vector. tensorflow/tensorflow#71575

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Automated Code Change tensorflow/tensorflow#71649

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

[XLA:GPU] Remove sparse pass from ROCm Triton emitter tensorflow/tensorflow#71661

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Add missing header include. tensorflow/tensorflow#71664

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

[XLA:GPU] Return early in RemoveUnusedSymbols/Dimensions. tensorflow/tensorflow#71665

Merged

copybara-service bot mentioned this pull request Jul 11, 2024

Cleanup traceme_recorder tensorflow/tensorflow#70110

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix gemm_fusion_autotuner_test on Hopper #14796

Fix gemm_fusion_autotuner_test on Hopper #14796

Fix gemm_fusion_autotuner_test on Hopper #14796

Fix gemm_fusion_autotuner_test on Hopper #14796

Conversation