//tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test fails on s390x #44713

skribm9 · 2020-11-09T22:52:28Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ub18.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 2.3.1
Python version: 3.6.9
Bazel version (if compiling from source): 3.4.1
GCC/Compiler version (if compiling from source): Ubuntu 7.5.0-3ubuntu1~18.04
CUDA/cuDNN version: N/A
GPU model and memory: N/A

Describe the current behavior
I am running Tensorflow 2.3.1 on s390x and //tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test is failing on running bazel tests.
The TC was earlier failing with error: Internal: TargetRegistry::lookupTarget failed: No available targets are compatible with triple "x86_64-pc-linux"

To fix this, I added support for s390x in test_target_triple_helper.h
This is the code change I did:

diff --git a/tensorflow/compiler/xla/service/cpu/test_target_triple_helper.h b/tensorflow/compiler/xla/service/cpu/test_target_triple_helper.h
index 857de4a814..1fb04f821a 100644
--- a/tensorflow/compiler/xla/service/cpu/test_target_triple_helper.h
+++ b/tensorflow/compiler/xla/service/cpu/test_target_triple_helper.h
@@ -21,8 +21,13 @@ limitations under the License.
 static const char kTargetCpuForHost[] = "ppc";
 static const char kTargetTripleForHost[] = "ppc64le-ibm-linux-gnu";
 #else
+#if (defined(__s390x__) && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__))
+static const char kTargetCpuForHost[] = "";
+static const char kTargetTripleForHost[] = "systemz-none-linux-gnu"; 
+#else
 static const char kTargetCpuForHost[] = "";
 static const char kTargetTripleForHost[] = "x86_64-pc-linux";
 #endif
+#endif

 #endif

Even after making this change, the test case is still failing. The LLVM is identifying systemz-none-linux-gnu as target but not able to calculate vector_register_byte_size.
The error output looks like this:

==================== Test output for //tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test:
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from CodegenReduceOnArchWithNoVectorRegisters
[ RUN      ] CodegenReduceOnArchWithNoVectorRegisters.Test
2020-11-09 22:19:12.755885: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1555500000 Hz
tensorflow/compiler/xla/service/cpu/vectorized_reduce_with_no_vector_registers_test.cc:86: Failure
Expected equality of these values:
  vector_register_byte_size_for_x86_64
    Which is: 0
  16
[  FAILED  ] CodegenReduceOnArchWithNoVectorRegisters.Test (4 ms)
[----------] 1 test from CodegenReduceOnArchWithNoVectorRegisters (4 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (4 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CodegenReduceOnArchWithNoVectorRegisters.Test

Describe the expected behavior
The LLVM function should return correct values for vector_register_byte_size and the test case should pass.

Other info / logs
I noticed the previous issue regarding XLA Testcases and have already taken the changes done in the PR 39912

The text was updated successfully, but these errors were encountered:

skribm9 · 2021-01-04T14:55:02Z

Hi @r4nt A very happy new year to you! Could you find anything on this issue?

sushreebarsa · 2021-08-11T12:34:03Z

@skribm9 It looks like you are using an older Version of Tensorflow . Many bugs have been fixed in the latest version. Could you please execute your code using Latest Version 2.5 and let us know if the issue still persists? Thanks!

rposts · 2021-08-11T14:33:07Z

@sushreebarsa These problems still exist on TF 2.5.0:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test
-----------------------------------------------------------------------------
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from CodegenReduceOnArchWithNoVectorRegisters
[ RUN      ] CodegenReduceOnArchWithNoVectorRegisters.Test
2021-08-11 14:25:59.351811: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 1555500000 Hz
tensorflow/compiler/xla/service/cpu/vectorized_reduce_with_no_vector_registers_test.cc:81: Failure
Expected equality of these values:
  vector_register_byte_size_for_x86_64
    Which is: 0
  16
[  FAILED  ] CodegenReduceOnArchWithNoVectorRegisters.Test (64 ms)
[----------] 1 test from CodegenReduceOnArchWithNoVectorRegisters (64 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (65 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CodegenReduceOnArchWithNoVectorRegisters.Test

 1 FAILED TEST

rposts · 2022-06-29T14:30:44Z

@sushreebarsa Any update on the issue? Problem still exists in TF 2.9.1.

kun-lu20 · 2022-08-16T13:13:48Z

Hi All,

I've run this test case on TF v2.9.1 on s390x and found the cause of GetTargetVectorRegisterByteSize() returning 0 is that the code passes "" as the value of cpu string parameter to LLVM function createTargetMachine(). This works for Intel, but on s390x we need to pass a specific CPU name (such as "z13") to this argument, since only CPU models newer than z13 can support vector facility on s390x.

Also this test case uses triple i686-none-android for subsequent tests. Since this triple has the arch value of x86, the lookupTarget() function from LLVM code works on Intel but will return error on other platforms. For example, on s390x we can use kTargetTripleForHost instead to follow the original code logic and avoid this error.

Looks like this test case was primarily designed for Intel platform. I wonder if we should skip this test case on other platforms or expand it to adapt to all platforms? Any ideas or suggestions would be highly appreciated.

Thanks!

Hi @sushreebarsa,

Could you please take a look? Thank you!

kun-lu20 · 2022-12-09T13:37:57Z

Hi @sushreebarsa ,

Are there any updates on this issue? Thanks!

rposts · 2023-04-04T17:46:22Z

This can be closed @sushreebarsa as it is x86 specific.

skribm9 added the type:bug Bug label Nov 9, 2020

google-ml-butler bot assigned Saduf2019 Nov 9, 2020

Saduf2019 added TF 2.3 Issues related to TF 2.3 comp:xla XLA labels Nov 10, 2020

Saduf2019 assigned ymodak and unassigned Saduf2019 Nov 10, 2020

ymodak assigned r4nt and unassigned ymodak Nov 10, 2020

ymodak added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 10, 2020

sushreebarsa added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Aug 11, 2021

sushreebarsa added stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.5 Issues related to TF 2.5 and removed TF 2.3 Issues related to TF 2.3 stat:awaiting response Status - Awaiting response from author labels Aug 11, 2021

sushreebarsa added TF 2.9 Issues found in the TF 2.9 release (or RCs) and removed TF 2.5 Issues related to TF 2.5 labels Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

//tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test fails on s390x #44713

//tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test fails on s390x #44713

//tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test fails on s390x #44713

//tensorflow/compiler/xla/service/cpu:vectorized_reduce_with_no_vector_registers_test fails on s390x #44713

Comments