[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot build TensorFLow with --config=dbg #48919

Closed
bas-aarts opened this issue May 5, 2021 · 43 comments
Closed

cannot build TensorFLow with --config=dbg #48919

bas-aarts opened this issue May 5, 2021 · 43 comments
Assignees
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower subtype:bazel Bazel related Build_Installation issues subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues type:build/install Build and install issues

Comments

@bas-aarts
Copy link
bas-aarts commented May 5, 2021

when building opensource TensorFlow with

bazel build --config=dbg --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

(for SM 7.0 only)

The build dies at link time with:

ERROR: /home/baarts/tensorflow-GH/tensorflow/python/BUILD:3373:24: Linking of rule '//tensorflow/python:_pywrap_tensorflow_internal.so' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc @bazel-out/k8-dbg/bin/tensorflow/python/_pywrap_tensorflow_internal.so-2.params bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(AnnotationRemarks.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info'
bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(BDCE.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info' bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(CallSiteSplitting.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info'
bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(ConstantHoisting.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info' bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(ConstraintElimination.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info'
bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(CorrelatedValuePropagation.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info' bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(DCE.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info'
bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(DeadStoreElimination.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info' bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(DivRemPairs.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info'
bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(EarlyCSE.pic.o):(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against .debug_info' bazel-out/k8-dbg/bin/external/llvm-project/llvm/libScalar.a(FlattenCFGPass.pic.o):(.debug_aranges+0x6): additional relocation overflows omitted from the output collect2: error: ld returned 1 exit status

Adding -mcmodel=large makes no difference, as the overflow is in a debug section.
I tried -gdwarf64 which is not supported by gcc

some platform info:

root@7fe23091cb5b:/opt/tensorflow# gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

root@7fe23091cb5b:/opt/tensorflow# uname -a
Linux 7fe23091cb5b 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
@bas-aarts bas-aarts added the type:build/install Build and install issues label May 5, 2021
@reedwm
Copy link
Member
reedwm commented May 6, 2021

I am unable to reproduce by running:

yes '' | TF_NEED_CUDA=1  ./configure
bazel build --config=dbg --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

I ran at commit a3d9f9b with Ubuntu 18.04. The compute capabilities I compiled with defaulted to 7.0.

Can you give more information, like the commit and OS? Perhaps the issue was fixed at a later commit.

You can also try compiling only a subset of files with debugging symbols, which I highly recommend as it reduces build time and gdb startup time. This might help since the "overflow" in the error message might be fixed with less debugging symbols, although admittedly I have no idea how debugging symbols work at all. I use the following command to compile a subset of files:

bazel build --per_file_copt=+tensorflow.*,-tensorflow/compiler.*,-tensorflow/lite.*,-tensorflow/core/kernels.*@-O0,-g --config=cuda //tensorflow/tools/pip_package:build_pip_package

Still, we should fix this even if the above command works for you.

@bas-aarts
Copy link
Author

I'm at commit 9275e30

root@7fe23091cb5b:/home/baarts/tensorflow-GH# cat /etc/issue
Ubuntu 20.04.1 LTS \n \l

@reedwm
Copy link
Member
reedwm commented May 6, 2021

I can reproduce on Ubuntu 20.04. I am not familiar with how debugging symbols work, but based on what you said it seems they are exceeding 2 GB which is causing the issue? You mention -gdwarf64, which is in gcc 11.1 but not in gcc 9.3, which is what Ubuntu 20.04 uses.

/CC @chsigg any ideas on what to do here? If it's impossible to support compiling all files with debug symbols, perhaps we should provide a config option and instructions on only building a subset of files with debugging symbols.

@UsharaniPagadala UsharaniPagadala added subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues subtype:bazel Bazel related Build_Installation issues labels May 7, 2021
@UsharaniPagadala
Copy link

@bas-aarts
Please take a look at above comment(reedwn) and let us know if you are still facing the same issue. Thanks!

@UsharaniPagadala UsharaniPagadala added the stat:awaiting response Status - Awaiting response from author label May 11, 2021
@bas-aarts
Copy link
Author

not sure what my action item is. @reedwm was able to reproduce, and asked @chsigg to chime in on an alternate way to build TF w/ debug symbols.
Yes, the issue is still there.

@bhack
Copy link
Contributor
bhack commented May 13, 2021

@reedwm It could be really nice to have a practical solution for c++ contributors.

@tlemo
Copy link
Contributor
tlemo commented May 14, 2021

Just a drive-by suggestion: could -gsplit-dwarf (aka. DWARF Fission) help here?

Bazel has a handy --fision, which according to the documentation should be on by default for dbg configs, although according to this bug this is not quite the case, so maybe it's something worth looking into?

@bhack
Copy link
Contributor
bhack commented May 16, 2021

See also #13295

@UsharaniPagadala
Copy link

@bas-aarts

Could you please confirm if the issue still persist.Thanks

@UsharaniPagadala UsharaniPagadala added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting response Status - Awaiting response from author labels May 18, 2021
@bas-aarts
Copy link
Author

confirmed.
Is there any reason to believe the problem would have been addressed?

@bhack
Copy link
Contributor
bhack commented May 18, 2021

Other then @reedwm mentioned usability aspects that I think are quite important, or the feature request of precompiled release with debug symbols at #13295, it seems that someone else is compiling in debug mode #39521 or not?

@bas-aarts
Copy link
Author

All external developers have a need for this. It would be great if Google could document and support the way for external developers to build TF with debug symbols.

@UsharaniPagadala UsharaniPagadala removed their assignment May 19, 2021
@UsharaniPagadala UsharaniPagadala removed the stat:awaiting response Status - Awaiting response from author label May 19, 2021
@jvishnuvardhan jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 19, 2021
@bas-aarts
Copy link
Author
bas-aarts commented May 20, 2021

@sanjoy, I tried adding --copt=-O1 as well as --copt=-O2, which does not address the problem.
Also tried to use clang instead of nvcc/gcc, which results in:

ERROR: /root/.cache/bazel/_bazel_root/21773b1a37c97ed8ddda3e8be78ee764/external/libjpeg_turbo/BUILD.bazel:41:11: undeclared inclusion(s) in rule '@libjpeg_turbo//:jpeg': this rule is missing dependency declarations for the following files included by 'libjpeg_turbo/jfdctflt.c': '/usr/lib/clang/10.0.0/include/stddef.h' '/usr/lib/clang/10.0.0/include/__stddef_max_align_t.h' '/usr/lib/clang/10.0.0/include/stdarg.h' Target //tensorflow/tools/pip_package:build_pip_package failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 9.149s, Critical Path: 0.56s INFO: 60 processes: 53 internal, 7 local. FAILED: Build did NOT complete successfully

@reedwm
Copy link
Member
reedwm commented Jun 10, 2021

I think the best solution here is to only include debugging information for certain files when --config=dbg is specified. We should also document how to compile with debugging information, perhaps either on the Build from source guide or the Contribute to TensorFlow

On Ubuntu 18.04, _pywrap_tensorflow_internal.so still builds, but both the .debug_info and .debug_str sections are dangerously close to overflowing. When I run the following commands, the hexadecimal size of .debug_info in bytes is 0xfa5e36a5 and the size of .debug_str is 0xf940ec6a, both which are close to the max value of 0xffffffff.

bazel build --config=dbg --config=cuda //tensorflow/python:_pywrap_tensorflow_internal.so
objdump -h bazel-out/k8-dbg/bin/tensorflow/python/_pywrap_tensorflow_internal.so

I suggest that with --config=dbg, we only include debugging info in files under the tensorflow directory that are not also under tensorflow/core/kernels. This excludes TF dependencies and kernels, the former which takes up a lot of space in .debug_info and the latter which takes up a lot of space in .debug_str (including kernels causes a lot of long Eigen symbols to appear in .debug_str). Adding the following flags reduces the size of .debug_info to 5ae89488 and reduces the size of .debug_str to 474e9eda

--copt=-g0 --per_file_copt=+tensorflow.*,-tensorflow/core/kernels.*@-g

@mihaimaruseac, @bas-aarts does adding the flags above to --config=dbg sound like a good plan? Not including debugging info in kernels is a shame, but they take up too much space in .debug_str. If one wants to have debugging symbols in a specific kernel, they can still do so with the flags --config=dbg --per_file_copt=+tensorflow/core/kernels/my_favorite_kernel.*@-g.

@mihaimaruseac
Copy link
Collaborator

This sounds good to me, we can go with this path for now.

@reedwm reedwm assigned reedwm and unassigned chsigg Jun 10, 2021
@bas-aarts
Copy link
Author

with the above diff, not all is well yet when compiling with --config=dbg . For mark_for_compilation_pass.cc, I see the following command line:

external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/tensorflow/compiler/jit/_objs/compilation_passes/mark_for_compilation_pass.pic.d '-frandom-seed=bazel-out/host/bin/tensorflow/compiler/jit/_objs/compilation_passes/mark_for_compilation_pass.pic.o' -DLLVM_ENABLE_STATS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DLLVM_BUILD_GLOBAL_ISEL -DTENSORFLOW_USE_CUSTOM_CONTRACTION_KERNEL -DTENSORFLOW_USE_MKLDNN_CONTRACTION_KERNEL -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DCURL_STATICLIB -DPLATFORM_LINUX -DENABLE_CURL_CLIENT -DOPENSSL_IS_BORINGSSL -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' -iquote . -iquote bazel-out/host/bin -iquote external/com_google_absl -iquote bazel-out/host/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/host/bin/external/nsync -iquote external/eigen_archive -iquote bazel-out/host/bin/external/eigen_archive -iquote external/gif -iquote bazel-out/host/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/host/bin/external/libjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/host/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/host/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/host/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/host/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/host/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/host/bin/external/zlib -iquote external/double_conversion -iquote bazel-out/host/bin/external/double_conversion -iquote external/local_config_cuda -iquote bazel-out/host/bin/external/local_config_cuda -iquote external/local_config_rocm -iquote bazel-out/host/bin/external/local_config_rocm -iquote external/local_config_tensorrt -iquote bazel-out/host/bin/external/local_config_tensorrt -iquote external/snappy -iquote bazel-out/host/bin/external/snappy -iquote external/curl -iquote bazel-out/host/bin/external/curl -iquote external/boringssl -iquote bazel-out/host/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/host/bin/external/jsoncpp_git -iquote external/aws -iquote bazel-out/host/bin/external/aws -iquote external/aws-c-common -iquote bazel-out/host/bin/external/aws-c-common -iquote external/aws-c-event-stream -iquote bazel-out/host/bin/external/aws-c-event-stream -iquote external/aws-checksums -iquote bazel-out/host/bin/external/aws-checksums -iquote external/llvm-project -iquote bazel-out/host/bin/external/llvm-project -iquote external/mkl_dnn_v1 -iquote bazel-out/host/bin/external/mkl_dnn_v1 -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cuda_headers_virtual -Ibazel-out/host/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers -Ibazel-out/host/bin/external/local_config_cuda/cuda/_virtual_includes/cudnn_header -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/BuiltinAttributesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/BuiltinDialectIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/BuiltinLocationAttributesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/BuiltinOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/BuiltinTypesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/CallOpInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/CastOpInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/InferTypeOpInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/OpAsmInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/RegionKindInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/SideEffectInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/SymbolInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ControlFlowInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/DerivedAttributeOpInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/LoopLikeInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ParserTokenKinds -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/StandardOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/TensorBaseIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/TensorOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/VectorInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ViewLikeInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/AffineMemoryOpInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/AffineOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/MemRefBaseIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/MemRefOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/CopyOpInterfaceIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/LinalgInterfacesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/LinalgStructuredOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/LinalgOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/LinalgSparseOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/SCFIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/SCFPassIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ComplexBaseIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ComplexOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/PDLOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/PDLTypesIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/PDLInterpOpsIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ConversionPassIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/TransformsPassIncGen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/canonicalize_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/chlo_ops_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/hlo_ops_base_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/hlo_ops_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/hlo_ops_pattern_gen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/MLIRShapeCanonicalizationIncGen -Ibazel-out/host/bin/external/llvm-project/mlir/_virtual_includes/ShapeOpsIncGen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_ops_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_ops_structs_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/chlo_legalize_to_hlo_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_gpu_ops_enums_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_gpu_ops_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_gpu_ops_structs_inc_gen -Ibazel-out/host/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/MhloPassIncGen -isystem external/nsync/public -isystem bazel-out/host/bin/external/nsync/public -isystem third_party/eigen3/mkl_include -isystem bazel-out/host/bin/third_party/eigen3/mkl_include -isystem external/eigen_archive -isystem bazel-out/host/bin/external/eigen_archive -isystem external/gif -isystem bazel-out/host/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/host/bin/external/com_google_protobuf/src -isystem external/farmhash_archive/src -isystem bazel-out/host/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/host/bin/external/zlib -isystem external/double_conversion -isystem bazel-out/host/bin/external/double_conversion -isystem external/local_config_cuda/cuda -isystem bazel-out/host/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cuda/include -isystem external/local_config_rocm/rocm -isystem bazel-out/host/bin/external/local_config_rocm/rocm -isystem external/local_config_rocm/rocm/rocm/include -isystem bazel-out/host/bin/external/local_config_rocm/rocm/rocm/include -isystem external/local_config_rocm/rocm/rocm/include/rocrand -isystem bazel-out/host/bin/external/local_config_rocm/rocm/rocm/include/rocrand -isystem external/local_config_rocm/rocm/rocm/include/roctracer -isystem bazel-out/host/bin/external/local_config_rocm/rocm/rocm/include/roctracer -isystem external/curl/include -isystem bazel-out/host/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/host/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem bazel-out/host/bin/external/jsoncpp_git/include -isystem external/aws/aws-cpp-sdk-core/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-core/include -isystem external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-s3/include -isystem external/aws/aws-cpp-sdk-transfer/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-transfer/include -isystem external/aws-c-common/include -isystem bazel-out/host/bin/external/aws-c-common/include -isystem external/aws-c-event-stream/include -isystem bazel-out/host/bin/external/aws-c-event-stream/include -isystem external/aws-checksums/include -isystem bazel-out/host/bin/external/aws-checksums/include -isystem external/llvm-project/llvm/include -isystem bazel-out/host/bin/external/llvm-project/llvm/include -isystem external/mkl_dnn_v1/include -isystem bazel-out/host/bin/external/mkl_dnn_v1/include -isystem external/mkl_dnn_v1/src -isystem bazel-out/host/bin/external/mkl_dnn_v1/src -isystem external/mkl_dnn_v1/src/common -isystem bazel-out/host/bin/external/mkl_dnn_v1/src/common -isystem external/mkl_dnn_v1/src/common/ittnotify -isystem bazel-out/host/bin/external/mkl_dnn_v1/src/common/ittnotify -isystem external/mkl_dnn_v1/src/cpu -isystem bazel-out/host/bin/external/mkl_dnn_v1/src/cpu -isystem external/mkl_dnn_v1/src/cpu/gemm -isystem bazel-out/host/bin/external/mkl_dnn_v1/src/cpu/gemm -isystem external/mkl_dnn_v1/src/cpu/x64/xbyak -isystem bazel-out/host/bin/external/mkl_dnn_v1/src/cpu/x64/xbyak -isystem external/llvm-project/mlir/include -isystem bazel-out/host/bin/external/llvm-project/mlir/include -isystem tensorflow/compiler/mlir/tensorflow/include -isystem bazel-out/host/bin/tensorflow/compiler/mlir/tensorflow/include -isystem tensorflow/compiler/mlir/hlo/include -isystem bazel-out/host/bin/tensorflow/compiler/mlir/hlo/include -isystem tensorflow/compiler/mlir/xla/include -isystem bazel-out/host/bin/tensorflow/compiler/mlir/xla/include -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 -w -Wno-sign-compare -g0 '-std=c++14' -c tensorflow/compiler/jit/mark_for_compilation_pass.cc -o bazel-out/host/bin/tensorflow/compiler/jit/_objs/compilation_passes/mark_for_compilation_pass.pic.o)
notice the -g0 -O2 -g0 -g0. This file is compiled with -O2, and has no debug information

@reedwm
Copy link
Member
reedwm commented Jun 11, 2021

For me, it is compiled with debug info without optimizations. After adding the two lines in my previous post, when I run:

bazel build -s  --config=dbg --config=cuda //tensorflow/compiler/jit:compilation_passes

I get the command line:

external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-dbg/bin/tensorflow/compiler/jit/_objs/compilation_passes/mark_for_compilation_pass.pic.d '-frandom
-seed=bazel-out/k8-dbg/bin/tensorflow/compiler/jit/_objs/compilation_passes/mark_for_compilation_pass.pic.o' -DLLVM_ENABLE_STATS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DLLV
M_BUILD_GLOBAL_ISEL -DTENSORFLOW_USE_CUSTOM_CONTRACTION_KERNEL -DTENSORFLOW_USE_MKLDNN_CONTRACTION_KERNEL -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DCURL_STATICLIB -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' -
iquote . -iquote bazel-out/k8-dbg/bin -iquote external/com_google_absl -iquote bazel-out/k8-dbg/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/k8-dbg/bin/external/nsync -iquote exte
rnal/eigen_archive -iquote bazel-out/k8-dbg/bin/external/eigen_archive -iquote external/gif -iquote bazel-out/k8-dbg/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/k8-dbg/bin/external/l
ibjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/k8-dbg/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/k8-dbg/bin/external/com_googlesourc
e_code_re2 -iquote external/farmhash_archive -iquote bazel-out/k8-dbg/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/k8-dbg/bin/external/fft2d -iquote external/highwayhash -iquote
bazel-out/k8-dbg/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/k8-dbg/bin/external/zlib -iquote external/double_conversion -iquote bazel-out/k8-dbg/bin/external/double_conversion -iquot
e external/local_config_cuda -iquote bazel-out/k8-dbg/bin/external/local_config_cuda -iquote external/local_config_rocm -iquote bazel-out/k8-dbg/bin/external/local_config_rocm -iquote external/local_confi
g_tensorrt -iquote bazel-out/k8-dbg/bin/external/local_config_tensorrt -iquote external/snappy -iquote bazel-out/k8-dbg/bin/external/snappy -iquote external/curl -iquote bazel-out/k8-dbg/bin/external/curl
 -iquote external/boringssl -iquote bazel-out/k8-dbg/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/k8-dbg/bin/external/jsoncpp_git -iquote external/llvm-project -iquote bazel-out/k
8-dbg/bin/external/llvm-project -iquote external/mkl_dnn_v1 -iquote bazel-out/k8-dbg/bin/external/mkl_dnn_v1 -Ibazel-out/k8-dbg/bin/external/local_config_cuda/cuda/_virtual_includes/cuda_headers_virtual -
Ibazel-out/k8-dbg/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers -Ibazel-out/k8-dbg/bin/external/local_config_cuda/cuda/_virtual_includes/cudnn_header -Ibazel-out/k8-dbg/bin/externa
l/llvm-project/mlir/_virtual_includes/BuiltinAttributesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/BuiltinDialectIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_vi
rtual_includes/BuiltinLocationAttributesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/BuiltinOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/Buil
tinTypesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/CallOpInterfacesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/CastOpInterfacesIncGen -Ibazel
-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/InferTypeOpInterfaceIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/OpAsmInterfaceIncGen -Ibazel-out/k8-dbg/bin/exte
rnal/llvm-project/mlir/_virtual_includes/RegionKindInterfaceIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/SideEffectInterfacesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-proje
ct/mlir/_virtual_includes/SymbolInterfacesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/TensorEncodingIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_include
s/ParserTokenKinds -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/ControlFlowInterfacesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/DerivedAttributeOpInt
erfaceIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/LoopLikeInterfaceIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/CopyOpInterfaceIncGen -Ibazel-o
ut/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/MemRefBaseIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/MemRefOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project
/mlir/_virtual_includes/StandardOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/TensorOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/VectorInte
rfacesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/ViewLikeInterfaceIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/AffineMemoryOpInterfacesIncGen
-Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/AffineOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/LinalgInterfacesIncGen -Ibazel-out/k8-dbg/bin/extern
al/llvm-project/mlir/_virtual_includes/LinalgStructuredOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/LinalgOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virt
ual_includes/SCFIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/SCFPassIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/ComplexBaseIncGen -Ibazel-out/k
8-dbg/bin/external/llvm-project/mlir/_virtual_includes/ComplexOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/PDLOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_
virtual_includes/PDLTypesIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/PDLInterpOpsIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/ConversionPassInc
Gen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/TransformsPassIncGen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/canonicalize_inc_gen -Ibazel-out/k8-dbg/b
in/tensorflow/compiler/mlir/hlo/_virtual_includes/chlo_ops_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/hlo_ops_base_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mli
r/hlo/_virtual_includes/hlo_ops_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/hlo_ops_pattern_gen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/MLIRSh
apeCanonicalizationIncGen -Ibazel-out/k8-dbg/bin/external/llvm-project/mlir/_virtual_includes/ShapeOpsIncGen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_ops_inc_gen -Ibazel-
out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_ops_structs_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/chlo_legalize_to_hlo_inc_gen -Ibazel-out/k8-dbg
/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_gpu_ops_enums_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/lhlo_gpu_ops_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/
compiler/mlir/hlo/_virtual_includes/lhlo_gpu_ops_structs_inc_gen -Ibazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/_virtual_includes/MhloPassIncGen -isystem external/nsync/public -isystem bazel-out/k8-d
bg/bin/external/nsync/public -isystem third_party/eigen3/mkl_include -isystem bazel-out/k8-dbg/bin/third_party/eigen3/mkl_include -isystem external/eigen_archive -isystem bazel-out/k8-dbg/bin/external/eig
en_archive -isystem external/gif -isystem bazel-out/k8-dbg/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/k8-dbg/bin/external/com_google_protobuf/src -isystem external/farmh
ash_archive/src -isystem bazel-out/k8-dbg/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/k8-dbg/bin/external/zlib -isystem external/double_conversion -isystem bazel-out/k8-dbg
/bin/external/double_conversion -isystem external/local_config_cuda/cuda -isystem bazel-out/k8-dbg/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-
out/k8-dbg/bin/external/local_config_cuda/cuda/cuda/include -isystem external/local_config_rocm/rocm -isystem bazel-out/k8-dbg/bin/external/local_config_rocm/rocm -isystem external/local_config_rocm/rocm/
rocm/include -isystem bazel-out/k8-dbg/bin/external/local_config_rocm/rocm/rocm/include -isystem external/local_config_rocm/rocm/rocm/include/rocrand -isystem bazel-out/k8-dbg/bin/external/local_config_ro
cm/rocm/rocm/include/rocrand -isystem external/local_config_rocm/rocm/rocm/include/roctracer -isystem bazel-out/k8-dbg/bin/external/local_config_rocm/rocm/rocm/include/roctracer -isystem external/curl/inc
lude -isystem bazel-out/k8-dbg/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/k8-dbg/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem
 bazel-out/k8-dbg/bin/external/jsoncpp_git/include -isystem external/llvm-project/llvm/include -isystem bazel-out/k8-dbg/bin/external/llvm-project/llvm/include -isystem external/mkl_dnn_v1/include -isyste
m bazel-out/k8-dbg/bin/external/mkl_dnn_v1/include -isystem external/mkl_dnn_v1/src -isystem bazel-out/k8-dbg/bin/external/mkl_dnn_v1/src -isystem external/mkl_dnn_v1/src/common -isystem bazel-out/k8-dbg/
bin/external/mkl_dnn_v1/src/common -isystem external/mkl_dnn_v1/src/common/ittnotify -isystem bazel-out/k8-dbg/bin/external/mkl_dnn_v1/src/common/ittnotify -isystem external/mkl_dnn_v1/src/cpu -isystem ba
zel-out/k8-dbg/bin/external/mkl_dnn_v1/src/cpu -isystem external/mkl_dnn_v1/src/cpu/gemm -isystem bazel-out/k8-dbg/bin/external/mkl_dnn_v1/src/cpu/gemm -isystem external/mkl_dnn_v1/src/cpu/x64/xbyak -isys
tem bazel-out/k8-dbg/bin/external/mkl_dnn_v1/src/cpu/x64/xbyak -isystem external/llvm-project/mlir/include -isystem bazel-out/k8-dbg/bin/external/llvm-project/mlir/include -isystem tensorflow/compiler/mli
r/tensorflow/include -isystem bazel-out/k8-dbg/bin/tensorflow/compiler/mlir/tensorflow/include -isystem tensorflow/compiler/mlir/hlo/include -isystem bazel-out/k8-dbg/bin/tensorflow/compiler/mlir/hlo/incl
ude -isystem tensorflow/compiler/mlir/xla/include -isystem bazel-out/k8-dbg/bin/tensorflow/compiler/mlir/xla/include -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__
TIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -g -w -DAUTOLOAD_DYNAMIC_KERNELS -DDEB
UG_BUILD '-std=c++14' -DTF_LITE_DISABLE_X86_NEON -c tensorflow/compiler/jit/mark_for_compilation_pass.cc -o bazel-out/k8-dbg/bin/tensorflow/compiler/jit/_objs/compilation_passes/mark_for_compilation_pass.
pic.o

There is a -g at the end, and no -O2. What bazel command are you using?

@bas-aarts
Copy link
Author
bas-aarts commented Jun 11, 2021

My commandline shows up when building //tensorflow/tools/pip_package:build_pip_package
bazel build -s --config=dbg -j 24 --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package
When I build //tensorflow/compiler/jit:compilation_passes, it looks correct (although I still want to verify that -g == -O0 -g)

copybara-service bot pushed a commit that referenced this issue Jun 11, 2021
Before, the build would fail with errors such as: "relocation truncated to fit: R_X86_64_32 against .debug_info'". The issue was the debug info was too large. I believe the issue was occurring because offsets into the .debug_info section are stored as 32-bit integers, and so that section cannot exceed 4GiB. To fix, debug info is only included for files under tensorflow/, excluding kernels. This brings the size of the .debug_info section down to about 1.4GiB, well under the 4GiB limit.

Unfortunately, TF kernels and TF dependencies do not have debugging info anymore, but I suspect these are rarely debugged. Debugging info for specific kernels/dependencies can still be explicitly included by the user, e.g. by passing the bazel flags: --config=dbg --per_file_copt=+tensorflow/core/kernels/identity_op.*@-g

See #48919 for more context.

PiperOrigin-RevId: 378910826
Change-Id: I4b94e3d53bb3ca00c30d5c83d2a57e4bd390c5a8
@reedwm
Copy link
Member
reedwm commented Jun 11, 2021

I submitted d3bbd2f, which makes the changes to .bazelrc. Can you try again after that commit and check if you can set breakpoints in mark_for_compilation_pass.cc, or whatever other file you want to debug (except dependencies and kernels)? I was able to debug with the changes in that commit. I used the command:

bazel build  --config=dbg --config=cuda //tensorflow/tools/pip_package:build_pip_package

Your command is slightly different, but it should still work.

@bas-aarts
Copy link
Author

trying now

@reedwm
Copy link
Member
reedwm commented Jun 11, 2021

Actually the build might be failing right now for an unrelated reason. So trying at d3bbd2f itself, with or without debugging info, might not work.

@reedwm
Copy link
Member
reedwm commented Jun 11, 2021

Building is working again, as of bf36815. I ran the command you tried:

bazel build -s --config=dbg -j 24 --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package

The subcommands for mark_for_compilation_pass.cc outputted by bazel are here. TensorFlow actually builds mark_for_compiliation_pass.cc twice: once with -O2 -g for the host platform and once with just -g for the target platform, the latter which has no optimizations enabled. In the subcommands, you can see the only the command with the string "for host" has -O2. I believe the pip package and any bazel tests/binaries you include should include the version of mark_for_compilation_pass with just -g, and so will be debuggable. I'm not familiar with bazel and so am not entirely sure why its being built for the host platform (maybe to run various genfile targets), but this shouldn't cause any issues debugging.

@bas-aarts
Copy link
Author

I just came to the same conclusion. Many files are compiled twice. What is 'host' used for? Is it required?
Seems like a lot of compilations if it's not strictly necessary.

@bas-aarts
Copy link
Author

Debugging works fine. Fix looks good to me.
We should leave it open though until after it has been documented as well.
Thanks @reedwm

@reedwm
Copy link
Member
reedwm commented Jun 11, 2021

I just came to the same conclusion. Many files are compiled twice. What is 'host' used for? Is it required?

The "Build configurations and cross-compilation" section of this page has details on the "host" vs "target" configuration (I used the word "platform" before but I think the right word is "configuration"). There is probably a genrule somewhere that uses mark_for_compiliation_pass somewhere along with other files, which causes it to be built for the host configuration in addition to the target configuration.

@sanjoy
Copy link
Contributor
sanjoy commented Jun 12, 2021

You can use --nodistinct_host_configuration to avoid building these files twice, which should be fine for your local development. Maybe this needs to be in .bazelrc as well (if putting this in bazelrc works as expected).

@bas-aarts
Copy link
Author

fwiw, building all of tensorflow/... (ie including tensorflow/core/kernels) with debug, still works as well.

@sanjoy
Copy link
Contributor
sanjoy commented Jun 12, 2021

fwiw, building all of tensorflow/... (ie including tensorflow/core/kernels) with debug, still works as well.

I thought this is precisely what wasn't working right? Or did I misunderstand the issue?

@bas-aarts
Copy link
Author

fwiw, building all of tensorflow/... (ie including tensorflow/core/kernels) with debug, still works as well.

I thought this is precisely what wasn't working right? Or did I misunderstand the issue?

This is still not all , just the tensorflow directory. all external bits are still optimized

@reedwm
Copy link
Member
reedwm commented Jun 12, 2021

The .debug_str section was very close to 4 GiB, so I removed the kernels to reduce it, although I'm not sure if the section must be below 4 GiB or not. Also, removing debug info in the kernels makes gdb start up considerably faster.

@bas-aarts
Copy link
Author
bas-aarts commented Jun 12, 2021

I just came to the same conclusion. Many files are compiled twice. What is 'host' used for? Is it required?

The "Build configurations and cross-compilation" section of this page has details on the "host" vs "target" configuration (I used the word "platform" before but I think the right word is "configuration"). There is probably a genrule somewhere that uses mark_for_compiliation_pass somewhere along with other files, which causes it to be built for the host configuration in addition to the target configuration.

so I added --distinct_host_configuration=false to the bazel build just to see. Build was so much faster, as only the build is done. So far, seems like stuff is working.

@bhack
Copy link
Contributor
bhack commented Jun 12, 2021

so I added --distinct_host_configuration=false to the bazel build just to see. Build was so much faster, as only the build is done. So far, seems like stuff is working.

Is this related to this debug build or it is going to impact also regular build?

@bas-aarts
Copy link
Author

so I added --distinct_host_configuration=false to the bazel build just to see. Build was so much faster, as only the build is done. So far, seems like stuff is working.

Is this related to this debug build or it is going to impact also regular build?

While i have not verified, this should not be debug related

@bhack
Copy link
Contributor
bhack commented Jun 12, 2021

so I added --distinct_host_configuration=false to the bazel build just to see. Build was so much faster, as only the build is done. So far, seems like stuff is working.

Is this related to this debug build or it is going to impact also regular build?

While i have not verified, this should not be debug related

This could be interesting /cc @angerson @perfinion

@bas-aarts
Copy link
Author

@reedwm, just following up regarding the documentation part of this bug. Any updates?

@reedwm
Copy link
Member
reedwm commented Jun 21, 2021

No update yet, will try to do this this week.

TensorFlow-Docs-Copybara pushed a commit to tensorflow/docs that referenced this issue Jun 22, 2021
@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@reedwm
Copy link
Member
reedwm commented Jun 23, 2021

This is fixed and documented. But for some reason, using --config=dbg makes gdb take a lot longer to startup and start running the program compared to simply passing -O0 -g to the same set of files with the following:

--per_file_copt=+tensorflow.*,-tensorflow/core/kernels.*@-O0,-g

I would have thought passing -c dbg to bazel would simply make bazel pass -O0 -g to gcc, but perhaps it is doing something more that causes gdb to take longer to startup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower subtype:bazel Bazel related Build_Installation issues subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests

9 participants