[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick Fix for Python 3.7 #21202

Merged
merged 8 commits into from
Aug 27, 2018
Merged

Quick Fix for Python 3.7 #21202

merged 8 commits into from
Aug 27, 2018

Conversation

bstriner
Copy link
Contributor
@bstriner bstriner commented Jul 28, 2018

Fix for 3.7 I made before I found #20766

Just built CUDA 9.2 CuDNN 7 compute 6.1 python 3.7 on VS2017 using CMake.

That seems to be the place for working on what will be the final version, but I'm just putting this out there if anyone wants a quick fix in the meantime (currently that PR conflicts). This version casts away consts, so it isn't exactly safe but should be fine if you just want things up and running. Tensorflow overuses const_cast but that is an issue for another day.

Just merging this branch into your own work is an easy option if you need to build for 3.7. I'll close this PR when the other branch gets finished.

Related reading materials:

Cheers

@caisq caisq requested a review from asimshankar July 28, 2018 14:44
@caisq caisq self-assigned this Jul 28, 2018
@caisq caisq added awaiting review Pull request awaiting review kokoro:force-run Tests on submitted change labels Jul 28, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jul 28, 2018
@gunan
Copy link
Contributor
gunan commented Jul 31, 2018

@alextp @asimshankar Should we merge this PR instead of #20766 since there has been no updates on the other PR?

@asimshankar
Copy link
Contributor
asimshankar commented Jul 31, 2018

Whichever PR makes progress is fine with me :)
I'll close that other one since it seems the author of the other hasn't gotten around to it.

That said, the comments on that PR still apply here:

  1. Let's avoid the const_cast.
  2. Let's isolate the changes to the SWIG files instead of changing the C API.

@bstriner : Would you be able to update this PR to do 2.?
(I had a pending change for 1., so can submit that unless you want to update the PR)

@asimshankar asimshankar mentioned this pull request Jul 31, 2018
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Jul 31, 2018
tensorflow-copybara pushed a commit that referenced this pull request Aug 1, 2018
…char*"

instead of a "char*".

(See https://docs.python.org/3/whatsnew/3.7.html#c-api-changes)

There are additional changes needed for Python 3.7 compatibility,
this change just pulls out one of them
(and subsumes a related attempt in #21202 and #20766)

Helps with #20517

PiperOrigin-RevId: 207008013
@asimshankar
Copy link
Contributor

Commited abb903d for avoiding the const_cast.

@bstriner : Are you interested in addressing 2. in this PR?

@Hong-Xiang
Copy link
Hong-Xiang commented Aug 13, 2018

@asimshankar I'm trying to change SWIG interface to fix the problem as you suggested. However I got the following error:

ERROR: 
(cache directory)/external/protobuf_archive/BUILD:659:1: C++ compilation of rule '@protobuf_archive//:python/google/protobuf/pyext/_message.so' failed (Exit 1)
external/protobuf_archive/python/google/protobuf/pyext/descriptor_containers.cc: In function ‘bool google::protobuf::python::descriptor::_GetItemByKey(google::protobuf::python::PyContainer*, PyObject*, const void**)’:
external/protobuf_archive/python/google/protobuf/pyext/descriptor_containers.cc:69:45: error: invalid conversion from ‘const char*’ to ‘char*’ [-fpermissive]
        ((*(charpp) = PyUnicode_AsUTF8AndSize(ob, (sizep))) == NULL? -1: 0): \
                                             ^
external/protobuf_archive/python/google/protobuf/pyext/descriptor_containers.cc:172:13: note: in expansion of macro ‘PyString_AsStringAndSize’
         if (PyString_AsStringAndSize(key, &name, &name_size) < 0) {
             ^
external/protobuf_archive/python/google/protobuf/pyext/descriptor_containers.cc:69:45: error: invalid conversion from ‘const char*’ to ‘char*’ [-fpermissive]
        ((*(charpp) = PyUnicode_AsUTF8AndSize(ob, (sizep))) == NULL? -1: 0): \
                                             ^
external/protobuf_archive/python/google/protobuf/pyext/descriptor_containers.cc:189:13: note: in expansion of macro ‘PyString_AsStringAndSize’
         if (PyString_AsStringAndSize(key, &camelcase_name, &name_size) < 0) {
             ^
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-writable-strings’
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.

It seems that const_cast problem still exists, and I just can't find file named descriptor_containers.cc. It seems this is caused by protobuf, can we solve this only modifying source of tensorflow?

Fix of protobuf

@ppwwyyxx
Copy link
Contributor
ppwwyyxx commented Aug 13, 2018

protobuf 3.6.1 has fixed the problem already.
EDIT: I read it wrong. See below.

@asimshankar
Copy link
Contributor

@ppwwyyxx : I might be reading incorrectly, but it seems that protobuf 3.6.1 does not include the change to make it compatible with Python 3.7 (commit protocolbuffers/protobuf@0a59054 isn't included in the 3.6.x branch)

@Hong-Xiang : You're right, we'll need to update the version of protobuf that TensorFlow depends on, perhaps after there is a release version of protobuf that is compatible with Python 3.7?

@martinwicke
Copy link
Member

I think this would also fix #20444.

@bstriner
Copy link
Contributor Author
bstriner commented Aug 13, 2018 via email

@bstriner
Copy link
Contributor Author

@asimshankar the renames seem cleaner if we do them in the cpps, but I'm willing to go either way. Bulky to have custom wrappers for a few functions just because of the async name conflict.

  • Unfortunately, no purely swig renaming like there is for function names
  • A macro would work, but just blanket replacing the word async feels like overkill
  • Could create new functions that wrap the old functions and have better parameter names
  • Could rename the parameters in the cpps

Any way we do it, any thoughts on the rename? async_ is closest typographically but maybe something like is_async is better semantically.

Thanks!

@bstriner
Copy link
Contributor Author

@asimshankar BTW, do the const_casts in TF bother anyone else? This idiom tastes like spaghetti: reinterpret_cast<T*>(const_cast<char*>((out->tensor_data().data())));.

The void pointer gets casted up to const then const_casted back, and gets casted from void to char to T. Getting out->buf_ would be great if there was an accessor.

@asimshankar
Copy link
Contributor

Thanks @bstriner : It seems that at this point this PR can be reduced to 2 lines, let's just do that: Change async to enable in the two functions in c_api.h (or if you want in the .cc file too). If you want to update the PR to do that, that would be awesome.

Regarding const_cast, I haven't audited all the uses, but in general, yeah we prefer to avoid them, though it may not always be possible.

@bstriner
Copy link
Contributor Author

This includes the protobuf bump as well just so I could get it to build, but let me know what you want to do about that. Moved the protobuf data into defines so it is just one place instead of 3 places.

I don't have a good way to get rid of the const_casting in pywrap_tfe.i. TFE_GetPythonString is now returning a const char*, which makes sense.

SWIG is currently generating

char *arg2 = (char *) 0 ;
arg2 = const_cast<char*>(TFE_GetPythonString(obj1));
TFE_ContextAddFunctionDef(arg1,(char const *)arg2,arg3,arg4);

I want it to generate this but I can't figure out how to make arg2 a const

const char *arg2 = (const char *) 0 ;
arg2 = TFE_GetPythonString(obj1);
TFE_ContextAddFunctionDef(arg1,arg2,arg3,arg4);

Does anyone know a way to get that const the way I want it? Something like typemap(arginit) but for declarations? I tried things like %typemap(in) const char* device_name (const char*) but can't figure out any syntax that will work.

@bstriner
Copy link
Contributor Author

@asimshankar the thing responsible for like 50% of the const_cast (just eyeballing it) is tensor_data. There is no way around using a bunch of const_cast. Tensors have a void* to the data, but the only getter (tensor_data) gets you a const char* even if you want to mutate the data. A getter like T* tensor::data_ptr<T>() would eliminate that mess and the casting.

@bstriner
Copy link
Contributor Author

Just built python3.7 successfully on both linux and windows. New protobuf has issues on windows so I had to make a patch file.

Add these 2 patch files to build on windows:

@joneepenk
Copy link

@bstriner nice work. hope the pr merged soon.

Copy link
Contributor
@asimshankar asimshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@asimshankar asimshankar added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process and removed stat:awaiting response Status - Awaiting response from author labels Aug 27, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Aug 27, 2018
@bstriner
Copy link
Contributor Author

Thanks! Is there a separate issue already for tracking protobuf and eigen fixes? Also might want to start an issue in swig for that const issue.

@daa
Copy link
daa commented Aug 27, 2018

@bstriner protobuf issue - #20950

@gunan gunan added the kokoro:force-run Tests on submitted change label Aug 27, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Aug 27, 2018
@tensorflow-copybara tensorflow-copybara merged commit 6f46906 into tensorflow:master Aug 27, 2018
tensorflow-copybara pushed a commit that referenced this pull request Aug 27, 2018
PiperOrigin-RevId: 210447509
@bstriner bstriner deleted the py37 branch August 28, 2018 01:28
@pbk0
Copy link
pbk0 commented Sep 1, 2018

Come on guys please upgrade to python 3.7 and make official release..... I need typing and data classes from python 3.7

@bstriner
Copy link
Contributor Author
bstriner commented Sep 2, 2018

@SpikingNeurons I feel you, but can you post any issues you have building from source with the above?

@pbk0
Copy link
pbk0 commented Sep 2, 2018

@bstriner Unfortunately, I depend on windows binaries and don't compile from source. Its shame that I cannot add value.

@mattpepin
Copy link

I'm getting this error as soon as I change the protobuf URLs/checksum/prefix for the ones above:

ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: error loading package 'tensorflow': Extension file not found. Unable to load package for '@bazel_skylib//:lib.bzl': The repository could not be resolved

Works fine when I change it for 3.6.1, but the fix for Python 3.7 is not included. Any idea what I'm doing wrong?

@bstriner
Copy link
Contributor Author
bstriner commented Sep 19, 2018

@mattpepin copy bazel_skylib from protobuf workspace to tensorflow workspace. Let me know if you have any issues after that.

@mattpepin
Copy link

@bstriner Thanks! It worked perfectly!

@RyanRosario
Copy link
RyanRosario commented Oct 14, 2018

@bstriner I don't understand if this issue has been resolved for Ubuntu with a PR+merge or not. I tried a git checkout r1.10, a bazel clean, reconfigure and build, but still get the Protobuf error. Am I supposed to do something manual for this to build on Python 3.7, Ubuntu 18.04? r1.12 also does not compile.

@Hoeze
Copy link
Hoeze commented Oct 15, 2018

I just tried to build the r1.12 branch:

ERROR: /home/[...]/.cache/bazel/_bazel_[...]/20b747b5896a556e0b493e9e90c43fcf/external/protobuf_archive/BUILD:659:1: C++ compilation of rule '@protobuf_archive//:python/google/protobuf/pyext/_message.so' failed (Exit 1)

external/protobuf_archive/python/google/protobuf/pyext/extension_dict.cc: In function 'PyObject* google::protobuf::python::extension_dict::_FindExtensionByName(google::protobuf::python::ExtensionDict*, PyObject*)':
external/protobuf_archive/python/google/protobuf/pyext/extension_dict.cc:56:45: error: invalid conversion from 'const char*' to 'char*' [-fpermissive]
        ((*(charpp) = PyUnicode_AsUTF8AndSize(ob, (sizep))) == NULL? -1: 0): \
                      ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
external/protobuf_archive/python/google/protobuf/pyext/extension_dict.cc:184:7: note: in expansion of macro 'PyString_AsStringAndSize'
   if (PyString_AsStringAndSize(arg, &name, &name_size) < 0) {
       ^~~~~~~~~~~~~~~~~~~~~~~~
At global scope:
cc1plus: warning: unrecognized command line option '-Wno-writable-strings'
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.

I'm using Arch linux with up-to-date packages, i.e. Python 3.7 and so on.

@bstriner
Copy link
Contributor Author

@RyanRosario correct. There is no stable release of protobuf with the required changes. See discussion above and in the linked post about how to switch protobuf versions. Feel free to check github for current master and try that version.

@Hoeze That is also a protobuf issue. Same story.

@NewUserHa
Copy link
INFO: Build options have changed, discarding analysis cache.
ERROR: C:/tensorflow/tensorflow/tools/pip_package/BUILD:223:1: error loading package 'tensorflow': Extension file not found. Unable to load package for '@bazel_skylib//:lib.bzl': The repository could not be resolved and referenced by '//tensorflow/tools/pip_package:build_pip_package'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: error loading package 'tensorflow': Extension file not found. Unable to load package for '@bazel_skylib//:lib.bzl': The repository could not be resolved
INFO: Elapsed time: 0.380s

@mattpepin copy bazel_skylib from protobuf workspace to tensorflow workspace. Let me know if you have any issues after that.

what does this mean?
there's nothing in \tensorflow\third_party\protobuf nor any string match 'skylib' in \tensorflow\workspace.bzl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet