[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking issue for upgrading protobuf to 3.19.0+ (and gRPC as well) #5708

Closed
Tracked by #6
nfelt opened this issue May 17, 2022 · 9 comments · Fixed by #6147
Closed
Tracked by #6

tracking issue for upgrading protobuf to 3.19.0+ (and gRPC as well) #5708

nfelt opened this issue May 17, 2022 · 9 comments · Fixed by #6147

Comments

@nfelt
Copy link
Collaborator
nfelt commented May 17, 2022

Googlers, see also b/219030239.

This is a tracking issue for updating our protobuf dependency to 3.19.0 or greater. That would simplify our lives in several respects: it would solve #5703, and it would allow us to upgrade gRPC past 1.30.0 (blocked by grpc/grpc#23311).

That update would involve both updating the protoc compiler we use to generate Python bindings, as well as updating the Python protobuf runtime version we depend on (see below for what that means).

This is currently blocked on TensorFlow's own upgrade (which is blocked by tensorflow/tensorflow#53234; Googlers, see b/182876485). We're blocked on TF because a previous attempt to update protobuf to 3.18.1 resulted in build breakages for other TF-ecosystem projects that rely on us, since our generated protos were incompatible with the older protobuf runtimes required by those projects.


Explainer: the way we depend on protobuf is a bit subtle. We build our .proto files into python libraries (py_pb2 targets) using this tb_proto_library build rule: https://github.com/tensorflow/tensorboard/blob/master/tensorboard/defs/protos.bzl

(Example of tb_proto_library usage here)

The build rule depends on 2 components that are provided by the protobuf dependency:

The protoc compiler is the tool that at build time actually generates _pb2.py files from the .proto files. Note however that we check the generated pb2.py files into our pip package, so that users don't need to run protoc themselves to execute TensorBoard locally, but when building TensorBoard with Bazel we do run protoc for this reason (which has its own issues).

The protobuf Python runtime is a common library that provides much of the implementation of the generated _pb2.py files, and as such it is required to be present at runtime in order to actually use those proto messages. Also, the protobuf compatibility policy is that the runtime version must always be at least as high as the protoc version used to generated the _pb2.py files. The protobuf Python runtime provided as a Bazel build-time dependency when building with Bazel, which basically guarantees that we use the same version for both runtime and protoc. However, running the TensorBoard pip package, we don't have the Bazel-built dependencies available and instead get our dependencies from installed pip packages, where the protobuf runtime is provided by the pip package protobuf: https://pypi.org/project/protobuf/ That opens up the possibility of version skew, which is why we set a lower bound in our requirements.txt. This lower bound must always be at least as high as the protoc version used to compile the protos in that release of TensorBoard.

@nfelt nfelt changed the title tracking issue for upgrading protobuf to 3.19.0+ (and updating gRPC as well) tracking issue for upgrading protobuf to 3.19.0+ (and gRPC as well) May 17, 2022
@saitcakmak
Copy link

As of protobuf 4.21.1 release, we're getting import errors while importing certain tensorboard components.

Repro:

conda create -n test_tensorboard python=3.7
conda activate test_tensorboard
pip install tensorboard
python
from tensorboard.backend.event_processing import plugin_event_multiplexer as event_multiplexer

Trace:

>>> from tensorboard.backend.event_processing import plugin_event_multiplexer as event_multiplexer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 24, in <module>
    from tensorboard.backend.event_processing import (
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_accumulator.py", line 23, in <module>
    from tensorboard.backend.event_processing import event_file_loader
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/backend/event_processing/event_file_loader.py", line 20, in <module>
    from tensorboard import data_compat
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/data_compat.py", line 20, in <module>
    from tensorboard.compat.proto import event_pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/event_pb2.py", line 17, in <module>
    from tensorboard.compat.proto import summary_pb2 as tensorboard_dot_compat_dot_proto_dot_summary__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/summary_pb2.py", line 17, in <module>
    from tensorboard.compat.proto import tensor_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/tensor_pb2.py", line 16, in <module>
    from tensorboard.compat.proto import resource_handle_pb2 as tensorboard_dot_compat_dot_proto_dot_resource__handle__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/resource_handle_pb2.py", line 16, in <module>
    from tensorboard.compat.proto import tensor_shape_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__shape__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/tensor_shape_pb2.py", line 42, in <module>
    serialized_options=None, file=DESCRIPTOR),
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/google/protobuf/descriptor.py", line 560, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
>>> 

@lucasjinreal
Copy link

When wil lthis be merged?

@Thomas-MMJ
Copy link

any progress on this?

@nfelt
Copy link
Collaborator Author
nfelt commented Nov 11, 2022

This continues to be blocked by TensorFlow's issue, unfortunately.

@Thomas-MMJ
Copy link
Thomas-MMJ commented Nov 12, 2022

So many machine learning projects depend on latest protobuf, but many others rely on tensorboard - could google create a 'protobuf' that is renamed something that doesn't conflict with the latest probobuf (say protobuf_legacy), then have pip, conda, etc. use the 'protobuf_legacy' for tensorboard till this is fixed?

nfelt added a commit that referenced this issue Jan 19, 2023
Fixes #5708 and #5703.

This updates our protobuf dependency to 3.19.6 in an attempt to address
#5708, and provide a cleaner solution to #5703.

The choice of 3.19.6 is meant to satisfy two competing constraints:

- Current Python protobuf runtimes (the 4.x series) only support
generated code from protoc versions 3.19.0+, as discussed in
https://protobuf.dev/news/2022-05-06/. As a result, prior to this
change, TensorBoard's pip package had to force its pip package
dependency to `protobuf < 4` to avoid the errors seen in #5703. This PR
lifts that restriction.

- Current TensorFlow is still stuck on protobuf 3.x, the same as we have
been, and as a result pins its pip package dependency using `protobuf <
3.20` (this could presumably be relaxed to `< 4` but that would require
new TF releases). As a result, we must support at least one protobuf
runtime version that also works with TF's constraints.

Our previous attempt at this upgrade (to ~3.18 or so) caused test
failures for Keras (which depends on TB, via TF, for the summary API
code), apparently due to a protobuf runtime that was too old for our
generated code. At the time, this was puzzling because they were
pip-installing a protobuf runtime version that should have been recent
enough - but I suspect now that this was a red herring, and bazel test
was actually getting the protobuf runtime from the protobuf build
dependency, not from the installed Python packages. If we see this
failure mode again, we'll have to get Keras to update the protobuf
Python runtime available in bazel tests.

Lastly, this upgrade lets us clean up some additional issues we had to
work around:

- We can also upgrade gRPC now, to 1.48.2. I selected this version since
it appears to be the most recent version prior to gRPC adopting protobuf
4.x (see
grpc/grpc@41ec08c)
- We can revert the backported fixes to protobuf and grpc from
#5793, since the upgraded
dependencies don't require patching
- We can back out rules_apple reintroduction from
#5561 that we only needed
for gRPC
@vadimkantorov
Copy link

should this be reopened for lifting <3.2.0 requirement?

@nfelt
Copy link
Collaborator Author
nfelt commented Feb 23, 2023

@vadimkantorov there isn't a <3.2.0 requirement, and certainly not from TensorBoard. You might mean the the <3.20 requirement from TensorFlow, which has already been lifted in this commit but hasn't yet been released in a stable release.

If you have a separate problem, please open a new issue.

@vadimkantorov
Copy link

Hmm, okay, I was confused by this recent behavior: pytorch/pytorch#90374, but maybe it occurred with an older version of tensorboard. If it actually works with newer protobufs it would be nice to note this explicitly in README, since this bug was around for a long time...

@nfelt
Copy link
Collaborator Author
nfelt commented Feb 23, 2023

The issue was fixed between when you filed that issue and now; the PR fixing it in January is linked earlier on this issue thread. We've released TensorBoard 2.12 which contains this fix, so if you have that version of TensorBoard installed you shouldn't see the PyTorch issue. We did mention the change in our release notes: https://github.com/tensorflow/tensorboard/releases/tag/2.12.0

facebook-github-bot pushed a commit to facebook/Ax that referenced this issue Feb 23, 2023
Summary:
This is no longer needed since tensorflow/tensorboard#5708 was resolved.

Pull Request resolved: #1449

Reviewed By: Balandat

Differential Revision: D43521202

Pulled By: saitcakmak

fbshipit-source-id: ddf852ab5ac27e1b6479554e0682b50078cd1574
yatbear pushed a commit to yatbear/tensorboard that referenced this issue Mar 27, 2023
Fixes tensorflow#5708 and tensorflow#5703.

This updates our protobuf dependency to 3.19.6 in an attempt to address
tensorflow#5708, and provide a cleaner solution to tensorflow#5703.

The choice of 3.19.6 is meant to satisfy two competing constraints:

- Current Python protobuf runtimes (the 4.x series) only support
generated code from protoc versions 3.19.0+, as discussed in
https://protobuf.dev/news/2022-05-06/. As a result, prior to this
change, TensorBoard's pip package had to force its pip package
dependency to `protobuf < 4` to avoid the errors seen in tensorflow#5703. This PR
lifts that restriction.

- Current TensorFlow is still stuck on protobuf 3.x, the same as we have
been, and as a result pins its pip package dependency using `protobuf <
3.20` (this could presumably be relaxed to `< 4` but that would require
new TF releases). As a result, we must support at least one protobuf
runtime version that also works with TF's constraints.

Our previous attempt at this upgrade (to ~3.18 or so) caused test
failures for Keras (which depends on TB, via TF, for the summary API
code), apparently due to a protobuf runtime that was too old for our
generated code. At the time, this was puzzling because they were
pip-installing a protobuf runtime version that should have been recent
enough - but I suspect now that this was a red herring, and bazel test
was actually getting the protobuf runtime from the protobuf build
dependency, not from the installed Python packages. If we see this
failure mode again, we'll have to get Keras to update the protobuf
Python runtime available in bazel tests.

Lastly, this upgrade lets us clean up some additional issues we had to
work around:

- We can also upgrade gRPC now, to 1.48.2. I selected this version since
it appears to be the most recent version prior to gRPC adopting protobuf
4.x (see
grpc/grpc@41ec08c)
- We can revert the backported fixes to protobuf and grpc from
tensorflow#5793, since the upgraded
dependencies don't require patching
- We can back out rules_apple reintroduction from
tensorflow#5561 that we only needed
for gRPC
dna2github pushed a commit to dna2fork/tensorboard that referenced this issue May 1, 2023
Fixes tensorflow#5708 and tensorflow#5703.

This updates our protobuf dependency to 3.19.6 in an attempt to address
tensorflow#5708, and provide a cleaner solution to tensorflow#5703.

The choice of 3.19.6 is meant to satisfy two competing constraints:

- Current Python protobuf runtimes (the 4.x series) only support
generated code from protoc versions 3.19.0+, as discussed in
https://protobuf.dev/news/2022-05-06/. As a result, prior to this
change, TensorBoard's pip package had to force its pip package
dependency to `protobuf < 4` to avoid the errors seen in tensorflow#5703. This PR
lifts that restriction.

- Current TensorFlow is still stuck on protobuf 3.x, the same as we have
been, and as a result pins its pip package dependency using `protobuf <
3.20` (this could presumably be relaxed to `< 4` but that would require
new TF releases). As a result, we must support at least one protobuf
runtime version that also works with TF's constraints.

Our previous attempt at this upgrade (to ~3.18 or so) caused test
failures for Keras (which depends on TB, via TF, for the summary API
code), apparently due to a protobuf runtime that was too old for our
generated code. At the time, this was puzzling because they were
pip-installing a protobuf runtime version that should have been recent
enough - but I suspect now that this was a red herring, and bazel test
was actually getting the protobuf runtime from the protobuf build
dependency, not from the installed Python packages. If we see this
failure mode again, we'll have to get Keras to update the protobuf
Python runtime available in bazel tests.

Lastly, this upgrade lets us clean up some additional issues we had to
work around:

- We can also upgrade gRPC now, to 1.48.2. I selected this version since
it appears to be the most recent version prior to gRPC adopting protobuf
4.x (see
grpc/grpc@41ec08c)
- We can revert the backported fixes to protobuf and grpc from
tensorflow#5793, since the upgraded
dependencies don't require patching
- We can back out rules_apple reintroduction from
tensorflow#5561 that we only needed
for gRPC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants