[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase default safety of Keras tar extraction #69752

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

WilliamRoyNelson
Copy link

Python 3.12 adds increased safety options for tarfile.extractall() but are disabled by default.

This enables an tarfile extraction filter if using Python 3.12 or higher.

The default for tarfile.extractall() will be set to 'data' in Python 3.14.

Applications and system integrators may wish to change extraction_filter of the TarFile class itself to set a global default. When using a function, they will generally want to wrap it in staticmethod() to prevent injection of a self argument.

References:
https://docs.python.org/3/library/tarfile.html#tarfile-extraction-filter
PEP 706 – Filter for tarfile.extractall

@google-ml-butler google-ml-butler bot added the size:S CL Change Size: Small label Jun 14, 2024
Copy link
google-cla bot commented Jun 14, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jun 17, 2024
Python 3.12 adds increased safety options for tarfile.extractall() but are disabled by default.

This enables an tarfile extraction filter.
https://docs.python.org/3/library/tarfile.html#tarfile-extraction-filter
@gbaned gbaned self-requested a review June 18, 2024 03:57
@google-ml-butler google-ml-butler bot added the awaiting review Pull request awaiting review label Jun 18, 2024
@mihaimaruseac
Copy link
Collaborator

I think this might need to be made on the keras org repo, instead of here

@WilliamRoyNelson
Copy link
Author

I think this might need to be made on the keras org repo, instead of here

I think you might be right, based on this:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/README.md

The big thing to me is that even if the code is on the way out (for the last 3 years?), it's still there and still gets flagged by certain security scans as an unsafe method of extracting .tar files.

I'm looking at some of the demo notebooks, and unless I'm misunderstanding how the imports work, they use the Keras functions in this library, not the Keras org repo functions.

"source": [
"image_path = tf.keras.utils.get_file(\n",
" 'flower_photos.tgz',\n",
" 'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',\n",
" extract=True)\n",
"image_path = os.path.join(os.path.dirname(image_path), 'flower_photos')"
]

@mihaimaruseac
Copy link
Collaborator

So, there are 2 Keras(es): one is Keras 3 that should be multi-framework, write code once and change an environment variable and now you use PyTorch as backend, change it again and now you use JAX. The other one is Keras as it was inside TF between TF 2.0 and TF 2..

The latter is what is still in the repo, but it should be taken as a frozen copy. All new updates should occur on the newer multi-framework Keras instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review Pull request awaiting review size:S CL Change Size: Small
Projects
PR Queue
  
Assigned Reviewer
Development

Successfully merging this pull request may close these issues.

None yet

3 participants