[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial read on nonexistent tf.gfile.GFile in w+ mode crashes #32090

Open
eirism opened this issue Aug 29, 2019 · 15 comments
Open

Initial read on nonexistent tf.gfile.GFile in w+ mode crashes #32090

eirism opened this issue Aug 29, 2019 · 15 comments
Assignees
Labels
comp:ops OPs related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.11 Issues related to TF 2.11 type:bug Bug

Comments

@eirism
Copy link
eirism commented Aug 29, 2019

System information

  • Have I written custom code: Yes
  • OS Platform and Distribution: Linux Ubuntu 18.04
  • TensorFlow installed from: binary
  • TensorFlow version: v1.14.0-rc1-22-gaf24dc9 1.14.0
  • Python version: 3.7

Describe the current behavior
Python raises tensorflow.python.framework.errors_impl.NotFoundError when doing a first read (no writes before it) on a nonexistent tf.gfile.GFile in w+ mode.

Describe the expected behavior
Read on an empty w+ file should return an empty string.
One problem with the current behaviour is that numpy.savez() crashes when writing to a GFile.

Code to reproduce the issue

import tensorflow as tf

with tf.io.gfile.GFile('test.txt', 'w+') as f:
    f.read()

Other info / logs

Traceback (most recent call last):
  File "test_gfile.py", line 5, in <module>
    f.read()
  File "/VENV/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 122, in read
    self._preread_check()
  File "/VENV/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 84, in _preread_check
    compat.as_bytes(self.__name), 1024 * 512)
tensorflow.python.framework.errors_impl.NotFoundError: test.txt; No such file or directory
@ravikyram ravikyram self-assigned this Aug 30, 2019
@ravikyram ravikyram added comp:ops OPs related issues TF 1.14 for issues seen with TF 1.14 labels Aug 30, 2019
@ravikyram
Copy link
Contributor

test.txt is not in your current working directory. Please, make sure test.txt and your python-file/jupyter-notebook are in same directory.Thanks!

@ravikyram ravikyram added the stat:awaiting response Status - Awaiting response from author label Aug 30, 2019
@eirism
Copy link
Author
eirism commented Aug 30, 2019

I expect that the w mode (and w+, w+b) should have similar semantics to Python open, where w means truncate the file first.

Since this isn't the case it breaks things like numpy.savez().

Workarounds are to either write something, e.g. "", to the file before reading so that the file is created, or manually create the file first as you suggest. But those are workarounds to a problem that I think should be solved in GFile.

@mihaimaruseac mihaimaruseac self-assigned this Aug 30, 2019
@mihaimaruseac mihaimaruseac removed the TF 1.14 for issues seen with TF 1.14 label Aug 30, 2019
@mihaimaruseac
Copy link
Collaborator

This happens on both nightly and 2.0.

As I'm working on modularizing filesystem support, I'm assigning this to me, although it will take a while until I can get to the python side of things.

In the end, the expected behavior should be similar to Python's:

>>> with open('this_file_does_not_exist_at_all', 'w+') as f: f.read()
... 
''

@eirism
Copy link
Author
eirism commented Aug 30, 2019

Related to the fact that GFile does not truncate files the same way as Python, reading an existing file opened with w+ will return the text in the file instead of "".

Example:

import tensorflow as tf

with open('existing.txt', 'w') as f:
    f.write('txt')
    f.flush()

with tf.io.gfile.GFile('existing.txt', 'w+') as f:
    print(f.read())  # Prints txt, should print ""

@mihaimaruseac
Copy link
Collaborator

I'll have to handle that too, thanks for pointing it out

@eirism
Copy link
Author
eirism commented Aug 30, 2019

f.seek() also crashes in the same way as f.read().

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Aug 31, 2019
@ravikyram ravikyram removed their assignment Sep 4, 2019
@ravikyram ravikyram added the type:bug Bug label Sep 4, 2019
@lminer
Copy link
lminer commented Jan 7, 2021

Any update on this? I'm running into this error trying to write to s3 with gfile and pysoundfile

@lgeiger
Copy link
Contributor
lgeiger commented May 13, 2021

@mihaimaruseac Any updates on this? This still seems to be an issue with TF 2.5 and prevent's the use of GFile together with np.savez.

@sanatmpa1
Copy link

issue still exists in 2.6.0 and nightly. Here's the gist. Thanks!

@mihaimaruseac
Copy link
Collaborator

Problem is the modularization effort stalled since members left the team last year.

We onboarded new members recently, so we should pick up these items again. Apologies for the delays.

@patzm
Copy link
Contributor
patzm commented Oct 27, 2021

from @eirism in #32090 (comment):

Workarounds are to either write something, e.g. "", to the file before reading so that the file is created, or manually create the file first as you suggest. But those are workarounds to a problem that I think should be solved in GFile.

could you give an example of such a workaround? I tried the following unsuccessful:

import numpy as np
import tensorflow as tf

with tf.io.gfile.GFile("output.npz", "w") as file:
    file.write("")
    np.savez(file, content=np.array([1, 2, 3]))

Still getting

tensorflow.python.framework.errors_impl.PermissionDeniedError: File isn't open for reading

@eirism
Copy link
Author
eirism commented Oct 27, 2021

@patzm Since numpy.savez both reads and writes to the file you need to open it in w+ mode. You get the error because w is only for writing to the file, you can not read from a w file.

@patzm
Copy link
Contributor
patzm commented Oct 28, 2021

sadly also doesn't work for me. Neither w+ nor wb+. Could you post a minimal example maybe?
Btw, I think this is also related to #32975.

@cnsgsz
Copy link
cnsgsz commented Dec 4, 2021

a similar approach that works for me.

io_buffer = io.BytesIO()
np.savez(io_buffer, ...)
with gfile.Open(path, "wb") as f:
  f.write(io_buffer.getvalue())

owenvallis added a commit to tensorflow/similarity that referenced this issue Apr 5, 2022
Writing to a buffer to avoid read error in np.savez when using GFile.
See: tensorflow/tensorflow#32090
@mihaimaruseac mihaimaruseac removed their assignment May 31, 2022
@chunduriv chunduriv self-assigned this Jul 19, 2022
@chunduriv
Copy link
Contributor
chunduriv commented Jul 19, 2022

I was able to replicate the issue in tf-nightly 2.12.0-dev20221215. Please find the gist for reference. Thank you.

@chunduriv chunduriv added the TF 2.9 Issues found in the TF 2.9 release (or RCs) label Jul 19, 2022
@chunduriv chunduriv added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jul 27, 2022
@mohantym mohantym self-assigned this Jan 16, 2023
@mohantym mohantym added TF 2.11 Issues related to TF 2.11 and removed TF 2.9 Issues found in the TF 2.9 release (or RCs) labels Jan 16, 2023
@mohantym mohantym removed their assignment Jan 16, 2023
abeltheo pushed a commit to abeltheo/similarity that referenced this issue Mar 23, 2023
Writing to a buffer to avoid read error in np.savez when using GFile.
See: tensorflow/tensorflow#32090
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:ops OPs related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.11 Issues related to TF 2.11 type:bug Bug
Projects
None yet
Development

No branches or pull requests