[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CelebA: Resource cannot infer ExtractMethod from filename #2321

Closed
ericpts opened this issue Aug 19, 2020 · 14 comments
Closed

CelebA: Resource cannot infer ExtractMethod from filename #2321

ericpts opened this issue Aug 19, 2020 · 14 comments
Labels
bug Something isn't working

Comments

@ericpts
Copy link
ericpts commented Aug 19, 2020

Short description

When trying to load the CelebA dataset, the code fails with the error KeyError: <ExtractMethod.NO_EXTRACT: 1>, because the
Resource() constructor is unable to figure out the proper method from the filename.

Environment information

  • Operating System: Debian Testing
  • Python version: 3.8.5
  • tensorflow-datasets version: '3.2.1'
  • tensorflow version: 2.3.0

Reproduction instructions

python3 -m tensorflow_datasets.scripts.download_and_prepare --datasets=celeb_a

Link to logs
https://paste.ubuntu.com/p/K8BRHzp27d/

Expected behavior
The dataset should be properly loaded.

Additional context

This happens in download_manager.py, in the function iter_archive.
resource is /home/ericpts/tensorflow_datasets/downloads/ucexport_download_id_0B7EVK8r0v71pZjFTYXZWM3FlDDaXUAQO8EGH_a7VqGNLRtW52mva1LzDrb-V723OQN8, and Resource is unable to guess the ExtractMethod from the path.

If I manually specify it as ExtractMethod.ZIP, then everything works correctly.

@ericpts ericpts added the bug Something isn't working label Aug 19, 2020
@vijayphoenix
Copy link
Contributor
vijayphoenix commented Sep 21, 2020

Hi @ericpts,
We recently fixed a similar bug #2423. (Fixed by bb2ce95)
This should appear in tfds-nightly soon.

Please try again and let us know the results.

@Conchylicultor
Copy link
Member

The CelebA bug is likely a duplicate of #1482

@zaccharieramzi
Copy link

Hi @vijayphoenix ,

I tried just now with tfds-nightly, and still got the KeyError: <ExtractMethod.NO_EXTRACT: 1> error.
Do you know when the next nightly release will be?
It seems that it stopped on september 9th for some reason: https://pypi.org/project/tfds-nightly/#history .

@vijayphoenix
Copy link
Contributor

Hi @zaccharieramzi,
Can you try again with updated tfds-nightly?

@oricou
Copy link
oricou commented Oct 2, 2020

I have just tried with version 3.2.1.dev202010020107, it still has the same issue

KeyError: <ExtractMethod.NO_EXTRACT: 1>

@tqa236
Copy link
tqa236 commented Oct 19, 2020

I just tried today with tfds-nightly and got the same error:

KeyError: <ExtractMethod.NO_EXTRACT: 1>

@sainivedh
Copy link

celeba_bldr = tfds.builder('celeb_a')
celeba_bldr.download_and_prepare()

Error:
KeyError: <ExtractMethod.NO_EXTRACT: 1>

@Conchylicultor
Copy link
Member

As explained above, the CelebA bug is likely a duplicate of #1482

@zaccharieramzi
Copy link
zaccharieramzi commented Nov 2, 2020

Hi @vijayphoenix , I just retried with the latest nightly, and it gave the same error.

@Conchylicultor I have tried several times at different hours and I am not in China (and not using any VPN), so I guess the root cause for this issue is different than #1482 .

However, when inspecting the downloads (for all files), I indeed see an html file that looks like the following (with title Google Drive - Quota exceeded):

Sorry, you can't view or download this file at this time.

Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator.

Maybe there is a problem related to this particular file for some reason.

Do you know how we can mitigate this issue by manually downloading the files ?
I see the drive links listed here, but I don't know then how to organise them in ~/tensorflow_datasets/celeb_a or if I need to extract them.

@zaccharieramzi
Copy link

Ok, I think I understood how the manual downloading should work.

Basically you can download all the files listed here, put them in ~/tensorflow_datasets/downloads/manual/ and you can just run the tfds.load('celeb_a'... as normal.
You can change the manually downloaded path with the download config here.

@vijayphoenix
Copy link
Contributor

Duplicate of #1482

You can also follow https://www.tensorflow.org/datasets/overview#troubleshooting for manual download

@liqinglin54951
Copy link

celeb_a tfrecord files:
https://drive.google.com/drive/folders/1MKQ9sRwr5OOFk3OBzLz91SsgF3MBqvtP?usp=sharing
OR
you can follow " Create tfrecord files for 'test', 'train', 'validation' " on cp13_Parallelizing NN Training w TF_printoptions(precision)_squeeze_shuffle_batch_repeat_image process_map_tfrecords (https://blog.csdn.net/Linli522362242/article/details/112386820)

@liqinglin54951
Copy link

KeyError: <ExtractMethod.NO_EXTRACT: 1> since your code need tfrecord files

@ghost
Copy link
ghost commented Apr 24, 2021

I think that this problem does not get fixed in Colab. I've tried in my laptop, and it works fine. But on Colab, nah, it just won't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants