CelebA: Resource cannot infer ExtractMethod from filename #2321

ericpts · 2020-08-19T09:40:23Z

Short description

When trying to load the CelebA dataset, the code fails with the error KeyError: <ExtractMethod.NO_EXTRACT: 1>, because the
Resource() constructor is unable to figure out the proper method from the filename.

Environment information

Operating System: Debian Testing
Python version: 3.8.5
tensorflow-datasets version: '3.2.1'
tensorflow version: 2.3.0

Reproduction instructions

python3 -m tensorflow_datasets.scripts.download_and_prepare --datasets=celeb_a

Link to logs
https://paste.ubuntu.com/p/K8BRHzp27d/

Expected behavior
The dataset should be properly loaded.

Additional context

This happens in download_manager.py, in the function iter_archive.
resource is /home/ericpts/tensorflow_datasets/downloads/ucexport_download_id_0B7EVK8r0v71pZjFTYXZWM3FlDDaXUAQO8EGH_a7VqGNLRtW52mva1LzDrb-V723OQN8, and Resource is unable to guess the ExtractMethod from the path.

If I manually specify it as ExtractMethod.ZIP, then everything works correctly.

The text was updated successfully, but these errors were encountered:

vijayphoenix · 2020-09-21T13:24:01Z

Hi @ericpts,
We recently fixed a similar bug #2423. (Fixed by bb2ce95)
This should appear in tfds-nightly soon.

Please try again and let us know the results.

Conchylicultor · 2020-09-21T13:32:20Z

The CelebA bug is likely a duplicate of #1482

zaccharieramzi · 2020-09-22T12:37:10Z

Hi @vijayphoenix ,

I tried just now with tfds-nightly, and still got the KeyError: <ExtractMethod.NO_EXTRACT: 1> error.
Do you know when the next nightly release will be?
It seems that it stopped on september 9th for some reason: https://pypi.org/project/tfds-nightly/#history .

vijayphoenix · 2020-10-01T20:36:40Z

Hi @zaccharieramzi,
Can you try again with updated tfds-nightly?

oricou · 2020-10-02T15:46:32Z

I have just tried with version 3.2.1.dev202010020107, it still has the same issue

KeyError: <ExtractMethod.NO_EXTRACT: 1>

tqa236 · 2020-10-19T09:24:17Z

I just tried today with tfds-nightly and got the same error:

KeyError: <ExtractMethod.NO_EXTRACT: 1>

sainivedh · 2020-10-24T09:26:42Z

celeba_bldr = tfds.builder('celeb_a')
celeba_bldr.download_and_prepare()

Error:
KeyError: <ExtractMethod.NO_EXTRACT: 1>

Conchylicultor · 2020-10-26T09:59:40Z

As explained above, the CelebA bug is likely a duplicate of #1482

zaccharieramzi · 2020-11-02T14:05:22Z

Hi @vijayphoenix , I just retried with the latest nightly, and it gave the same error.

@Conchylicultor I have tried several times at different hours and I am not in China (and not using any VPN), so I guess the root cause for this issue is different than #1482 .

However, when inspecting the downloads (for all files), I indeed see an html file that looks like the following (with title Google Drive - Quota exceeded):

Sorry, you can't view or download this file at this time.

Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator.

Maybe there is a problem related to this particular file for some reason.

Do you know how we can mitigate this issue by manually downloading the files ?
I see the drive links listed here, but I don't know then how to organise them in ~/tensorflow_datasets/celeb_a or if I need to extract them.

zaccharieramzi · 2020-11-02T14:31:43Z

Ok, I think I understood how the manual downloading should work.

Basically you can download all the files listed here, put them in ~/tensorflow_datasets/downloads/manual/ and you can just run the tfds.load('celeb_a'... as normal.
You can change the manually downloaded path with the download config here.

vijayphoenix · 2020-12-29T14:58:15Z

Duplicate of #1482

You can also follow https://www.tensorflow.org/datasets/overview#troubleshooting for manual download

liqinglin54951 · 2021-01-28T19:30:49Z

celeb_a tfrecord files:
https://drive.google.com/drive/folders/1MKQ9sRwr5OOFk3OBzLz91SsgF3MBqvtP?usp=sharing
OR
you can follow " Create tfrecord files for 'test', 'train', 'validation' " on cp13_Parallelizing NN Training w TF_printoptions(precision)_squeeze_shuffle_batch_repeat_image process_map_tfrecords (https://blog.csdn.net/Linli522362242/article/details/112386820)

liqinglin54951 · 2021-01-28T19:49:46Z

KeyError: <ExtractMethod.NO_EXTRACT: 1> since your code need tfrecord files

ghost · 2021-04-24T17:53:10Z

I think that this problem does not get fixed in Colab. I've tried in my laptop, and it works fine. But on Colab, nah, it just won't.

ericpts added the bug Something isn't working label Aug 19, 2020

Conchylicultor mentioned this issue Nov 30, 2020

get KeyError while loading celeb_a dataset with tfds(tensorflow_datasets) #2801

Closed

vijayphoenix closed this as completed Dec 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CelebA: Resource cannot infer ExtractMethod from filename #2321

CelebA: Resource cannot infer ExtractMethod from filename #2321

CelebA: Resource cannot infer ExtractMethod from filename #2321

CelebA: Resource cannot infer ExtractMethod from filename #2321

Comments