-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sc.read_h5ad randomly produces AnnDataReadError/OSError #1351
Comments
That problem occurs within h5py (we just wrap the underlying OSError) and isn’t a consequence of how scanpy uses h5py. The relevant part of the traceback is: OSError: Can't read data (file read failed:
time = Sat Aug 1 13:27:54 2020,
filename = '/path.../filtered_gene_bc_matrices.h5ad',
file descriptor = 47,
errno = 5,
error message = 'Input/output error',
buf = 0x55ec782e9031,
total read size = 7011,
bytes this sub-read = 7011,
bytes actually read = 18446744073709551615,
offset = 0) The reported filename looks weird: I assume See also: |
This is quite a common error on our internal servers @Hrovatin. I have been getting around it by reading from a different server, and then it just often works. It would be great if you can figure our what the issue might be. |
Which server do you suggest? - I had tried a couple with no success. I am having a lot of trouble with it - I am getting errors when reading different parts of the file - even when trying to use just h5py. |
I have been moving between interactive servers not on the queue. |
From my time in @theislab I infer this means it’s a network mount problem. You can probably fix it by putting the file somewhere in the local file system then. Since /home/* is network-mounted, that means /localscratch/ or /tmp/ I assume |
Thank you very much @flying-sheep - copying to tmp for the time of reading seems to currently work. |
Great to hear! Usually when there’s weird, site-specific errors, I say I can’t help because I don’t have SSH access and “my crystal ball is currently out of order”. Seems like my crystal ball worked just fine these days! |
@flying-sheep just wait until tomorrow... when the next random error occurs ;). |
Found the same error in our internal workflows. Saved the data to h5py files, but could not open them anymore for some reason. Error: ---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/anndata/_io/utils.py in func_wrapper(elem, *args, **kwargs)
155 try:
--> 156 return func(elem, *args, **kwargs)
157 except Exception as e:
/opt/conda/lib/python3.7/site-packages/anndata/_io/h5ad.py in read_group(group)
531 if encoding_type:
--> 532 EncodingVersions[encoding_type].check(
533 group.name, group.attrs["encoding-version"]
/opt/conda/lib/python3.7/enum.py in __getitem__(cls, name)
356 def __getitem__(cls, name):
--> 357 return cls._member_map_[name]
358
KeyError: 'dict'
During handling of the above exception, another exception occurred:
AnnDataReadError Traceback (most recent call last)
<ipython-input-20-38a594ec7d06> in <module>
----> 1 adata_ast=sc.read_h5ad('../../data_processed/Leng_2020/adata_ast.h5ad')
/opt/conda/lib/python3.7/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
424 d[k] = read_dataframe(f[k])
425 else: # Base case
--> 426 d[k] = read_attribute(f[k])
427
428 d["raw"] = _read_raw(f, as_sparse, rdasp)
/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw)
838 '1 positional argument')
839
--> 840 return dispatch(args[0].__class__)(*args, **kw)
841
842 funcname = getattr(func, '__name__', 'singledispatch function')
/opt/conda/lib/python3.7/site-packages/anndata/_io/utils.py in func_wrapper(elem, *args, **kwargs)
161 parent = _get_parent(elem)
162 raise AnnDataReadError(
--> 163 f"Above error raised while reading key {elem.name!r} of "
164 f"type {type(elem)} from {parent}."
165 )
AnnDataReadError: Above error raised while reading key '/layers' of type <class 'h5py._hl.group.Group'> from /.
adata_ast=sc.read_h5ad('../../data_processed/Leng_2020/adata_ast.h5ad') VersionsPackage Version absl-py 1.1.0 Has anyone found any solution to work around this issue? |
I want to follow up and see if this has a solution |
Having same issue |
same |
I'm pretty sure none of you are having the same issue as the original one reported here. Compare @abuchin 's error message of The thing you're seeing is a new one stemming from an update to anndata. You're trying to read in a Upgrade your anndata and you should be ok. |
The solution from @ktpolanski fixed this for me. |
Thanks @ktpolanski. Problem solved. |
I was facing this issue in 0.7.8. Upgrading to 0.8.0 solved the problem |
Same here, thanks @ktpolanski !!! |
how did you update? pip says that 0.7.8 is the latest version |
For me |
Then it's probably a case of having an old python3. I loaded up an environment where I have python 3.6.9 and the newest version it saw was 0.7.8. |
pip install anndata --upgrade works. |
I am having the same problem,however pip install anndata --upgrade didn't work for me. pip said it is already the latest version: Requirement already satisfied: anndata in d:\python3.10.9\lib\site-packages (0.9.1), then I really don't know what to do. Could you guys help me with that? [crying] |
@Sunyiqing2003 we certainly don’t want you crying! people here had problem reading with older anndata versions, but you seem to have the newest one, so it’s not the same issue. could you file a new issue? |
@flying-sheep Surely I can file a new issue , Thank you very much ! |
Thank you for your time and attention , I really appreciate it. I have filed a new issue : https://github.com/scverse/scanpy/issues/2551. I might know the reason why updating anndata didn't work. the main reason for me seems to be big array and memory error. |
I am trying to load some datasets with
sc.read_h5ad(file_name)
. Frequently, I get the below error. When I re-run the code multiple times or at different times it sometimes works, but often I get the error (using the same code and data). This happens when reading different h5ad datasets (e.g. is not specific to one dataset). At all times there seems to be enough free RAM / similar amount of free RAM. This happens both when using jupyter-notebook and python without jn.Error:
Versions:
The text was updated successfully, but these errors were encountered: