-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSError: Can't read data - unexpectedly large 'bytes actually read' #1610
Comments
There was a similar error reported recently in #1592. 18446744073709551615 is Are you parallelising anything with multiple processes - or could scanpy be doing some internal parallelism? Opening an HDF5 file before a fork can cause weird problems with reading it. I thought of this because you said it sometimes works and sometimes doesn't, which suggests a race condition. Other than that, we may just have to pass you on again, to HDF5 itself. You can email help@hdfgroup.org or use the HDF forum for that. |
Thank you. I do not think that any of it is parallelized. My file is saved locally - on the server where I am trying to read it, |
In that case, the only thing I can think to suggest is asking HDF group about it. The error is coming from HDF5, and h5py is just another layer of code wrapping around that. If it's not something they already recognise, someone will need to try to reduce it to a minimal example which reproduces the problem. |
It seems that the problem was due to scverse/scanpy#1351 (comment) . This comment seems to resolve my problems for now. |
OK, thanks. I guess we should remember to check for network filesystems when these errors come up. Maybe we should have a troubleshooting bit in the docs somewhere. |
Hey everyone, I'm having the same issue as @Hrovatin. I'm not using scanpy but Hi-C tools to convert a huge matrix from one format to another. Here is the same error message I get for both runs: `WARNING:py.warnings:/home/me/miniconda3/lib/python3.8/site-packages/cooler/util.py:733: FutureWarning: is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead INFO:cooler.cli.load:fields: {'bin1_id': 0, 'bin2_id': 1, 'count': 2} |
Is it possible that |
Hey, |
Should I create a new issue for this or could it be possible to reopen this one ? |
The error is coming from HDF5, so you might want to ask HDF group about it - help@hdfgroup.org, or on the HDF forum. But if you're accessing data on a network filesystem, there's a good chance that's the problem. You might be able to avoid the error by using SWMR mode, but that has its own limitations. |
1 similar comment
The error is coming from HDF5, so you might want to ask HDF group about it - help@hdfgroup.org, or on the HDF forum. But if you're accessing data on a network filesystem, there's a good chance that's the problem. You might be able to avoid the error by using SWMR mode, but that has its own limitations. |
Hello everybody, I am new in python and I try to use h5py package following the tutorial avalaible here : https://lpdaac.usgs.gov/resources/e-learning/getting-started-gedi-l1b-data-python/ When I want to know the list : list(gediL1B.keys()) Here is the error message that I receive : TypeError Traceback (most recent call last) ~\anaconda3\envs\geditutorial\lib_collections_abc.py in iter(self) ~\anaconda3\envs\geditutorial\lib\site-packages\h5py_hl\group.py in iter(self) h5py\h5g.pyx in h5py.h5g.GroupID.iter() h5py\h5g.pyx in h5py.h5g.GroupID.iter() h5py\h5g.pyx in h5py.h5g.GroupIter.init() h5py_objects.pyx in h5py._objects.with_phil.wrapper() h5py_objects.pyx in h5py._objects.with_phil.wrapper() h5py\h5g.pyx in h5py.h5g.GroupID.get_num_objs() TypeError: Not a group (not a group) Could you help me please? |
I was trying to load some datasets with scanpy.read_h5ad(file_name), which is based on h5py. The developers of the Scanpy instructed me that this is in fact h5py problem and that they can not help me with it (see scverse/scanpy#1351). When trying to load the h5ad dataset (based on hdf5) I frequently get the below error. When I re-run the code multiple times or at different times it sometimes works, but often I get the error (using the same code and data). This happens when reading different h5ad datasets (e.g. is not specific to one dataset). At all times there seems to be enough free RAM / similar amount of free RAM - one of the h5ad files is 63M and the server has more than 300GB free RAM and 40GB free sweep. I also tried using different servers, but this did not help. This happens both when using jupyter-notebook and python without jupyter-notebook.
Please note that I deleted most of the path of my file from the stacktrace - it is stored in my workspace on the server.
The text was updated successfully, but these errors were encountered: