[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot read loom file created in Seurat3 (column index exceeds matrix dimensions) #598

Closed
cakirb opened this issue Apr 9, 2019 · 23 comments

Comments

@cakirb
Copy link
cakirb commented Apr 9, 2019

I have a loom file created from Seurat object by using as.loom function in Seurat3. After closing the file with $close.all(), I'm trying to read loom file by read_loom function in scanpy, but I have this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-aed61d3d5eef> in <module>
      1 import scanpy as sc
----> 2 a = sc.read_loom('brain10x.loom')

/opt/conda/lib/python3.7/site-packages/anndata/readwrite/read.py in read_loom(filename, sparse, cleanup, X_name, obs_names, var_names, dtype)
    156 
    157         if X_name not in lc.layers.keys(): X_name = ''
--> 158         X = lc.layers[X_name].sparse().T.tocsr() if sparse else lc.layers[X_name][()].T
    159 
    160         layers = OrderedDict()

/opt/conda/lib/python3.7/site-packages/loompy/loom_layer.py in sparse(self, rows, cols)
    109                 col: List[np.ndarray] = []
    110                 i = 0
--> 111                 for (ix, selection, view) in self.ds.scan(items=cols, axis=1, layers=[self.name]):
    112                         if rows is not None:
    113                                 vals = view.layers[self.name][rows, :]

/opt/conda/lib/python3.7/site-packages/loompy/loompy.py in scan(self, items, axis, layers, key, batch_size)
    597                                 for key, layer in vals.items():
    598                                         lm[key] = loompy.MemoryLoomLayer(key, layer)
--> 599                                 view = loompy.LoomView(lm, self.ra[ordering], self.ca[ix + selection], self.row_graphs[ordering], self.col_graphs[ix + selection], filename=self.filename, file_attrs=self.attrs)
    600                                 yield (ix, ix + selection, view)
    601                                 ix += cols_per_chunk

/opt/conda/lib/python3.7/site-packages/loompy/graph_manager.py in __getitem__(self, thing)
     96                 if type(thing) is slice or type(thing) is np.ndarray or type(thing) is int:
     97                         gm = GraphManager(None, axis=self.axis)
---> 98                         for key, g in self.items():
     99                                 # Slice the graph matrix properly without making it dense
    100                                 (a, b, w) = (g.row, g.col, g.data)

/opt/conda/lib/python3.7/site-packages/loompy/graph_manager.py in items(self)
     55         def items(self) -> Iterable[Tuple[str, sparse.coo_matrix]]:
     56                 for key in self.keys():
---> 57                         yield (key, self[key])
     58 
     59         def __len__(self) -> int:

/opt/conda/lib/python3.7/site-packages/loompy/graph_manager.py in __getitem__(self, thing)
    116                         raise AttributeError(f"'{type(self)}' object has no attribute {thing}")
    117                 else:
--> 118                         return self.__getattr__(thing)
    119 
    120         def __getattr__(self, name: str) -> sparse.coo_matrix:

/opt/conda/lib/python3.7/site-packages/loompy/graph_manager.py in __getattr__(self, name)
    127                                 c = self.ds._file[a][name]["b"]
    128                                 w = self.ds._file[a][name]["w"]
--> 129                                 g = sparse.coo_matrix((w, (r, c)), shape=(self.ds.shape[self.axis], self.ds.shape[self.axis]))
    130                                 self.__dict__["storage"][name] = g
    131                         return g

/opt/conda/lib/python3.7/site-packages/scipy/sparse/coo.py in __init__(self, arg1, shape, dtype, copy)
    190             self.data = self.data.astype(dtype, copy=False)
    191 
--> 192         self._check()
    193 
    194     def reshape(self, *args, **kwargs):

/opt/conda/lib/python3.7/site-packages/scipy/sparse/coo.py in _check(self)
    279                 raise ValueError('row index exceeds matrix dimensions')
    280             if self.col.max() >= self.shape[1]:
--> 281                 raise ValueError('column index exceeds matrix dimensions')
    282             if self.row.min() < 0:
    283                 raise ValueError('negative row index found')

ValueError: column index exceeds matrix dimensions

I can read loom file with loompy seamlessly. They are in the latest versions (Seurat_3.0.0.9000, loomR_0.2.1.9000, scanpy==1.4).

Am I doing wrong by typing that reading code below?:

a = scanpy.read_loom('brain10x.loom', sparse=True)

Thanks...

@ahy1221
Copy link
ahy1221 commented Apr 27, 2019

I also meet the same issue today. It seems that the problem is caused by col_graphs group in the loom file. If you remove Seurat3 graph slot and write out to loom. It would be done

@cakirb
Copy link
Author
cakirb commented Apr 29, 2019

Hi, thanks for your answer. How do you remove a graph slot from a Seurat object? When I try, I get this error:

> dataset@graphs <- NULL
Error in (function (cl, name, valueClass)  : 
  assignment of an object of class “NULL” is not valid for @‘graphs’ in an object of class “Seurat”; is(value, "list") is not TRUE

@ahy1221
Copy link
ahy1221 commented Apr 29, 2019

Hi,
the graph slot is assumed to be list . Just do that:
pbmc_small@graphs <- list()

@cakirb
Copy link
Author
cakirb commented Apr 29, 2019

Thank you very much. I could remove the graph slot and this error is gone, but now I have a new error:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-2-aae861244dfa> in <module>
----> 1 adata = sc.read_loom('dataset.loom')

/opt/conda/lib/python3.7/site-packages/anndata/readwrite/read.py in read_loom(filename, sparse, cleanup, X_name, obs_names, var_names, dtype, **kwargs)
    184             var=var,
    185             layers=layers,
--> 186             dtype=dtype)
    187     return adata
    188 

/opt/conda/lib/python3.7/site-packages/anndata/base.py in __init__(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, oidx, vidx)
    670                 layers=layers,
    671                 dtype=dtype, shape=shape,
--> 672                 filename=filename, filemode=filemode)
    673 
    674     def _init_as_view(self, adata_ref: 'AnnData', oidx: Index, vidx: Index):

/opt/conda/lib/python3.7/site-packages/anndata/base.py in _init_as_actual(self, X, obs, var, uns, obsm, varm, raw, layers, dtype, shape, filename, filemode)
    848         # annotations
    849         self._obs = _gen_dataframe(obs, self._n_obs,
--> 850                                    ['obs_names', 'row_names', 'smp_names'])
    851         self._var = _gen_dataframe(var, self._n_vars, ['var_names', 'col_names'])
    852 

/opt/conda/lib/python3.7/site-packages/anndata/base.py in _gen_dataframe(anno, length, index_names)
    285                 _anno = pd.DataFrame(
    286                     anno, index=anno[index_name],
--> 287                     columns=[k for k in anno.keys() if k != index_name])
    288                 break
    289         else:

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    390                                  dtype=dtype, copy=copy)
    391         elif isinstance(data, dict):
--> 392             mgr = init_dict(data, index, columns, dtype=dtype)
    393         elif isinstance(data, ma.MaskedArray):
    394             import numpy.ma.mrecords as mrecords

/opt/conda/lib/python3.7/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype)
    210         arrays = [data[k] for k in keys]
    211 
--> 212     return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    213 
    214 

/opt/conda/lib/python3.7/site-packages/pandas/core/internals/construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype)
     54 
     55     # don't force copy because getting jammed in an ndarray anyway
---> 56     arrays = _homogenize(arrays, index, dtype)
     57 
     58     # from BlockManager perspective

/opt/conda/lib/python3.7/site-packages/pandas/core/internals/construction.py in _homogenize(data, index, dtype)
    275                 val = lib.fast_multiget(val, oindex.values, default=np.nan)
    276             val = sanitize_array(val, index, dtype=dtype, copy=False,
--> 277                                  raise_cast_failure=False)
    278 
    279         homogenized.append(val)

/opt/conda/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
    656     elif subarr.ndim > 1:
    657         if isinstance(data, np.ndarray):
--> 658             raise Exception('Data must be 1-dimensional')
    659         else:
    660             subarr = com.asarray_tuplesafe(data, dtype=dtype)

Exception: Data must be 1-dimensional

@ahy1221
Copy link
ahy1221 commented Apr 29, 2019

It seems that something wrong happened for the Seurat meta slot. The code told that this error happened when AnnData tried to construct obs attribute.
I am afraid this beyond my scope since I cannot access your data for further debugging

@lkmklsmn
Copy link

I am getting the same error.

Exception: Data must be 1-dimensional

@cakirb
Copy link
Author
cakirb commented May 1, 2019

@ahy1221
Copy link
ahy1221 commented May 1, 2019

@cakirb
Your file is good.
I don't see any problem for reading the loom file you provided, at least by scanpy v1.4.1
image

@cakirb
Copy link
Author
cakirb commented May 2, 2019

Hi, that's interesting. I'm also using scanpy v1.4.1. Which anndata and loompy versions are you using?

@PedroRaposo
Copy link
PedroRaposo commented May 16, 2019

Hi @cakirb have you figured out the solution for this problem, by any chance?
Thanks

@cakirb
Copy link
Author
cakirb commented May 16, 2019

Hi @PedroRaposo, unfortunately not. My colleague told me that this issue could be related to the versions of scanpy, anndata or loompy. I have the same scanpy version with the successful test above. Maybe, it is related to loompy and anndata version but I'm not sure...

@MichaelPeibo
Copy link
MichaelPeibo commented May 18, 2019

Hi, @cakirb
Try installing loompy using pip install -U loompy, and make sure you are not using version 2.0.2.
see
theislab/scvelo#20 (comment)

EDITED: I am encountering the same problem as yours

Exception: Data must be 1-dimensional

@MichaelPeibo
Copy link

I have tried using latest version of anndata(0.16.9), still got the same error.

@PedroRaposo
Copy link

My too. I had loompy version 2.0.17 and now I installed the version 2.0.16 and still I'm getting the same issue.

@MichaelPeibo
Copy link

Hi, @ahy1221
Could you provide more details of your environment and packages version?
Many thanks!

@ivirshup
Copy link
Member
ivirshup commented Jun 1, 2019

I'd love to help close this issue, but it's difficult for us to debug without a complete reproducible example. Could someone who's been experiencing this please provide a complete script which reproduces this issue?

This script should include loading data into Seurat, whatever minimal set of intermediate steps are necessary, then writing out the file which scanpy fails to read. Ideally, the data is computationally generated, something as simple as x = matrix(1, nrow=10, ncol=10) or x = matrix(rpois(100, range(5)), ncol=10).

If someone who is having this issue can please provide an example like this, we'll be able to help much faster.

@PedroRaposo
Copy link

For me, python's modules versions was the problem. Now it works (for your information, you can see this thread scverse/anndata#152).

@MichaelPeibo
Copy link

@PedroRaposo Updating all these packages works! Thanks!

@cakirb
Copy link
Author
cakirb commented Jun 5, 2019

It works for me too!! So I can close the issue.

@cakirb cakirb closed this as completed Jun 5, 2019
@fentouxungui
Copy link

pbmc@graphs <- list()
works for me! Thanks @ahy1221

@gauravsinghrathore
Copy link

When I try to read loom file created bu seurat through sc.read_loom or scv.read or loompy.connect
everything becomes still the cell is always busy with * sign and nothing gets executed basically it hangs. any input?

@timlai4
Copy link
timlai4 commented Jul 2, 2020

After removing the graphs and loading the loom file into scanpy with the now empty graphs slot, is there a way to manually add it back in? For example, before removing the graphs attribute, I call as.matrix() and saved it as a CSV (probably a better way to do this to maintain the sparse property). I can now read this CSV back into Python (e.g. with pandas), but what is the correct way to reload it into the resulting AnnData object?

@ivirshup
Copy link
Member
ivirshup commented Jul 8, 2020

This issue has been mentioned on Scanpy. There might be relevant details there:

https://scanpy.discourse.group/t/importing-graphs-from-seurat/249/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants