[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting adata.raw does not copy data #2748

Closed
2 of 3 tasks
yuxiaokang-source opened this issue Nov 12, 2023 · 1 comment
Closed
2 of 3 tasks

Setting adata.raw does not copy data #2748

yuxiaokang-source opened this issue Nov 12, 2023 · 1 comment

Comments

@yuxiaokang-source
Copy link
yuxiaokang-source commented Nov 12, 2023

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the master branch of scanpy.

What happened?

import numpy as np
import pandas as pd
import anndata as ad
from scipy.sparse import csr_matrix
print(ad.__version__)

mtx = np.array([[1,2,3],[2,3,4],[4,5,6],[0,20,100]])

adata = sc.AnnData(mtx)
adata.raw = adata
print(adata)
print(adata.X)

sc.pp.normalize_total(adata,target_sum=1e4)
sc.pp.log1p(adata)
print(adata.X) 

print(adata.raw.X[0:10,0:10])

when the type of adata.X is integer, the value of adata.raw is not changed, I get following result
image

but when I change the type of adata.X to float, the value of adata.raw will be changed with adata.X, I get following result

import numpy as np
import pandas as pd
import anndata as ad
from scipy.sparse import csr_matrix
print(ad.__version__)

mtx = np.array([[1.2,2.1,3.9],[2.01,3.99,4.23],[4.21,5.12,6.87],[0,20.12,100.96]])

adata = sc.AnnData(mtx)
adata.raw = adata
print(adata)
print(adata.X)

sc.pp.normalize_total(adata,target_sum=1e4)
sc.pp.log1p(adata)
print(adata.X) 

print(adata.raw.X[0:10,0:10])

I get following result
image

It sems strange for me? Shouldn't I save raw data for float data? Could you give some suggestions? My environment is
image

Error output

No response

Versions

-----
anndata     0.10.1
scanpy      1.9.5
-----
CoreFoundation      NA
Foundation          NA
PIL                 9.4.0
PyObjCTools         NA
anyio               NA
appnope             0.1.2
asttokens           NA
attr                22.1.0
babel               2.11.0
backcall            0.2.0
bottleneck          1.3.5
brotli              NA
certifi             2023.07.22
cffi                1.15.1
chardet             4.0.0
charset_normalizer  2.0.4
cloudpickle         2.2.1
colorama            0.4.6
comm                0.1.2
cycler              0.10.0
cython_runtime      NA
cytoolz             0.12.0
dask                2023.6.0
dateutil            2.8.2
debugpy             1.6.7
decorator           5.1.1
defusedxml          0.7.1
dill                0.3.6
entrypoints         0.4
executing           0.8.3
fastjsonschema      NA
gmpy2               2.1.2
h5py                3.9.0
idna                3.4
igraph              0.10.8
ipykernel           6.25.0
ipython_genutils    0.2.0
jedi                0.18.1
jinja2              3.1.2
joblib              1.2.0
json5               NA
jsonpointer         2.1
jsonschema          4.17.3
jupyter_server      1.23.4
jupyterlab_server   2.22.0
kiwisolver          1.4.4
leidenalg           0.10.1
llvmlite            0.40.0
louvain             0.8.1
lz4                 4.3.2
markupsafe          2.1.1
matplotlib          3.7.2
mpl_toolkits        NA
mpmath              1.3.0
natsort             8.4.0
nbformat            5.9.2
numba               0.57.1
numexpr             2.8.4
numpy               1.24.3
numpydoc            1.5.0
objc                10.0
packaging           23.1
pandas              2.0.3
parso               0.8.3
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
platformdirs        3.10.0
plotly              5.9.0
prometheus_client   NA
prompt_toolkit      3.0.36
psutil              5.9.0
ptyprocess          0.7.0
pure_eval           0.2.2
pvectorc            NA
pyarrow             11.0.0
pydev_ipython       NA
pydevconsole        NA
pydevd              2.9.5
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.15.1
pyparsing           3.0.9
pyrsistent          NA
pytz                2023.3.post1
requests            2.31.0
rfc3339_validator   0.1.4
rfc3986_validator   0.1.1
ruamel              NA
scipy               1.11.1
send2trash          NA
session_info        1.0.0
setuptools          68.0.0
six                 1.16.0
sklearn             1.3.0
sniffio             1.2.0
socks               1.7.1
sphinxcontrib       NA
stack_data          0.2.0
sympy               1.11.1
tblib               1.7.0
terminado           0.17.1
texttable           1.7.0
threadpoolctl       2.2.0
tlz                 0.12.0
toolz               0.12.0
torch               2.1.0
torchgen            NA
tornado             6.3.2
tqdm                4.65.0
traitlets           5.7.1
typing_extensions   NA
urllib3             1.26.16
wcwidth             0.2.5
websocket           0.58.0
xxhash              2.0.2
yaml                6.0
zipp                NA
zmq                 23.2.0
zope                NA
-----
IPython             8.15.0
jupyter_client      7.4.9
jupyter_core        5.3.0
jupyterlab          3.6.3
notebook            6.5.4
-----
Python 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ]
macOS-12.5-arm64-arm-64bit
-----
Session information updated at 2023-11-12 16:47
@flying-sheep
Copy link
Member

I wouldn’t consider this a bug, but you’re right that it is surprising behavior.

You can fix it by doing

adata.raw = adata.copy()

instead of

adata.raw = adata

For why it happens, I think sc.pp.normalize_total calls adata.X = adata.X.astype(float) or so at one point.

If adata.X already has a float dtype, this does nothing, if it’s e.g. integers, this creates a copy.

Therefore do the above if you want to make sure .raw has a copy of everything.

@flying-sheep flying-sheep changed the title bug about adata.raw Setting adata.raw does not copy data Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants