[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to proceed #1

Closed
flying-sheep opened this issue Feb 4, 2017 · 4 comments
Closed

how to proceed #1

flying-sheep opened this issue Feb 4, 2017 · 4 comments

Comments

@flying-sheep
Copy link
Member
  1. make things work*
  2. find out how to store PCA and friends in/with the AnnData 1cec418#commitcomment-20744162
  3. determine how to read/write AnnData. maybe fields named var_* in the HDF5 will be var metadata and so on?

*apart from things still crashing, especially the group plotting should be fixed (should probably be transformed to one scatter call with a list of all groups)

@falexwolf
Copy link
Member
falexwolf commented Feb 5, 2017

Good! So, I'd really like to jump in and work on ann_matrix as well, if you think this is efficient. Of course, I don't want to mess up what you had in mind.

  1. yes, that's important - can i help?
  2. that's easy, simply put it in smp as a multicolumn object
  3. should be very easy as well, maybe recarray can directly be written with a single key, if not, one has to make the separation between str and float columns -> shall I attack that? see this for how it was done with the ddata using its 'rowcat' attribute. should be straightforwardly adapted, right?*

*sorry, I simply forgot to add readwrite.py on thursday night, which caused master to be non-working since then, of course. with readwrite.py added, master now works just fine. I guess the only change you made to utils.py was adding the AnnData.from_dict(...) in the function read()? so one could use readwrite.py from master within ann_matrix. or just create readwrite.py again by cutting out everything related to reading/writing from utils and pasting it into the new module readwrite.py.

@falexwolf
Copy link
Member
falexwolf commented Feb 5, 2017

Generally: What shall I do in order to merge ann_matrix as quickly as possible with the master branch? Starting from tomorrow, fiona would like to work on one tool using the nestorowa16 case i mentioned before. So if you allow me, I'll try to get everything running and polished tonight.
PS: During the day, I'll be offline.

@flying-sheep
Copy link
Member Author

sure, go ahead, i’m occupied today preparing my mitarbeitergespräch :D

@falexwolf
Copy link
Member

damn, I'm not fit enough to make ann_matrix work tonight. so, in order to get figures, analysis and a barebone code for fiona ready (we have a skype conference with fabian and the group in cambridge tomorrow at 11am, and fabian is quite pushy), i'll use the working master branch.

let's discuss merging with ann_matrix in person during the next days.

falexwolf pushed a commit that referenced this issue Oct 29, 2018
Updated read_10x_h5:
- Renamed the original `read_10x_h5` as `_read_legacy_10x_h5`;
- Added `_read_v3_10x_h5` to read the new Cell Ranger output format;
- The new `read_10x_h5` determines the version of HDF5 input by the presence of the matrix key, and wraps the above two functions. In addition, it takes a `gex_only` argument which filters out feature barcoding counts from the outcome object when it is True (default). Otherwise, the full matrix will be retained.
- For CR-v3, `feature_types` and `genome` were added into the outcome object as new attributes.

Updated read_10x_mtx:
- Renamed the original `read_10x_mtx` as `_read_legacy_10x_mtx`;
- Added `_read_v3_10x_mtx` to read the new Cell Ranger output format;
- The new `read_10x_mtx` determines the version of matrix input by the presence of the `genes.tsv` file under the input directory, and wraps the above two functions. In addition, it takes a `gex_only` argument which filters out feature barcoding counts from the outcome object when it is `True` (default). Otherwise, the full matrix will be retained.
- For CR-v3, `feature_types` was added into the outcome object as a new attribute.

Added test data and code for the revised functions.

Note for the genome argument:
- There is a genome argument in Scanpy's `read_10x_h5` function but not in `read_10x_mtx` as the genome was already specified by the path of input directory. The outcome object of the two functions should be the same which always take one genome at a time.
- In this PR, when there are multiple genomes (e.g. Barnyard), `read_10x_mtx` always read them all, whereas `read_10x_h5` always need to specify one of them (mm10 by default). However, when `gex_only == False`, the `genome` argument will be ignored and the whole matrix will be read.
falexwolf added a commit that referenced this issue Oct 29, 2018
Let Scanpy read from Cell Ranger 3.0 outputs (#1)
ivirshup pushed a commit that referenced this issue Apr 8, 2019
flying-sheep pushed a commit that referenced this issue Jun 28, 2019
update to match upstream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants