-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doublet filtering function #173
Comments
Sorry for the late response, I was on holidays. I'm happy to merge a pull request for this, if the package appears solid. Would you want to add a file |
Thanks to @yueqiw for the confidence. :) @falexwolf We have no issue with our package being included here, but we wouldn't be able to create a custom API for your package right now, if that's what you were suggesting? Given my quick overview of your package, two things you should note:
|
Hi @JonathanShor, you don't need to create a custom API. One point of Scanpy is to provide convenient access via If your package works reliably, both the restrictions you mention should in principle not prevent adding your package. Of course, in the future, we want all elements of Scanpy to scale to millions of cells, not just the core tools. But for a lot of people, it's right now helpful to have a large number of tools available also for relatively small datasets. The only problem is to avoid cluttering the Scanpy API with virtually any tool there is. Tools in the API should have passed a certain quality check. Doublet detection is a difficult problem. Already last autumn, we played around with @swolock 's tool but didn't end up using it - it was good, but in our situation, it didn't seem to apply (are you eventually going to distribute a package for it @swolock ?). I myself quickly wrote a tool, too, but it didn't work well. Just yesterday, this appeared. Then there is also this on "empty cell detection". There are more tools out there, I think... What I mean is: computationally detecting doublets is still something where the field has not agreed on a consensus. Just like batch correction. Therefore, I would not add a tool There are two options. Either we create a @flying-sheep @gokceneraslan @fidelram @dawe anyone opinions on such cases? |
Hi @falexwolf, yes I will be making my method available. A rough version is already on github, and I also played around with adding it to my scanpy fork (though not the right way -- I added it to |
Good to hear! Looking forward to learning more about it. |
It looks like this may have stalled a bit. Is anyone currently working on making some form of doublet detection available from scanpy? |
Hi @ivirshup, I've been meaning to get back to this. I've just started on an AnnData-compatible version of Scrublet which should be easy to hook up to Scanpy. Will keep you posted. |
Any updates on this? |
I guess, we should ask @swolock. 🙂 |
I can apply pretty easily scrublet from the original python package, so I
guess a wrapper should be something fast to implement :) I was thinking
about doing it last week, but I am no expert in this kind of stuff :/
Den tir. 14. maj 2019 kl. 20.18 skrev Alex Wolf <notifications@github.com>:
… I guess, we should ask @swolock <https://github.com/swolock>. 🙂
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#173>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACC66UN3NQVYPI4KPDUAOK3PVL7A3ANCNFSM4FE4LIFQ>
.
|
@cartal @SamueleSoraggi |
Your way sounds sure better, many things into the scrublet algorithm are in
redundancy with components of scanpy. It will sure look great :)
Just one thing: in the scrublet paper they suggest always to just run the
simulation of doublets and look at the expected vs estimated fraction of
doublets before removing doublets. If those two values do not match, they
say one should rerun scrublet and tune the expected fraction.
Does your script only run simulation of doublets and output the doublets
score, or does it also remove doublets at once? If you do the latter, then
one is not able to simulate doublets more than once to adjust the expected
doublet fraction.
Cheers.
Den tor. 16. maj 2019 kl. 05.15 skrev Sam Wolock <notifications@github.com>:
… @cartal <https://github.com/cartal> @SamueleSoraggi
<https://github.com/SamueleSoraggi>
For some reason I decided to integrate Scrublet using Scanpy's functions
where possible, rather than making a simple wrapper. The core functionality
is up and running in this fork <https://github.com/swolock/scanpy>, and
now I just need to add documentation, make some of the code more
Scanpythonic(?), and add an example.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#173>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACC66UI4FF4LES7GRVKHZZDPVTGWTANCNFSM4FE4LIFQ>
.
|
@swolock why don't you submit a PR? I just tested your code and seems to work. |
How is this work going? We'd love to integrate Scrublet into our workflows, which are currently quite Scanpy-centric. |
@pinin4fjords Given that there is no response from @swolock I tried the following, which works well: import Scrublet as scr
scrub = scr.Scrublet(adata.raw.X)
adata.obs['doublet_scores'], adata.obs['predicted_doublets'] = scrub.scrub_doublets()
scrub.plot_histogram()
sc.pl.umap(adata, color='doublet_scores') |
I have problem installing and importing scrublet on windows please can you help me
Current channels:
To search for alternate channels that may provide the conda package you're
and use the search bar at the top of the page. |
I started work to move |
Hi,
I tried the
DoubletDetection
Python library on my data and got some interesting result. As it can be run directly on a numpy array of count matrix (adata.X
), I thought it would be an interesting feature forscanpy
.The text was updated successfully, but these errors were encountered: