-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gene co-expression networks #72
Comments
I would be very interested in helping to add any of these if they do not currently exist or aren't already in development. |
Dear both, sorry about the late response... I've become the father of twins in the past weeks... Will respond much more quickly again soon. Yes, we're working on this and will provide one solution within the next days. @tcallies could you push what you wrote? You can then tell me if this does the job for you. |
Dear both, correlation matrices are available now. Following our usual split into tools and plotting, you can call
for correlation matrix calculation. adata is the usual AnnData object you are working with. The method basically wraps the pd.DataFrame.corr method, which allows you to specify the correlation method ('pearson', 'spearman', 'kendall'). I use it for smaller data so it has not been optimized for performance (yet), but I tested the method for 3k cells and 600 genes and ended up with a runtime of ~8 seconds. I hope that is conveniently fast enough for you (if not let us know). After calling the tool, you can plot correlation matrices (using a wrapper for seaborn heatmap) by calling
This function searches basically only the AnnData annotation (again, if no key specified, "Correlation_matrix" is the default). Hope this does the job! |
Cool, sounds great! Thank you! I will also play around with this. Why don't you add it to the documentation? Maybe here https://github.com/theislab/scanpy/blob/980aa00adca49f6aa994a6f870ad98c3ad9218af/scanpy/api/__init__.py#L60? |
Ah! And we should also think about the naming convention here. Maybe |
It will be hard to maintain an overview of what's going on with all the names that were not specific enough and had to be removed but still kept at some place to maintain backward compatibility. |
@seth-ament @jorvis Having the correlation matrix, you then want to cluster it using hierarchical clustering, right? So, in order to achieve this, shall we add this functionality to clustermap, which currently clusters the expression matrix itself? |
I will certainly update my new stuff today at least once (probably more often ) and change the name / add the documentation |
That sounds right, yes. Looking forward to this being available. |
Yes, thanks so much. This looks great. Typically, we cut the hierarchical tree to produce gene clusters, summarize these clusters as the mean expression of the genes within the cluster, then pass the mean expression profile to plotting functions like coloring tSNE plots and violin plots. |
Any updates here? I'd love to add this to an analysis tool UI I'm working on (and presenting at a conference this weekend). Very happy to promote scanpy there. |
Hi all—does anybody have a skeleton snippet they're willing to share here on how to run this in the current version of Scanpy? Thanks! |
Unfortunately, all of this discussion here was not really further pursued, I have to admit. In principle, these are very simple things. However, I'm a bit afraid of offering a canonical function as I fear that there are also a lot of bad ways of visualizing gene correlation plots and I don't feel capable of judging this. If no one else wants to make a pull request for that (maybe using what @tcallies already did, but I fear it's not really serving the purpose of the discussion here: here, here) it would be cool if someone sent me an example case, which clearly shows what you want. Maybe @jorvis, you can send images for the examples you have in mind? |
It’s still not in the docs, and by now also broken… #392 |
Hello @tcallies @falexwolf @flying-sheep sc.tl.correlation_matrix(adata_sub2, name_list=['SMARCA4', 'TP53'], n_genes=20, annotation_key=None, method='pearson')
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_3196/1400689712.py in <module>
----> 1 sc.tl.correlation_matrix(adata_sub2, name_list=['SMARCA4', 'TP53'], n_genes=20, annotation_key=None, method='pearson')
AttributeError: module 'scanpy.tools' has no attribute 'correlation_matrix' |
same error, seconded -- is there an alternative approach built in? |
Looks like when |
A very dodgy workaround would be from scanpy.tools import _top_genes
from scanpy.plotting import _anndata
_top_genes.correlation_matrix(adata, names, annotation_key=None, method='pearson')
_anndata.correlation_matrix(adata, groupby='leiden') |
We are very impressed with the scalability of scanpy. We are interested in performing gene co-expression clustering on large single-cell RNAseq datasets. This typically involves calculating pairwise correlations between genes, then using these correlations as distance metrics for hierarchical and k-means clustering. Does scanpy already support these kinds of analyses?
The text was updated successfully, but these errors were encountered: