Calculate differential expression between user defined groups of clusters #397

aditisk · 2018-12-13T21:06:12Z

Hello,

I want to be able to use sc.tl.rank_genes_groups() to calculate differential expression between two groups of my choice. For example, if I have 16 clusters in my UMAP plot and I want to compare group 1 (all cells in clusters 1 to 8) to group 2 (all cells in clusters 9 to 16) how can I do this ?

Thank you in advance for any help.

LuckyMD · 2018-12-13T22:17:18Z

Hi,

You could just create a new .obs variable with the two groups and the perform sc.tl.rank_genes_groups() over this variable. For example, you could do something like this:

adata.obs['groups'] = ['group 1' if int(i) < 9 else 'group 2' for i in adata.obs['louvain']]
sc.tl.rank_genes_groups(adata, groupby='groups', key_added='group_DE_results')

as there are only two groups the top-ranked genes for either groups will be the up-regulated genes in that group (and down-regulated in the other group) that are most differentially expressed between the groups.

You should however note that rank_genes_groups is not a particularly sensitive test for differential gene expression. While it is good for a quick exploratory analysis, other tools like limma or MAST may give you more DEG results.

aditisk · 2018-12-14T18:44:34Z

Hi,

Thank you so much for the prompt response. I was able to make the comparisons following your method. As you suggested, I am going to try using MAST or limma for DEG testing in the future.

Thanks.

falexwolf · 2018-12-15T21:27:03Z

Thank you so much for fielding so many of the issues, @LuckyMD! 😄

Can we elaborate a bit further on this one, though? For simple two-group comparisons, rank_genes_groups with method='wilcoxon' (Wilcoxon-Rank-Sum/Mann-Whitney U test) should be a legit choice, shouldn't it? It's used in many of this year's Nature, Cell and Science single-cell papers, it's the default test of Seurat (https://satijalab.org/seurat/de_vignette.html) and several people reported that it performs well in Sonison & Robinson, Nat Meth (2018). So, I don't think one needs to encourage people to immediately go to the great and powerful MAST, limma and DESeq2.

Can you point me to a reference that shows that a Wilcoxon-Rank-Sum test is less sensitive? How is this even a useful statement if you don't talk about the false positives you buy in? We should look at an AUC that scans different p-values, right? A bit more than a year ago, @tcallies and I had a full paper draft discussing AUCs for marker gene detection formulated as a classification problem, but we never finished it. In the general setting, it's not at all straightforward to make the evaluation a well-defined problem and other people will for sure have done a better job. Unfortunately, I have never fully caught up with the literature, I fear...

falexwolf · 2018-12-15T21:29:50Z

And, I think, we'd want to start advertising diffxpy in addition to the usual suspects, shouldn't we? Any comments from your side, @davidsebfischer?

LuckyMD · 2018-12-17T14:30:11Z

@falexwolf I try to answer where I can.

I should probably have clarified a bit above. I would argue that most real data DE tests benefit from accounting for technical covariates. For example, you should probably not perform batch correction on your data and then do a wilcoxon rank sum test, but instead take the normalized (and log transformed) data or the raw counts and include a batch covariate in the test. This also holds for technical covariates that describe the complexity of the data (such as size factors or n_genes). Often these factors are not sufficiently accounted for by simple normalization techniques (especially for plate-based data), and are thus included in the DE testing framework. This is done in MAST (and MAST performs better with this detRate covariate in the Soneson & Robinson paper you cite above), and it is also done in a recent negative binomial DE test from Mayer et al, Nature 2018. When you are not able to fit the background variability in your model, you will have a lower sensitivity. Accounting for covariates is obviously not possible with t-tests or wilcoxon rank sum tests. Hence my statement about lower sensitivity. They did perform comparatively well in the DE method comparison, which is why I'd argue that they're useful for first pass exploratory applications (and marker gene detection when you don't want to use more fancy approaches like this). However, if you can account for technical covariates, that's probably a good approach to use.

Also, according to the comparison paper you mention, there are not more false positives when using MAST or limma compared to t-tests or Wilcoxon rank sum tests.

davidsebfischer · 2018-12-17T14:58:27Z

I agree with @LuckyMD about the points regarding covariates.

With respect to two group comparisons without confounding, rank-sum tests have less statitical power than t-tests (https://stats.stackexchange.com/questions/130562/why-is-the-asymptotic-relative-efficiency-of-the-wilcoxon-test-3-pi-compared), disclaimer I haven't checked this proof, I think this is a standard statistics result though, this is also discussed here https://stats.stackexchange.com/questions/121852/how-to-choose-between-t-test-or-non-parametric-test-e-g-wilcoxon-in-small-sampl. I havent run simulations to check how big the influence of the difference in power is on the kind of data we encounter. However, as also pointed out by the second link, violations of the distributional assumptions for t-test impact these results and these violations will be major on scRNAseq. Intuitively I would therefore tend to rank-sum tests.

With respect to diffxpy: We can account for other noise models in the two-group comparisons by performing model fitting, tutorial here. The bioarxiv will hopefully be up in the next few weeks.

LuckyMD · 2018-12-21T11:02:31Z

Also, I just found time to read the links you posted @davidsebfischer. Shouldn't we always use a welch t-test instead of a t-test in marker gene detection according to your second stackexchange link? They state that If you don't have a good reason to assume equal variances in the groups, then use the Welch correction... if we have a group vs rest type of setup as we do in rank_genes_groups() at the moment, then we would definitely not expect a single cluster to have an equal variance to the combination of all other cells in other clusters.

I think the default is currently t-test-overestimate-var... being oblivious to exactly how that works, might it not be better to adapt that to a welch-t-test-overestimate-var or something like that @falexwolf?

falexwolf · 2018-12-26T20:56:55Z

@LuckyMD

Thank you for the whole in-depth discussion. It makes a lot of sense! 😄

To your question: Scanpy has used Welch's adaption of Student's t-test from the very beginning.

@davidsebfischer

Thank you! I guess it would be nice to have a single-cell tutorial in diffxpy that shows higher sensitivity by accounting for technical covariates.

LuckyMD · 2018-12-26T23:40:21Z

It's always a nice feeling when your understanding catches up to someone else's decision making a long time ago... ;)

sophiamaedler · 2019-02-12T19:03:41Z

Any news when diffxpy will be up on biorxiv? The package looks super interesting and I'd like to learn more about the background to the methods you implemented.

wangjiawen2013 · 2019-09-07T12:45:17Z

Now seurat performs DE analysis using alternative tests including MAST and DESeq2 in a convinent way, such as FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", test.use = "MAST"). So I hope that Scanpy could interated more methods too, such as diffxpy in this way:
sc.tl.rank_gene_groups(adata, method='diffxpy' or 'MAST'). Here is the hyperlink of DE analysis in Seurat:

https://satijalab.org/seurat/v3.0/de_vignette.html

DingWB · 2019-11-08T06:27:31Z

I agree with wangjiawen2013, Scanpy should add MAST and DESeq2 into sc.tl.rank_gene_groups too.

wangjiawen2013 · 2019-11-08T07:04:29Z

This field is developing very fast, more and more advanced DE test methos are emerging, it's better to adopt these powerful methods.

wangjiawen2013 · 2019-11-08T07:36:23Z

One of the shortcomings of scanpy's default DE testing is that p-values (or FDR) of a few genes are very significant (equal 0 or approximately 0 in some datasets), then it's impossible to execute -log transformation, even there is only one 0. The volcano plot will be not beautiful because of the high significance.
@falexwolf

flying-sheep · 2019-11-08T10:32:04Z

@davidsebfischer is developing https://github.com/theislab/diffxpy, which is a great way to do differential expression in a scanpy workflow. We should document this!

Do you think the p-values are inflated? If so, how could we propagate uncertainty to them?

wangjiawen2013 · 2019-11-08T10:48:09Z

please refer to this:
theislab/diffxpy#127
theislab/diffxpy#128

chris-rands · 2020-05-28T17:41:57Z

Thanks all for the interesting discussion- did a consensus emerge on the 'best' way to do differential expression testing in scanpy?

LuckyMD · 2020-05-28T18:09:20Z

We typically use diffxpy for this. I would check out that repo. It's fast, and quite versatile. Documentation is also getting to a pretty good state I think.

chris-rands · 2020-05-29T07:54:17Z

Thank you @LuckyMD. Naive question, but what is the advantage of diffxpy over sc.tl.rank_genes_groups? I read comments above about noise models and technical covariates, but I don't fully understand the model fitting aspect and both methods seem to offer similar tests like T-tests and Wilcoxon rank-sum.

LuckyMD · 2020-05-29T10:18:49Z

sc.tl.rank_genes_groups offers only the T-test (including a second version of this) and Wilcoxon rank-sum test. These tests are also in diffxpy, but there are fare more sophisticated parametric models which you can (and probably should) use to test for differential expression in different setups. Ideally, to run a DE test you would want to model raw count data as a negative binomial and then add covariates to the model (like size factors, sample, condition, etc.). This is only possible in diffxpy.

LuckyMD closed this as completed Dec 15, 2018

adkinsrs mentioned this issue Jul 16, 2021

Possible to set reference condition for two_sample tests? theislab/diffxpy#205

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculate differential expression between user defined groups of clusters #397

Calculate differential expression between user defined groups of clusters #397

Calculate differential expression between user defined groups of clusters #397

Calculate differential expression between user defined groups of clusters #397

Comments