[go: nahoru, domu]

Sequential regulatory activity prediction across chromosomes with convolutional neural networks

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 5.
Figure 5.

Basenji gene-specific variant scores enrich for eQTLs. (A) We defined SNP expression difference (SED) scores for each biallelic variant and gene combination as the difference between the model prediction for the two alleles at that gene's TSSs. (B) We computed the signed LD profile of the SED annotations (denoted by SED-LD) to more readily compare to eQTL measurements in human populations (Methods). |SED-LD| shows a strong relationship with eQTL statistics from GTEx. Here, we binned variants into five quantiles by the difference between their regression predictions including and excluding |SED-LD| and plotted the proportion of variants called significant eQTLs in pancreas. We chose five quantiles to represent the observed statistical trend parsimoniously and aesthetically. The proportion rises with greater |SED-LD| to 4.2× in the highest quantile over the average of the bottom three quantiles, which represented the median enrichment in a range of 3.2–5.8× across the 19 tissues. See Supplemental Figure S8 for all tissues and TSS-controlled analysis. (C) Plotting |SED-LD| versus the χ2 statistics reveals a highly significant correlation.

This Article

  1. Genome Res. 28: 739-750

Preprint Server