-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Python and PyMC, removed TensorFlow, and added PyTorch in conda environment. #8561
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
6534430
to
558ccaf
Compare
Thanks for your work on this @samuelklee! Testing on both wes and wgs would be ideal. For wgs we can use the gatk-sv reference panel, which is our standard (I can help with this once a docker is ready). For wes, 1kgp would work although it's definitely showing its age. Are the integration test differences large? |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
OK, I think things are looking good! Updated a bunch of things, including the following:
These and other packages (numpy, scipy, etc.) are all pretty much at the latest available versions for python 3.10. I've also bumped version numbers for our internal python packages. I also made all of the changes to the gCNV code to accommodate any changes introduced by PyMC/Pytensor. For the most part, these were minor renamings of However, there were some more nontrivial changes, including to 1) model priors (since some of the distributions previously used were removed or are now supported differently), 2) the implementation of posterior sampling, 3) some shape/dimshuffle operations, and other things along these lines. Using a single test shard of 20 1kGP WES samples x 1000 intervals, I have verified determinism/reproducibility for DetermineGermlineContigPloidy COHORT/CASE modes, GermlineCNVCaller COHORT/CASE modes, and PostprocessGermlineCNVCalls. Numerical results are also relatively close to those from 4.4.0.0 for all identifiable call and model quantities (albeit far outside any reasonable exact-match thresholds, most likely due to differences in RNG, sampling, and the aforementioned priors). Some remaining TODOs:
|
Github actions tests reported job failures from actions build 7143821808
|
4bf5286
to
ed59372
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Here's one, if anyone would like to try it out! |
Finishing up a tieout using old MalariaGEN Pf7 CNV genotyping runs, and I think things look good from this perspective, at least. This tieout uses a subset of the Pf7 samples containing 300 cohort and 1683 case samples (which were indeed treated as a cohort-case cluster in the original Pf7 CNV genotyping analysis). ~4k genomic bins are covered. We compare this branch against 4.5.0.0, as well as this branch against itself (checking for reproducibility). Costs for this branch ($10.92) and 4.5.0.0 ($10.96) were quite comparable. Note that a small portion of these costs derives from Pf7-specific genotyping steps, which I did not bother to remove from the workflow. Runtime for the ploidy modeling and postprocessing steps were comparable. Interestingly, runtime for the gCNV was ~20-25% longer with this branch than with 4.5.0.0, but memory usage fell by a factor of ~3 (~6GB to ~2GB)! I am not sure if we could recoup the runtime with some more tweaking of the environment (perhaps double checking that optimized BLAS/MKL/etc. packages are properly used, changing environment variables/flags, etc.), but I think the decrease in memory usage is quite nice. Concordance was checked for the following quantities (4.5.0.0 is on the x-axis and this branch is on the y-axis in all plots below):
|
Thanks for sharing these results @samuelklee ! That memory vs runtime tradeoff does not seem at all bad to me. What else needs to be done before we can feel confident about merging this (apart from deprecating the old CNN tools)? Is the posterior sampling issue resolved? |
Yes, that's been resolved! Phew. If anyone wants to start taking an initial review pass, that would be appreciated! There are probably a few very minor cleanups I could do (e.g., removing stray commented code and TODOs), but I don't think that should hold anything up. |
Also, I think @mwalker174 should have the final say on merging, as he's the current gCNV owner. Would be great if he could run any GATK-SV-specific tests and report back on runtime/accuracy! |
src/main/python/org/broadinstitute/hellbender/gcnvkernel/utils/math.py
Outdated
Show resolved
Hide resolved
@samuelklee It's great to see this inching closer to finally getting merged! I think we should plan on merging it shortly after the upcoming 4.6 release, assuming the remaining testing doesn't reveal any issues. Since 4.6 will contain a number of important bug fixes and user-requested changes, I want to get it out before making a big breaking change to the Python environment. After 4.6 is out, we can meet briefly to discuss how best to handle the deprecation of the legacy CNN tools, and hopefully get this in at last. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for documenting all the API changes. This looks reasonable to me (along with the malaria plots). There are a couple more TODOs that I think are to-done. @droazen is going to help us come up with a CNN deprecation/transition plan so we don't have to turn off tests to merge
src/main/java/org/broadinstitute/hellbender/tools/copynumber/PostprocessGermlineCNVCalls.java
Show resolved
Hide resolved
src/main/python/org/broadinstitute/hellbender/gcnvkernel/inference/deterministic_annealing.py
Show resolved
Hide resolved
src/main/python/org/broadinstitute/hellbender/gcnvkernel/io/io_commons.py
Outdated
Show resolved
Hide resolved
src/main/python/org/broadinstitute/hellbender/gcnvkernel/models/pytensor_hmm.py
Outdated
Show resolved
Hide resolved
@@ -295,7 +295,7 @@ jobs: | |||
runs-on: ubuntu-latest | |||
strategy: | |||
matrix: | |||
wdlTest: [ 'RUN_CNV_GERMLINE_COHORT_WDL', 'RUN_CNV_GERMLINE_CASE_WDL', 'RUN_CNV_SOMATIC_WDL', 'RUN_M2_WDL', 'RUN_CNN_WDL', 'RUN_VCF_SITE_LEVEL_FILTERING_WDL' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may already be a WDL for the NVIDIA CNN in the Terra workspace, but this will be part of the CNNScoreVariants Deprecation Plan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droazen leaving this as a reminder that we'll need to deal with the CNN tests/code/etc. disabled here in a subsequent PR.
...st/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCallerIntegrationTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/broadinstitute/hellbender/utils/python/PythonEnvironmentIntegrationTest.java
Outdated
Show resolved
Hide resolved
@droazen When we finally remove CNNScoreVariants, don't forget that you can put an entry in the DeprecatedToolsRegistry with a helpful message about the replacement. |
Very excited that this is almost ready! I'm presently going to run some tests on matched BGE/exome samples to assess runtime performance and results with gatk-sv. |
@mwalker174 reported last week that exomes look good, with perhaps more details on cost/accuracy forthcoming! I'll be OOO next week, but will address comments after I get back---already looked through them and I think it should be quick (famous last words). |
We've run some basic WGS tests in gatk-sv, and the results look good. Using our 161-sample 30x NYGC 1KGP test panel, here is a summary of per-sample raw calls from gCNV+cnMOPs. That is, records look like:
Master: Branch: Note that this is a simple bedtools intersection requiring 90% reciprocal overlap, but not matching samples. Subsetting just to NA19420: Master: Branch: Further subsetting to NA19240 variants over 5kbp, which is the default gatk-sv threshold for depth-only calls: Master: Branch: While it does appear there's an appreciable difference here, if we subset to NA19240 variants that survive gatk-sv filtering to the "CleanVcf" stage (in the current master), the differences are much less: Master: Branch: I think the differences in raw calls is probably "in the noise" for WGS, given the high concordance in this test sample after gatk-sv filtering and genotyping are applied. Edit: Thanks to @kirtanav98 for running these tests. |
Also we've noticed a slight drop in peak memory usage from 8.7 GB to 7.2 (17% decrease) in both cohort and case modes for |
Here's the performance benchmarking NEW gCNV docker
1 0.2034884 0.4107143 0.9387755 0.8581560 344 280 98 141 |
Great, thanks @mwalker174 and @JMF47! I will try to get this tidied up before heading out for 7/4. |
Hi all, Any chance this will make it into a release soon? I was hoping this got merged with the recent docker image overhaul. Thanks |
…ker and conda environments.
f7a9760
to
4dc5caa
Compare
@matthdsm this was intentionally left out of the recent 4.6 release, but should go into the next minor release. Would of course appreciate any testing/feedback from the community before then! |
Released |
Copying over some discussion from Slack, with some slight modifications:
I've made some strides in this PR; as of 6b08f3a, I've made enough updates to accommodate API changes so that cohort-mode inference for both GermlineCNVCaller and DetermineGermlineContigPloidy runs successfully under Python 3.10 and PyMC 5.9.0---although note that 5.9.1 has been released in the interim!
However, our work has just begun. Results now produced in the numerical tests mentioned above are quite far off from the original expected results. It remains to be seen whether this is due to the randomness of inference, some slight changes to the model prior that were necessitated by the API changes, or some bugs introduced in other code updates. (Also note that I believe Andrey's PR in item 4 already broke these tests, although the numerical differences were much smaller and more reasonable---but perhaps he can confirm. Also noting here that I think determinism is still currently broken as of this commit---there have been some changes to PyTensor/PyMC seeding so that our previous theano/PyMC3 hack no longer applies.)
So I think the next step is to just go to scientific-level testing and see what the fallout is. Ideally, we'd still get good performance (or perhaps better! at least on the runtime side, hopefully...) and we can just update the numerical tests. But if performance tanks, then we might need to see whether I've introduced any bugs. @mwalker174 @asmirnov239 perhaps you can comment on what might be the appropriate test suite here----1kGP?
I'll also highlight again that this PR will remove TensorFlow and might require that the corresponding CNN implementations be supported by an alternate strategy, at least until the PyTorch implementation goes in.