[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retire FilterVcf in favor of VariantFiltration? #1215

Closed
akiezun opened this issue Dec 2, 2015 · 7 comments
Closed

retire FilterVcf in favor of VariantFiltration? #1215

akiezun opened this issue Dec 2, 2015 · 7 comments
Assignees
Labels

Comments

@akiezun
Copy link
Contributor
akiezun commented Dec 2, 2015

FilterVcf seems to be a subset of VariantFiltration. Should we retire it?

@akiezun
Copy link
Contributor Author
akiezun commented Dec 2, 2015

WDYT @yfarjoun @ldgauthier ?

@ldgauthier
Copy link
Contributor

I've never used the Picard one. I'm under the impression that the Picard version actually removes variants from the output, which I am against.

@akiezun akiezun mentioned this issue Dec 2, 2015
27 tasks
@yfarjoun
Copy link
Contributor
yfarjoun commented Dec 2, 2015

OK. I looked at the code. They do the same thing. not removing variants,
but rather apply HARD filters to variants.

The Picard version was written since GATK was too slow....

On Wed, Dec 2, 2015 at 1:59 PM, ldgauthier notifications@github.com wrote:

I've never used the Picard one. I'm under the impression that the Picard
version actually removes variants from the output, which I am against.


Reply to this email directly or view it on GitHub
#1215 (comment)
.

@ldgauthier
Copy link
Contributor

👍 for fast.

@akiezun
Copy link
Contributor Author
akiezun commented Dec 2, 2015

thanks for clarifying. I'll compare perf of GATK4 and report it here.

@akiezun akiezun assigned akiezun and unassigned vdauwera Dec 2, 2015
@akiezun
Copy link
Contributor Author
akiezun commented Apr 18, 2016

It's true - FilterVCF is faster when filtering genotypes. Here I took 1 small vcf file with 2535 samples and 12632 lines (675M)

I'm closing this then. Unclear if we want to invest in speeding up VariantFiltration given that we're moving away from vcf files at some point.

No filters: 
VariantFiltration Elapsed time: 0.20 minutes
FilterVCF Elapsed time: 1.11 minutes

MIN_DP - per-genotype filter
VariantFiltration: Elapsed time: 2.05 minutes.
FilterVCF: Elapsed time: 1.16 minutes.

MIN_GQ - per-genotype filter
VariantFiltration  Elapsed time: 2.06 minutes
FilterVCF: Elapsed time: 1.16 minutes.

@akiezun akiezun closed this as completed Apr 18, 2016
@akiezun
Copy link
Contributor Author
akiezun commented Apr 18, 2016

One clarification - FilterVCF is not a subset of VariantFiltration, it offers one new option - filter on allele balance computed on the fly. I added #1726

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants