[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-f, --trim_front1 option gives fewer than expected counts #155

Open
sujaikumar opened this issue May 2, 2019 · 2 comments
Open

-f, --trim_front1 option gives fewer than expected counts #155

sujaikumar opened this issue May 2, 2019 · 2 comments

Comments

@sujaikumar
Copy link

Thanks for an excellent piece of software - fastp is fast, accurate, and the defaults/options/reports for merging/trimming are perfect for all our needs!

However, assuming I've understood the options correctly, there is possibly a bug:

The SRA run SRR2413286 (for example) has a 20 bp adapter at the front (left) of the read that needs to be trimmed, and a 3' adapter:

I get different counts if I trim with and without the -f 20 option even if I adjust the -l --length_required limit accordingly.

Code to reproduce (using fastp 0.19.4)

wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR241/006/SRR2413286/SRR2413286.fastq.gz
fastp -i SRR2413286.fastq.gz -a GTGTCAGTCACTTCCAGCGG -f 20 -l 18 -n 0 -o SRR2413286.fastp.f20.l18.fq
fastp -i SRR2413286.fastq.gz -a GTGTCAGTCACTTCCAGCGG -l 38 -n 0 -o SRR2413286.fastp.l38.fq

wc -l SRR2413286.fastp*fq
  25822936 SRR2413286.fastp.f20.l18.fq
  26231448 SRR2413286.fastp.l38.fq

Is fastp doing something else that is causing the number of reads in the -f 20 -l 18 case to be lower than the second case -l 38 ?

@sfchen
Copy link
Member
sfchen commented May 3, 2019

Can you please try the new version v0.20.0?

Some related changes have been made since v0.19.4.

@sujaikumar
Copy link
Author

Thanks for the suggestion. I just tried it with v0.20.0

wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR241/006/SRR2413286/SRR2413286.fastq.gz
fastp -i SRR2413286.fastq.gz -a GTGTCAGTCACTTCCAGCGG -f 20 -l 18 -n 0 -o SRR2413286.fastp.f20.l18.fq
fastp -i SRR2413286.fastq.gz -a GTGTCAGTCACTTCCAGCGG -l 38 -n 0 -o SRR2413286.fastp.l38.fq

wc -l SRR2413286.fastp*fq

  25800360 SRR2413286.fastp.f20.l18.fq
  26231448 SRR2413286.fastp.l38.fq

It's still trimming more reads when using -f 20 -l 18 than when using -l 38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants