-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calib and use with conda #28
Comments
Hello Chad, Sorry for the (very) late response. So, the conda version does not install the error correction module and only contains the clustering one because of some issues I had with adding SPOA dependency on bioconda. Please let me know if using conda is a must for your tests and I can give it another try over the weekend especially that SPOA conda version has been updated recently |
Hey Baraa, No problem! Conda would be preferable, and I'm sure many more would appreciate it! However, if that is not the case I'm happy to install it using your instructions on the readme. I eventually installed calib following the readme and was able to configure calib and calib_cons. However, after running : Extracting minimizers and barcodes... I received binary testcluster file, which was ~2.7G and started with a
Reading cluster file: testcluster which results in empty msa and fastq files (I think resulting from the improper cluster file). Please let me know if you would like me to transfer this additional issue to a separate issue! |
The Also, what is the length of each read mate? |
863640 863640 17 ??6?ޠp?lÂóLN????T???@A?aJ,?}?????z?,?f;?>v?m;ǻ%p?ɔ????@???kv?8B I'm using PE-125bp with 4bp UMIs |
Oh, you need to add |
Added to README now. Also, about multi-threading, Calib runtime does not scale well with more threads. If you multiple samples run them in parallel but each on a single thread. Also, if you want a bit of speedup, run with |
That seems to be working, thank you :) I ran it quickly on my negative control and I get the expected cluster tsv file (the larger data sets are still waiting in a queue)! Thanks for the additional comments on Calib's scalability. I'll close the issue and will reopen in something goes haywire over the weekend! |
Alas, I've hit another snag. I was able to generate a proper cluster file with the additional
1654325 3012539 9 @HS27_336:2:1101:2628:2159 NACTGGGCCCAGCTTGCTAGACAAATAGGAGCCAGCCTGAATGATGACATTCTTTTCGGGGTGTTCGCACAAAGCAAGCCAGATTCTGCCGAACCAATGGATCGATCTGCCATGTGTGCATTCCC #<:??GDGGGGGGFGGGCEGFGCEGGGGGGGGGGGGGGGGGGGGGGFGGGGGGEGGGGGDGGGGGGGGGGDGFEFGGGGGGGGGGGG>GECGGGGGGF8FGGGGGGGGGGGGGDDDD<D=GGGGG @HS27_336:2:1101:2628:2159 CGAGTCATTGTTTTTGTTGACGATCTTGTTGAAGAAGTCGTTGACATATTTGATAGGGAATGCACACATGGCAGATCGATCCATTGGTTCGGCAGAATCTGGCTTGCTTTGTGCGAACACCCCGA A?ABBGGGGGGDGGGGGGBGGGGGGGGGGGBEGEGGGGGBFGGGGGGGGGGG1FGG>FGGDGGGGGGGGGGGGGGFDFFDFGGDGBFG>FGCD>GGE>F@GGGGGGGGGGGGGG<FGGGGBGGGG Then I was trying to pass the cluster file to calib_cons with the following command:
Reading cluster file: test.cluster I'm then left with the following files in the working DIR: 1.out.fastq All files are empty except the test.cluster file :/. I have tried the calib_cons command with the various different examples you've provided in the --help section, which result in the same message. Am I making another silly mistake? |
How did you generate the |
I definitely miss-interpreted the space separated FASTQ list comment. I thought since the sequences from the fastq.R1 and fastq.R2 were already present in the cluster file the original fastq files were no longer needed. I used the name of the unzipped |
Hi @baraaorabi, Best, Sarah |
Hi @sandmanns! Please give the latest conda release a try. It should now include the error correction module. Let me know if it works! |
Perfect! It works. |
Hey @baraaorabi,
I was wondering how the consensus and error correction steps are performed with the conda installed version of calib?
I was able to generate the test.cluster with the following command:
calib --input-forward R1.fastq.gz --input-reverse R2.fastq.gz --barcode-length 4 --output-prefix test. --minimizer-count 7 --kmer-size 8 --error-tolerance 1 --minimizer-threshold 2
BUT, I'm unable to proceed with the clustering and error correction steps because there are no additional calib arguments with the conda installed version:
$ calib --help
Combined barcode lengths must be a positive integer and each mate barcode length must be non-negative! Note if both mates have the same barcode length you can use -l/--barcode-length parameter instead.
Calib: Clustering without alignment using LSH and MinHashing of barcoded reads
Usage: calib [--PARAMETER VALUE]
Example: calib -f R1.fastq -r R2.fastq -o my_out. -e 1 -l 8 -m 5 -t 2 -k 4 --silent
Calib's paramters arguments:
-f --input-forward (type: string; REQUIRED paramter)
-r --input-reverse (type: string; REQUIRED paramter)
-o --output-prefix (type: string; REQUIRED paramter)
-s --silent (type: no value; default: unset)
-q --no-sort (type: no value; default: unset)
-g --gzip-input (type: no value; default: unset)
-l --barcode-length (type: int; REQUIRED paramter unless -l1 and -l2 are provided)
-l1 --barcode-length-1 (type: int; REQUIRED paramter unless -l is provided)
-l2 --barcode-length-2 (type: int; REQUIRED paramter unless -l is provided)
-p --ignored-sequence-prefix-length (type: int; default: 0)
-m --minimizer-count (type: int; default: Depends on observed read length;)
-k --kmer-size (type: int; default: Depends on observed read length;)
-e --error-tolerance (type: int; default: Depends on observed read length;)
-t --minimizer-threshold (type: int; default: Depends on observed read length;)
-c --threads (type: int; default: 1)
-h --help
Am I missing something here?
Best,
Chad
The text was updated successfully, but these errors were encountered: