-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roary not generating pan_genome_reference.fa #223
Comments
Sounds like an interesting project, although plasmids vary so much you'd
|
Also maybe turn off splitting paralogs since the gene order varys so much,
|
Andrew, Thanks for the help! No core genes were found in the summary_stats using the defaults, which was no surprise. I hadn't run it with -e -n because I wasn't expecting a core gene alignment (as there were no core genes). However, running Roary with -e -n results in the generation of pan_genome_reference.fa as expected by Roary. Oddly enough the post-analysis step is still taking awhile to run - I say odd because there should be no core genes to align. I apologize that it wasn't clear to me that -e -n were necessary to generate the pan_genome_reference. Running with the -s to avoid splitting paralogs did not change the generation of pan_genome_reference.fa (or the lack thereof). Best, |
Sorry if it wasnt clear in the documentation, its on my list to improve it somewhat. |
Moving on from my earlier work, I am attempting to generate a "panplasmidome" using reference plasmid sequences from Genbank. Of course, as these plasmids are diverse, there is no real "core" to speak of, however having a "pan_genome_reference" could be useful for mapping short reads looking for known plasmid gene content.
I've run roary several times but the pan_genome_reference.fa isn't being generated. It appears the other usual output files are being generated including gene_presence_absence.csv and all of the Rtab files. No errors or warnings are being generated.
Altering the threshold for calling genes "core" to a low percentage so that some core genes are called does not appear to change the script behavior. Is there an inherent problem with using Roary to generate a pan-plasmid reference? I've tried gff converted from gb files, as well as fresh FASTA files annotated by Prokka. Same result. If this is in fact an error, and not a known limitation, I'll upload sample data.
Best,
S. Wesley Long
The text was updated successfully, but these errors were encountered: