-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
value-less qualifiers get an empty value in EMBL export #902
Comments
Can you check what the parser did? It should be parsing https://github.com/biopython/biopython/blob/master/Bio/SeqIO/InsdcIO.py#L300 |
It's definitely an empty list in the example:
Edit: Technically, it's not even an empty list, but a list containing only an empty string. |
Thanks - looks like something was broken in the parser, and we can't have any tests at the moment which check this explicitly. |
OK many thanks, for now I will add a workaround. What do you mean with 'we can't have any tests which check this explicitly' -- are you saying there are no tests but there should be or that you aren't intending to test this at all? |
Sorry, poor phrasing, I meant that fact this bug occurred implies we currently don't have test checking for this. We should have a test for this. |
It looks like the parser was changed a while back. Switching the parser to use
That is probably invalid, but might be stored as |
This is to address biopython#902, and a possibly accidental change to the original parser behaviour. The GenBank and EMBL writers expect None to write a valueless qualifier (rather than an empty string).
The good news is for the GenBank/EMBL output, you can use either |
Since this overlaps with BioSQL, I filed biosql/biosql#6 |
Well, this would work for setting new pseudo entries, but what I wanted was keeping the pseudo qualifier untouched in features read from EMBL. At the moment I am writing the EMBL output into a StringIO and then cleaning it up afterwards anyway (not just the |
Pull request #905 makes the parser changes I outlined (so we can read and then write |
Hi peterjc, I saw this is still handled like this in SeqIO.write and the code http://biopython.org/DIST/docs/api/Bio.GenBank-pysrc.html#_FeatureConsumer.feature_qualifier |
@Carambakaracho Can you have a look at #905? |
This is to address biopython#902, and a possibly accidental change to the original parser behaviour. The GenBank and EMBL writers expect None to write a valueless qualifier (rather than an empty string).
When reading/writing EMBL files, qualifiers without values like
/pseudo
get an empty value after exporting. Biopython 1.67. Example:on the following example file:
test.embl.zip
This fails validation by ENA's pre-submission validator.
The text was updated successfully, but these errors were encountered: