[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permissive Parsing and Phenopacket Versions #450

Open
iimpulse opened this issue Sep 21, 2022 · 2 comments
Open

Permissive Parsing and Phenopacket Versions #450

iimpulse opened this issue Sep 21, 2022 · 2 comments

Comments

@iimpulse
Copy link
iimpulse commented Sep 21, 2022

Hi Jules,

Daniel and I have found in our analysis pipeline with exomiser that we are allowing v2 phenopackets through the command-line runner with permissive parsing. With the schema change from v1 "negated" is renamed to "excluded". Exomiser still runs but will allow all phenotypes through since the filtering step is looking to filter negated terms not excluded terms. This leads to incorrect analysis because negated/excluded terms are interpreted as present.

One possible solution is to upgrade the phenopacket parser to attempt to parse v2 phenopackets not permissively then on failure try to parse v1 phenopackets. Otherwise failing hard. Of course this is your call on how to move forward with parsing.

We might want to also consider reving to v2 phenopackets for 14.0.

Below is a test phenopacket when looking at the parsing.

{
  "id": "phenopacket-id",
  "subject": {
    "id": "subject-id"
  },
  "phenotypicFeatures": [{
    "type": {
      "id": "HP:0008066",
      "label": "Abnormal blistering of the skin"
    },
    "excluded": true
  }, {
    "type": {
      "id": "HP:0040189",
      "label": "Scaling skin"
    }
  }],
  "metaData": {
    "created": "2022-01-23T11:11:11.000000Z",
    "createdBy": "phenopacket-wizard",
    "resources": [{
      "id": "hp",
      "name": "human phenotype ontology",
      "url": "http://purl.obolibrary.org/obo/hp.owl",
      "version": "http://purl.obolibrary.org/obo/hp/releases/2022-06-11/hp.json",
      "namespacePrefix": "HP",
      "iriPrefix": "http://purl.obolibrary.org/obo/HP_"
    }],
    "phenopacketSchemaVersion": "2.0"
  }
}
@ielis
Copy link
ielis commented Sep 21, 2022

An example to demonstrate the mishap (can be used in ProtoParserTest):

This code parsers the above phenopacket:

@Test
void readV1PhenopacketJson() {
    // Path to the phenopacket from Mike's post
    Phenopacket phenopacket = parsePhenopacket(Paths.get("example.json"));

    assertThat(phenopacket.getPhenotypicFeaturesCount(), equalTo(2)); // passes

    long positivePhenotypicFeaturesCount = phenopacket.getPhenotypicFeaturesList().stream()
            .filter(pf -> !pf.getNegated())
            .count();
    // Fails because the `excluded` field from the v2 phenopacket is ignored, 
    // changing the `Abnormal blistering of the skin` to a present term despite 
    // that it should have been omitted from the analysis.
    assertThat(positivePhenotypicFeaturesCount, equalTo(1L));
}

@julesjacobsen
Copy link
Contributor

Thanks for the feedback Mike and Daniel. I'll add this to 14.0.0.

@julesjacobsen julesjacobsen added this to To do in Release 14.0.0 via automation Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Release 14.0.0
  
To do
Development

No branches or pull requests

3 participants