Deploy RR-language-agnostic batch version to prod
Closed, ResolvedPublic3 Estimated Story Points
Actions

Assigned To

Authored By

	achou
	Feb 29 2024, 9:50 AM

Description

In T348536, we tested the kserve batcher on RR-language-agnostic in staging and confirmed its effectiveness. For this test, I used my branch of knowledge integrity, which adds new functions to RR-language-agnostic to support batch feature extraction and classification.

To deploy the RR-language-agnostic batch version to production, we need to merge these changes into the latest knowledge integrity release, v0.6.0. However, this is currently on hold due to a problem with Pydantic (see T355742). We will start this task once the issue is resolved.

Details

Subject	Repo	Branch	Lines +/-
ml-services: update RevertRisk LA/ML/Wikidata's images	operations/deployment-charts	master	+3 -19
revertrisk: modify the response type in batch model	machinelearning/liftwing/inference-services	main	+4 -25
ml-services: update batch revertrisk LA image in staging	operations/deployment-charts	master	+1 -1
revertrisk: add support for base model's payloads in batch model	machinelearning/liftwing/inference-services	main	+32 -11

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	achou	T348153 Q3 2024 Goal: Lift Wing users can request multiple predictions using a single request.
Resolved	achou	T358744 Deploy RR-language-agnostic batch version to prod
Resolved	achou	T360406 Error handling in Batch Predictions for RevertRisk Models

Event Timeline

achou created this task.Feb 29 2024, 9:50 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 29 2024, 9:50 AM

achou added a parent task: T348153: Q3 2024 Goal: Lift Wing users can request multiple predictions using a single request..Feb 29 2024, 9:55 AM

calbon moved this task from Unsorted to Backlog/Lift Wing on the Machine-Learning-Team board.Mar 5 2024, 3:49 PM

calbon set the point value for this task to 3.

achou moved this task from Backlog/Lift Wing to Ready To Go on the Machine-Learning-Team board.Mar 14 2024, 10:40 AM

achou added a subtask: T360406: Error handling in Batch Predictions for RevertRisk Models.Mar 19 2024, 9:35 AM

achou closed subtask T360406: Error handling in Batch Predictions for RevertRisk Models as Resolved.Apr 4 2024, 7:15 PM

I repost what I previously wrote here as the issue is more related to deployment.

In T360406#9685087, @achou wrote:
@kevinbazira posed a question - how can end users switch between batch and non-batch requests?

First to clarify, the batch model can also handle single requests. For example, give this input:
{
    "instances": [
      {
        "lang": "en",
        "rev_id": 123456
      }
    ]
}
The main differences between the base model (currently in production) and the batch model (the new one) are:

The batch model supports multiple predictions in a single request.

The batch model uses a different input/output schema, required by the Kserve batcher.

Regarding how end users access the batch model, there are three options:

Replace the current model with the batch model

I think this is the plan when we set up the goal T348153. The concern here is the input/output schema is a breaking change, that could impact downstream applications. Given that the Revert Risk-language agnostic model currently handles production traffic, we would need to notify downstream product owners and provide support as needed. This switch would also introduce some inconsistency among our Lift Wing models, as this model server would be the first one using a different input/output schema.

Create a new endpoint for the batch model

We could add a new endpoint, such as /v1/models/revertrisk-language-agnostic-batch, and document the changed schema and usage examples on the model card, API Gateway doc, and Lift Wing doc. We would then inform end users about this new endpoint that they can use for requesting multiple predictions. However, this would bring us more maintenance work, as we basically provide two different services for the same model.

Find a way to support both schemas in one endpoint

We could make the batch model backwards compatible with the current schema for single requests, but this would complicate our code, and the distinction between the base model and the batch model would become blurred, which is not desired. Alternatively, maybe there is a way to redirect batch requests to the batch isvc, which I'm not sure of its feasibility but that would be ideal.

At first I leaned towards the second option to avoid introducing a breaking change to our production service. However, upon further consideration, it seems excessive to create a new endpoint for the batch model.

What do people think about this?

achou moved this task from Ready To Go to In Progress on the Machine-Learning-Team board.Apr 4 2024, 7:54 PM

Change #1020835 had a related patch set uploaded (by AikoChou; author: AikoChou):

[machinelearning/liftwing/inference-services@main] revertrisk: add support for base model's payloads in batch model

https://gerrit.wikimedia.org/r/1020835

gerritbot added a project: Patch-For-Review.Apr 17 2024, 1:38 PM

Change #1020835 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] revertrisk: add support for base model's payloads in batch model

https://gerrit.wikimedia.org/r/1020835

achou mentioned this in rMLIS0706f1a55693: revertrisk: add support for base model's payloads in batch model.Apr 19 2024, 9:29 AM

Change #1021966 had a related patch set uploaded (by AikoChou; author: AikoChou):

[operations/deployment-charts@master] ml-services: update batch revertrisk LA image in staging

https://gerrit.wikimedia.org/r/1021966

Change #1021966 merged by jenkins-bot:

[operations/deployment-charts@master] ml-services: update batch revertrisk LA image in staging

https://gerrit.wikimedia.org/r/1021966

I got an error when testing the batch model after deploying the new image of kserve 0.12.1 for revert risk models

aikochou@deploy1002:~$ curl "https://inference-staging.svc.codfw.wmnet:30443/v1/models/revertrisk-language-agnostic:predict" -d@./input_some_succeed.json -H "Host: revertrisk-language-agnostic-batcher.revertrisk.wikimedia.org" --http1.1 -k | jq '.'
{
  "error": "AttributeError : 'JSONResponse' object has no attribute 'encode'"
}

It worked before. There may be a change in kserve 0.12.1 that's causing the problem. I'll debug this.

Maintenance_bot removed a project: Patch-For-Review.Apr 26 2024, 6:11 PM

aikochou opened https://gitlab.wikimedia.org/repos/research/knowledge_integrity/-/merge_requests/40

feat(revertrisk): add support for batch prediction

aikochou merged https://gitlab.wikimedia.org/repos/research/knowledge_integrity/-/merge_requests/40

feat(revertrisk): add support for batch prediction

Maintenance_bot removed a project: Patch-For-Review.May 21 2024, 6:30 PM

Change #1035012 had a related patch set uploaded (by AikoChou; author: AikoChou):

[machinelearning/liftwing/inference-services@main] revertrisk: modify the response to dict type in batch model

https://gerrit.wikimedia.org/r/1035012

gerritbot added a project: Patch-For-Review.May 22 2024, 5:24 PM

Change #1035012 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] revertrisk: modify the response type in batch model

https://gerrit.wikimedia.org/r/1035012

achou mentioned this in rMLIS80b4f3197810: revertrisk: modify the response type in batch model.Jun 3 2024, 1:34 PM

Maintenance_bot removed a project: Patch-For-Review.Jun 3 2024, 2:31 PM

Change #1038736 had a related patch set uploaded (by AikoChou; author: AikoChou):

[operations/deployment-charts@master] ml-services: update RevertRisk LA/ML/Wikidata's images

https://gerrit.wikimedia.org/r/1038736

gerritbot added a project: Patch-For-Review.Jun 4 2024, 10:22 AM

Change #1038736 merged by jenkins-bot:

[operations/deployment-charts@master] ml-services: update RevertRisk LA/ML/Wikidata's images

https://gerrit.wikimedia.org/r/1038736

Maintenance_bot removed a project: Patch-For-Review.Jun 4 2024, 3:31 PM

The new revertrisk images have been deployed to production.

Next steps:

Update API documentation to inform users that they can request multiple predictions in a single request.

achou closed this task as Resolved.Jun 19 2024, 11:54 AM

isarantopoulos moved this task from In Progress to 2023-2024 Q4 Done on the Machine-Learning-Team board.Jun 19 2024, 12:59 PM

Deploy RR-language-agnostic batch version to prodClosed, ResolvedPublic3 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

Deploy RR-language-agnostic batch version to prod
Closed, ResolvedPublic3 Estimated Story Points
Actions

Related Objects
Search...