User Details
- User Since
- Feb 15 2022, 2:51 PM (126 w, 3 d)
- Availability
- Available
- IRC Nick
- aiko
- LDAP User
- Unknown
- MediaWiki User
- AChou-WMF [ Global Accounts ]
Tue, Jul 9
Thu, Jul 4
Fri, Jun 28
The feature has been deployed to the revert risk models on ml-staging.
Wed, Jun 26
@kostajh yeah we can keep it very simple - all fields need to be provided by the caller, and missing any field will result in an invalid json error. On the caller side, if they can ensure all required values are correctly calculated, it should work fine, and the new endpoint should be kept for internal use only.
Tue, Jun 25
I resumed work on this task this week. Upon reviewing the schema required by the model, I think that some fields should not require user input. Instead, these values should be assigned or calculated by the system. For example, fields such as revision.bytes and revision.timestamp should be system-generated since the model uses these values for prediction. For the revision.id field, we will simply assign -1, representing a pre-save edit.
Fri, Jun 21
Thu, Jun 20
Jun 19 2024
Jun 10 2024
Jun 5 2024
I tested the Revert Risk models with the transparent config in staging. It worked without any issues. Notably, it seems that the transparent config somehow increases the performance of the revertrisk multilingual isvc. The requests per second (RPS) increased from 2.9 to 6.79 based on load test results.
The new revertrisk images have been deployed to production.
Jun 3 2024
May 30 2024
May 29 2024
May 24 2024
Hi @dcausse @EBernhardson, I just wanted to sync with you whether it is acceptable to lose some events in the stream for eqiad.mediawiki_page_outlink_topic_prediction_change_v1 and eqiad.mediawiki_revision_score_drafttopic when we transition from changeprop to cp-jobqueue. If I recall correctly, Search uses these streams to update Elastic Search. I checked the consumer groups on the dashboards (outlink, drafttopic) and the cirrus-streaming-updater-producer-eqiad was there. :)
May 17 2024
May 16 2024
May 15 2024
May 14 2024
Thanks for sharing the use case!
Potentially called on all edit attempts by not-yet-logged-in users.
One thing to note is that for edits by not-yet-logged-in users, the revert risk multilingual (RRML) model might be more suitable than revertrisk language agnostic (RRLA) model as it handles bias better. But RRML requires more resources and is much slower, with prediction latency ranging from hundreds of ms to a few seconds.
May 6 2024
Can't believe I missed this :(
Apr 26 2024
I got an error when testing the batch model after deploying the new image of kserve 0.12.1 for revert risk models
aikochou@deploy1002:~$ curl "https://inference-staging.svc.codfw.wmnet:30443/v1/models/revertrisk-language-agnostic:predict" -d@./input_some_succeed.json -H "Host: revertrisk-language-agnostic-batcher.revertrisk.wikimedia.org" --http1.1 -k | jq '.' { "error": "AttributeError : 'JSONResponse' object has no attribute 'encode'" }
It worked before. There may be a change in kserve 0.12.1 that's causing the problem. I'll debug this.
Hi @kostajh, yes, this is something we can work on this quarter. I am wondering if there's an ongoing project or product in development that needs this feature. If so, could you provide the links? Also, do you have an estimate of the expected traffic for this feature? I'm assuming it will be requested via the external endpoint, correct?
Apr 23 2024
Thanks, that is what I am proposing as well. @achou, how feasible do you think this is from your side? It would involve accepting a POST with all the features (https://gitlab.wikimedia.org/repos/research/knowledge_integrity/-/blob/main/knowledge_integrity/featureset.py?ref_type=heads) needed.
Apr 19 2024
Apr 16 2024
@kostajh @XiaoXiao-WMF thanks for tagging. Sorry I was unaware of the discussion here. The ML team is currently in the middle of quarterly planning. I will bring up the proposal during our planning and get back to you shortly!
Apr 12 2024
Apr 11 2024
Apr 9 2024
I built a RRML image locally using the Pytorch 2.2.x base image from T360638.
Apr 8 2024
Apr 5 2024
We have deployed the new RRLA model server to production.
Apr 4 2024
This task is complete. I've created T361881 to follow up on the above test results issue.
FYI @MunizaA :)
The new RRLA model server featuring KI v.0.6 has been deployed to ML-staging. I used wrk to conduct load testing and compare the performance between the old and new versions. The results for the previous version are under P59447, and the results for the new version are under P59464. From these results, it's clear that the new KI version does not affect the performance metrics, such as average latency and RPS.
I repost what I previously wrote here as the issue is more related to deployment.
This task is complete. Check out these examples:
This task is complete. Check out these examples of new error messages:
$ curl "https://inference-staging.svc.codfw.wmnet:30443/v1/models/revertrisk-language-agnostic:predict" -d '{"rev_id": 15925124, "lang": "ro"}' -H "Host: revertrisk-language-agnostic.revertrisk.wikimedia.org" --http1.1 -k | jq '.' { "detail": "Could not make prediction for revision 15925124 (ro). Reason: revision_missing" }
Apr 3 2024
@kevinbazira posed a question - how can end users switch between batch and non-batch requests?
Apr 2 2024
Mar 28 2024
@isarantopoulos do you remember the config values in locust.conf when you ran the revertrisk tests? I can't reproduce the result in revertrisk_stats.csv. I haven't deployed RRLA to staging yet, so it's the same model you tested.