-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding reboot functionality #378
Conversation
Code Metrics Report=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== Dockerfile 1 34 25 0 9 Happy 1 442 369 0 73 JSON 9 21 21 0 0 Python 31 1217 1038 37 142 TOML 16 440 400 1 39 ------------------------------------------------------------------------------- Jupyter Notebooks 1 0 0 0 0 |- Markdown 1 60 30 22 8 |- Python 1 96 87 1 8 (Total) 156 117 23 16 ------------------------------------------------------------------------------- Markdown 16 1135 0 836 299 |- BASH 5 100 97 0 3 |- Python 6 122 110 0 12 |- Rust 2 80 72 3 5 (Total) 1437 279 839 319 ------------------------------------------------------------------------------- Rust 115 34276 31041 583 2652 |- Markdown 57 641 13 594 34 (Total) 34917 31054 1177 2686 =============================================================================== Total 191 37565 32894 1457 3214 =============================================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this functionality. My concern is that we will not know if the engine itself has crashed and therefore there is no more sender to reboot to. I think that if the sender was closed, we should check if the engine itself is alive somehow before restoring it.
Also, I assume that you left out the rebooting logic from interactive mode on purpose? I think that we should include this logic though in the mistralrs_pyo3::Runner
methods send_completion_request
and send_chat_completion_request
.
@EricLBuehler thanks for the thoughts!
Agreed - let me think more about this before my next commit here
Will add those when I've made another stab at the above |
Okay this is my next stab, here's a quick overview:
This is a little more robust than the last approach; previously, maybe somehow the sender was dropped without the engine going down. In this new approach, the engine has definitely crashed if the thread has returned. Also this now auto-reboots on get_sender(), so no extra logic is needed on the server side. I don't believe there's a potential for deadlock with the Thoughts on this newest version? Still some to-dos with the error stuff, but I'll handle that if this looks like a solid approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, I think this is almost ready to merge after you fix the errors you mentioned.
Before merging, however, I would appreciate it if you run the updated code to make sure it works, and maybe throw in a panic!
in the engine to confirm that the reboot works?
Here's a version that adds some error types, reworks the logging (warns on a reboot needed), and resolves the merge conflict. I was able to test that adding a panic into the engine.run() did crash the runtime and it was able to restart (before immediately crashing again because of the panic). Not sure why the windows check failed, let me know if there's something I can do about that -- looks like a tokenizer didn't have permissions? |
Ok, great! The Windows check is flaky for some reason. |
I don't think the clippy stuff is from this PR -- let me know if there's anything else to do for this one! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!
This is something I whipped up, thoughts? The basic idea is to be able to restart the tokio runtime if it for some reason goes down.
It introduces a few new pieces, mainly:
reboot
function, which updates theMistralRs
senderMistralRs.sender
to have aRwLock
, so we can safely swap it out with a new sender if the engine goes downRwLock
on the sender, theget_sender()
function can potentially fail, so nowget_sender()
returns a Result.Before I keep going on this, wanted to get anyone's thoughts on it!
Maybe still to do