[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resctrl do not handle the orphan mon_groups #3057

Open
srl11 opened this issue Feb 8, 2022 · 6 comments
Open

resctrl do not handle the orphan mon_groups #3057

srl11 opened this issue Feb 8, 2022 · 6 comments
Assignees

Comments

@srl11
Copy link
srl11 commented Feb 8, 2022

resctrl creates mon_group for container if cadvisor set resctrl_interval(!=0), and clear it when the container is destroyed.
But, cadvisor won't clear up the mon_group for the container which is destoryed during cadvisor restart.
So should we check the orphan mon_group, and clear it up?

releated version: v0.43.0 v0.42.0 v0.41.0 (commit)

My Solution:
if resctrl_interval!=0, cache the mon_group created by cadvisor, which has cadvisor monGroupPrefix, and clear up the expired mon_group.

@Creatone @katarzyna-z @dashpole

@srl11
Copy link
Author
srl11 commented Feb 8, 2022

another comment for resctrl comment

@srl11
Copy link
Author
srl11 commented Feb 13, 2022

Is that appropriate to deal with the issue in this way? Could you please help me to review itPR? Thanks. 🌹 @Creatone @katarzyna-z @dashpole

@Creatone
Copy link
Collaborator
Creatone commented Mar 7, 2022

Maybe the simplest solution is to clear all "cadvisor" prefixed mon groups during setup?

@srl11
Copy link
Author
srl11 commented Mar 10, 2022

Maybe the simplest solution is to clear all "cadvisor" prefixed mon groups during setup?

It's the simplest solution. But when cadvisor crash frequently, it will lead to clear-up and re-create mon-group frequently. I'm not sure whether it will cost a lot for kernel as I'm not familar with the mon_group in the kernel 😭 .

@Creatone
Copy link
Collaborator

Frequently crashing is not expected behavior.
I understand that corner case, we fix the same thing previously with a simplest solution in intel/workload-collocation-agent

@Creatone
Copy link
Collaborator

@srl11 Could you check if that commit is sufficient for you? Creatone@3e2feae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants