[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: How paused state for a managed resource should be reported? #363

Open
ulucinar opened this issue Nov 3, 2022 · 3 comments
Open
Labels
enhancement New feature or request

Comments

@ulucinar
Copy link
Contributor
ulucinar commented Nov 3, 2022

What problem are you facing?

In a recent discussion with @luebken and @muvaf and me, there were some concerns around how the paused state of a managed resource is communicated. The current behavior is to put the resource into a Synced=False state with the reason ReconcilePaused as follows:

status:
  atProvider: {}
  conditions:
  - lastTransitionTime: "2022-09-28T12:30:45Z"
    reason: ReconcilePaused
    status: "False"
    type: Synced

So for example, if the managed resource is Ready=True, Synced=True state, pausing it will transition it into the Ready=True, Synced=False/ReconcilePaused state omitting a paused event on the managed resource. A subsequent kubectl get on the managed resource will also report SYNCED=False:

> k get vpc

NAME                      READY   SYNCED   ID                      CIDR             IPV6CIDR   AGE
my-resource-7q8tk-v4hjw   True    False    vpc-077f9c262d8241fe7   192.168.0.0/16              19h

and then a kubectl describe or a kubectl get -o yaml can be used the check if the Synced=False is due to a ReconcileError or a ReconcilePaused.

And if the crossplane.io/paused annotation is removed, or its value is changed to something other than "true", then the managed resource will transition into the Synced=True (if there are no reconcile errors), or the Synced=False/ReconcileError state.

We should also keep in mind that once all the events associated with a paused managed resource are drained from the controller's workerqueue, no further events are queued for the paused managed resource and we will not observe any more paused events for the resource unless an external entity (client) modifies/deletes the managed resource, causing new reconciliations.

With this issue, we would like to discuss:

  • Whether we should reserve the Synced=False status condition only to ReconcileErrors and remove the ReconcilePaused reason associated with the Synced=False condition.
  • If we no longer report the paused state of a managed resource via Synched=False condition, whether we want to report this state as part of the resource's status somewhere (like with a new status condition or elsewhere like Ready=False, etc.)
  • ...

How could Crossplane help solve your problem?

I'm opening this issue for collecting feedback and ideas and also to increase awareness for the new feature.

@ulucinar ulucinar added the enhancement New feature or request label Nov 3, 2022
@muvaf
Copy link
Member
muvaf commented Nov 8, 2022

Quoting my message from Slack.

This was a long discussion when we first discussed how to do observe-only resources, i.e. whether it should be an annotation/policy field or a separate kind. Where we sort of landed was a separate kind with data field instead of spec because ultimately, the fact that user wants it to be observe-only or paused overrides their desire of how the resource should look like but it is still confusing to have a spec that is simply ignored. This was the main thing that moved the needle towards a separate kind for me even if that solution requires much more effort.

There is a conceptual race there about whether the desire is spec or the annotation when both exist. I’d personally approach from another angle as well which is the fact that we set Synced to False only if the reconciliation doesn’t succeed and in the reconcilers we made the decision that annotation is the final desired state if given (we return if we see it). From that perspective, I think we already made a decision in the reconciler about which field has the final say so I think setting Synced to True would be inline with that. So I suggest we set it True.

@negz
Copy link
Member
negz commented Jan 24, 2023

Intuitively Synced=False seems correct to me. A resource that is paused may not not have its desired state synced, and a resource that is experiencing a reconcile error may not have its desired state synced. They're two different reasons that a resource may not be in sync.

there were some concerns around how the paused state of a managed resource is communicated.

Could you state explicitly what the concerns are?

@negz
Copy link
Member
negz commented Jan 24, 2023

I'm not sure how much we can really change at this point - I would consider status conditions as part of a resource's API and thus changing the status conditions (with perhaps the exception of the human readable message) to be a breaking API change.

That said, I think observedGeneration would probably do a better job of communicating what the synced status currently tries to communicate. From https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/kstatus/README.md#conditions:

If the controller is not running, or for some reason not able to update the resource, it will look like it is in a good state when that is not true. The solution to this issue is to adopt the pattern used by several of the built-in types where there is an observedGeneration property on the status object which is set by the controller during the reconcile loop. If the generation and the observedGeneration of a resource does not match, it means there are changes that the controller has not yet seen, and therefore not acted upon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants