[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the Gateway API support to configurate an ACME issuer #36

Merged
merged 2 commits into from
Sep 7, 2022

Conversation

lonelyCZ
Copy link
Contributor

Signed-off-by: lonelyCZ 531187475@qq.com

Fixes #33


# The following parameters must be used in combination
# 1. issuer_type is "SelfSigned" and envoy.service.type is "NodePort", which can be tested in a local kind environment.
# 2. issuer_type is "ACME" and envoy.service.type is "LoadBalancer", which can be tested in a cloud environment such as GKE.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is only tested, if there will not problem, I will do other works, like update readme.

@lonelyCZ
Copy link
Contributor Author

Hi @irbekrm, could you please help me to test this pr in GKE environment.

/assign @irbekrm

Copy link
Contributor
@irbekrm irbekrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @lonelyCZ !

I've tried the module, but am getting error during ACME issuer creation. I can reproduce this on kind cluster as well

│ Error: Plugin did not respond
│ 
│   with kubernetes_manifest.gateway-issuer-acme[0],
│   on main.tf line 121, in resource "kubernetes_manifest" "gateway-issuer-acme":
│  121: resource "kubernetes_manifest" "gateway-issuer-acme" {
│ 
│ The plugin encountered an error, and failed to respond to the
│ plugin.(*GRPCProvider).ApplyResourceChange call. The plugin logs may
│ contain more details.
╵

Stack trace from the terraform-provider-kubernetes_v2.12.1_x5 plugin:

panic: interface conversion: interface {} is nil, not map[string]interface {}
...

(As a sidenote, we probably want to add a dependency on gateway API install to cert-manager install module as the cert-manager pod crashes if the Gateway feature is enabled, but it does not find the Gateway CRDs)

@@ -13,3 +13,11 @@ variable "issuer_name" {
default = "gateway-issuer"
}

variable "issuer_type" {
type = string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should validate that it is one of the accepted types (similar to what you did for vault auth mechanisms https://github.com/cert-manager/testing-addons/blob/master/cm-vault/vault-config/variables.tf#L10

resource "kubernetes_manifest" "selfsigned-issuer" {
count = var.issuer_type == "SelfSigned" ? 1 : 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this issuer type should be called something like SelfSignedCA as it is a CA issuer that is actually used to sign the Gateway certs not a self-signed one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree!

cm-gateway/common.tfvars Show resolved Hide resolved
@irbekrm irbekrm assigned lonelyCZ and unassigned irbekrm Aug 24, 2022
@lonelyCZ
Copy link
Contributor Author
│ Error: Plugin did not respond
│ 
│   with kubernetes_manifest.gateway-issuer-acme[0],
│   on main.tf line 121, in resource "kubernetes_manifest" "gateway-issuer-acme":
│  121: resource "kubernetes_manifest" "gateway-issuer-acme" {
│ 
│ The plugin encountered an error, and failed to respond to the
│ plugin.(*GRPCProvider).ApplyResourceChange call. The plugin logs may
│ contain more details.
╵

Stack trace from the terraform-provider-kubernetes_v2.12.1_x5 plugin:

panic: interface conversion: interface {} is nil, not map[string]interface {}
...

I can reproduce this on kind cluster as well

I guess this is error coming from Kubernetes Provider, because it can run in my kind cluster environment.

The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ApplyResourceChange call

It seems that the previous resources haven't removed completely. Could you please help to retest it in your env.

But now there are some problems certainly, for example, if there are some errors occurring when executing terragrunt run-all destroy, we also can't run it secondly, I will open an issue to trace this problem and improve its stability.

As a sidenote, we probably want to add a dependency on gateway API install to cert-manager install module as the cert-manager pod crashes if the Gateway feature is enabled, but it does not find the Gateway CRDs.

From the Dependencies Graph, currently, the cert-manager should be deployed after gateway-api installed. Could you describe your encountered problems?

@lonelyCZ
Copy link
Contributor Author
lonelyCZ commented Aug 24, 2022

I also reproduced this error in my kind cluster.

Error: Plugin did not respond
│
│   with kubernetes_manifest.gateway-issuer-acme[0],
│   on main.tf line 121, in resource "kubernetes_manifest" "gateway-issuer-acme":
│  121: resource "kubernetes_manifest" "gateway-issuer-acme" {

I found the gateway-issuer was not ready, I guess the above error is caused by it.

[root@master68 cm-gateway]# kubectl get issuer -A
NAMESPACE      NAME             READY   AGE
gateway-demo   gateway-issuer   False   97m
status:
  acme: {}
  conditions:
  - lastTransitionTime: "2022-08-24T12:21:44Z"
    message: 'Failed to register ACME account: 400 urn:ietf:params:acme:error:invalidEmail:
      Error creating new account :: invalid contact domain. Contact emails @example.com
      are forbidden'
    observedGeneration: 1
    reason: ErrRegisterACMEAccount
    status: "False"
    type: Ready

@lonelyCZ lonelyCZ force-pushed the pr-issuer-acme branch 2 times, most recently from fec73f8 to 8322d6a Compare August 24, 2022 14:22
}
"spec" = {
"acme" = {
"email" : var.acme_email
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the problem is the configuration of acme, could you please help me to deploy it alone to debug it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes you're right there is a problem with the example email

status:
  acme: {}
  conditions:
  - lastTransitionTime: "2022-08-24T15:12:58Z"
    message: 'Failed to register ACME account: 400 urn:ietf:params:acme:error:invalidEmail:
      Error creating new account :: invalid contact domain. Contact emails @example.com
      are forbidden'
    observedGeneration: 1
    reason: ErrRegisterACMEAccount
    status: "False"
    type: Read

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... however I see that even with the correct email when the issuer does become ready it nevertheless crashes.

We should investigate for a bit more, if it doesn't help, we can try to temporarily remove the wait condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that even with the correct email when the issuer does become ready it nevertheless crashes.

Is it the reason that is timeout? The current creation time is only 1 minute.

@lonelyCZ lonelyCZ removed their assignment Aug 29, 2022
@lonelyCZ
Copy link
Contributor Author

Hi, @irbekrm does it work if remove the wait condition? Or is it the reason that is timeout? The current creation time is only 1 minute.

/assign @irbekrm

@irbekrm
Copy link
Contributor
irbekrm commented Aug 31, 2022

Hi, @irbekrm does it work if remove the wait condition?

Hi @lonelyCZ yes, if the wait condition is removed the ACME issuer gets created correctly 👍🏼
I think it is a problem with how the conditions get parsed in the upstream terraform module. Perhaps we can investigate and try to open an issue upstream later, but for now we could just remove the wait condition.

I now see another error where the ACME challenge does not get solved correctly, looking into it..

@irbekrm
Copy link
Contributor
irbekrm commented Aug 31, 2022

I now see another error where the ACME challenge does not get solved correctly, looking into it..

I think we need another Gateway that would forward plain HTTP traffic from all routes and that cert-manager could use to attach the HTTPRoute it creates for the challenge. At the moment we configure the issuer to attach the challenge route to the same gateway that we are issuing the cert for, but this Gateway will only become Ready when the cert is issued, so it cannnot be used to route the traffik to the challenge pod. So, we need a plain HTTP Gateway like this as well

Let me know if that makes sense

@lonelyCZ
Copy link
Contributor Author

I think we need another Gateway that would forward plain HTTP traffic from all routes and that cert-manager could use to attach the HTTPRoute it creates for the challenge. At the moment we configure the issuer to attach the challenge route to the same gateway that we are issuing the cert for, but this Gateway will only become Ready when the cert is issued, so it cannnot be used to route the traffik to the challenge pod. So, we need a plain HTTP Gateway like this as well

Thanks for your investigations, I will do it as your suggestions today.

@irbekrm
Copy link
Contributor
irbekrm commented Aug 31, 2022

Thanks for your investigations, I will do it as your suggestions today.

Thank you @lonelyCZ 😄

@irbekrm irbekrm assigned lonelyCZ and unassigned irbekrm Aug 31, 2022
@lonelyCZ lonelyCZ removed their assignment Aug 31, 2022
@lonelyCZ
Copy link
Contributor Author

Please take a look. The dependencies graph also has been updated.

/assign @irbekrm

@irbekrm
Copy link
Contributor
irbekrm commented Sep 2, 2022

@lonelyCZ I think we still need to remove that wait condition from the issuer #36 (comment)

@lonelyCZ
Copy link
Contributor Author
lonelyCZ commented Sep 2, 2022

I think we still need to remove that wait condition from the issuer #36 (comment)

Ok, any other problems? If no, I am going to update the readme file.

/hold

@irbekrm
Copy link
Contributor
irbekrm commented Sep 2, 2022

Ok, any other problems?

I think after that this should be good 👍🏼

@irbekrm irbekrm assigned lonelyCZ and unassigned irbekrm Sep 2, 2022
@lonelyCZ lonelyCZ removed their assignment Sep 3, 2022
@lonelyCZ
Copy link
Contributor Author
lonelyCZ commented Sep 3, 2022

/unhold

/assign @irbekrm

Signed-off-by: lonelyCZ <531187475@qq.com>
Signed-off-by: lonelyCZ <531187475@qq.com>
@irbekrm
Copy link
Contributor
irbekrm commented Sep 7, 2022

Thanks @lonelyCZ!

/lgtm
/approve

@jetstack-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: irbekrm, lonelyCZ

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jetstack-bot jetstack-bot merged commit 519ae43 into cert-manager:master Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the Gateway API support to configurate an ACME issuer
3 participants