Kubernetes Image Policy Webhook Explained
How to create and deploy a Kubernetes image policy webhook.
Join the DZone community and get the full member experience.
Join For FreeIntroduction
In this article, we will explore how webhooks work in Kubernetes and, more specifically, about the ImagePolicyWebhook. The Kubernetes documentation about it is kind of vague, since there is no real example or implementation that you can get out of it, so here, we will break it down to the different alternatives. In a real-world scenario, I would prefer to rely on OPA Gatekeeper, but I’m planning to make this trip worth it by adding a database and making the webhook allow or disallow images based on a vulnerability scan — for example, allow only medium or lower vulnerabilities in your containers — but that will be a post for another day. If you are interested, you can help in this repo. For more information in general, see here.
There are two ways to make this work, and each one has a slightly different behavior. One way is using the ImagePolicyWebhook and the other is using Admission Controllers. Either validating or mutating works, but here I used the validating webhook. You can learn more here.
This admission controller will reject all the pods that are using images with the latest
tag and, in the future, we will see if pods cannot meet required security levels.
Comparison
The ImagePolicyWebhook is an admission controller that evaluates only images. You need to parse the requests do the logic and the response in order to allow or deny images in the cluster.
The good parts of the ImagePolicyWebhook
:
- The API server can be instructed to reject the images if the webhook endpoint is not reachable. This is quite handy, but it can also bring issues, like core pods won’t be able to schedule.
The bad parts of the ImagePolicyWebhook
:
- The configuration is a bit more involved and requires access to the master nodes or to the API server configuration. The documentation is not clear, and it can be hard to make changes, update, etc.
- The deployment is not trivial. You need to deploy it with systemd or run it as a Docker container in the host, update the DNS, etc.
On the other hand, the ValidatingAdmissionWebhook can be used for way more things than just images (if you use the mutating one, well, you can inject or change things on the fly).
The good parts of the ValidatingAdmissionWebhook
:
- Easier deployment since the service runs as a pod.
- Everything can be a Kubernetes resource.
- Less manual intervention, and access to the master is not required.
- If the pod or service is unavailable then all images are going to be allowed which can be a security risk in some cases, so if you are going this path be sure to make it highly available, this can actually be configured by specifying the
failurePolicy
toFail
instead ofIgnore
(Fail
is the default).
The bad parts about the ValidatingAdmissionWebhook
:
- Anyone with enough RBAC permissions can update/change the configuration since it’s just another kubernetes resource.
Building
If you intend to use it as a plain service:
$ go get github.com/kainlite/kube-image-bouncer
You can also use this Docker image:
xxxxxxxxxx
$ docker pull kainlite/kube-image-bouncer
Certificates
We can rely on the Kubernetes CA to generate the certificate that we need. If you want to learn more, go here:
Create a CSR:
xxxxxxxxxx
$ cat <<EOF | cfssl genkey - | cfssljson -bare server
{
"hosts": [
"image-bouncer-webhook.default.svc",
"image-bouncer-webhook.default.svc.cluster.local",
"image-bouncer-webhook.default.pod.cluster.local",
"192.0.2.24",
"10.0.34.2"
],
"CN": "system:node:image-bouncer-webhook.default.pod.cluster.local",
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"O": "system:nodes"
}
]
}
EOF
Then apply it to the cluster
xxxxxxxxxx
$ cat <<EOF | kubectl apply -f -
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: image-bouncer-webhook.default
spec:
request: $(cat server.csr | base64 | tr -d '\n')
signerName: kubernetes.io/kubelet-serving
usages:
- digital signature
- key encipherment
- server auth
EOF
Approve and get your certificate for later use:
xxxxxxxxxx
$ kubectl get csr image-bouncer-webhook.default -o jsonpath='{.status.certificate}' | base64 --decode > server.crt
ImagePolicyWebhook Path
There are two possible ways to deploy this controller (webhook). For this to work, you will need to create the certificates as explained below, but first, we need to take care of other details. Add this to your hosts file in the master (or where the bouncer will run).
We use this name because it has to match with the names from the certificate. Since this will run outside Kubernetes, and it could even be externally available, we just fake it with a hosts entry.
xxxxxxxxxx
$ echo "127.0.0.1 image-bouncer-webhook.default.svc" >> /etc/hosts
Also in the API server you need to update it with these settings:
xxxxxxxxxx
--admission-control-config-file=/etc/kubernetes/kube-image-bouncer/admission_configuration.json --enable-admission-plugins=ImagePolicyWebhook
If you did this, you don’t need to create the validating-webhook-configuration.yaml
resource nor apply the Kubernetes deployment to run in the cluster.
Create an admission control configuration file named /etc/kubernetes/kube-image-bouncer/admission_configuration.json
with the following contents:
{
"imagePolicy": {
"kubeConfigFile": "/etc/kubernetes/kube-image-bouncer/kube-image-bouncer.yml",
"allowTTL": 50,
"denyTTL": 50,
"retryBackoff": 500,
"defaultAllow": false
}
}
Adjust the defaults if you want to allow images by default.
Create a kubeconfig file /etc/kubernetes/kube-image-bouncer/kube-image-bouncer.yml
with the following contents:
xxxxxxxxxx
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /etc/kubernetes/kube-image-bouncer/pki/server.crt
server: https://image-bouncer-webhook.default.svc:1323/image_policy
name: bouncer_webhook
contexts:
- context:
cluster: bouncer_webhook
user: api-server
name: bouncer_validator
current-context: bouncer_validator
preferences: {}
users:
- name: api-server
user:
client-certificate: /etc/kubernetes/pki/apiserver.crt
client-key: /etc/kubernetes/pki/apiserver.key
This configuration file instructs the API server to reach the webhook server at https://image-bouncer-webhook.default.svc:1323
and use its /image_policy
endpoint. We're reusing the certificates from the API server and the one for kube-image-bouncer that we already generated.
Be aware that you need to be sitting in the folder with the certs for that to work:
xxxxxxxxxx
$ docker run --rm -v `pwd`/server-key.pem:/certs/server-key.pem:ro \
-v `pwd`/server.crt:/certs/server.crt:ro -p 1323:1323 \
--network hostkainlite/kube-image-bouncer \
-k /certs/server-key.pem -c /certs/server.crt
ValidatingAdmissionWebhook Path
If you are doing this, all you need to do is generate the certificates. Everything else can be done with kubectl. First of all, you have to create a TLS secret holding the webhook certificate and key (we just generated this in the previous step):
xxxxxxxxxx
$ kubectl create secret tls tls-image-bouncer-webhook \
--key server-key.pem \
--cert server.pem
Then create a Kubernetes deployment for the image-bouncer-webhook
:
xxxxxxxxxx
$ kubectl apply -f kubernetes/image-bouncer-webhook.yaml
Finally, create ValidatingWebhookConfiguration
that makes use of our webhook endpoint. You can use this, but be sure to update the caBundle with the server.crt
content in base64:
xxxxxxxxxx
$ kubectl apply -f kubernetes/validating-webhook-configuration.yaml
Or you can simply generate the validating-webhook-configuration.yaml
file like this and apply it in one go:
xxxxxxxxxx
$ cat <<EOF | kubectl apply -f -
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: image-bouncer-webook
webhooks:
- name: image-bouncer-webhook.default.svc
rules:
- apiGroups:
- ""
apiVersions:
- v1
operations:
- CREATE
resources:
- pods
failurePolicy: Ignore
sideEffects: None
admissionReviewVersions: ["v1", "v1beta1"]
clientConfig:
caBundle: $(kubectl get csr image-bouncer-webhook.default -o jsonpath='{.status.certificate}')
service:
name: image-bouncer-webhook
namespace: default
EOF
This could be easily automated (Helm chart coming soon...). Changes can take a bit to reflect, so wait a few seconds and give it a try.
Testing
Both paths should work the same way, and you will see a similar error message:
xxxxxxxxxx
Error creating: pods "nginx-latest-sdsmb" is forbidden: image policy webhook backend denied one or more images: Images using latest tag are not allowed
or
xxxxxxxxxx
Warning FailedCreate 23s (x15 over 43s) replication-controller Error creating: admission webhook "image-bouncer-webhook.default.svc" denied the request: Images using latest tag are not allowed
Create an nginx-versioned RC to validate that the versioned releases still work:
xxxxxxxxxx
$ cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-versioned
spec:
replicas: 1
selector:
app: nginx-versioned
template:
metadata:
name: nginx-versioned
labels:
app: nginx-versioned
spec:
containers:
- name: nginx-versioned
image: nginx:1.13.8
ports:
- containerPort: 80
EOF
Ensure/check the replication controller is actually running:
xxxxxxxxxx
$ kubectl get rc
NAME DESIRED CURRENT READY AGE
nginx-versioned 1 1 0 2h
Now create one for nginx-latest to validate that our controller/webhook can actually reject pods with images using the latest tag:
xxxxxxxxxx
$ cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-latest
spec:
replicas: 1
selector:
app: nginx-latest
template:
metadata:
name: nginx-latest
labels:
app: nginx-latest
spec:
containers:
- name: nginx-latest
image: nginx
ports:
- containerPort: 80
EOF
If we check the pod, it should not be created and the RC should show something similar to the following output. You can also check with kubectl get events --sort-by='{.lastTimestamp}'
:
xxxxxxxxxx
$ kubectl describe rc nginx-latest
Name: nginx-latest
Namespace: default
Selector: app=nginx-latest
Labels: app=nginx-latest
Annotations: <none>
Replicas: 0 current / 1 desired
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=nginx-latest
Containers:
nginx-latest:
Image: nginx
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
ReplicaFailure True FailedCreate
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 23s (x15 over 43s) replication-controller Error creating: admission webhook "image-bouncer-webhook.default.svc" denied the request: Images using latest tag are not allowed
Debugging
It’s always useful to see the API server logs if you are using the admission controller path since it will log why it failed there, and also the logs from the image-bouncer. For example: apiserver
xxxxxxxxxx
W0107 17:39:00.619560 1 dispatcher.go:142] rejected by webhook "image-bouncer-webhook.default.svc": &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{ SelfLink:"", ResourceVersion:"", Continue:"", RemainingItemCount:(*int64)(nil)}, Status:"Failure", Message:"admission webhook \"image-bouncer-webhook.default.svc\" denied the request: Images using latest tag are not allowed", Reason:"", Details:(*v1.StatusDetails)(nil), Code:400}}
kube-image-bouncer:
xxxxxxxxxx
echo: http: TLS handshake error from 127.0.0.1:49414: remote error: tls: bad certificate
method=POST, uri=/image_policy?timeout=30s, status=200
method=POST, uri=/image_policy?timeout=30s, status=200
method=POST, uri=/image_policy?timeout=30s, status=200
The error is from a manual test, the others are successful requests from the API server.
The Code Itself
Let's take a really brief look at the critical parts of creating an admission controller or webhook:
This is a section of the main.go
. As we can see, we are handling two POST
paths with different methods and some other validations. What we need to know is that we will receive a POST method call with a JSON payload and that we need to convert to an admission controller review request.
xxxxxxxxxx
app.Action = func(c *cli.Context) error {
e := echo.New()
e.POST("/image_policy", handlers.PostImagePolicy())
e.POST("/", handlers.PostValidatingAdmission())e.Use(middleware.LoggerWithConfig(middleware.LoggerConfig{
Format: "method=${method}, uri=${uri}, status=${status}\n",
}))if debug {
e.Logger.SetLevel(log.DEBUG)
}if whitelist != "" {
handlers.RegistryWhitelist = strings.Split(whitelist, ",")
fmt.Printf(
"Accepting only images from these registries: %+v\n",
handlers.RegistryWhitelist)
fmt.Println("WARN: this feature is implemented only by the ValidatingAdmissionWebhook code")
} else {
fmt.Println("WARN: accepting images from ALL registries")
}var err error
if cert != "" && key != "" {
err = e.StartTLS(fmt.Sprintf(":%d", port), cert, key)
} else {
err = e.Start(fmt.Sprintf(":%d", port))
}if err != nil {
return cli.NewExitError(err, 1)
}return nil
}app.Run(os.Args)
This is a section from handlers/validating_admission.go
. Basically it parses and validates whether the image should be allowed or not, and then it sends an AdmissionReponse back with the flag Allowed
set to true or false. If you want to learn more about the different types used here, you can explore the v1beta1.Admission Documentation:
xxxxxxxxxx
func PostValidatingAdmission() echo.HandlerFunc {
return func(c echo.Context) error {
var admissionReview v1beta1.AdmissionReviewerr := c.Bind(&admissionReview)
if err != nil {
c.Logger().Errorf("Something went wrong while unmarshalling admission review: %+v", err)
return c.JSON(http.StatusBadRequest, err)
}
c.Logger().Debugf("admission review: %+v", admissionReview)pod := v1.Pod{}
if err := json.Unmarshal(admissionReview.Request.Object.Raw, &pod); err != nil {
c.Logger().Errorf("Something went wrong while unmarshalling pod object: %+v", err)
return c.JSON(http.StatusBadRequest, err)
}
c.Logger().Debugf("pod: %+v", pod)admissionReview.Response = &v1beta1.AdmissionResponse{
Allowed: true,
UID: admissionReview.Request.UID,
}
images := []string{}for _, container := range pod.Spec.Containers {
images = append(images, container.Image)
usingLatest, err := rules.IsUsingLatestTag(container.Image)
if err != nil {
c.Logger().Errorf("Error while parsing image name: %+v", err)
return c.JSON(http.StatusInternalServerError, "error while parsing image name")
}
if usingLatest {
admissionReview.Response.Allowed = false
admissionReview.Response.Result = &metav1.Status{
Message: "Images using latest tag are not allowed",
}
break
}if len(RegistryWhitelist) > 0 {
validRegistry, err := rules.IsFromWhiteListedRegistry(
container.Image,
RegistryWhitelist)
if err != nil {
c.Logger().Errorf("Error while looking for image registry: %+v", err)
return c.JSON(
http.StatusInternalServerError,
"error while looking for image registry")
}
if !validRegistry {
admissionReview.Response.Allowed = false
admissionReview.Response.Result = &metav1.Status{
Message: "Images from a non whitelisted registry",
}
break
}
}
}if admissionReview.Response.Allowed {
c.Logger().Debugf("All images accepted: %v", images)
} else {
c.Logger().Infof("Rejected images: %v", images)
}c.Logger().Debugf("admission response: %+v", admissionReview.Response)return c.JSON(http.StatusOK, admissionReview)
}
}
Everything is in this repo.
Closing Words
This example and the original post were done here, so thank you Flavio Castelli for creating such a great example. My changes are mostly about explaining how it works and the required changes for it to work in the latest Kubernetes release (at the moment v1.20.0), as I was learning to use it and to create my own.
The readme file in the project might not match this article but both should work. I haven't updated the entire readme yet.
Errata
If you spot an error or have any suggestions, please send me a message so it gets fixed.
Also, you can check the source code and changes in the generated code and the sources here.
Published at DZone with permission of Gabriel Garrido. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments