Hack your Kubernetes controller in Bash in 10 minutes!

Maël Valais

Antoine
Le Squéren

Code and live slides!

Hack your Kubernetes controller in Bash in 10 minutes!

Maël Valais, Software Engineer

Antoine Le Squéren, DevOps Engineer

A story about Vault, external-secrets, slow skaffold run, and how a one-liner Bash controller did the trick.

"I maintain the cert-manager project. I aim to build the best Let's Encrypt experience on Kubernetes."

"I improve the developer experience at OneStock by providing an efficient development environment."

OneStock

website

(3) order

(1) visit

(2) fetch stock

warehouse

(4) deliver

t-shirt in stock

OneStock

website

(3) order

(1) visit

warehouse

store

(4) deliver

t-shirt not in stock

t-shirt in stock

(2) fetch stock

Secrets at OneStock

60 clients

=> 300 secrets

Internal secrets

  • OneStock API
  • Postgre
  • MongoDB
  • ElasticSearch
  • ...

10 secrets

3 environments

(dev, staging, prod)

External secrets:

  • Stock API
  • SFTP
  • Mailjet
  • Gmaps
  • ...

5 secrets

Back in the old days

Prod

deploy

ssh

docker-compose.yaml

secrets.env

bastion

Swarm

Staging (on 40 dev laptops)

secrets.env

docker daemon

deploy

docker-compose.yaml

🔥

(2) skaffold

run

Dev1 laptop

A new dev environment

OVH managed cloud

Secret

Deployment

dev1-ns

Secret

Deployment

dev2-ns

skaffold.yaml

helm manifests

secrets.env

(1) load

(2) skaffold

run

Dev laptop

A password leak

Kubernetes

Secret

Deployment

dev1-ns

Secret

Deployment

dev2-ns

skaffold.yaml

helm manifests

secrets.env

(1) load

leak!

Vault

Vault

Dev laptop

skaffold.yaml

helm manifests

(1) vault kv get

(2) skaffold

run

Secret

Deployment

dev1-ns

/secrets/prod/postgres

/secrets/dev/postgres

Kubernetes

(1) skaffold run

Secret

"postgres"

(2) fetch

(3)

create

External secrets

Kubernetes

ExternalSecret
"postgres"

/secrets/prod/postgres

/secrets/dev/postgres

Vault

External secrets operator

Developer commands

Self-healing, Consistency, Desired vs. Observed state

edge-triggered action

Kubernetes

Desired state

"replicas=5"

"linux processes=2"

Observed state

level-triggered action

user interaction

user interaction

action

transfer money

consistent state (desired = observed)
(transactional)

observed state

desired state

replicas=5

action

linux processes=2

kubelet creates container

but always consistent

not able to recover data inconsistencies

no data consistency

but can recover from inconsistencies

SUM(balance)
FROM accounts;

fact

constraint

Bank

Desired state

"sum of balances is 0"

Observed state

"sum of balances in DB"

=

observed state

desired state

action

Self-healing in external-secrets operator

postgres in sync

redis out of sync

postgres in sync

password in Vault matches Secret in Kubernetes

postgres out of sync

password in Vault does not match Secret in Kubernetes

vault get &&
kubectl patch secret

A bad actor

Kubernetes

leak!

Secret

ingress API

dev2-ns

Data

Secret

ingress API

dev1-ns

Data

Vault

/secrets/dev/api

dev1-ns

ingress API

Secret

40 developers

=> 400 random passwords

Secrets at OneStock (bis)

60 clients

Internal secrets

  • OneStock API
  • Postgre
  • MongoDB
  • ElasticSearch
  • ...

10 secrets

External secrets:

  • Stock API
  • SFTP
  • Mailjet
  • Gmaps
  • ...

5 secrets

Slow solution

(0) vault put <randompass>

(1) skaffold run

Secret

"postgres"

(2) fetch

(3) create

Kubernetes

ExternalSecret
"postgres"

/secrets/dev-2/postgres

/secrets/dev-1/postgres

Vault

External secrets operator

Developer commands

Defining a controller: what are the desired and observed states?

Action:

$ vault kv put secret/dev-1/postgres password=random
======= Metadata =======
Key                Value
---                -----
created_time       2022-06-26T15:37:26.01313574Z
custom_metadata    <nil>
deletion_time      n/a
destroyed          false
version            30

"Run vault put password=random"

Desired state:

"No external secret is stuck with 'secret not found' due to a missing secret in Vault."

$ kubectl get externalsecret
NAME       KEY                     PROPERTY   READY   REASON             MESSAGE
redis      secret/dev-1/redis      password   True    SecretSynced       Secret was synced
postgres   secret/dev-1/postgres   password   False   SecretSyncedError  Could not get secret data from provider

This means 'secret not found'

apiVersion: external-secrets.io/v1beta
kind: ExternalSecret
status:
  conditions:
  - type: Ready
    status: False
    reason: SecretSyncedError
    message: Secret key was not found

Observed state:

"Run kubectl get externalsecret and I look for SecretSyncedError"

kubectl --watch to avoid polling ExternalSecrets

Get alerted as soon as SecretSyncedError appears

Writing our one-liner controller 

kubectl get externalsecret --watch -ojson \
  | jq 'select(.status.conditions[]?.reason == "SecretSyncedError")' --unbuffered \
  | jq '.spec.data[0].remoteRef' --unbuffered \
  | jq '"\(.key) \(.property)"' -r \
  | while read key property
do
  vault kv put $key $property=somerandomvalue
done

so that we can use jq

because we are piping jq

We only need to take action when SecretSyncedError exists

Action

Observe state

Our one-liner controller in action!

A real controller runs inside a Pod, right?

Kubernetes

helm
install

dev-1

dev-1 namespace

ExternalSecret
"postgres"

Secret
"postgres"

 = external-secrets operator

 = our controller

./controller.sh

controller pod

kubectl get --watch

(observe state)

(reconcile action)

vault put

vault

secret/dev-1/postgres

Visualising the controller pod in the cluster

fetch

create

A real controller runs inside a Pod, right?

Let us write a Dockerfile and a Deployment manifest

#! /bin/bash

kubectl get externalsecret --watch -ojson \
  | jq 'select(.status.conditions[]?.reason == "SecretSyncedError")' --unbuffered \
  | jq '.spec.data[0].remoteRef' --unbuffered \
  | jq '"\(.key) \(.property)"' -r \
  | while read key property
do
  vault kv put $key $property=somerandomvalue
done

controller.sh

FROM alpine:3.16

# The "setcap -r" is detailed in https://github.com/hashicorp/vault/issues/10924.
RUN tee -a /etc/apk/repositories <<<"@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing" \
    && apk add --update --no-cache bash curl jq kubectl@testing vault libcap \
    && setcap -r /usr/sbin/vault

COPY controller.sh /usr/local/bin/controller.sh
CMD ["controller.sh"]

Dockerfile

apiVersion: apps/v1
kind: Deployment
metadata:
  name: controller
spec:
  replicas: 1
  selector:
    matchLabels: {name: controller}
  template:
    metadata:
      labels: {name: controller}
    spec:
      containers:
        - name: controller
          image: controller:local
          imagePullPolicy: Never
          env:
            - name: VAULT_ADDR
              value: http://vault.vault:8200
            - name: VAULT_TOKEN
              valueFrom:
                secretKeyRef:
                  name: vault-token
                  key: vault-token
      serviceAccountName: controller
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: controller
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: controller
subjects:
  - kind: ServiceAccount
    name: controller
roleRef:
  name: external-secrets-reader
  kind: Role
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: external-secrets-reader
rules:
  - apiGroups: [external-secrets.io]
    resources: [externalsecrets]
    verbs: [get, list, watch, update, patch]

deploy.yaml

A real controller runs inside a Pod, right?

The controller pod in action

What now?

Use conditions to alert user when something goes wrong

Users won't know when something goes wrong

apiVersion: external-secrets.io/v1beta
kind: ExternalSecret
metadata:
  name: postgres
spec:
  data:
  - remoteRef:
      conversionStrategy: Default
      key: secret/dev-1/postgres
      property: password
    secretKey: password
  refreshInterval: 5s
  secretStoreRef:
    name: vault-backend
  target:
    name: postgres
status:
  conditions:
  - type: Ready
    status: False
    reason: SecretSyncedError
    message: Secret key was not found
  - type: Created
    status: False
    reason: VaultConnError
    message: Vault returned 403 unauthorized
    
    

What now?

Use Go and controller-runtime

Slow sequential processing
due to the while loop

func main() {
  mgr, _ := manager.New(config.GetConfigOrDie(), manager.Options{})
  c, err := controller.New("ext-secrets-vault-creator", mgr, controller.Options{
    Reconciler: reconcile.Func(func(ctx context.Context, r reconcile.Request) (reconcile.Result, error) {
      extsecret := v1.ExternalSecret{}
      err := mgr.GetClient().Get(ctx, r.NamespacedName, &secret)
      
      // vault kv put
            
      return reconcile.Result{}, nil
    }),
  })
}

no more slow sequential processing, i.e., controller can handle hundreds of ExternalSecrets

Maël Valais

Antoine Le Squéren