maelvls dev blog

maelvls dev blog

Systems software engineer. I write mostly about Kubernetes and Go. About

08 Jul2020

Why Kubernetes Compares With Hash

In Kubernetes, the replica set controller uses a hashing mechanism to compare objects. Pods that are owned by a replica set have a label that is used to remember what the hash of the pod template that led to this pod spec:

kind: Pod
metadata:
  labels:
    pod-template-hash: 775855699b

In this post, I want to present the advantages and limitations of using a hash function in this context.

Pros: works around the fact that child object might get mutated or defaulted, which means the reflect.DeepEqual can’t work (what about performance???)

Cons: if the child gets updated, the parent cannot know that it has been changed since the hash only works in one way. (talk about replica set mapping to multiple pods)

  • why is it a label and not an annotation?

Looks like the hash is created in the replicaset sync func and it uses the ComputeHash func which uses fnv.New32a from the std lib:

// ComputeHash returns a hash value calculated from pod template and
// a collisionCount to avoid hash collision. The hash will be safe encoded to
// avoid bad words.
func ComputeHash(template *v1.PodTemplateSpec, collisionCount *int32) string {
    podTemplateSpecHasher := fnv.New32a()
    hashutil.DeepHashObject(podTemplateSpecHasher, *template)

    // Add collisionCount in the hash if it exists.
    if collisionCount != nil {
        collisionCountBytes := make([]byte, 8)
        binary.LittleEndian.PutUint32(collisionCountBytes, uint32(*collisionCount))
        podTemplateSpecHasher.Write(collisionCountBytes)
    }

    return rand.SafeEncodeString(fmt.Sprint(podTemplateSpecHasher.Sum32()))
}

and the DeepHashObject looks like this:

// DeepHashObject writes specified object to hash using the spew library
// which follows pointers and prints actual values of the nested objects
// ensuring the hash does not change when a pointer changes.
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
    hasher.Reset()
    printer := spew.ConfigState{
        Indent:         " ",
        SortKeys:       true,
        DisableMethods: true,
        SpewKeys:       true,
    }
    printer.Fprintf(hasher, "%#v", objectToWrite)
}

So the hashing mechanism uses davecgh/go-spew to turn the object into a string, and then uses the fnv std library to hash that string.

Benchmarking two hashing functions

package main

import (
    "hash"
    "hash/fnv"
    "testing"

    "github.com/davecgh/go-spew/spew"
    "github.com/mitchellh/hashstructure"
)

type ComplexStruct struct {
    Name     string
    Age      uint
    Metadata map[string]interface{}
}

var v = ComplexStruct{
    Name: "mitchellh",
    Age:  64,
    Metadata: map[string]interface{}{
        "car":      true,
        "location": "California",
        "siblings": []string{"Bob", "John"},
    },
}

func BenchmarkMitchellhHashstructure(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _, _ = hashstructure.Hash(v, nil)
    }
}

func BenchmarkKubernetesComputeHash(b *testing.B) {
    for i := 0; i < b.N; i++ {
        hasher := fnv.New32a()
        DeepHashObject(hasher, v)
        _ = hasher.Sum32()
    }
}

// From https://github.com/kubernetes/kubernetes/blob/7e75a5ef/pkg/util/hash/hash.go#L25-L37
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
    hasher.Reset()
    printer := spew.ConfigState{
        Indent:         " ",
        SortKeys:       true,
        DisableMethods: true,
        SpewKeys:       true,
    }
    printer.Fprintf(hasher, "%#v", objectToWrite)
}

The winner seems to be the Kubernetes' ComputeHash function, although it relies on spew to generate a string. I guess most of the time spent by hashstructure.Hash is due to the reflect package?

% go test -bench .
BenchmarkMitchellhHashstructure-8         368101              3.127 µs/op
BenchmarkKubernetesComputeHash-8          456028              2.704 µs/op

Other clues

  controller-revision-hash: 6f7768f569
  k8s-app: kube-proxy
  pod-template-generation: "7"

  k3s.io/node-config-hash
📝 Edit this page