The Charmed Kubernetes CI recently has been encountering an issue where an upgrade-charm
hook is triggered during the initial deploy for no apparent reason. For example:
13:27:47 [validate-ck-bugfix-upgrade-focal-1.17-stable] + juju deploy -m validate-b792948a:validate-upgrade --overlay overlay.yaml --force --channel stable 'cs:~containers/charmed-kubernetes'
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Located bundle "cs:~containers/bundle/charmed-kubernetes-432"
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/containerd-64
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/easyrsa-303
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/etcd-501
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/flannel-477
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/kubeapi-load-balancer-715
13:27:48 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/kubernetes-master-826
13:27:49 [validate-ck-bugfix-upgrade-focal-1.17-stable] Resolving charm: cs:~containers/kubernetes-worker-661
...
13:28:13 [validate-ck-bugfix-upgrade-focal-1.17-stable] Deploy of bundle completed.
13:28:13 [validate-ck-bugfix-upgrade-focal-1.17-stable] + juju::wait
13:28:13 [validate-ck-bugfix-upgrade-focal-1.17-stable] + echo 'Waiting for deployment to settle...'
13:28:13 [validate-ck-bugfix-upgrade-focal-1.17-stable] Waiting for deployment to settle...
13:28:13 [validate-ck-bugfix-upgrade-focal-1.17-stable] + timeout 45m juju-wait -e validate-b792948a:validate-upgrade -w
13:32:54 [validate-ck-bugfix-upgrade-focal-1.17-stable] ERROR:root:kubernetes-worker/0 failed: workload status is error
This resulted in:
$ grep -E '(ran|queued) .* hook|operation upgrade' kubernetes-worker_0/var/log/juju/unit-kubernetes-worker-0.log
2020-06-05 17:29:49 INFO juju.worker.uniter resolver.go:139 found queued "install" hook
2020-06-05 17:31:14 INFO juju.worker.uniter.operation runhook.go:142 ran "install" hook (via explicit, bespoke hook script)
2020-06-05 17:31:15 INFO juju.worker.uniter resolver.go:139 found queued "leader-elected" hook
2020-06-05 17:31:17 INFO juju.worker.uniter.operation runhook.go:142 ran "leader-elected" hook (via explicit, bespoke hook script)
2020-06-05 17:31:20 INFO juju.worker.uniter.operation runhook.go:142 ran "config-changed" hook (via explicit, bespoke hook script)
2020-06-05 17:31:20 INFO juju.worker.uniter resolver.go:139 found queued "start" hook
2020-06-05 17:31:51 INFO juju.worker.uniter.operation runhook.go:142 ran "start" hook (via explicit, bespoke hook script)
2020-06-05 17:31:52 DEBUG juju.worker.metrics.collect manifold.go:278 ran "collect-metrics" hook (via explicit, bespoke hook script)
2020-06-05 17:32:29 INFO juju.worker.uniter.operation runhook.go:142 ran "cni-relation-joined" hook (via explicit, bespoke hook script)
2020-06-05 17:32:32 INFO juju.worker.uniter.operation runhook.go:142 ran "kube-api-endpoint-relation-joined" hook (via explicit, bespoke hook script)
2020-06-05 17:32:44 INFO juju.worker.uniter.operation runhook.go:142 ran "container-runtime-relation-joined" hook (via explicit, bespoke hook script)
2020-06-05 17:32:44 DEBUG juju.worker.uniter.operation executor.go:83 running operation upgrade to cs:~containers/kubernetes-worker-661
2020-06-05 17:32:45 INFO juju.worker.uniter resolver.go:139 found queued "upgrade-charm" hook
In this case, the upgrade-charm
hook failed due to an issue that we were specifically testing the fix for, because it obviously didn’t have the new code from an actual upgrade, which is how we noticed this.
Why is upgrade-charm
running during the initial deploy? This was a regular bundle deployment