The issue in this particular case is a discourse application, which may want to run a database migration script because of a configuration change. (You just got a request to run a new version of your application, and you need to migrate the database to the new schema.)
The issue is that you need to coordinate that only 1 unit actually runs the migration script, but it needs to be run inside the application pod (where the schema is defined).
If this was a normal IAAS charm, then you can use “is-leader” to determine which of the units runs the script, and then run the script directly. However, with CAAS the pod that can run “is-leader” is decoupled from the pod that can run the script.
There are a few potential ways to attack this, and it would be good to know what we recommend.
- Talk directly to the K8s api from the operator pod, to configure a Job that represents the migration script. It can potentially use the same container image, with a different environment variable to indicate how it is meant to operate. Note that the charm still needs a way to bring down the existing application pods, since they won’t be able to talk the new db schema, and doesn’t really want to start the new application pods until the schema has been migrated.
- Be able to ‘kubectl exec’ a script to run inside an application pod. In this the operator could probably configure the new pods to run, but also they would go into “suspended” state, until the charm sees that the leader pod is ready, then triggers the script to run the db migration. Once that completes, it then triggers a script on all the pods so that they actually start the application.
- This could also potentially be explicit charm actions. (eg, you don’t just exec something in the application, but have pre-defined scripts that you can cause to be run from the charm.) IIRC, charms don’t have a way to trigger actions, they can currently only be initiated by ‘juju run’ from a user.
- From the application pods, be able to “juju-run is-leader” to know which pod should be running the migration script. You then need a way to communicate that the migration has been done and that the pods can resume normal operation. This one feels a bit clumsier, because it likely means that you bring up all 3 application pods but with a “don’t actually run” flag set, and then run the migration on the leader, which then has to communicate back out when the migration has been done, which then causes the charm to change the pod spec to now say “and actually run the application as normal”. It is certainly doable, but it does need a way for the Charm to be able to indicate to a pod that it is the ‘special’ one (would this be possible with an env var that says ‘X is the special one’ and some way for pods to tell if they are X?)
- Have the charm running inside the application pod so it has direct access to the migration script.
Are there other ways to solve this problem? Are there good answers for it that are already available in Juju today? (I feel like talking to the k8s api might already be possible, but it is probably a bit clumsy to enable.)