Keeping State in Juju Controllers in Operator Framework

Charm State In Juju Controllers

One of the new features that landed in Operator Framework 0.7 is support for saving the state of your charm in the Juju controller instead of saving it locally on disk. (via the state-get, state-set tools provided by Juju 2.8.)

Currently we require you to opt-in to using it, since there are some side effects that I’ll outline below. To not use local storage, you can simply do:

  main(MyCharm, use_juju_for_storage=True)

Effects

With that one change, instead of creating a local SQLite database to maintain charm state, we will write the state to the Juju controller with state-set and read it back in with state-get. One major change to be aware of is that state-* functions are in the same data transaction as relation-set. So if your hook exits nonzero, then it will be as if the hook never ran, since the internal state will also be rolled back, along with the external state (relation data). This is generally seen as a positive thing, since it keeps both parts in sync. But it is something to be aware of.

In general, there should be very few other changes that a charm needs to make. All of the StoredState() objects in the charm (including the one the Framework itself uses) will now use this for their storage. Which means that state will persist to restarted K8s pods, etc. It should also mean that K8s charms that use Actions will also have their state available in the action, since the data is no longer local to that machine.

One very nice feature of the new storage, is that instead of storing your state as base64 encoded pickles in an SQLite database, we store it as human-readable YAML. And you are able to run:

$ juju run --unit uo/0 state-get
'#notices#': |
  []
StoredStateData[_stored]: |
  {event_count: 1773}
Ubuntu/StoredStateData[_stored]: |
  {load15min: 0.05, load1min: 0.16, load5min: 0.08}

Which should give a bit more visibility into what your charm thinks of its current state.

Caveats

There are a few things to be aware of, about the effect of enabling this in your charm.

  1. There isn’t an upgrade path for a charm that had been using local storage into one that uses Juju for storage. We plan to implement support for this, but it wasn’t required for the primary use case (newly developed charms targeting K8s deployments that have no storage).
  2. This is a strict ‘must use Juju for storage’. So if you deploy your charm on Juju 2.7, it won’t work, and will raise an error when it tries to call state-get. This is related to (1). Once we have support for upgrading, then we will likely change the logic so that if state-get is available, then we will prefer to use it. And if we see a charm database, we will migrate the content from that database into Juju storage. Doing this correctly is non-trivial (Juju state is only persisted when the hook exits successfully, so if we tried to copy the state, then delete the database, and then the hook fails, we would lose state completely, which we don’t want. We have plans, just didn’t want to rush)
  3. If you enable use_juju_for_storage then you will no longer get the Charm.on.collect_metrics event inside your script. We will still exec a hooks/collect-metrics script if it exists, but in Juju 2.8.0 the restricted context of collect-metrics does not expose any of the state-* functions. Thus we are unable to read and write the Framework state to properly trigger events. Charms that want to not use local storage but still create metrics can still write metrics.yaml and a collect-metrics hook, but just without framework support.
  4. There is some concern about performance, as execing an external process to read and write state is quite a bit slower than reading from SQLite database. And the code isn’t yet designed around minimizing calls (batching, preserving an object that you just wrote, so you don’t have to read it again, etc.)

Future Work

The next steps for this work are to make sure we have a solid upgrade story, a solid performance story, and then we can turn this on automatically. So when a charm at an older version of the framework is upgraded to a newer version of the framework, or when the Juju controlling the charm is upgraded, it can migrate the data across, and switch how it stores data.

1 Like