Critical Review of Juju Today

Hello everybody, me and my team at Katharos Technology wanted to share our critical review of what Juju is like today. None of this is stated just to senselessly bash on Juju and its problems, but is an honest evaluation of Juju based on our experiences with it. We want to share our experiences and what we have learned, both the good and the bad, so that we can help Juju grow and become better.

We have the utmost respect for Juju and the people involved both from Canonical and from the the community. We aren’t trying to point fingers or tell people that “you should do better”. Some of these points are even well on their way to being address in one way or another such as better documentation. This is all meant to be constructive criticism so that we can collaborate and make stuff happen! :muscle:

Just a warning, this is going to be a long one, so, here we go! :smiley:


Juju’s Strengths


Holistic Approach to Automation

Juju taks a unified approach to the enterprise automation problem. It doesn’t just cover one of provisioning, configuration management, deployment, or operations, it covers all of them. This is a requirement to truly automate things like scalable deployments of stateful applications such as databases. You can achieve the golden “push button scaleability” where the only thing you have to do to add a server to your application cluster and get it fully configured and user-facing in production is to run a single command or click a button.

This is not something that is provided anywhere else.

You have tools such as Terraform, Rancher, Docker Swarm, Kubernetes, Chef, Ansible, and others, but none of them cover the entire landscape of infrastructure, application, and operations automation at the same time.

Declarative Configuration + Programmable Configuration

Juju allows for a declarative application stack definition with its YAML files to define Juju Models. Declarative configuration is a key component of making systems more reproducible and source controllable, but only having declarative configuration can make a system too opinionated to adapt to different user’s use-cases and applications.

Juju combines the declarative configuration of Models with the programmable configuration Charms. This combination allows you to hook into the application’s automation life-cycle and share information with other charms to coordinate complicated application deployments in a way that other orchestrators like Swarm and Kubernetes do not allow.

This also, is something not found anywhere else, and its flexibility was the only reason it was possible for us to make the Lucky Charming Framework.

Juju’s Weaknesses


Barriers to Charm Deployment & Management

These are some of the obstacles we’ve noticed when it comes to using Juju and deploying/managing charms with it:

Phasing Out the Interactive Juju GUI

Lately Juju has been working on a new Juju GUI that focuses on wide-scale visibility while allowing you to control the Juju cluster by using an embedded CLI in the web interface. This is contrary to the old Juju GUI which allowed for graphical drag-and-drop/point-and-click control of your Juju cluster in addition to having a built-in web CLI.

We believe the the value of the old Juju GUI has been underestimated and we would much rather preserve the ability to control the cluster through the GUI, not just with an embedded CLI.

To clarify, we are not foreign to the command-line in in any respect. We spend all day on the command-line in our work, but that is not to diminish the value of performant and responsive GUIs for managing wide reaching changes faster than you can do so on the command-line. There are a number of reasons:

Operator Friendliness

In most enterprises, it is not the person who develops the app that manages the operations for that app in production, it is the operators. The operators are handed the controls for an application that they do not have an in-depth knowledge of and tasked with being able to handle the possible problems that could happen with an application at scale in production.

Intuitiveness is going to be key to making sure that hand-off from the developer to the operator goes smoothly and properly empowers the operator to take on the tasks that will be necessary to manage the application.

Having a GUI as obvious and intuitive as the old Juju GUI would be a huge advantage. If a new server needs to be created with specific resource constraints and a new unit for a specific application added to that unit, it could hardly be easier than to say, “add machine”, “set CPUs/Memory”, “add this unit over there”, “oh, and maybe this unit, too”. This is demonstrated in the LizardFS tutorial that I made with the Juju GUI:

There are improvements that could still be made to the old Juju GUI, such as a way to run charm actions, maybe, or a better “zoomed out” view allowing you to view the status of multiple models at a time, but the core functionality there is invaluable and something that me and my team have not experienced since leaving Rancher, the only software we have yet seen that delivers a truly amazing real-time, interactive orchestration dashboard.

Speed and Context During Dynamic Operations

Another advantage of the GUI is speed and context. When charms can change status every second and something goes into an error state, you want to know why and you need to find out quickly, or the status could change again before you could see what caused the problem.

You need to be able to see that a charm or unit is unhealthy or errored and click on that charm or a button next to it to get the logs for the charm instantly. Then you need to be able to close those logs and click the log button of another charm to see its logs. If I have to type out a long winded command to get the logs on the comand-line then I have to type that command out again to get the logs for the other charm, I’m not going to get to it soon enough. I then have to then scour the verbose debug logs of the charm to find out where it went wrong, after the fact.

Rancher was an example of an amazing workflow: all of the application units were represented by littler circles that were either :white_check_mark:, :warning:, or :x:. You could right-click on any of them to get the logs or shell into them right from the web GUI instantly. You could also stop/restart the units by clicking them or modify the charm configuration.

The icons were small and organized by model and charm ( converting into Juju terms, anyway ). You could quickly find what you were looking for and because the unit icons were small you could see the status of tens of units at a glance. You could also click on the unit to get unit-specific information in a pop-out drawer. It even had graphs for the CPU, Memory, Disk, and Network usage!

The Juju GUI wasn’t quite there yet, but it was almost there. That workflow is amazing, and something we used extensively when managing our applications in Rancher. We were confident because we could control the cluster quickly and view and respond to changes without having to type long-winded commands or look up the help messages for the CLI just to figure out how to use it before we missed what we were looking for.

Wider User Base

Juju has a smaller user base than we want it to have. It has been reported that Juju kind of needs an expert to use or teach people to use it:

That being said, who do we expect to be current users of Juju today. They are most likely going to be DevOps experts or else very adventurous devs. If we want to improve and extend our user-base, then that will involve many people who are not necessarily the ones who spend all day on the command-line. Users who would greatly apreciate an interactive GUI or maybe not even be able to use Juju without it. We can’t cater just to the people who may be using Juju now, but also the people we want to be using Juju, which is a larger audience than Juju is currently successfully targeting.

Application Logs Need to Be Accessible ( through GUI and CLI )

Juju allows charms to log output that is accessible from the the Juju GUI and from the Juju CLI, but it doesn’t have any facility for accessing the application’s logs. This is essential to understanding why your application is behaving a certain way.

We run all of our application workloads in containers where the standard output of the container is always accessible as the source for application logging. It provides a standardized way to get to the application logs without having to have a knowledge of the application and the way it was configured and where to find its logs on disk. It’s always in the docker log.

Even if you knew where to get the logs for your application it is still a two step process to get them with Juju: you have to juju ssh into the unit that you want to investigate, then you have to tail the log that you are interested in. That can be a lot of typing and, if you have a lot of units you need to check on, it can be very cumbersome and difficult to get the big picture.

Juju needs a mechanism not just to get the Charm logs for debugging the charm, but also to get the application logs for debugging the application.

Implementation idea: This could possibly be implemented as a UDP endpoint or maybe just a file descriptor that you could ncat or pipe the application logs to. There needs to be a way to capture the logs for the application without needing to call a command like juju-log for every line.

Cascading Irrecoverable Error States and Automatic Machine Removal

Juju charms, for safety purposes have a rather conservative behavior when any charm hook fails. The charm will go into an error state and essentially freeze all operations ( other than retrying the previous operation ) to prevent data loss. The issue with this is that that it tends to result in cascading irrecoverable error states when faced with a bug in the charm.

For example, I ran into a situation where I had an HTTP proxy charm that had a bug that caused a hook failure under certain conditions. I related this HTTP proxy charm ( not knowing about the bug ) to a Grafana charm. When the proxy charm went into an errored state the only way to fix it was to force remove the charm. I force removed the proxy charm, but then Grafana’s hook errored out, not because of a bug in Grafana, but because the charm’s Juju agent was eternally trying to respond to a relation hook event on a relation that no longer existed, since the force removal of the proxy.

My only recourse was to force remove Grafana. If that Grafana charm had been related to a database, the database charm probably would have errored and had to be force removed as well, and so on.

That only happened in a dev environment but that is a very scary thing to put into production. One unforeseen situation in charm code could necessitate the removal of my entire production stack. And that’s not all.

Automatic Machine Removal

By default when you remove a unit from a machine, it removes the machine as well if that was the last unit on the machine. Say that unit was the last unit of my database cluster. Juju’s default behavior is then to destroy all of the data that was stored on that machine’s disk. If I had to force remove an application because of an error state cascade I could easily forget to first backup my data and Juju would not just remove the charms, but the machine and all of the charm’s data, permanently. ( Obviously I should have backups but that is never something we can assume a user has and just delete their stuff. )

To counteract this default, we at KatharosTech actually provision null charms on every host as a safety mechanism. These charms will reside on each machine and therefore prevent Juju from removing the machine automatically without me explicitly removing the null charm.

Juju should at least allow you to configure this, or maybe provide a way to have charms specify that they store persistent data that shouldn’t be destroyed without user confirmation.

Barriers to Charm Development

If Juju is to be useful, you have to have charms to deploy. Obviously there is the charm store to go and download charms that people have made, but chances are, if you are not using Juju specifically because you want to deploy an app stack you found on the store, you will need to write your own charms for your applications at some point. If people are going to use Juju, then, we have to make it as easy as possible not just to use, but also to make charms.

Here are some obstacles we’ve noticed to writing charms:

Charming Documentation

The charming documentation while actually somewhat well filled out, is still hard to approach for beginners to charming. Here are a couple of causes to that:

  • A little bit of a split between the reactive framework and the hook-based model ( though lots of the older documentation has been appropriately marked as such )
  • A lot of context is needed to understand charming that can’t really be found in just one place

The second point is probably the most important. There is a lot of investigation into charming that I had to do, reading through the hook tools documentation and such, before I actually understood how the charms worked. And it took a really long time ( relatively ) before I fully understood relations, something that essentially can’t make sense until you start actually using them.

The reactive framework and its Python and Bash variants further confused things. The reactive framework seems too magical. I couldn’t figure out where the database settings were coming from in the Python examples and I couldn’t figure out why some of the commands for the bash version were written in snake_case and some were written in kebab-case.

Also it wasn’t clear right off that if you wanted to write bash charms you really had to scour the charms that you wanted to relate to for the information necessary to relate to them because they only had documentation for reactive relations.

Charms Were Slow To Deploy

I also noticed when I first started following tutorials and getting into charm development that deploying and relating charms was slow. OK, not really that slow, but for someone used to using containers for everything, which start up very quickly, charms seemed super slow. Also, after the app was installed, just relating and un-relating the charms was taking a long time doing, according to the logs, nothing but printing trace messages. ( Thankfully, this has been fixed with a great effort by @jameinel to speed up the hook tools ).

These experiences with the documentation and slow deployment led us to make the Lucky charming framework to help address those issues.

How Lucky Tries to Address Charming Barriers

The goal of Lucky was to help provide a well documented and cohesive way to write charms that were fast to deploy.

Documentation

To help the documentation issue, Lucky consolidates everything you need to write charms into a single CLI with all of the commands documented both online and in an embedded CLI documentation viewer. All of the commands you need to interact with Juju and the charming system are in one CLI, so that you hardly need to go anywhere other than the Lucky documentation to learn to write charms ( though we still link to the Juju documentation where appropriate ). That way, in your charm code, you can clearly tell when you are interacting with Juju and the charming system, whenever you run a lucky command.

Bash

The other thing was, we didn’t want you to have to write Python. As a system administrator with 5+ years writing containers, Bash has constantly been the tool for automating system installations and tasks. Most everything you need to do is accomplished by running commands on the system and you shouldn’t need anything outside of that and some control flow with if statements and loops to accomplish the installation and management of system applications.

Python is not bad in any respect, and it has been a consideration we may put into practice later that Lucky should support Python scripting, but you shouldn’t need it and Bash should not be considered second class. Supporting Bash first-class opens the up the charming world to a wider audience and lowers the barrier to entry with a language that any system administrator who knows how to install an application is probably already comfortable with.

Docker

Docker is crucial to maximizing charm deployment speed and simplifying the charming system by eliminating the need for layers. By using Docker containers you provide a standardized way to package applications that works across Linux distributions. Additionally, almost all popular apps nowadays have Docker containers written by the application maintainers or another respected software organization. This offloads a major portion of the engineering for a large collection of applications onto the application’s maintainers and off of the charm developer.

This comes with benefits to the deployment speed of charms as well. Charms take only as long to start as the container takes to download ( plus some housekeeping by Juju ). This greatly improves the user friendliness of deploying charms and the developer friendliness of developing and re-deploying charms many times. It is a big productivity booster.

Docker is arguably an increased barrier to entry because you have to learn Docker, but Lucky does not require you to use Docker. It has a full integration with Docker as a tool that can boost your charm development productivity, and it provides a way to turn off the Docker integration when you don’t need it.

Helpful Utils

Because you are writing Bash instead of Python, you don’t necessarily have a massive standard library at your disposal, so Lucky comes with some utilities for generating things like random passwords, finding random available ports on the system, and storing state in a local key-value store. It also comes with a built-in system-independent cron scheduler.

Platform Independence

Lucky does not depend on any system tools other than Bash. It does use platform specific procedures for installing Docker on the target platform ( only Ubuntu so far ), but that is easy to extend to other operating systems such as Centos. Additionally, if Python support is added to Lucky, Lucky will have the CPython interpreter and standard library built-in with no dependency on the system version of Python, greatly reducing the obstacles involved with porting charms to be multi-platform.

That combined with Docker for encapsulating the application installation should make it feasible to deploy one charm on Centos and Ubuntu seamlessly.

Remaining Issues

We believe that Lucky has taken a large step in the direction of addressing some of our biggest issues with Juju and has allowed us to start developing and using charms with success so far. We have a list of charms now made with Lucky on the charm store all of which have been put to use ( not full production yet, but going to be ).

Still, there are things that are non-optimal, some of which may require large changes to Juju itself.

Charm Store Rejects Cross-Distro Charms

I has been discussed a few times before that you are not allowed by the charm store policy to push a charm that supports both Centos and Ubuntu at the same time. This is essentially counteracting a goal of Juju by store policy!

If the user is innovative enough to do it, they should be allowed to deploy charms that support both distros. Just because the reactive framework doesn’t support Centos doesn’t mean that other users can’t produce frameworks that do or else just write plain ol’ bash and check the distro before doing anything distro specific.

If you are really, really set on it being a quality control problem where people are publishing broken charms to the store, then don’t let them make it public and let them use it privately. I couldn’t even publish a null charm that didn’t do anything for both Centos and Ubuntu, so what did I do, I just skipped Centos support like almost every other charm on the charm store.

Versioned And Formally Defined Charm Interfaces

A major problem with the charming ecosystem is the fact that interfaces are not versioned nor are they formally defined.

For instance, I created a charm for the RethinkDB database and I developed the reql relation interface as a way to get the info charms would need to connect to the database such as username and password. Now there are two issues: how to document it, and how to change it.

Documenting Interfaces

Until very recently, interfaces had absolutely no standard way of being documented. It was expected that people would write reactive layers to officially support an interface, but that restricts their use to only Python and Reactive charms ( which don’t run on Centos! ). That is closing off the community and now not any charm can talk to any other charm!

A recent attempt at addressing this has been the #docs:interfaces category on the forum. This is the best thing for interface documentation yet, but I still feel like having a formal definition of a charming interface, maybe with a YAML interface definition or something, to really solidify officially how charm interfaces should be documented.

Related to this, it would be very good to have some form of type checking on this formal definition that could warn you when you are breaking the rules for that relation interface, helping the charm developer and preventing bugs.

This is a large topic as there are a lot of ways to approach it, but I think that solving the issue is essential to a solid charm ecosystem.

Versioning Interfaces

The next issue you have is that, even if you document your interfaces, you cannot version them. Juju and the charm store essentially builds a sort of “package manager” for full blown applications, but what would a package manager be without package versions!

With my reql interface I designed, there is currently no way to join with any user other than the admin user. That is something I want to add to the interface, but what if that feature necessitated a breaking change to the relation key-value procedure. If I just up and changed the way that the interface worked, other charms on the store would just stop working with my charm with absolutely no explanation for what went wrong.

The only way to avoid this today is to name interfaces with a version in them manually, such as reql2 which is not a pattern I have seen suggested anywhere and yet is the only way I can imagine to make sure you don’t arbitrarily break relations with the way interfaces are setup today. This is another point where the community has no clear documentation or guidelines and where Juju has no constructs to enforce structure and compatibility.

Key-Value Relation System Seems Flawed

That brings us to the way that relations are structured. I think most can probably agree that relations are Juju’s most powerful and confusing aspect of charms. They are amazing and the key to Juju’s ability to organize application deployments, but they are a bitter pill to swallow when you are trying to use them in your own charms for the first time.

Relations Are Difficult to Learn

The documentation for them suffers from maybe ( in overly figurative terms :wink: ) the worst plague in project documentation: all of the information is there you just can’t understand how it effects you practically.

The relation documentation is quite complete, actually ( not counting app relations which just came out recently ), but the problem is that it is very hard from that documentation to figure out how relations pan out practically in a project.

Only when you start messing with the relations, probably do some experimenting with juju-run, and start writing charms that both provide and use relations, do you actually start to figure out how they work.

Also, with the advent of app-scope relations, which solve a great, practical use-case, things get even more complicated. There are even more rules for when one relation can do what or get to what information when and which hooks are triggered, etc. . It was difficult for me to distill, and I am an experienced software engineer.

Yes, I figured it out, I taught my colleague, and I wrote a tutorial about how to do it all with Lucky and it actually wasn’t super complicated, but it was hard to get to that point.

Shared Key-Value Pairs Seems Like a Bad Practice

The other thing about relations is that the concept of shared key-value store and hook-based discovery of changes feels like a bad practice. Not only is is confusing to learn, but bleeds into increasing the chances that your charm is going to run into an unforeseen situation that causes a charm error and, potentially, the irrecoverable error cascade discussed earlier.

Even though both sides of the relation can only change their own side of the relation data, they are still sharing state and expecting each-other to modify the state in a predictable manner in response to each-other’s modifications. That setup, subjectively, sounds suspiciously like a concurrent programming anti-pattern that is prone to mistakes that can only be discovered by running the application. Hopefully you run the application and work out all of the mistakes before pushing it to production!

That is a little overstated, though, because that is to a certain, unavoidable extent, the nature of distributed system automation.

A Way Out? - The Actor Model

So, here is an idea that would take potentially monumental changes to Juju to support fully, but I think the idea has merit.

The Actor model is a concurrent programming model in which each actor is solely responsible for its own state, which it shares with no one. Actors accomplish work by communicating with each-other by sending messages. The Actor model is a powerful foundation for building reliable distributed systems and it is, importantly, very easy to understand.

If we structured charms as actors, the Juju controller would send the charms messages whenever the controller needed to notify the charm of something such as changed charm configuration. If the charm needed to communicate to another charm over a relation, it would send a message. If the other charm needed to respond back, it would send another message. It is very easy to think about and explain to new users.

As for the messages, they could probably be simple key-value documents with a required type field that would be a unique identifier for the type of message that is being sent. Maybe they could be JSON documents and store more nested data. Also, the interface would be simple to document and type-check. Each interface would define the types of messages that it needs to handle on both the providing and the requiring sides and the types of fields that each type of message would have. We could define a simple YAML schema definition with versions for the schema.

Also, we would most likely need to create a simple “schema package manager”. A centralized ( yet optionally self-hostable ) location where you push your schema definitions with their names and versions, very similar to npm, pypi, crates.io, etc. . Whoever pushes the package first gets the name and we can collaborate on the schemas like a typical Open Source package. That way we can unify the community around formal definitions that set a standard for which charms can reliably communicate with each-other.

Also, the actor model gets rid of the limitations of having to fit into Juju’s own hook system and only operate within those hooks. That was something that was a little difficult to break out of with Lucky, when we needed to implement a cron scheduler.

Footnote: Another potentially important aspect of the Actor model is that actors should be able to skip messages for later so that they can act as a Finite State Machine ( FSM ). This helps to make the charm life-cycle easier to reason about and make more reliable.

Not a Silver Bullet

Obviously it isn’t like the Actor model would just magically solve our problems. A big issue right now is that the only way to attempt something like that would inherently eliminate the ability for any charms not built with the actor model to communicate with charms that are built with the actor model.

I’m not sure if I understood correctly, but I think that the Operator framework actually has this problem? That it only communicates well with other Operator charms? I may have misunderstood the work-in-progress getting-started doc I was reading on it. I would like some clarification on this.

Either way, I think that the way relations are structured today is sort of a problem from both a technical and a good-for-beginners perspective, but I don’t know that Juju can necessarily adapt to a new model without alienating the other charming frameworks and dividing the charm community, which would be tragic ( if it isn’t already going to happen with the Operator framework ).

Also, maybe the Actor model isn’t even a good idea. We think it is, but it hasn’t been tested. In the time that we are able to allocate for it, we are experimenting with our own actor model implementation that may one day find itself in Lucky or some other form of experimental charming framework, but we don’t know what the future looks like for that. We want to get some example of what Actor charms could look like and whether or not it is an effective strategy for writing charms.

Closing Thoughts


Forgive me for such a long post, but I wanted to be thorough. If you got this far, thank you for reading.

To reiterate, this was not written just to bash on Juju. This was written so that Juju and the community could benefit from hearing our honest opinion about the shortcomings that we have found in what is really a one-of-a-kind and amazing tool. If we didn’t think so, we actually wouldn’t spend the time to break this down like this!

I would encourage anybody who has any thoughts to quote the relevant pieces of this post, if any, and reply with their honest opinion. That is how Juju will improve, not by pretending that it is perfect when it isn’t, but by addressing what needs help. I fully hope that this will be helpful to Juju.

8 Likes

Interesting reading and I think many items you bring up have been raised many times.

  • The high learning curve.
  • Fragmented documentation
  • Weak Linux distribution coverage
  • Relation/interfaces difficult to understand.
  • GUI:s are unstable and has not a clear usage scenario.
  • The line between frameworks on top of juju, versus, juju core - is not easy to draw for new developers.

I do think it is changing slowly and taken seriously.

1 Like

Wow, thanks for taking the time to write this. Such carefully thought out (and honest) feedback from experienced Juju users is worth its weight in gold.

There’s a lot of interesting and valid points made.

There are things which have been recently under active discussion, like cross distro charms, and other things which are being worked on, albeit it takes time to get good results, like better doc.

Some things like the preference for Python based charms and hence the development of Python based frameworks for writing them, are based on an opinionated point of view as to what best suits most people and what best gets bang for buck in terms of benefit vs cost of development etc. But the great thing is that Juju charms still supports other languages like bash etc (as you well know :slight_smile: )

I won’t do justice to the excellent post by responding any more late on a Friday. Just wanted to leave a quick reply to say thanks for posting.

3 Likes

I also have to take some time to properly digest this post. But wanted to say thank you for taking the time to put together a thorough, detailed and useful perspective on Juju’s strengths and weaknesses.

3 Likes

On the GUI, I think there is room for parallel efforts as long as the underlying REST APIs are clean.

I like your focus on productivity - that is the main driver for us of the switch to a more tabular GUI. We found that the canvas approach, while helpful to show people who are used to whiteboarding architecture, wasn’t easy to make productive for operations. With a denser, more info-centric approach that we call the dashboard, I think we can achieve what you want, with keystroke and icon vizualisations and in-place operations like access to logs.

I think the best way to progress your engagement on this is to have a detailed call with the design and implementation team for the dashboard. I think many of your ideas can be incorporated. If not, we can figure out how to keep the APIs you need to continue the current canvas approach further.

3 Likes

In terms of charming approach… I think we just have different perspectives, and that’s OK! I think the key is to keep the underlying Juju model language-neutral.

The drivers for us focusing much more explicitly on Python in our new framework is that we have found that charms are real applications, that should be tested and managed more like apps than integration code. Your experience may be different, and I don’t see a problem in that.

I would say that a key goal for us was to solve the documentation-of-interfaces problem by code reuse rather than documentation. Using the same language for both sides of the relation allows for nice code reuse between the authors of the charms that have to relate. We tried to do that in a language-neutral way in reactive, but the result as you found was neither tasteful bash nor tasteful Python, as you observed :slight_smile:

On this one I think we should run parallel efforts - you are happy with your approach, and we’re exploring a different one, I don’t think it’s something to try and predetermine now.

2 Likes

On cascading errors, yes, it seems we need to provide a way to choke off those problems and clear the issue from the active set.

On machine removal, yes it seems we need to adapt the thinking and be explicit about that.

On relation data, which is really interface documentation and debugging, the discussions we have had are twofold. First, focusing on code reuse so the details of the data changes are encapsulated in code which is written by one author for both halves of the relation. That means accepting a language-specific approach. And second, on making it easier to visualise and step through the evolving relation data, which would be part of making it easier to interoperate with an interface implemented by someone else in a different language.

The reason we feel code reuse is the more compelling approach is because it is easier to vendor and version code than data protocols.

2 Likes

This also leads to the discussion about formal specification of interfaces. I understand the desire to have clarity about interfaces. But it’s actually really hard to design a data messaging protocol well, especially one that involves multiple distributed systems interacting. If we expect charmers to do that really well, I think we will likely fail.

On the other hand, if we find a way for charmers to share code, and we encapsulate the (potentially messy) relation data inside code that one person has to make work, then everyone else can just use that code.

Of course, this gets back to our divergent views on language neutrality. I think it’s good we’re exploring this problem from different perspectives.

2 Likes

All in all your write-up of your perspective is fantastic, thank you for taking the time, and let’s keep working on this together.

4 Likes

To be clear, the most valuable thing about the old GUI, to us, was not the canvas style, though that view was cool for presentation, but the ability to graphically control the GUI.

That sounds, good. We did recognize that the density of the old GUI was not very optimal for large deployments. My colleague ended up zooming way out in the web-browser tab so he could see the whole thing.

That would be great! My team would love to have a meeting. We use Jitsi Meet for our meetings and we could work out a time to have a web conference.

That sounds good. Me and my team have experience with both full-stack applications and web application/website frontends and we have entertained the idea of writing our own GUI prototype as well. As long as the API is there we will be free to experiment with the UX, and we can collaborate on any API changes that may be necessary.

Yes! I’m so glad that Juju was flexible enough that we could push it so far in the direction of our goals with the Lucky framework. And it is refreshing that Canonical and the community is welcoming to different approaches and ideas. :slight_smile:

I actually totally agree with that, and I was very pleased to see the new relation “mocking” setup that the Operator framework had for testing, something I definitely want to work into any frameworks that we make. We think, though, that writing charms demands it own application development tools, similarly to how you have tools such as Cargo and NPM for Rust and NodeJS.

With the actor-based messaging strategy we think that this development workflow can be modularized in a way that is also language agnostic.

I agree. It is something that is very difficult to tell how it will play out practically before actually using, and it is also rather based on developer preference. I don’t think there is any value in trying to unify the approaches, but for now give some time for both approaches to prove themselves out.

It does produce a little bit of a community split, maybe, with two different charming methodologies, but at the same time it allows different developers to take the path that is most close to the way that they think and work. Eventually it may be that the lessons learned from these frameworks will manifest in one framework that solves all of our problems. :slight_smile:

:+1:

Absolutely. Thank you for your responses as well. Your points have helped me understand better the motivation behind the designs you are currently pursuing.

We do have different perspectives which is actually good, because hopefully we collectively can represent a larger spectrum of potential charm developers and users of Juju.


It may seem ironic how we are already considering developing yet another charming framework ( the design of which was outined in [Idea] Actor Charming Framework For Juju ) after producing Lucky, but Lucky was essential to helping us understand what we felt like Juju still needed help with, even after solving a lot of our pain points.

We are wanting to establish an easy-to-understand way to develop charms that is maximally cross-platform and language agnostic. We think the actor framework is a way to do that, but it needs proving out.

So while we use Juju and Lucky to make charms for now, we will experiment to see if we can bring Juju even further into our vision of simple and powerful automation design framework.

Thanx Mark and glad you give this thread such an extensive reply which gives a lot of credibility to the attention to juju from a Canonical perspective. It provides a good indication juju still is in the game.

On the quote, I can let you know that I have had the pleasure to participate with the great design team twice already and saw early work on the new GUI and had the chance to provide input and thoughts, although I have yet not been able to use it in a live situation.

I think think much is to be won with a good GUI, but I think for juju generally, the other items listed and mentioned are equally important to address.

I’m definitely interested in helping out there also moving forward.

3 Likes

Holy @#$%!

This post should be treated as the bible.

@zicklag you’ve perfectly verbalised the entire Juju experience from many perspectives. Developer, Researcher & Operator. :clap: You have an amazing talent for constructively relaying experiences in their entirety.

I’m (and probably many others) euphoric at everything that’s being discussed here.

2 Likes

Firstly I’d like to echo many of the others and say “Thank you” for taking the time to write up your experience. I’d like to comment on a few points, but that doesn’t take away from all the rest.

As I’m sure you have noticed, Juju will automatically retry hook errors. However there is a command that you can use to tell the Juju controller (and model) that you have handled the situation.

juju resolved --no-retry <unit-name>

This will effectly remove the error state from the unit for the current hook execution. Juju then moves on to the next thing it would run for that unit. Frequently this is another hook. However, sometimes hooks are written in such a way that the next hook also enters an error state. You can continue to run the resolved command to move the agent along from each hook, effectively telling it that you have handled it. Once the queue of hooks have been executed, and resolved, the unit should enter its idle state. This would allow you to remove the application or upgrade the charm.

You may find that during the removal of the application, you’d also hit errors. You can mark these resolved too, and you should get to the stage where the application is removed without breaking the other applications it is related to.

This is definitely something I think we could do better about. Perhaps being able to set a policy for the model, so when you remove the last unit for a machine, it could leave the machine there. This should be very easy to implement.

As an operator of that type of model, you could still call remove-machine explicitly to get rid of the machine.

I like this idea, and is something definitely worth exploring.

There are other elements as well that I’m thinking on, but I wanted to explicitly comment on these.

1 Like

Isn’t this already juju model-config provisioner-harvest-mode=none?

Ah, that --no-retry is something I haven’t seen before. The problem I always had was that specifying resolved would only work for a second before it would immediately go back into an error. That’s good to know, thanks.

It would be cool if you could do a juju remove --force-resolve or something like that that would automatically go through and “gracefully” remove by ignoring all of the errors.

:+1:

Hello @zicklag, I’m Claudio from the Canonical design team. As mentioned by Mark and Erik, we are working on the new Dashboard and we would love to have a meeting with you and your team to discuss it more in detail. I’m sending a DM about it.
Thank you very much for your amazing feedback, it’s always meaningful to see such a great effort from our users.

@erik-lonroth we were planning a visit to you directly, but of course now we need to do remotely. We put together an agenda and some points to discuss with you and your teams. Are you available for a meeting next week to talk about it?

1 Like

Look. I would love to arrange a meeting with you again with my colleague @xinyuem some time in the future. But at the moment, the covid19 situation leave little room for us to engage.

How about schedule this later this autumn instead to give more time to let it pass?

1 Like

@erik-lonroth @xinyuem
Sure, no worries! In the meantime please let us know if you have any feedback.

2 Likes