Shrinking our binaries by 70%

Hi team, I thought that I should share the results of a small experiment… We should consider running upx over our compiled binaries. This will add about ~10mins to the build but should decrease the size of what we ship by at least 70%.

$ cp $GOPATH/bin/juju* /tmp
$ cd /tmp
$ upx -9 juju*
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2018
UPX 3.95        Markus Oberhumer, Laszlo Molnar & John Reiser   Aug 26th 2018

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
 100954112 ->  27944028   27.68%   linux/amd64   juju                          
  52021592 ->  15768984   30.31%   linux/amd64   juju-blobstore-cleanup        
   3239584 ->   1231776   38.02%   linux/amd64   juju-bridge                   
   8822784 ->   3396680   38.50%   linux/amd64   jujuc                         
 124768256 ->  33763772   27.06%   linux/amd64   jujud                         
  51984728 ->  15762660   30.32%   linux/amd64   juju-force-upgrade            
  52091224 ->  15795124   30.32%   linux/amd64   juju-list-blobstore           
  85049344 ->  22593872   26.57%   linux/amd64   juju-metadata                 
   --------------------   ------   -----------   -----------
 478931624 -> 136256896   28.45%                 [ 8 files ]

Yes, the files are still perfectly fine. Naturally, they need to be decompressed to execute them - so perhaps it’s not worth it for the clients. But the agents would certainly benefit from reduced file sizes in my opinion.

A blog post about using UPX with Go projects. (We already use -ldflags "-w -s" to strip debugging info from release builds)

The results are pretty interesting, as a 70% improvement is certainly something
to note, especially when your binaries are sometimes over 100 megabytes
themselves.

I just don’t think that packing executables is a good thing to do for open
source projects that have a wide reach, and especially one that is used for
managing and provisioning critical infrastructure.

I like reverse engineering, and by extension, looking into how computer malware
works. Most malware that isn’t a toy these days are packed, and usually packed
with a custom build of upx, such that running “upx -d” with a vanilla build of
upx fails, to slow down malware researchers.

When I see a packed binary in the wild, I immediately think that someone is
trying to hide something, and the binary is likely malicious.

AV companies are on the same wavelength these days, and will flag a packed
binary as suspicious. As we expect juju to grow and to enter more enterprise
environments, more and more will start using enterprise AV inside their
juju controllers or user’s clients, or perhaps even within their deployed
machines themselves.

AV software really don’t like the fact that a binary loads a stub loader,
decompresses the binary, and then executes it from memory. The program running
in memory is different from the binary on disk, which makes it difficult or
impossible to determine if the program running in memory has been tampered with
or overwritten, and would set off some AV alarms about application integrity.

Packing can also change the programs entrypoint, which then causes problems
down the line when you want to debug your binaries, as suddenly machine addresses
might not match debug symbols, which could be bad if a juju user hits a bug
that is critical to their production environment, and it would delay debugging
by SEG.

Decompressing the binary to run it also takes time, and at 100mb a pop, this
would be noticeable to users. If a provisioned machine is getting low on memory
and is battling the OOM daemon, restarting a killed juju daemon gets harder as
the unpacker needs additional space to read in the program, decompress it and
then jump to its entry point, which can sometimes cost packed binary size +
uncompressed binary size memory.

Since juju is really at the core of deployments, it is very important to think
about the security triad of Confidentiality, Integrity and Availability. I think
packing juju binaries could compromise integrity and availability of juju, and
would introduce more risks than the advantages of disk space being saved.

So, while packing can give you good results in reducing binary size, there are
good reasons why packing isn’t more widespread, and in the context of juju, I
think it is better to leave binaries as is, and live with large binary sizes.

2 Likes