Containers, Now and Then
Posted on July 1, 2016
I’m excited to announce that Cuddletech is now 100% Docker Powered. This is particularly exciting for me because it comes after being hosted on containers for the previous 10 years of its life. To celebrate it seems like a good time to reflect on how containers have evolved over the last decade.
For those not familiar with Cuddletech, since 1999 it has been my personal website dedicated to all things Solaris (and Enlightenment, and my various books, etc.). I moved it to the Joyent cloud I helped to build in 2006 (it was called “Grid” then). I had been asked to come to Joyent because of my love of Solaris and to join in realizing the vision of “utility computing” which grew at Sun but which they never realized. We built the cloud on the Solaris Zones technology. I was able to use my connections within Sun to help drive the various technologies in a direction to help benefit Zones and, in turn, the cloud we were building.
Zones were an elegant solution to the problem of waste produced by solutions like Xen. At that time VMware was still getting its feet under it, but it was clear to everyone that as rack-mount servers increased in capability and resources they were far too large to be dedicated to a single application and isolation was required to keep applications unentangled. On the large SPARC servers we had Logical Domains (LDOM’s) and other ways of carving up the system, but on X86 it was a free-for-all. Zones kept applications nicely separated and manageable, while avoiding any significant performance penalties.
Zones came in two varieties: Sparse and Whole Root. “Whole root” zones had a complete and independent filesystem from the host system (known as the “global zone”), which provided benefits of greater seperation (you can write into /usr/bin for instance) but at the cost of much duplication. “Sparse” zones on the other hand utilized a read-only loopback mount of the global zones filesystems into the container thereby only requiring /etc to be re-generated to allow for read-write, meaning that each zone only consumed a couple kilobytes of disk. With Sparse Zones you could spin up thousands of containers on even a modest computer.
The party kicked into over-drive when Solaris welcomed ZFS and Crossbow (the Solaris networking stack). ZFS provided us with fully isolated storage (a “dataset”) to the container which could be snapshotted, cloned, rolled back, etc. Crossbow provided us with dedicated virtual networking interfaces which were plugged into a virtual switch. As time went on we got even more features such as increasingly robust resource controls around CPU (such as “bursting”), Memory, Network throttling, I/O throttling, etc. It was exciting to be involved in it and using the Joyent Cloud as a proving ground for the amazing technology Sun was producing.
I would famously grill new administrators to Joyent with the question “What is a Zone?”, to which the proper answer was that is was three things:
- A Definition (/etc/zones/myzone.xml)
- A Registration Entry (/etc/zones/index.xml)
- A ZFS Dataset (/zones/myzone)
The real power was in the ZFS Dataset. Because we could snapshot easily and with virtually no overhead, we could easily migrate zones from one location to another, enable backups, etc. Most resource controls were embedded in the Zone Definition, which means that moving a zone from Point A to Point B was as simple as a zfs snapshot, zfs send/recv, copy of the definition, and adding it to the registry. Roll that pattern into a script and you can move zones around the datacenter in just the time it took to move the data across the network, and even faster when we improved the scripts to do multiple passes of snapshots and then deltas to reduce the transfer time to seconds.
Sparse Zones were amazing.
But as strange as it may seem now, they weren’t as appreciated as you’d think. There was one fundamental flaw in our use of containers at that time….. they were just really fast and lightweight VMs.
I went to battle with several Sun execs (who shall remain nameless) who nearly scrapped Zones in 2008 wanting to replace Zones with a port of Xen to Solaris (which they did under the name “xVM”). I supported the initial plan to port Xen because at Joyent we were loosing business because we didn’t support Linux and Windows alongside Zones (many customers wanted to put custom apps on Zones but still required Windows for AD or Linux for something that couldn’t be ported or wasn’t supported on Solaris). However, as it became clear to me that Xen was intended to ultimately replace Zones I lashed out against the move.
The battle intensified when I learned that Sparse Zones were being removed from Solaris in favor of Whole Root only. Ultimately that battle was lost and Sparse Zones were removed from Solaris and later Whole Root would be further extended by introducing “Kernel Zones” and thanks to ZFS Boot Environments had a completely separate installation of Solaris. From my perspective at Joyent, running a cloud based on the flexibility provided by Sparse Zones it was an outrage and for several years we were “stuck” on OpenSolaris Build 121 (“snv_121”, insiders will note this was caused by a variety of other factors such as ZFS De-Duplication breaking all sorts of things, and then problems in some of the later builds thereafter). When Oracle bought Sun and Joyent acquired many of Sun’s best engineers were we able to help fork OpenSolaris into Illumos and then SmartOS and re-institute Sparse Zones as “Joyent Branded Zones”.
The history lesson here is to drill home the point that most of the world, even at Sun, insisted on viewing Zones as a lightweight VM and an ineffective one at that. This is why many pushed hard to make Zones more and more like regular VMs with the overhead of a full OS install, boot partitions, and kernels.
But even those of us at Joyent didn’t realize how much we continued to view Zones as VM’s until LXC gave way to Docker.
Even today, in 2016, Linux has not caught up to what Solaris had nearly a decade ago. The technology provided by Linux doesn’t hold a candle. But it wasn’t actually technology that made Docker something new, it was a worldview we hadn’t fully appreciated.
While Docker loves the “shipping container” analogy, I prefer to think of them as game cartridges. Plug an app in, pull it out and replace it with a new one, hit the power button and everything resumes because everything around the cartridge remains unchanged. This splits application deployments into small, compossible, re-usable elements. Upgrades are simple and I only need to backup my application data, not the application itself.
I love the Docker model because it gets us closer to the ideal espoused in 2009 by Chef co-founder Adam Jacobs as the definition of “Infrastructure as Code”:
“I define it through it’s primary goal: Enable the reconstruction of the business from nothing but a source code repository, an application data backup, and bare metal resources.
Secondarily, there is another ideal here – the thing that should take the longest in this recovery process is the time it takes to restore your application data.”
CFengine, Puppet, and Chef each inched us closer to the ideal of Infrastructure as Code, but in many environments it struggled with fully solving for that secondary goal because the ways in which we installed our applications were inefficient. Chef could only go as fast as the installation mechanisms for your applications could go. The analogy here is building a house on location versus trucking in a pre-fabricated home. With the Docker paradigm available to us we can now use Chef to build the solid foundations and then drop ship the house. If it burns down, we just drop in a new house on our firm foundation and we solve for the secondary problem of allowing us to rebuild as fast as our data can move.
In the Solaris world, particularly as envisioned by Joyent, we enjoyed Adam’s vision by means of Sparse Zones, ZFS, and Crossbow. Today in the Linux world we approach this vision more slowly. Docker gives us flexibility in application deployment, and we still require configuration management to provide the foundations for Dockers runtime environment, and thanks to Linux’s flexibility we can bring containers off the server and leverage them on our Mac or Windows laptops.
Sadly, we still have poor options with regard to volume management and networking. The solution to the storage problem was solved by the “stateless applications!” propaganda machine, which contains only a fragment of truth in certain situations, but just enough to keep it alive. The solution to networking has a variety of solutions, from overlay networking to service discovery to service specific IP’s provided by Kubernetes, but they are still only partial solutions, and not at all agreed upon universally.
Despite the challenges faced by the Docker ecosystem, I think we do all agree that the idea of treating containers as an isolated and self-contained application runtime environment rather than VM’s is the right path. In this context, Microservices and Serverless seem very logical progressions of the concepts.
Today, in 2016, Docker has realized the vision of the Java JAR, a ready-to-run application with everything it needs, which can run anywhere provided a lightweight runtime is available. But this is only the first battle.
It was evident at DockerCon 2016 that the legendary “Ops/Dev” wall continues to exist. What Docker gave us was a small slot in that brick wall large enough to push a Docker container through. Yet further work is required to bring down that wall…. and bring it down we must.
I’m excited to embrace the future of containers. I have no doubt that ZFS will continue to come to Linux and solve our storage problems, and networking will continue to improve. Clustering & Schedulers will continue to evolve and converge which will continue to shrink the Dev/Ops wall and perhaps even one day destroy it. New technologies for building intelligent containers, such as Habitat, will improve and make our applications not only self-contained but also self-aware. The future is indeed very bright.
… and so, because I love this future so much, today I moved Cuddletech off Joyent and onto a system which is running 4 docker containers. 🙂