Over the last couple days I added a bunch of SmartOS documentation on getting the major configuration management solutions working. Pointers for CFengine3 and Puppet are there with the basics on getting started. I added extensive documentation for Chef, suitable for even users entirely new to Chef. I’ve also populated a Github repo with cookbooks, Knife bootstraps, and a full framework for using Chef Solo with SmartOS.
Go find all the docs in the SmartOS Wiki: Configuration Management on SmartOS
“Pointers for CFengine3 and Puppet are there with the basics on getting started.”
You guys picked pkgsrc, I presume on the grounds that it can build components from source and is not screwed up in the %pre(un) and %post(un) installation order like RPM… was it really so hard to do the last 10% and start encapsulating automation inside of pkgsrc packages, then check them into some SCM like git or Mercurial, and plug them into some network installation technology like JumpStart?
Relize that if you had to implement a recipe-based framework to undo previous hack-gone-wrong on the system or systems, you have already lost:
- someone has hacked the system via Puppet or Chef, indicating lack of process (dev, test, product acceptance, production), deployment windows, change management tracking system;
- no ability to plug in discrete, encapsulated automation (“components”) into a network provisioning framework like JumpStart, Kickstart, etcetera.
I fail to see how sharing pointers to installing CFengine3 or Puppet open up the floor for a bitch fest about PKG-SRC… but so be it.
First, Jumpstarts dead any way you cut it.. Solaris 11 or Illumos, Jumpstart is dead. So lets just ignore all that. And Kickstart, wtf does that have to do with SmartOS? PKG-SRC is used by SmartOS and we use none of these styles of installation, in fact we don’t “install” at all… so any reasoning down that line is fundamentally flawed.
Secondly, our choosing to use PKG-SRC had nothing to do with pre or post install scripts. Unlike IPS, eliminating them was not an architectural decision. We chose PKG-SRC because we have a strong BSD heritage and it was an excellent cross-platform solution for building packages while maintaining compatibility with the PKG-SRC community at large. We’re not trying to compete with RPM or Deb or any other system.
Thirdly, it seems you have a horribly broken understanding of declarative configuration management in a non-persistent and/or cloud based architecture.
I’m all for a good fight… feel free to better frame your argument and lets duke it out. Please email me and we can have a more involved discussion or if you’d like we can hook up on Skype some time and talk it out.
Ben, thanks for this it is really useful.
Any chance you could skim over installing a monitoring solution – e.g. Zabbix – in the global zone?
Cheers.
Sure, I’ll put it on my todo list.
Any chance of sharing a decent SmartOS zabbix template as well?
Hi Ben,
I have been a user of Solaris JumpStart for some time and want to start using Chef. You have previously mentioned that the initial learning curve on Chef is rather steep.
Therefore I am grateful to you for being so “open” in providing information about how Joyent is using Chef internally on SmartOS.
Being “open” is paramount. It is what will attract more and more developers, especially young developers, who are the key to influencing potential users and companies to start using SmartOS.
Being “closed” turns developers off. Keeping developers in the dark is a form of mushroom management in which once the lights are turned out the only thing that grows are mushrooms not market share.
It seems that the Sun is once gain rising through the illumos community and SmartOS.
Keep that good “open” stuff coming.
“fail to see how sharing pointers to installing CFengine3 or Puppet open up the floor for a bitch fest about PKG-SRC… but so be it.”
I have come to the conclusion that you are very much a proponent of Chef, and based on my experience, doing orchestration via automated hacking solutions, or as you call them “declarative configuration management” is not the optimal way to control a large number of systems.
To frame the discussion, what advantages does Chef offer over encapsulating automation inside of OS packages?
With Chef, Puppet or any other automated hacking solution I have no way to ask the operating system which version of the component or components I am running, and I have no way to ask the operating system which files the component contains, nor whether someone has modifed those files in any way (as software management subsystems maintain an integrity check of the files). So in a way, my visibility into what is on the system is limited when I troubleshoot production, whereas if I can query the OS in a uniform way (pkginfo, pkg_info, rpm -q…) I have a consistent and simple interface to find out what is on the system without intimate knowledge of how it all works or where it is. The computer will tell me. It has all the information and it is easy to obtain.
“First, Jumpstarts dead any way you cut it.. Solaris 11 or Illumos, Jumpstart is dead.”
JumpStart might be dead in your world, but there are still plenty of government and financial institutions that live and die by it. Someone has to provide support for them, and that someone obviously is not Oracle. So JumpStart is dead for Solaris. Fine. What is the alternative, and how clearly has that alternative been communicated to everyone? All I saw and heard so far was Brian saying “we do PXE boot of SmartOS from a central server”.
Great. Is that documented anywhere so that the rest of us can use it, work on it, improve it, get our feet wet with the technology? I am no stranger to PXE and DHCP, having extended a DHCP client to support suboption 43, but if there is one thing I would like to avoid at all costs, it is reverse engineering and hacking. I can do it and I’m pretty decent at it, but I hate doing it because it usually ends up being a huge tangent to what I am trying to do, and my guess is, I am not alone.
“And Kickstart, wtf does that have to do with SmartOS?”
My point is that if one encapsulates automation inside of an OS package, one immediately gets the benefit of plugging it into the network installation framework. Whatever that network installation framework happens to be, it does not matter what it is. The concept is key here.
You have basically sidestepped the core of my question. To orchestrate via Chef, one writes recipes, so we can look at it as scripting automation on a large scale.
My question still stands: why didn’t you encapsulate that automation inside of OS packages? Let’s have a technical discussion. Did you find anything in OS packaging that technically prevented you from encapsulating work into OS packages?
“Unlike IPS, eliminating them was not an architectural decision.”
OK, fine. I am just asking because I am curious. Remember, I am looking at what you guys are doing from the outside (although I would certainly not mind looking on it from the inside given the opportunity). So to me as an outside observer it seemed curios, neither wrong nor correct, why pkgsrc was picked, and nowhere is that explained. So I am curious, and nobody is saying anything about that.
If one wants an answer, the best thing is to ask. So I’m asking.
If I have a “horribly broken understanding of declarative configuration management”, please teach me. For example, how is learning a whole new language for a system that blindly overwrites configuration advantageous over keeping a bunch of discrete, self contained components in operating system packages? Wouldn’t it be easier to be able to deploy anything with just pkg_add, then have to know the intricacies of each individual recipe?
I’m all for comparing notes. Have at it.
Where does the term “automated hacking solution” come from in relation to configuration management? A quick Google doesn’t find it in use anywhere, and I suspect your use of it implies a profound misunderstanding of what these systems do.
Chef simply hacks configuration files based on a “recipe”. That is in essence what it does, and it does it on a massive scale. It can also start and stop processes on a mass scale, and it can overwrite a file which does not match what it is told that a file should contain.
As you can imagine, I have a huge problem with a daemon arbitrarily overwriting a set of files it has been told to watch over, because to me, that means someone has logged onto the system and changed something manually. And that in turn signals that process is deficient.
In my view, other than to perhaps setup storage, nobody should ever be allowed to log into a system for any reason, let alone hack any files by hand so that a solution like Chef would have to overwrite them. That means that somework should have been packaged and it was not, or that some framework that a component could call needed to be designed and was not.
Solutions like Chef and Puppet are treatments for a symptom, not a cure for the root cause. That is my experience anyway.
Now, if I missed anything, feel free to correct me.
I have an idea. Pick a small configuration file for something, and let us each devise our own solution: you write a Chef recipe, I will make a SVR4 package. Then we will have something solid to look at and we can discuss scalability. Please pick some simple file and provide what the configuration inside of it should be, and let’s have at it. Then rather than philosophically, we can compare and contrast real solutions.
Lets first understand that SmartOS is a distribution of Illumos intended to do one thing and do it extremely well… visualization. It doesn’t compete with Ubuntu or Solaris, it competes with ESXi. There is no native packaging for OS installation because there is little to do there. It can be done by bootstrapping PKG-SRC, but in the JPC I don’t do it, but others do and that’s each administrators decision to make.
On to Configuration Management (CM) versus configuration rolled into packaging. This is not a new or unique conversion, in fact it was hotly debated at DevOps Days Mountain View in 2010 (pretty sure it was 2010). Some people we packaging their configuration up and overlaying it as a version controlled RPM/Deb, there were others who were adding them to packages as pre-/post- install scripts. The debate was fierce. Some people felt, as you obviously do, that the software and the configuration were inextricably linked and therefore should be managed as a single asset. Others felt, as I do, that the systems configuration should be entirely independent of the packaging system so that it was both portable and consistent. There are advantages and disadvantages to both approaches. Package based configuration is extremely precise however it is also extremely inflexible. If your managing a medical device (several do) the consistency is extremely important and flexibility isn’t. However, if your deploying a web service on Joyent and then wish to deploy a cold standby on SoftLayer, flexibility is of prime importance. So lets be clear, the choice you make depends on the problem your solving.
You clearly have a very old and primitive view of Configuration Management, which is not uncommon. That is, you see it as little more than a hodge podge of shell scripts in a specialty language not worth your time to learn. Many have felt this way… “I can do all this in a bash script, screw Puppet” or the like.
The power of CM is in its consistency, idempotence, and versatility. With Chef, for instance, attributes can be applied to a server or group of servers to tailor a single generic cookbook for the task at hand. I have different Zabbix servers in each of my data centers, but I use a single cookbook for all of them, I just change the “zabbix/server” attribute as appropriate and I’m done. More importantly, I can change that configuration in seconds by simply updating the attribute file and having Chef re-run. If that configuration was in a package I’d have a heart attack reinstalling or updating the zabbix-client package.
Keep in mind the type of automation your preforming. If its just adding a user for a daemon, sure that goes in the package post-install…. if your configuring which of 10 different syslog aggregators it should point to, that’s not so easy.
Finally, keep in mind that CM is and should be idempotent. That is, you can re-run it over and over and it will preform not function unless it needs to be done. For instance, if a config file already matches the configuration I want, it does nothing, otehrwise it makes it (declarative) right. Therefore, if I want to ensure everything in my datacenter is configured properly I just let my CM tool run and if everything is good, nothing happens, if anything is wrong, it fixes it. This is something you CAN NOT DO in a Jumpstarted environment. Trust me, I had a world class Jumpstart infrastructure for a very long time… everything was perfect when deployed, but after 6 months validating that things hadn’t drifted was nearly impossible and fixing anything that had changed was a daunting task.
If that doesn’t convince you I don’t know what else to day. CM isn’t the future any more, its the present accepted standard. Those who haven’t moved to CM are now at least 2 years behind the curve. CM is industry standard in all but legacy environments and many of those will back fill it simply for sanity sake in the future.
There are many points you have made here which I would like to address, but before I do, I want to make sure I am understood in what I am writing about. This is why I have offered to write an example package. Once I do that, I will happy for you to come and rip me apart if you still disagree with me. But the code will be there, and then what I mean will be clear.
Please give me a configuration file, and tell me what it needs to look after the fact. I will implement each and every point you made in a package. Then feel free to rip me apart if you still think what I did is incorrect and will not scale.
I think that you and I want the same thing, but that we are speaking about completely different things, and want to get that out of the way first. What do you say?
No… its not worth my time. This is a troll, not a discussion. If you want to understand the practical advantages and disadvantages of both approaches you can do that work yourself. I did many many moons ago.
I did, and I found Chef to be the wrong way to do things. Needless to say, I view using “orchestration frameworks” instead of integrated change, asset and deployment management solutions as automation of hacking a file in vi, just on a massive scale.
You don’t have to argue with me, just give me a file and let me demonstrate my way, then do (or not) what you want after that. I still think you do not understand what I am talking about, and I think that because you mention things like “SmartOS is a hypervisor” — we all know that. It has been beaten to death enough times. It’s a hypervisor. It’s not perfect, but it’s the best. That’s not the issue.
And I’m not a troll; all I am asking you to do is provide a simple configuration file, and I can demonstrate.
First you say “let’s duke it out”, then you say “it’s a troll”. If I don’t see it your way (and people often disagree), then we’re suddenly not “duking it out” any more.
I am not interested in a philosophical discussion because I do not want to argue. I want to compare and demonstrate. Let the code speak.
I may be wrong but ISTRM having read his pseudonom before, on the c0t0d0s0 blog…
Why anybody would package up a set of configuration-files is beyond me, though.
Maybe for special circumstances (interstellar satellite?) – but not in a dynamic environment like cloud-hosted services – or just a “simple” datacenter with a few hundred servers.