Archive for the ‘Management’ Category

Writing a Better SOP

Tuesday, September 25th, 2012

Within an ops team you should have 3 primary types of governance enablers: controls, policies and processes. A control is a guiding principle, which is implemented as a one or more policies (which are just rules), which are in turn standardized in a set of procedures. Its important to have all 3, because controls are very vague, policies are often general and broad in nature, which means to provide consistent quality results we require prescriptive procedures. At Joyent we call these “Standard Operating Procedures” (SOP).

The whole point of an SOP is to produce consistent results regardless of who’s using it. That means that all SOP’s need to be in a similar, familiar, and easy to follow format that is suitable to anyone who may need to use it. That, therefore, means that to get those consistent results there can be no room for ambiguity, it must be explicit and convey any necessary context along with it. Ambiguity is the mortal enemy of consistency. Case in point, if you’ve ever been asked to recompile software with a large number of configure flags, if your unable to determine which flags were used in the past you’ll go cold with anxiety over whether or not your building it properly. When you go back and ask who it was built in the past someone might say “Don’t you know how to compile software?” and the answer is likely going to be “Yes I do, but I don’t know how YOU compile software.” Whats important is that the person implementing a procedure be given all the information and context necessary to understand, and if necessary, interpret the information as appropriate for the given situation.

The first key to better SOP’s is to provide a template for others to follow. Without a standard template each author will write the procedure in their own unique style. Some people will write you a book, others will just paste some lines from their terminal into a code block. The template therefore must enforce a certain flow that ensures we include all the needed information but in a concise and complete way. Plus, we want SOP creators to focus entirely on writing the content, not debating the format.

Here is the template I use for Joyent Operations SOPs (in Confluence markup):

* Author:  {page-info:created-user}  created at: {page-info:created-date}
* Version: 1
* Revisions: {page-info:current-version}
* Reviewed by: (User @ date)
* Time to implement: 1hr
* Products this applies to: (SKU1)

{toc}

h1. Description & Scope

h1. Prerequisites

* Root access to node
* [SOP-222: Something|SOP-222: Something]
* [SOP-224: Something else|SOP-224: Something else]

h1. Procedure

h3. Step 1: Do this

{noformat}
Example
{noformat}

h3. Step 2: Do that

h3. ...

h1. Procedure Validation

# Login and verify external connectivity (ping google.com)
# curl zone IP address, page returns
# etc.

h1. Notes/Jira Examples

* [http://confluence.atlassian.com/display/DOC/JIRA+Issues+Macro]
* [http://confluence.atlassian.com/display/DOC/JIRA+Portlet+Macro]
* [https://studio.plugins.atlassian.com/browse/CONFJIRA-154]

Lets step through the above template.

All SOPs must be numbered for easy reference. Even the template itself is SOP-000. The SOP title is in the form: “SOP-102 Creating LDAP Users”, for instance.

The top of the SOP is full of metadata. The author, creation date, major version number and number of revisions made and products (or projects or whatever) that this SOP applies to. You’ll notice 2 other fields: “Reviewed By” and “Time to implement”. These are perhaps the most important of all. After an SOP is created it must be reviewed by someone else in the group, preferably with as little knowledge of the subject as possible. They should read and follow the SOP as written, starting a timer when the begin and stopping the timer when they are complete… it is that stopwatch time which becomes the “time to implement”. This is extremely important, the time estimate for implementation by the author will be way too short because they know what they are doing, the time it takes a complete n00b will be more useful and truthful.

Moving on through the template, “Description and Scope” are where we provide context. What are we talking about, what does it entail at a high level, what does this impact, etc. We want to include as much information as possible to set the stage for the procedure that follows. Then we include a bulleted list of “Prerequisites”. The single most common part of any procedure that gets skimped on is the prerequisites and they are also generally the most time consuming.

The meat of the SOP is the procedure itself. I strongly believe these must be in a “Step 1… Step 2… Step 3″ format; it must be easy and intuitive to follow and in some cases may be used as a checklist during sensitive procedures. Its important that these truly start at the beginning and go to the end. “Step 1: Login to server X” may be overly simplistic but necessary for clarity if multiple machines are involved. I also like to have the final step be “Done” to make it clear that you have reached the end.

Just as important as the procedure is the “Validation Steps”… to ensure a quality job we must not only preform the proceedure but validate it in one or more ways to ensure it was really done right. This has the added side effect of giving the person doing the work the satisfaction that it was done properly and they didn’t screw something up along the way.

Lastly is a place to include external links as appropriate. If possible I like to link in tickets (we use Jira) which have relied on the SOP before, so that if by chance there is some confusion they can find examples of the work being done in the past.

An optional section that I’ve used before is a “Rationale”. In this section you would include notes on why you chose to implement the procedure in the way that you did. This allows for continuous improvement of the SOP. In most cases there are many ways to solve a problem, conveying why you chose the method you did will help you hone the procedure in the future while learning from the past. Without it your likely to have regression or duplication crop up.

This is the model that we’ve used at Joyent for several years and it has stood the test of time. I believe it to be a very solid standard for writing SOP’s and sharing knowledge within the organization and avoiding any one single person becoming a constraint. If you have refinements or a better method, I’d love to learn about it.

Policy & Process in the Blood

Saturday, April 14th, 2012

I’m highly introspective… far more than I would actually like to be.  I’m one of those strange individuals to whom if you said “Do you realize your being a jerk right now?” I’d actually admit “Yes, I’m sorry about that, I’m trying to find a way to rectify it unsuccessfully.”

Despite that obsessive level of awareness, nothing can tell you more about who you are then your children.  In particular, by observing things your children do that you never taught them, they just started doing of their own accord because “it seemed right”.  Genetics at work.

I fight frequently with people about documenting processes.  But maybe I’m just anal?  Then the other day my son comes to me and shows me this:

This is Glenn, my eldest son (6 years old).  He wanted some lemonade, but mom and I were busy.  He decided it might help if he simplified his request into a process.  You can see here that we start with a bottle of lemonade, then we pour it into a glass, then WHAMO!  we have our amazingly refreshing beverage to enjoy.  It is the perfect process with an input, output, and processing in the middle.  Brilliant, and he hasn’t even been to business school yet.  How much simpler does process get?

What about policy?  Policy is just a business word for “rules”, nothing more.  In my opinion, the worlds most amazing and effective policy is this one:

That yellow line is policy.  Its not a brick wall, but we treat it like one.  Thanks to that little bit of paint two cars can drive towards each other at 70 MPH, passing with only 6 ft between them, without fear.  It doesn’t get simpler or more powerful than that.

Parents and authority figures in general, tend to layer into a child the concept of right and wrong as absolutes. Take the cookie and you shall be punished, so don’t take the cookie! All throughout our culture we do this, define a rules and corresponding punishments. The result is a general fear of rules, because they are seemingly there for the sole purpose of justifying punishment.

Any rule, any law, any policy, can be viewed as a guide or as a guillotine. When I asked many of my peers what they thought about policy a surprising number quickly answered “Its there so that you can fire people.” Its shocking how many people believe that. One would think that policy is there to enforce lessons learned in the past, as a guide for decision making, pre-computed solutions to problems which might be difficult to conflicting. So then why is it that they are considered simply a justification for punishment? Inconsistency of course… everyone seems to ignore, discount, or outright disregard policy on a day-to-day basis and it only comes to peoples attention when someone is being called out.

Policy and process are wonderful things. At least, they can be. They are the means by which we share knowledge within an organization. Common tasks, problems, and dilemmas can be quickly handled in a tried and true way, consistent throughout the organization, because we have policy and process. But in order for them to work, there are some ground rules, if you don’t follow them they are doomed to be the millstones of frustration most of us see them as:

  • They need to be simple and straight-forward for the average employee.
  • They need to be indexed, so that they can be easily found.
  • They need to be relevant to the business, not just copied from someone else.
  • They need to be consistent, so that they do not contradict each other.
  • They need to be helpful and solve real problems.
  • They need to be up to date. Old policy and process can be worse than none at all, because people are afraid of the reliability and may waste time debating a course of action, which is exactly what process and policy should speed up.

The last point is the hardest. Knowledge management is still something we’re shitty at. Wiki’s have helped a lot over the last decade by making everything searchable and empowering everyone to update documents quickly and easily. But the fundamental problem is that of scaling. Not scaling the infrastructure but of the human mind. Many a sci-fi story has depicted the person who desire to know everything, and when the wish was granted, their head promptly exploded in one way or another. In many large companies when you hire on you’ll receive a book or binder with all the company policies… did you read it? Of course not: tl:dr.

Thus, what we’re really talking about here is culture. Genetics. Your children get them from you in the blood, but in a company we must teach them to others through words and actions. Preferably when employees are new, through on the job training/mentoring/tasking. Will you ignore policy and process? If you don’t care, they are likely useless crap anyways, and everyone can fend for themselves and hopefully get it right. But what if instead they were useful, and they were a reference available to simplify life? You don’t read the dictionary, but you know that its there and handy when you need it… so should be process and policy.

I feel passionate about these things because I hate to see employees stressed out because they aren’t sure what to do or how to do something. Useless anxiety. Wasted energy. Muda. I see managers beat on their people for not knowing… but who’s fault is it really? There are hard problems in the world, lets focus the energy on new problems and codify what we’ve learned in the past for everyone to benefit from. This is the nature of continuous improvement… building a collective body of corporate knowledge and continuously expanding, refining, and even replacing it when appropriate.

LISA Keynote 2011: The DevOps Transformation

Friday, December 16th, 2011

Last week I was given the incredible opportunity to not only speak at LISA but to deliver the opening keynote.  I hadn’t expected to even go, but when I learned the topic was DevOps I made a last minute plea on the eve of the submission deadline for a slot to deliver a talk I was calling “The 60 Minute MBA”, a history of Operations Management.  My hope is that I could get some obscure timeslot so a handful of people could geek out with me on Operations Management and LEAN and how it is helping to fuel and direct a lot of the DevOps thinking out there.  To my great shock I was told I was given the keynote slot… frankly, something I didn’t want for fear of the stress associated with it, but Tom felt I should step up and that I’d do great.

I haven’t blogged much in the last year and when I have its on topics you probably wouldn’t expect from a “Solaris blogger”.  I’ve held back most of what I want to talk about and only let the cream rise to the top.  My already frantic reading backlog only intensified as I was trying to pack as much into my talk as possible and ensure I was accurate.  Everything I read, watched, attended or did was reshaping my talk and I essentially spent 6 months “on stage” in my mind.   The problem I really had was that I had maybe 6 hours of content that I needed to condense into a 1 hour slot, hitting the essentials but not diluting its potency.  And, of course, I’m still learning every day.  Only 2 weeks prior to my talk did I finally hammer out a rough slide deck and I then had to keep pushing it around into something moderately cohesive.  Trying to find ways to address wisdom, systems thinking, agile, lean, TPS, OM and OR, and tie all this back to DevOps was a challenge.

To make things more challenging, Tamarah’s (my wife, seen above) due date for our 5th child is the 14th of Dec and the talk was to happen at 9:30AM Eastern time, which is 6:30AM Pacific and I’m not a morning person.  So… all things considered, I did pretty well, but you will notice in my talk that I was a little slower than I normally would be.  The upshot, however, was that I didn’t ramble much which kept me on my time marks.

What was interesting to me was what different people walked away with. Some people really keyed in on the value chain and asking “Why?”. Others wanted to rediscover ITIL because it was the first time they had heard it didn’t suck. Others got interested in operations management and LEAN, something they’d heard of but didn’t know where to start learning more. Others keyed on the collaboration of devops and bringing teams together. There was, I think, something for everyone and I didn’t hear any negative feedback on that talk beyond some people liking some parts and not caring about others… and it was designed that way.

Two things I want to note for viewers. First, when I said “by men I mean the human race”, I should have better explained that I think of “men” in a JRR Tolkien sense, the “race of man”. Secondly, at the very end I bagged on Sun TechPubs… I didn’t really explain myself and someone took offense to it. The fault was not on Sun’s writers, but rather on the engineering managers who wouldn’t permit writers the access to engineering they needed, so TechPubs was left to figure it out themselves. The fault was squarely on the engineering managers, NOT on the writers. Given the circumstances they have always turned out amazing documentation and I have nothing negative to say about the writers (as I noted in my answer, I wanted to be one at one time).

Anyway.  The following is the keynote, the slides can be found here.


 

I referenced a lot of books, and may have asked for the list of books, so here it is.

Please note! I do not profit from any of this in any way, I’m not getting a book kick back or whatever.  My only source of income is my Joyent salary.

The Essential Books you should read to put DevOps, ITIL/ITSM, LEAN and Operations Management into perspective and educate yourself for the future:

  1. The Visible Ops Handbook: Implementing ITIL in 4 Practical and Auditable Steps
  2. Any Operations Management textbook
  3. Web Operations: Keeping the Data On Time
  4. Lean IT: Enabling and Sustaining Your Lean Transformation

The Advanced Books you can read to dig behind the ideas, this is my “Best Of” list:

  1. My Philosophy of Industry & Moving Forward Henry Ford
  2. Today and Tomorow Henry Ford
  3. The Principles of Scientific Management Taylor
  4. The Toyota Production System Ohno
  5. Out of Crisis Deming
  6. The New Economics Deming
  7. Management Challenges for the 21st Century Drucker
  8. The Goal Goldratt
  9. Critical Chain Goldratt
  10. Creating the Corporate Future Ackoff
  11. Future Shock Toffler

One book mentioned in my talk that I do not own, nor have I read, is Lean Startup by Eric Ries, which is based largely on The Four Steps to the Epiphany a book I did buy at the MIT Press bookstore after my keynote. “Lean Startup” is popular, but all he’s really doing is applying LEAN concepts and Agile methodologies to the startup. There are hundreds of “Lean XYZ” books. I am personally interested in the real deal, not books about other books. “LEAN IT” is my one exception because it can be a big time saver and I feel it gives proper credit to the history and sources of the ideas it espouses.

Finally, rather than give you a “fire hose” list of everything, I’ll simply include a picture of what I feel is a very complete libary on these various topics.  The handful of books missing from these shelves are PDFs on my iPad such as the official  “ITILv3 2011 Update”, several books on Engineering Systems, etc.  Click the image to see it high-res.

 

Nothing New Under the Sun: An Introduction to Operations Management (OM)

Thursday, July 21st, 2011

8 All things are full of weariness;
a man cannot utter it;
the eye is not satisfied with seeing,
nor the ear filled with hearing.
9 What has been is what will be,
and what has been done is what will be done,
and there is nothing new under the sun.
10 Is there a thing of which it is said,
“See, this is new”?
It has been already
in the ages before us.
11 There is no remembrance of former things,
nor will there be any remembrance
of later things  yet to be
among those who come after.

 

Ever been irritated by the subtle but constant reference by Agile and DevOps people to manufacturing?  You may not even realize they are doing it, but you’ll hear reference to a book called “The Goal”, quotes from Deming, analogies to factories, etc.  In many conference talks I could feel that there was some larger body of knowledge that speakers were alluding to, but not fully describing.  What was this secret knowledge?  Last year I finally stumbled upon the answer and I’ve been consumed by it ever since… long time readers of my blog will note a considerable change in tone and subject since Dec of last year.

This secret body of knowledge that is all around you, but not directly named is “Operations Management” (OM).

Classically, it is said that a company is made up of 3 primary organizations divisions: Finance, Marketing (which includes Sales), and Operations.  Finance handles the books and internal resources, Marketing brings the market to the company and sells its products to that market, and Operations is the part of the company that does what your company does.  This is an overly simplistic model, but it makes a complex organization easier to grok.  If you run a hot dog stand, “operations” refers to ordering hot dog stuff, making hot dogs, serving customers, etc.  If you make cars, “operations” refers to the factory floor managing supply chain, operating the assembly line, and delivering cars to dealers.  If you run a web site, “operations” refers to the developers and sysadmins who make the product, run it, etc.  So again, the model breaks down to bean counters, sellers, and makers/doers.

Have you ever thought about getting an MBA?  I have.  Except, when I looked at the curriculum my eyes somehow danced right over OM, because I didn’t know what I was looking for.  Now I know.  You can examine the OM departments at Harvard Business School and MIT Sloan.  As with so many things today, the first step to knowledge is knowing what to look for, if you don’t know what its called you can search until your blue in the face and find nothing of real value.

My journey really took off when I found, at Church of all places, a donated text book entitled Fundamentals of Operations Management (4e).  “WOW!” I though, “that what I’ve been looking for!”  One look at the table of contents and I knew I’d stumbled onto the illusive body of knowledge I’d sought for so long:

  1. Introduction to Operations Management
  2. Operations Strategy: Defining How Firms Compete
  3. New Product and Service Development, and Process Selection
  4. Project Management
  5. The Role of Technology in Operations
  6. Process Measurement and Analysis
  7. Financial Analysis in Operations Management
  8. Quality Management
  9. Quality Control Tools for Improving Processes
  10. Facility Decisions: Location and Capacity
  11. Facility Decisions: Layouts
  12. Forecasting
  13. Human Resource Issues in Operations Management
  14. Work Performance: Measurement
  15. Waiting Line Management
  16. Waiting Line Theory
  17. Scheduling
  18. Supply Chain Management
  19. Just-in-Time Systems
  20. Aggregate Planning
  21. Inventory Systems for Independent Demand
  22. Inventory Systems for Dependent Demand

Jack pot!  If more than half of those chapters don’t seem pertinent to IT departments, then you’ve never tried to manage one.  The focus may be slightly different, but the core issues, problem domains, and related disciples are essentially identical.  This explains why so many “experts” are making reference to OM, knowingly or unknowingly, because in manufacturing they dealt with the same problems, in essence, we have in IT.  The Web companies (Twitter, Facebook, Flikr/Etsy, etc) are the ones leading the charge because more than traditional IT organizations, they really do look like the factory floor producing a single line of products.

So now… now I know what questions to ask.  And ask I did.  This opened up a whole new world to me that was right under my nose.  The Toyota Production System (TPS) which became known in the US as “Lean”… W. Edwards Deming and Total Quality Management (TQM)… ISO-9001…. the undertones of ITIL, CobiT, ISO-27001, and Agile…. it all came together and made sense for the first time.

This sent me into an epic journey as I sought out book after book after book by the cornerstone individuals of OM, because they all wrote books that formed the modern body of knowledge.  I now own all of Henry Ford’s books, Shigeo Shingo’s books, Taiichi Ohno’s books, W. Edward Deming’s Books, Walter Shewhart’s book, Fredrick Winslow Taylor’s book, Ludwig von Bertalanffy’s books, Peter Drucker’s books, and on and on and on.  I couldn’t stop buying and reading these texts that describe the world we find ourselves in today, shaped by the work they did so long ago.  All these points in my head started to be connected, one by one, and a fabric of knowledge appeared.

Friends, the point is this: there is nothing new under the sun.  Things change, evolve, and morph, sure, but the principles are not new.  If they were, we wouldn’t look back at Plato and Aristotle as wise today, much of what they debated 2400 years ago is still as pertinent today.  So it is with Agile and DevOps, the core principles have been well explored and addressed in the last century of manufacturing as part of Operations Management.  We only need adapt that knowledge, and the “experts” are doing exactly that.

Consider an example.  As a consequence of the innovations Ohno was introducing at Toyota in building the Toyota Production Systems (TPS, aka Lean), and in particular that of Kanban (the basis of Just-in-Time production, which is pull rather than push based production), he needed a way to speed up the “changeover time” (setup time) of large pressing machines.  These machines contain “die” which press sheet metal into, say, a car door.  The changeover time could be as much as 6 hours… that means, when you decide to stop making part A and want to make part B, you have to shut down for 6 hours to setup the machine for the new part before starting production again.  The way this was typically handled was to simply make a shitload of parts to build up a big inventory so that you reduced the likelyhood of needing to do another setup.  They were after local efficiency (what the “Theory of Constraints” calls local optima) at all costs.  This mass production method wasn’t going to work in Ohno’s new just-in-time world, the idea of stamping out only 20 parts and then changing to create another was completely idiotic.  At least, it was until he put Shigeo Shingo on the job.  It too Shingo years to make it happen, but ultimately he created a method know as “Single Minute Exchange of Dies” (SMED).  With his method you can change dies in less than 10 minutes (single-digit minutes, not 60 seconds).  This was the breakthrough that Ohno needed to make Kanban really work… and work it did.  With out SMED, a technology approach, to compliment Ohno’s other methods (Kanban, 5S, 5W, Andon, Muda, etc) Toyota just wouldn’t have been the industrial revolutionary that they became.

Now, why the hell am I telling you all that?  Look at what cloud did to IT.  Just like Kanban, Cloud came along and showed us that our setup times are way too long, and changeover from one type of setup to another was awful.  Configuration Management (CFengine, Chef, Puppet, etc) are the SMED of our industry.  Same problems, same needs, different solutions, but similar approaches.  There is no reason for us to re-invent all the wheels, alot of these issues are solved problems, if you just know where to look and what questions to ask, and have an open mind.

If you are like me and have been looking for something, but you know not what, go find yourself a book on Operations Management and get your journey started.  You’ll have a massive head start over all your peers who won’t figure this out for another couple years (just as others already got a head start over us).

The Joy of Non-Functional Requirements

Saturday, April 30th, 2011

ITILv3′s “Service Design” book, in section 5.1 regarding Requirements Engineering defines 3 types of requirements:

  • Functional requirements are those specifically required to support a particular business function.”
  • Management and operational requirements (sometimes referred to as non-functional requirements) address the need for a responsive, available, and secure service, and deal with such issues as ease of deployment,
    operability, management needs and security.”
  • Usability requirements are those that address the ‘look and feel’ needs of the user and result
    in features of the service that facilitate its ease of use. This requirement type is often seen as part of
    management and operational requirements, but for the purposes of this section it will be addressed separately.”

Later in section 5.1.1.2 several categories of these “Management and operational requirements” are presented, including: manageability, efficiency, availability and reliability, maintainability, security, controllability, measurability and reportability, etc.

Non-Functional Requirements (NFR) are generally equivalent to operational “technical debt”.  Every organization has some amount of this debt.  That debt can have practical explanations such as lack of resources (ie: staff, expertise, cash, etc) or simply be a result of geek perfectionism, afterall there is always something more that can be done.

I like the phrase “Non-Functional Requirements” because it adequately sums up the life of a sysadmin.  Your job is generally to identify and implement all the “things” that need to be there but no one really cares about until there is an emergency.  Backups, security, monitoring, synchronization, performance, capacity planning… burdens that many managers don’t want to be bothered with until its too late.

The news is buzzing about two examples of NFR biting companies in the butt… in particular the PlayStation Network (PSN) outage and the Amazon Web Services (AWS) EBS outage.  Both examples are easy to criticize, but consider what technical debt you have in your infrastructure.  Do you actually have a list of it all?  Do you review your infrastructure for NFR’s on an ongoing basis?

In the context of DevOps, you see a natural divide in requirements, dev tends to be concerned with functional requirements.  What the solution does or does not do.  Ops is then left to attend to all the various NFR’s after the fact, sometimes with very little guidance.  One thing we’ve heard from a great many DevOps guru’s is that Ops needs to be involved in development projects from day one… why?  NFR.

What gives me pause is that I’m certain that at both AWS and PSN there was at least one person who had a “told ya so” moment when disaster struck.  Rarely do things like this happen where everyone was completely blindsided by the event.  Which is why NFR’s are a key focus of Risk Management.

ITILv3 defines Risk as: “A possible event that could cause harm or loss, or affect the ability to achieve Objectives.  A Risk is measured by the probability of a Threat, the Vulnerability of the Asset to that Threat, and the Impact it would have if it occurred.”

What bothers me is that ITIL’s, indeed most peoples, definition of “risk” differs from the classical definition of Risk Management, which is to analyze all potential outcomes of a given decision, good or bad.  In IT, at least according to ITIL, we seem to over-focus on the negative.   Webster defines risk as: “possibility of loss or injury”, so ITIL isn’t wrong, but it may blind us from finding potential win-win outcomes.

The life of a sysadmin is the joy of non-functional requirements.  Those things that aren’t sexy, aren’t exciting, but indeed are requirements none the less.

I emplore you all, if you take any one things from the DevOps movement, to get operations involved in product development early to that you can solidify NFR from the beginning.  No employee should be burdened with having to make a personal decision about how often something is backed up or what is or isn’t monitored.  Combine functional requirements with non-functional requirements, do risk analysis and craft from that an SLA at the outset… because thats when people are most likely to care and have their minds in the right place.

Lastly, if you do do this, include as many people from the ops team as possible.  If only an ops manager is involved you are going to cut off a lot of potentially valuable feedback early when you need it, and you may have a very hard time motivating your ops team to get all those NFR’s implemented.  Sysadmins are almost never without something to do, so giving them a motivating sense of urgency isn’t optional.  That’s how that technical debt accumulates.  Just because something is important doesn’t mean its important “to me”, so it gets put off and off and off… and then you’re on the front page of the Wall Street Journal.

Systems Thinking & The Wisdom of Ackoff

Wednesday, March 9th, 2011

Dr. Russell Ackoff was not the father of System Thinking, but he was in my opinion its best disciple. A voice of reason in the wilderness.

In the following is one of many you will find on YouTube (I recommend you watch as many as you can), but there are a great number of important points he makes that I’d ask you to carefully ponder:

  • “There are 5 types of content in the human mind: data, information, knowledge, understanding, and wisdom. It’s a hierarchy.” (See my previous post for details on the Wisdom Hierarchy)
  • Regarding Peter Drucker’s infamous line, “There is a difference between doing things right and doing the right thing.” Dr. Ackoff says: “See, doing the right thing is wisdom, effectiveness. Doing things right is efficiency. The curious thing is that the righter you do the wrong thing, the wronger you become. If you’re doing the wrong thing and you make a mistake and correct it you become wronger. So it’s better to do the right thing wrong, than the wrong thing right.”
  • “So we’re now questioning, that it turns out every major social problem today is trying to do the wrong thing righter.”
  • “So instead of looking at the efficiency with which we are perusing our objectives, we’re beginning to re-examine the objectives.”
  • Dr. Ackoff considers the education system. “Our system is not about learning, [...] its about teaching. We don’t recognize that teaching is a major obstruction to learning.”; “Who in the classroom learns the most…. the teacher. See the classroom is upside down.”
  • “You can take each system [...] and you can see that they are all perusing objectives that are contrary to their intention.”
  • “You never learn by doing something right, because your already doing it right. You only learn by mistakes.”
  • “There are two kinds of mistakes, the kind you shouldn’t have done. [..] That’s called an error of commission. The other type of error is when you didn’t do something that you should have done. That’s an error of omission.” He goes on to point out that only errors of commission are recorded, and therefore if employees/managers can only get in trouble for doing something they shouldn’t have done, what will they do? Nothing.
  • “It’s our treatment of error that leads to a stability which prevents significant change.”

Once you’ve finished that, I recommend you watch a series of 3 videos from a single talk by Dr. Ackoff on System Thinking.

He really gets into it in Part 2, where he goes through examples of how Analytical Thought is insufficient for modern problems. Modern systems require a new pattern of thought. “‘Why?’ questions, about objects called systems, can not be answered by the use of analysis.” He goes on to explain that analysis produces knowledge but not understanding… it tells us how it works, but not why it works the way it does.

From the second part, “Synthetic thinking consists of 3 steps, which are exactly the opposite of analysis, each one:”

  1. “In the first step of analysis, you take whatever it is you want to understand and you take it apart. The first step of synthesis is you take the thing you want to understand and you say ‘What is this a part of?’ You identify the containing whole of which this is a part. So if I want to understand an automobile I say its part of the transportation system first.”
  2. “In the second step of analysis, I try to identify the properties and behaviors of the parts taken separately. In the second step of synthesis I to explain the behavior of the containing whole. Whats the transportation system?”
  3. “In the third step of [analysis] I try to aggregate the parts into an understanding of the whole. In the third step of synthesis, I dis-aggregate the understanding of the containing whole by identifying the role or function of what I’m trying to explain in that whole.”

Please do go through them, I think you’ll be enlightened. If you are new to Systems Thinking you’ll get an excellent crash course and I think be very pleased with what you find and how it can help adjust your thinking to enable you to better approach day-to-day problems you face.

The System of Profound Knowledge & The Wisdom Hierarchy

Monday, March 7th, 2011

Wisdom, it is said by Lt. Cmdr. Data, “is the difference between knowledge and experience.”

Two of my hero’s of the 20th Century are W. Edwards Deming and Russell L. Ackoff.  Deming is commonly thought of as the “father of quality”.  He taught the Japanese about quality and management in the 50′s and from it Japanese redefined manufacturing and the Toyota Production System (TPS) has become LEAN, which is changing the way all companies do business.  If you hate ISO-9000, blame Deming.  Ackoff was an early pioneer of “Systems Thinking” in the realm of Operations Research (OR).  Incidentally, they were both friends and worked together in the 40′s.

It is commonly understood that the most truly profound ideas are those which are the least surprising, they feel like something you’ve always know.  Indeed, you have always known… the wise teachers are changing something from unconscious and incidental to conscious and intentional.

What is wisdom?  How do we obtain it?  The “Wisdom Hierarchy” has existed in various forms for a long time, but I like Ackoff’s the best.  It is thus:

  • Wisdom: Clarity through experience, understanding of consequences
  • Understanding: Why?
  • Knowledge: How?
  • Information: What?
  • Data: Metrics

There is a progression here.  It hearkens back to how you learned to describe a story when you were in grade-school… the 5 W’s (and one H): Who, What, When, Where, Why, and How.  If you take these 5 W’s and lay them out in order of importance, you see the converse of the “Wisdom Hierarchy”.  Consider a child running to you saying “Danny’s hurt!”, in what order do we want the information?

  1. First, we need the context of the situation: Who is involved, Where and When did this happen?  This is just data, it tells of nothing of importance by itself but it orients us for all following questions as the basis.
  2. Second, we need to turn the data into information: What happened?  But information alone can lead to rash decisions and incorrect conclusions.
  3. Thirdly, we turn that information into knowledge: How did it happen?  At this point we’re getting intelligent, with what and how we can repeat the process or see where things went wrong, but its still isolated… we need to connect this with the System around it.
  4. Fourthly, we turn knowledge into understanding: Why did it happen?  A system view emerges as we see interconnections between parts of the system.  Why did Johny push Danny?  Because Johny wants Danny’s bike.
  5. Finally, over time (experience) we can accumulated understanding into wisdom which enables us to not only understand the past, but project that understanding into the future.  We can predict consequences and outcomes before we act.

So, as you can see, this is intuitive stuff, but Ackoff’s Wisdom Hierarchy makes it more intentional.  You may have a sense of what wisdom is, you may “know it when you see it”, but this Hierarchy gives us a map.

What’s interesting is that we commonly only go part way up the ladder of wisdom.  We get to information or knowledge and then simply assume the rest, using intuition.  But that’s dangerous.  We all know what they say about assumptions: “ass-u-me”.  It certainly isn’t repeatable.  To grow as an organization it is important to institutionalize wisdom, to be methodical about it.  When was the last time you made a decision without knowing the how or why?  Maybe you put out that fire, but it will come back again to be sure.

Deming’s System of Profound Knowledge gives a framework to evaluate problems (questions, things, whatever) in the real world.  It has four components:

  1. Appreciation of a System: Things don’t exist in the real world in isolation, things/people/ideas/etc. are systems.  In order to understand something we must realize this and examine the system.  The word “Why?” is a systems thinking word, when a child asks why they are trying to stitch information from two things together into a single system, they are exploring the inter-relations of the world.  “Why is the sky blue?”  “Why are orders taking so long?” We’re exploring the inter-relations of a system.  Profound knowledge starts with seeing the system.
  2. Understanding of Variation:  In statistical control there are two types of variation: common and special variation.  Difficult problems get even more difficult when we confuse the two.  Common variation is just a fact of life, things aren’t perfect all the time.  If your spouse gets home late by 30 minutes we may say that this is “common variation”, traffic or meetings may have caused them to be late, nothing unusual here.  However, if your spouse is late by 3 hours, that’s a “special variation”, its outside control bounds and indicates something unusual that should be investigated.  We sometimes say, “pick your battles”, and this is an example of that, don’t nit-pick and waste time solving normal variation, focus on the real strange ones that indicate a real problem.
  3. Theory of Knowledge: The essence of scientific method is that of creating and evolving theories.  Theories make predictions and codify your existing understanding.  The system of profound knowledge is about building and refining these theories.
  4. Understanding of Psychology: Trying to understand anything without considering the effects on and assumption of people is pointless.  You must consider the opinions, assumptions, reactions, perceptions of people to bring reality into what is otherwise cold.  Just because something may be efficient doesn’t mean it will be effective, psychology is a big part of that.

Together, the Wisdom Hierarchy and the System of Profound Knowledge are two powerful tools (frameworks really) for approaching problems and change, going from where we are to where we wish to be.  For building something to be proud of, rather than just making ourselves busy with the latest fire.

While you may quickly dismiss both of these, I hope you will reflect on them and see if they can guide you toward being more structured and intentional about how you address life’s challenges, at home or at work.  I have pondered them for some time and continue to find new utility in them.

Note to the reader: The above explanations are my own based on Ackoff’s and Deming’s writings in whole and therefore may not match exactly with explanations you may find elsewhere.  Most bullet point explanations are too short and miss the true point, in my opinion.  This is an attempt on my part to better summarize their true intent.