ITILv3′s “Service Design” book, in section 5.1 regarding Requirements Engineering defines 3 types of requirements:
- “Functional requirements are those specifically required to support a particular business function.”
- “Management and operational requirements (sometimes referred to as non-functional requirements) address the need for a responsive, available, and secure service, and deal with such issues as ease of deployment,
operability, management needs and security.”
- “Usability requirements are those that address the ‘look and feel’ needs of the user and result
in features of the service that facilitate its ease of use. This requirement type is often seen as part of
management and operational requirements, but for the purposes of this section it will be addressed separately.”
Later in section 18.104.22.168 several categories of these “Management and operational requirements” are presented, including: manageability, efficiency, availability and reliability, maintainability, security, controllability, measurability and reportability, etc.
Non-Functional Requirements (NFR) are generally equivalent to operational “technical debt”. Every organization has some amount of this debt. That debt can have practical explanations such as lack of resources (ie: staff, expertise, cash, etc) or simply be a result of geek perfectionism, afterall there is always something more that can be done.
I like the phrase “Non-Functional Requirements” because it adequately sums up the life of a sysadmin. Your job is generally to identify and implement all the “things” that need to be there but no one really cares about until there is an emergency. Backups, security, monitoring, synchronization, performance, capacity planning… burdens that many managers don’t want to be bothered with until its too late.
The news is buzzing about two examples of NFR biting companies in the butt… in particular the PlayStation Network (PSN) outage and the Amazon Web Services (AWS) EBS outage. Both examples are easy to criticize, but consider what technical debt you have in your infrastructure. Do you actually have a list of it all? Do you review your infrastructure for NFR’s on an ongoing basis?
In the context of DevOps, you see a natural divide in requirements, dev tends to be concerned with functional requirements. What the solution does or does not do. Ops is then left to attend to all the various NFR’s after the fact, sometimes with very little guidance. One thing we’ve heard from a great many DevOps guru’s is that Ops needs to be involved in development projects from day one… why? NFR.
What gives me pause is that I’m certain that at both AWS and PSN there was at least one person who had a “told ya so” moment when disaster struck. Rarely do things like this happen where everyone was completely blindsided by the event. Which is why NFR’s are a key focus of Risk Management.
ITILv3 defines Risk as: “A possible event that could cause harm or loss, or affect the ability to achieve Objectives. A Risk is measured by the probability of a Threat, the Vulnerability of the Asset to that Threat, and the Impact it would have if it occurred.”
What bothers me is that ITIL’s, indeed most peoples, definition of “risk” differs from the classical definition of Risk Management, which is to analyze all potential outcomes of a given decision, good or bad. In IT, at least according to ITIL, we seem to over-focus on the negative. Webster defines risk as: “possibility of loss or injury”, so ITIL isn’t wrong, but it may blind us from finding potential win-win outcomes.
The life of a sysadmin is the joy of non-functional requirements. Those things that aren’t sexy, aren’t exciting, but indeed are requirements none the less.
I emplore you all, if you take any one things from the DevOps movement, to get operations involved in product development early to that you can solidify NFR from the beginning. No employee should be burdened with having to make a personal decision about how often something is backed up or what is or isn’t monitored. Combine functional requirements with non-functional requirements, do risk analysis and craft from that an SLA at the outset… because thats when people are most likely to care and have their minds in the right place.
Lastly, if you do do this, include as many people from the ops team as possible. If only an ops manager is involved you are going to cut off a lot of potentially valuable feedback early when you need it, and you may have a very hard time motivating your ops team to get all those NFR’s implemented. Sysadmins are almost never without something to do, so giving them a motivating sense of urgency isn’t optional. That’s how that technical debt accumulates. Just because something is important doesn’t mean its important “to me”, so it gets put off and off and off… and then you’re on the front page of the Wall Street Journal.