This one graphic DOESN’T explain it all

How many times have you been reading Facebook, or your favorite blog or a site like buzzfeed and you see an entry with the title like ‘this one simple graphic explains [insert topic here] once and for all’ or something like that.

These titles suggest to the reader that if you just looked at some problem in a specific way that suddenly it’d all become clear. Of course, the next step is that your democrat or republican friends post these items to Facebook with a helpful comment like “for my [opposing party] friends.” And really, nobody’s mind is changed.

First off, I’m not going to spend much time addressing cognitive dissonance. Reality is, giving someone evidence that isn’t in line with their world views tends to lead to a strengthening of their current views, not a weakening of it.

But secondarily, for any sufficiently complicated topic (which in my world, is pretty much all of them), there is no one graphic that explains it all. And I suspect most of your situations are like that as well. Let me use an example, organizational productivity. We measure effort per function point per industry norms. And we were demonstrating in our “one chart that explains it all” that productivity had improved since a recent change. Except one chart won’t do it. The chart makes the main point, but then we had at least five other charts checking things like the measurement system hadn’t been tampered with, that quality hadn’t suffered as apparent productivity rose, and so on. In IT, most of the things we measure are proxy measures for some outcome we care about. As proxy measures, we always have to worry about the measurement system and unintended consequences of our choices. As a result, no analysis is ever complete on a single chart.

Treat anyone and anything that is explained to you in “one simple chart” with suspicion. If it seems too simple and obvious, it probably is.

The “right” way for the business, but the “wrong” way for us

How many times have you been annoyed by a business partner who tells you exactly how something should be done. Let’s say you’re building a new application and it needs data from a third party. The third party provider has a web service you can call to get a realtime result. However, another application in your organization is already getting that data periodically as well. The business person begins to question your design (and the associated cost of a new call to the webservice) when wouldn’t it be better just to take a nightly feed from the other system and load it? For oh so many reasons, we know this isn’t the right thing to do. The data is likely outdated since it’s now 24 hours behind the realtime version. It ties two unrelated systems, who happen to need similar data, together. Now if the old system changes, you’ll have to modify and retest this new system as well.

If, as developers, we were purists about doing the right thing, it’d be a defensible stance. But how often have we kludged something together ourselves? Take for instance, data we need about our organization. It might come from a Project-and-Portfolio Management tool. Perhaps we want to look at data by “team,” but for whatever reason the team data isn’t available in the tool. Do we try and do the right thing and fix it? Not usually. Just the other day I encountered a data extract, then joined via Microsoft Excel and a vlookup() to a list of people which mapped them to a team. Why was the team data not in the tool? That would’ve been difficult, but the right thing to do. Now, instead of a clear single system of record for team membership, both the PPM tool’s org structure and this one-off would have to be maintained. And while we are often hesitant to make compromises about the stuff we build for the real customers, we will accept horrid messes that create work for ourselves. Why should we be unwilling to have messy solutions imposed on us but willing to impose them on ourselves?

I don’t think we can hold others to higher standards than we are willing to hold ourselves as developers. Don’t create your own messes.

What if static analysis is all wrong?

I just got back from a meeting with one of my former college professors. I’ve kept in touch because the academic world and research has much to teach us about how to operate in the business world. For one, without the financial pressures, academia is free to explore some crazier ideas that one day may create value.

In this recent meeting we were discussing static analysis and machine learning. Static analysis has proven frustrating in some of my analysis since it has no evidence of predictive power over outcomes we care about – defects the user would experience and team productivity. And yet we keep talking about doing more static analysis. Is it that the particular tool is bad or is the idea fundamentally flawed in some way?

What turned out to be a non event for machine learning might be an interesting clue to the underlying challenges with static analysis. This particular group does research on genetic programming. Essentially, they are evolving software to solve problems. This is valuable in spaces where the solution isn’t well understood. In this particular piece of research the team was trying to see if modularity would help solve problems faster. That is, if the programs could evolve and share useful functions, would that cause problems to be more easily solved? The odd non event was that it didn’t seem to help at all. No matter how they biased the experiments, the evolution of solutions preferred copying and tweaking code over using a shared function. Although the team didn’t look into it much, they suspect that modularity actually creates fragility in software. That is, if you have a single function that many subsystems use then if the function is changed the ripple effects may be disastrous. If there exist many copies of the function and one is changed, the impact is much smaller. One might argue that this could apply to human created code as well. It isn’t simply a matter of making code more modular and reusable, but perhaps only under certain circumstances. If true, it’d fly in the face of what we know about writing better software. And importantly it would quickly devalue what static analysis tools do, which is push you towards a set of commonly agreed upon (but possibly completely wrong) rules.

What can a snowblower tell us about software?

If you’re in the north eastern United States, you’re probably thinking about snow right now. And if you’re responsible for clearing the snow from your drive or walkways you might also be all too familiar with time behind a snow blower. For years I hand shoveled my walkways, but when we moved to this new house they were simply far too long for that.

It takes me about an hour to do all the clearing I am responsible for, so that’s a lot of time to think, which isn’t necessarily a bad thing. This particular snow is the deepest we’ve had yet. My snowblower has three forward speeds on it and presumably you use a slower speed when you have more snow to clear. The slower speed allows the auger to clear the snow before it gets all backed up.

So, as I was clearing the drive, I noticed something. Even at the lowest speed there was enough snow that some of it was being simply pushed out of the way by the blower. That meant that I’d have to do clean-up passes just to get that little bit of snow that the blower wouldn’t get on the first pass. And that got me to thinking. What if I just went faster? After all, if I was going to have to make a second pass anyway, who cares if it’s a tiny bit of snow or a little bit more?

And that got me to thinking about software. One approach might be to take it slow and carefully, but if you’re going to create bugs anyway, then perhaps going that slow isn’t the right answer. You’re still going to need the clean-up pass so you might as well let it happen and just clean up a bit more, right?

That sort of makes sense, if you think a second pass over the code is as effective as a second pass with the snow blower. In terms of dealing with snow, the blower is relentless. If it goes over the same ground twice it will do so with the same vigor as before. On the other hand, testing is imperfect. Each pass only gets about 35-50% of the defects (tending towards the 35% end). It isn’t like a snow blower at all. If you push aside a relatively big pile of snow with the a snow blower, it’ll get it on the second go. If you create a big pile of bugs in the code on your first go, a round of testing will likely reduce the pile by less than half. Then you need another pass, and another just to get to an industry average 85%.

There’s one other thing about going too fast that I learned with my snow blower. Some times you get a catastrophic failure. In my case, going too fast with the snow blower broke the shear pin on the auger. It’s a safety feature to prevent damage to the engine but it does make it impossible to keep using it to move snow. And software is a bit like that too. Go too fast and you may introduce a major issue that you’ll spend a whole lot of time cleaning up. It is not all about speed.

Tom Demarco talks ethics

I never took ethics in college. I also never planned on attending a conference to hear a talk on ethics. After all, ethics were sort of a base assumption from my perspective and not given much thought beyond large companies having employees sign a code of ethics on some sort of frequency.

In fact, I thought I was attending a talk on decision making, and thus expecting something about decision theory, game theory, maybe psychology. Certainly not ethics. But Tom took something that at first I was like “where is he going with this” to “wow!” I’ll attempt to do it justice, but I can’t promise that I will. To the best of my ability, here’s what I took away:

Aristotle believed that ethics were logically provable. Metaphysics contains all the things we can know about the world. Epistemology is built on that. It’s everything we can derive from what we know about the world. For example, Socrates is a man. All men are mortal, therefore Socrates is mortal. Ethics, Aristotle “promised” were provable with logic. For something like 2400 years, all kinds of philosophers tried to make good on this promise and were unable to. At some point in time, David Hume perhaps, classified metaphysics as “is” statements and ethics as “ought” statements. It was argued that it was impossible to derive an ought statement from an is statement.

Along comes Alasdair Macinytre. He argues that if something is defined by its purpose (this is a watch, for example) then the ought statements follow naturally. What ought a watch do? It should tell good time. So, that raises the question, what is the purpose of man?

We go back to Aristotle. Aristotle also created a mechanism for defining things. His definition requires that you group something and then differentiate it. So, a definition for man might be “an animal with the ability to speak.” That’s an is statement, for sure, but by Macintyre’s requirements, it doesn’t define man’s purpose. Macintyre goes on to define man as an animal that practices. Creating practices becomes man’s purpose. A practice is something we attempt to systematically extend. Tennis, for example, is a practice. The current tennis players have extended the sport in new and interesting ways, such that although famous tennis players of yore would recognize the sport they probably couldn’t compete anymore (even if they were young enough) because the sport has been systematically extended.

So, if that’s right, that man is an animal who practices, then for each practice we create, the oughts follow naturally. If you are a software developer and your manager comes to you and says “we need to take three months off this project” what ought you do? Well, first you ought to be honest – cutting the timeline will hurt the quality of my practice. Second, you must be just – quality matters to our customer, and we can’t deliver poor quality. It’s a disservice. And lastly, you must be courageous – go back in there and tell them no!

How many times has one of our employees, by this framework, acted ethically and we viewed it as a problem? Far too many times, I’d guess. The person with ethics, who values his or her practice and whose ought statements are clear can be frustrating. But when viewed through the lens of Tom Demarco’s talk, suddenly what they’re doing makes a lot of sense.

Like reading one chapter of a novel

A friend of mine shared an excellent analogy about commenting code the other day. He referred to code without comments like reading one chapter of a novel and expecting to understand the whole story. It really doesn’t matter how well the chapter is written, there’s just more surrounding information which makes it make sense.

This stemmed from a conversation regarding commenting philosophy in code. I’ve recently heard a couple times that code comments are a weakness and represent a failure to write the code in a transparent manner. That’s like saying that the last chapter of The Great Gatsby is a failure because you can’t understand the whole story by reading just that chapter.

Code does not exist completely stand alone. It must interact with other parts of the system, and depending on the design may or may not do certain things. To understand what a function does may require understanding what the child functions do and don’t do and how the entire transaction hangs together. Particularly, when we have to call external services, it’s helpful to have context. Code comments are like the Cliff’s notes or margin notes in a book. They help you quickly tie back to other parts of the code and understand why the code is doing something, not necessarily what it is doing,

I wholeheartedly agree that writing “//increment i” just before the line “i++” is a useless comment, but there clearly are uses for comments. Comments are not just those things used by “weak” programmers who can’t write code clearly enough. They serve a valuable purpose, particularly in understanding the intent of the code during code reviews.

Could a high standard deviation be good for your business?

If you had asked me a few weeks ago, I would have told you that having a high standard deviation in your process was never a good thing. After all, being unpredictable isn’t good. That is what is drilled into our heads all the time when you study Six Sigma.

The conversation started with a group of people discussing sports and, in particular, the Oakland A’s, who figure prominently into the book “Moneyball.” As stats geeks, we admire all that went into making the A’s a great team on a much smaller budget than many others. But the A’s are boring. You might get excited about Moneyball but, at least from where I stand, the A’s don’t fare well. If you use attendance as a measure, the A’s rank near the bottom – between 27th and 29th depending on the site you use.

So, how do we rationalize this disconnect. Solid data suggests that a far less monied team can compete, but the fans don’t show up. And the reality is, ultimately sports is a business just like any other and you need the fans to show up. So, why is it that a better average performance didn’t draw in the fans?

Well, because predictable is… Boring. There are places where predictable isn’t good. High standard deviation is good. What do people remember about the Red Sox? Eighty some odd years of coming close and failing, then the excitement of a few years of wins and we are back to the flame out stories. We like it when the game is exciting. And for winning to be exciting, there must be losses. Frankly, I’ll turn off a game if it is a blowout. If you could design a team who was reliably better than every other team, I bet nobody would watch. Predictable outcomes aren’t fun outcomes.

Don’t believe me? Check out the analysis of JCPenny’s “no more coupons” approach.. Lower prices are more predictable and overall should be better for customers, but it doesn’t work. Sometimes we have to appeal to something other than a reliable, repeatable experience. It is important to figure out when is the time to be predictable – a lot of the time, but not all the time.

The only outcome that really matters is the one the customer sees

The other day I was reviewing a set of proposed metrics with a group of business analysts. We had already been through one round of with their metrics and cut the quantity from fifty plus down to about twenty five. It was an improvement, but it still wasn’t good enough.

In the time between the first meeting and this one, which had been a while, my thinking on metrics had evolved some. The issue wasn’t that the team wanted to measure their performance – that’s a good start. The issue was that the team wanted to measure only their performance.

In particular, one measure that caught my attention was the requirements defect rate. In essence, this is a simple measure that is just the number of requirements defects divided by total defects found. But while the definition is simple, the implementation is not. First off, what does it mean to have a requirement defect. Is a misunderstanding about a requirement a requirement defect or a coding defect? If the requirement is missing, is that a requirement defect or was it simply an implied requirement that any reasonable developer ought to have known? For certain, there are some defects where the source of injection is well understood, but many others where it is not.

But more importantly, it finally clicked for me when Dr. Hammer said that, regardless of your role, you should be measured against the customer outcome. The example he used at the time was that a parcel delivery service ought to be measured for on time delivery. And everyone in the organization should, regardless of their job. Why? Because everyone has a hand in making sure the package gets there, directly or indirectly. And, if some teams are allowed to measure themselves differently, they can choose a set of measures that cause sub optimization of the entire process. In essence, if the business analysts focused on what percentage of the defects were requirements issues, quality could get worse (higher overall defect density) while the number of requirements defects stayed about the same. The end result is that the customer would be more unhappy, the business analysts would look unjustifiably good, and nobody would undertake appropriate corrective action.

What end does it serve, as a business analyst or any other team, to have a measure which simply allows you to say “not my fault” while the customer suffers. No, I argue, if you want to make sure it is the customer who is getting what the need and want, then the only measures that matter are the ones that measure what the customer experiences and not some slice which serves only to absolves some team for responsibility to help the organization meet that goal.

That’s not root cause

Far too often I read some email or paper explaining in detail the “root cause” of an issue as something along the lines of “in module X when calling function Z, if a null is passed in variable Q then blah, blah, blah will happen. Root cause is to fix the module to test for a null in variable Q and…”

This is not a root cause. I know why so many people think it is. When we go to fix a bug we can fix it the right way or the wrong way. The wrong way, perhaps, is to fix it not at the source but to devise some sort of work around. These things are often sloppy, and by most developers standards, highly undesirable. To a developer, a root cause simply means where the problem begins. And, of course, this is a bit gray as well. If function X calls function Y, passing an invalid variable, which function is at fault. If you are a proponent of defensive programming, you’d argue that function Y is at fault for not checking the inputs. If you are a proponent of design by contract, then you’d argue function X should never have called function Y with an invalid value.

Frankly, it doesn’t matter to me because neither are the root cause. The real root cause is the reason you made the mistake that you made in the first place. Why did the code ever get written that X could call Y with a bad variable or that Y wasn’t checking its inputs? Why did we make these design decisions? Why did we make these coding choices? Why was it built that way? When you start asking and answering these questions, instead of talking about where in the code a fix is needed, then you can start getting somewhere. If you stick to the “where in the code was a fix needed” then you can never do anything with the information. You won’t make that exact same mistake in the exact same line of code – because you just removed that bug. But where in the code you make the fix won’t help you figure out how to prevent the next mistake.

Three clues you’re on the wrong path with root cause

I’ve come to the conclusion that there are three things that I hear all the time which are now my clue that a team isn’t on the right path when it comes to root cause.

  1. “Nobody could have caught it” and its variants like “once in a blue moon” are clues that the person speaking believes there are defects that nobody can catch. The funny thing is, if you look closely enough, and at enough defects, you will see that we do make the same types of mistakes over and over and over again. For example, numerous organizations commonly make date/time errors. These types of errors are common because working with dates and times is tricky. Leap years, end of year and end of day trip us up constantly.
  2. “It was so obvious [QA should have had a test case for it].” The implication here, sometimes stated but other times not is that had only QA written the right test case it would’ve easily been seen to be wrong. Of course, in my mind, this raises the question “if it was so obvious, why didn’t the developer catch it?” This is the quality is not my problem attitude. Had only those testers done their jobs properly, this wouldn’t have happened. But, frankly, had only the developer not made the mistake, we wouldn’t need QA at all.
  3. “Better communication.” Given the number of times that I’ve heard this root cause (as in, better communication of X is needed) one would think that we don’t talk to each other at all. Now, perhaps we don’t talk to each other enough sometimes, but the grim reality is that given enough communication, it all starts to sound like noise. I’ve had to beg to be taken off email distribution lists about all kind of nonsense because we can’t separate signal from noise. Even with much of the noise removed, it’s impossible to communicate everything. We are neither capable of storing nor retrieving that information at exactly the right time. More and better communication is a red herring. You are simply approaching the limits of what people can handle in terms of information at one time.

If you’re hearing the above things when you’re doing root cause, then you’re doing it wrong, or at least not deeply enough.