16 March 2011

How to Write Software Right

I have been working on, testing, developing, breaking, engineering, architecting, designing, and creating software for over twenty years. Seven and a half of them have been professional years. I have been in school, participating in Computer Science related activities for seven of my twenty years. I have ready thousands of books, essays, tutorials, forums, blog posts, and reports on various subjects related to all aspects of the IT industry. I have worked with eight companies, on five open community projects run by others, more than ten open source projects of my own, with hundreds of coworkers both domestically and abroad, and on several contracts both of my own initiation and because others wanted my help. Also, I participate casually in four local programming language and API user groups, a couple of global API user groups, and have attended Microsoft Tech Ed, and Agile Roots conferences. My point is that I have worked a fairly large percentage of my life on software, and I have experienced a lot of different economic situations, team and solo environments, and software communities. And in more than eighty percent of my overall experience, the resulting software has been unacceptably buggy, politics play a greater role as to when software is complete than discrete quality does, programmers don't have time and in many cases don't have the experience to engineer software properly, and management, of both software activities and people, is poorly executed, in part due to lack of understanding the problems being solved, the solutions to those problems, and also and again due to lack of experience.

The Problem, As I See It

The problem as I see it, has a lot to do with management. As projects grow, management becomes a greater necessity. This is well-founded, but projects need the right kind of management. The right kind of management is an aspect of Software Engineering that in my mind has yet to become fully understood. MBA students learn about strategies for being successful in the marketplace. While these strategies work well in supply-demand situations where the product is a physical, tangible commodity, the same strategies do not work as well when the product is abstract and service-based, as is the situation with software. Software does not follow supply-demand rules in the same way as commodities like gasoline, vegetables, and fast food. In software, the commodity isn't what the end-user gets, rather it is what the producer has. In software the commodity is not the software, it is the programmer, the software architect, the technical writer, the software engineer, and the software quality assurance professional. And that commodity needs more than just carnuba wax to preserve it. Programmers need training, motivation, and benefits. Good software engineers need to know that they are trusted and entrusted with the final product. Software architects need to be listened to and adhered, not downplayed and criticized. I know a good number of these commodities that are extremely well-versed and know what they're doing, but when their skill-set is undercut by the bottom line, then their performance suffers, and as a result so does the product that they are working on.

A sub-problem of the management problem seems to me to be the bottom line, and while just like any in other commercial industry, the product must be profitable to survive the market, the commodity, the worker must be able to engage in the highest possible quality work to produce a profitable product. When a consumer purchases an engineered product other than software, such as an HD television, an electric mixer, or an automobile (excluding the software-driven components), it is expected to be of the highest quality.

A Case In Point

Nine years ago, my wife purchased a two-year old 2000 Chevrolet Malibu sedan. Chevrolet is generally a good company that produces high quality vehicles at an astounding rate and at a lower cost to the consumer than many companies. On the other hand a company like BMW has been producing superiorly engineered high quality products for years as well, but their vehicles cost significantly more than a Chevrolet. As an example a 2010 Chevrolet Malibu four door sedan has an invoice cost of $\$$20,733, while a comparable 2010 BMW 3 Series four door sedan has an invoice cost of $\$$30,500, costing the consumer about 47% more at the bank. The BMW 3 Series is a quality vehicle. The Chevrolet Malibu is a quality vehicle. And while I have never driven or owned a BMW 3 Series, somehow I expect that it has never had any of the engineering problems that my wife's 2000 Chevrolet Malibu has experienced: turn signals stop working unless you jimmy the emergency signal button (why? because the connection between the turn signals, emergency signals, the emergency signal button, and the turn signal wand was soldered together), brake pads and rotors need to be replaced every three to six thousand miles (professionals expect to change brake pads and rotors on vehicles around 20,000 to 50,000 miles depending on usage, but on my wife's Malibu they really wore out that fast, and we were purchasing performance grade pads), and finally several hose fittings to the cooling system burst, costing hundreds of dollars in repairs, hundreds of dollars in coolant, and a highly pressurized system that continued to fail. Mechanics were finally able to solve the turn signal issues with my persistent calling to GM and complaining about it. After reading on the NHTSA's web site about how many people had complained about it, and how many had received notice of a recall, I decided to find out why I hadn't received notice of a recall, and their cop-out answer was that my vehicle hadn't been manufactured at the same plant as those vehicles that were recalled for the exact same turn signal problem, same year, make, model combination. After several months I finally received a letter stating that they would repair the problem for free at a certified Chevrolet mechanic, but that my situation was not considered a recall. What was the real reason for the problem? In one word, management. Either somebody didn't manage the quality of the product at its inception, or somebody didn't manage the quality of the product at production, or both. Either way, I'm certain that any electrical system vehicle engineer would tell you that a soldered connection would not last very long under the stress that a motor vehicle would put on it. Whether they were given time to engineer it properly is a good question. But on the front of vehicle manufacturers, quality appears to be suffering more and more. Especially recently were several Toyota recalls and similar recalls and issues by other manufacturers over the years. These types of issues are problems that can be solved by careful engineering, and quality assurance through testing and continuous integration. Motor vehicles should be higher quality than this, especially for as expensive as they are, but also because lives depend on that kind of quality.
Most of the time software isn't so system-critical, but sometimes it is even more system critical. Imagine if a commercial airliner suddenly lost altitude because the autopilot software thought is was nearing time to land, while in fact the airliner was flying in the middle of the ocean with no place to land, or over a jagged mountain range. Imagine if a space shuttle computer suddenly locked up causing a fatal crash due to a software defect. Luckily that didn't happen. Imagine if an automated cannon suddenly discharged on its own military troops due to a software defect, killing some. While most of us in the IT industry don't work on projects of the life-threatening system-critical nature of these incidents, some do, and we shouldn't be blaming anybody but ourselves when these sorts of costly problems occur. If management is getting in the way of quality, then we need to step up and let them know, and for those MBAs out there, you need to know that the world doesn't happen according to a text book.

Acceptable Defect Density

"The software needs to get released at some point,"
your managers are certainly saying. And I completely agree. If you don't release software, then you won't be able to get a return on it. So is there an acceptable number of defects or issues that production software can contain? I don't feel like it's fair to lay down an IT-industry standard, rather each industry using software to control any portion of their products must define their own standard. Better yet each company must strive to the highest standard possible. In my opinion zero defects is the most acceptable number of defects in production ready software. It is pretty unlikely that all defects can be averted in any software project, but I believe that a 99% defect free rate based on the size of the system (however you choose to measure that, i.e. defects per lines of code), is a realistic goal. How do we attain a 99% defect free rate? The answer, though simple in my opinion, is not necessarily easy to implement. The answer is that you need to fully engineer software. "Fully engineering" means that all parties involved in creating the software need to participate in analyzing the problem you are trying to solve and the risks associated with it. Management needs to be there so that they can help make the problem clear. Project management needs to be there so that you can understand what kind of resources are available to solve the problem. Developers, architects, engineers, and technical writers need to be there so that the project is understood by those who are actually solving the problem and doing the work. And the problem and solutions need to be discussed and developed until the end goal is clear in everybody's mind. Now I don't mean have meeting after meeting and fully document every single caveat or problem situation possible in the system, but don't be satisfied with the highest level of explaining what is wanted. Give enough detail that questions can be asked. Answer the questions in a way that satisfactorily resolves any concerns. And always leave your door open in case issues arise during the development process. After that comes the development, and then come the real challenges. It has been said that in order to fully understand how to solve a problem with software, you must begin solving it. Even with tools like programming language knowledge, computer architecture knowledge, design patterns knowledge, and all of the experience and wisdom in the world, every problem solved with software is different. You might even come to solve the same problem in a different way given the nature of the reason for solving the problem.
Back to defect density in the real world, Steve McConnell, author of Code Complete, and his blog IEEE Software, says that [Bibliography-1]
"[A software company's] task is treacherous, treading the line between releasing poor quality software early and high quality software late. A good answer to the question, 'Is the software good enough to release now?'" can be critical to a company’s survival.
Once again the problem with the software industry is fully exposed. Why can't we just write software correctly? I recently read a compilation of reports organized for the United States Air Force pertaining to systems engineering needs by Edward R. Comer of Software Productivity Solutions, Inc. It is not surprising to me that their conclusions are the same as the conclusions of most of the industry. For example in the Needs Survey of the "1975 NRL Navy Software Development Problems Report", the "result of a year-long investigation into Navy software problems...based on interviews with...people associated with Navy software development" the following twelve recommendations were made [Bibliography-2]:
  1. Unify life cycle control of software. Development responsibilities for a system should not be split, and maintenance activity should not be independent of development activity.
  2. Require the participation of experienced software engineers in all system discussions. This is especially crucial for early decisions such as the determination of the system configuration, assignment of development responsibility, and choice of support software.
  3. Require the participation of system users in the development cycle from the time requirements are established until the system is delivered. Changes which are inexpensive and easy at system design time are often extremely expensive and difficult after the software has been written.
  4. Write acceptance criteria into software development contracts. This will help avoid unnecessary misunderstandings and delays for negotiation before a system is delivered.
  5. Develop software on a system that provides good support facilities. If necessary, consider developing support software prior to or in conjunction with system development.
  6. Design software for maximum compatibility and reusability. Premature design decisions should be avoided; logically related systems should have their differences isolated and easily traceable to a few design decisions.
  7. Allocate development time properly among design, coding and checkout. Since manpower-allocation estimates are based in part on the time estimates for different phases of development, improper estimation can be quite expensive.
  8. List, in advance of design, all areas in which requirements are likely to change. This can be done at the time requirements are stated and will help the designer partition the software to isolate areas most likely to change.
  9. Use state-of-the-art design principles, such as information hiding. Principles which optimize reliability, cost reduction, and maintainability should be emphasized.
  10. Critical design reviews should be active reviews and not passive tutorials. Sufficient time must be allowed to read design documents before the review, and the documents must be readable.
  11. Do not depend on progress reports to know the state of the system. Programmer estimates are typically biased; milestones are a more accurate indication of development progress.
  12. Require executable milestones that can be satisfactorily demonstrated. Milestones demonstrating system capabilities that will rest on major design decisions should be written into development contracts.
These twelve recommendations follow suit with what many people in the industry state are needed, and what many software development processes try to provide. Perhaps the problem with software processes however is the same exact problem that generally exists with software, there is no silver bullet! I have worked with several companies and groups who have use Agile as part of their process. One of the attractions of the Agile software development process is that there are several different methods of implementing it. Back when Agile was young, there were many schools of thought pertaining to the Agile development methodology that said, "if you're not doing this, then you aren't Agile". More recently Alistair Cockburn basically told us, at an Agile Roots conference held in Salt Lake City, UT in July of 2010, that "Agile is agile", and there is no wrong way to implement the Agile software development methodology. More importantly, implementing only the pieces of different methods of Agile that are important to your organization is the best way to implement Agile. For example, make daily stand-up meetings part of your day-to-day process, and make visibility into your projects transparent. Those are two pieces of different methods of Agile. But again processes don't solve software problems, people do. And if a process or bureaucratic red tape are causing your project to fall behind, then get rid of them. Process are meant to help teams with little or no direction to make direction in my opinion. But teams outgrow processes on a daily basis. Making a team conform to a standard that isn't working for them holds back the potential of that team.

And Finally There's Quality

Software quality assurance is becoming a larger part of software in the industry. More and more various software development shops are opening up quality assurance departments in their companies. But this trend does not reveal that software is coming out at a higher quality, rather in underlines the issues that most companies have experienced in the past, and will continue to experience with quality. The companies building up their QA departments are just the ones who realize that quality is becoming a bigger issue than they can muster on their own, without dedicated resources.
How many companies would be willing to release their software into the wild even if it had a defect to lines of code ratio of 10%? What if your QA department didn't sign off on it and told you that you would lose more money by releasing it now with the 10% defect ratio than you would by releasing it late with a 1% defect ratio? Would you still release it? In my experience companies tend to release on time with a higher defect ratio than what is internally acceptable more often than not. So where is the problem again? QA is telling you not to release it. But why? Because they know that the risk of needing to refund customers or do non-billable work is higher. Well QA should be listened to now. Understanding assessments made by QA should not be regarded any less than making sure that your developers know how to develop, that your architects have designed a complete and well thought-out product, or that you're implementing time-tested design patterns that have been proven to be correct. If you have a QA department, then you probably realize that you have a problem with your software. It's also probable that you don't know where the problem is. It is estimated that it is far less expensive to design software well and completely in the beginning than to need to rely on customers to find your issues. It is still better if an internal quality assurance pass finds the problems before you release the software. If you have a QA department and you aren't listening to their advice, then you might as well not have a QA department, because now you're probably losing at least one and a half times the money - you're still relying on customers to find the issues before you'll fix them because you're downplaying the idea that what your QA department has to say is valuable. Of course there's always the other side of the coin that says that their utilization of a software quality assurance team is in order to find the defects that their design, unit testing, and code reviews didn't find. Steve McConnell provides us with some insight in his book, Code Complete [Bibliography-3]:
...software testing alone has limited effectiveness -- the average defect detection rate is only 25 percent for unit testing, 35 percent for function testing, and 45 percent for integration testing. In contrast, the average effectiveness of design and code inspections are 55 and 60 percent.
In conclusion engineering software is difficult. There are bad ways to do it, good ways to do it, and better ways to do it. My utopious dream of one day working on the perfect software project, may always remain a dream. But if we don't start working smarter on software than working harder, then there are going to continue to be huge consequences that are bad. Software isn't going away, and it will only continue to become more complex. Software will continue to be used in more and more applications. And software will only become more correct and defect-free if professionals in the software industry become better educated, and try harder to make software the right way.

Bibliography
  1. Steve McConnell, "IEEE Software", [http://www.stevemcconnell.com/ieeesoftware/bp09.htm]
  2. Edward R. Comer, "System Engineering Concept Demonstration, Systems Engineering Needs", Copyright © 1992, Software Productivity Solutions, Inc.[http://www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA265468]
  3. Steve McConnell, "Code Complete", third party quotation from book: http://www.codinghorror.com/blog/2006/01/code-reviews-just-do-it.html, Code Complete 2nd Edition Home Page, On Amazon.com