Code Complete
Several months ago I finished reading Steve McConnell's Code Complete 2nd Edition. I learned a lot from it and took notes. Some of what the book contained were methodologies that I already knew and used on a daily basis. Other parts of the book opened my eyes to a higher road in programming. My goal in reading the book was to become a better programmer, a better program designer or architect, and to better understand the reasons behind doing things a certain way. Especially helpful was the fact that most of the practices described in the book were practices we had implemented internally at my current job at the time. So aside from learning new things, they were new things that I was expected to know.
As I said, I already bought into a lot of the material of "Code Complete" before I had read it, but there were some things that I did just because, yet I knew there must be a good reason why I did them. In this next series of posts I am going to review parts of the book by reviewing my notes. I recommend that every programmer read this book. It really applies to everyone at every level, and every programming language. Nobody is exempt. In the past I have talked to some programmers of certain languages and they tried to convince me that "Code Complete" was only for C, C++ and C#. Those are the primary languages referenced in the book, but "Code Complete" is deeper than programming language choice. As I review my notes I may draw information from other sources as well. Please bear with me as I strive to convey my new knowledge in a meaningful fashion.
Valid Reasons to Create a Routine
For all intents and purposes the term "routine" here means any language construct the resembles (per the language) a function, procedure, sub-routine, method, and in some cases accessors and mutators (getters and setters).
In order to reduce complexity
The first reason that Code Complete gives that a programmer may choose to introduce a new routine is to reduce complexity. One of the general examples given was a situation when you might have a nasty long elaborate conditional expression that you want to pop into an if statement. Here is an example:
namespace Nathandelane.Examples.Conditionals { public class ShowDifficultConditional { public ShowDifficultConditional() { string name = "Nathan Lane"; string position = "Software Developer"; int age = 29; if (name.StartsWith("Na") && name.Contains("La") && (position.Contains("Dev") || position.Contains("Quality")) && (age < 35)) { Console.WriteLine("{0} is a young {1} at age {2}", name, position, age); } else { Console.WriteLine("{0} is not such a young {1} at age {2}", name, position, age); } } } }
I define a "elaborate...conditional expression" as any conditional expression that combines more than a single conditional operator. In this case I have five conditional statements of which two are their own compound conditional. (Aside: I don't normally like to stack my conditional statements, but because this blog has a relatively small amount of horizontal real estate, I chose to display them this way, which in my opinion is pretty ugly. But ignore it for now.)The principle of creating a routine to reduce complexity extracts a nasty long elaborate conditional expression into its own routine and replaces its original nastiness with a call to that routine. So in the case of the above, following this principle might look something like this:
namespace Nathandelane.Examples.Conditionals { public class ShowDifficultConditional { public ShowDifficultConditional() { string name = "Nathan Lane"; string position = "Software Developer"; int age = 29; if (NastyConditionalIsTrue()) { Console.WriteLine("{0} is a young {1} at age {2}", name, position, age); } else { Console.WriteLine("{0} is not such a young {1} at age {2}", name, position, age); } } private bool NastyConditionalIsTrue() { bool result = false; result = name.StartsWith("Na"); result = result && name.Contains("La"); bool subResult = (position.Contains("Dev") || position.Contains("Quality")); result = result && subResult; result = result && (age < 35); return result; } } }
Now as you can see the nastiness isn't really gone, but the code complexity has been reduced drastically, and now I can deal with problems in that conditional expression without worrying about making sure that the if statement is formatted correctly, because I can see that it is without thinking too much about it.While this principle is cool I wouldn't recommend using it too often. If you have an ugly conditional expression, you may want to re-evaluate your program in general. For example a while back I wrote a parser for a calculator program that had a huge dispatch conditional block that consisted of about thirty nasty embedded conditional blocks. I reworked it several times and it just turned out to always be a mess, and there were always more "if's" and "else if's" to add in. So I changed the whole thing over to the state pattern, did away with all of the conditional-guess-work, and simplified work work greatly. In a way I did follow this principle, but to a greater degree than simply creating a new routine; I created a new class of routines.
In order to introduce an intermediate, understandable abstraction
In my HGrep program I have a lot of elaborate configuration settings. The software itself has 19 documented command-line argument options, some of which have multiple sub-options. The optional command-line arguments that are available is just one method of abstracting the interface for the program into intermediate, more understandable abstractions.
To better accommodate these abstractions in the code, I create separate methods that provide settings to the agent for each of the possible command-line argument states. The immediate benefit of this is that if one of my arguments changes for some reason, then I can accommodate that change simply in the code by making a change to the method that corresponds to the argument.
In order to avoid duplicate code
This one should be a given to anybody who has been programming for more than a year, even if you're just a hobbyist. Duplicating code is generally speaking a great big no-no. Not that it's really that bad, except when it comes to maintenance. Think about it. What if you had duplicated six line of code 12 times in your program, and one day you decided that two of those lines needed to change. You would really need to make the change to 24 lines, and what would happen if you missed or forgot one? Ideally you would extract those six lines of code into a single routine, and then call that routine in those 12 places in your program. Then when your two-line code change came along, you would only need to worry about changing those two lines of code.
See how simple that it? It even promotes increased productivity and reduces the likelihood of you making a mistake while making such a change and therefor introducing new defects.
In order to support sub-classing
This principle is an indicator of another of the great design patterns. Allowing for a hook routine is that pattern. Hook routines are routines that may be called by a process, but perhaps doesn't do anything unless it is implemented by the sub class. Basically the unimplemented routine is the hook, and an inheriting class can override that routine to allow for special usage. Hooks
Hooks allow for dynamically class-enhancing child classes to have a little bit more say in what goes on behind the scenes. One example that I can remember from "Head First Design Patterns" is a pizza store program. This program allowed for the pizza store to implement its own toppings for its pizzas but maintain the same process of making the perfect pizza. The pizza store base class took care of that process, but each implementing pizza store had the ability to add certain toppings, like a special sauce or cheese by overriding certain addTopping() routines.
In order to hide sequences
Probably the most common sequence you might find in programming (especially object-oriented programming), that is also important is the initialization of an object. In object-oriented programming, objects are initialized by a class constructor, which is a special routine found in the class. The constructor is where we usually initialize instance variables, and we can call other private methods to do special tasks if one of the constructor arguments is a state variable.
Wanting to hide a sequence from the developer using your API is an honorable desire in creating routines. Above I talked about hooks, which are routines that are found in sequences but are left to the implementor to take care of. These hooks, when implemented, may cause minor or significant changes to a sequence. But the sequence is hidden in another routine that ensures the implementor doesn't mess with the overall sequence.
In order to hide pointer operations
In low-level programming languages or programming languages that provide access to the lower levels of operating systems such as C and C++ pointer operations are exposable at the rawest level. Pointer operations can often be confusing, and as such should be encapsulated so that they are more easily maintainable. Some such pointer operations might include allocating and de-allocating memory, instantiation of a class or struct,or even the de-referencing of a pointer. These can all be simplified by wrapping their operations in a high-level routine.
In order to improve portability
Most programming languages are portable across comuter architectures to some extent, but almost all programming languages have some caveats. Some examples might include the inclusion of Win32 API extensions in order to better support Windows 32-bit architectures, or the Curses library to support Intel's console extensibility. One more common instance that we might see for using a method to improve portability comes from JavaScript. JavaScript, though an ECMA standard, still roughly comes in at least two different flavors: Internet Explorer and everything else (stadards-compliance). Because of this many things still differ to some extent across these two general platforms of browsers. Let's take for example th method of attaching events to an object.
/** * alertMe is a function that calls alert with the object passed to it. * @param {event} e */ function alertMe(e) { alert(e); } someElement = document.getElementById("someElementsId"); if(someElement.attachEvent) { // Internet Explorer's method of attaching events: someElement.attacheEvent('onclick', alertMe); } else { // W3C's standard: someElement.addEventListener('click', alertMe, false); // The third argument above specifies whether the event should bubble up. No control on the event attachment in IE for this. }
Now I am certain that I would not want to type that code every time I wanted to attach an event to a particular element, so I might do something like this:/** * myAttachEvent attaches an event handler to an element based on the * browser compatibility. * @param {element} element * @param {string} eventName Like 'click' or 'mouseover' * @param {function} callback * @param {bool} bubbleUp (Optional) Whether or not the event should bubble * up in the browser (not supported in IE) */ function myAttachEvent(element, eventName, callback, bubbleUp) { if(!bubbleUp) { bubbleUp = false; } if(element.attachEvent) { element.attachEvent('on' + eventName, callback); } else { element.addEventListener(eventName, callback, bubbleUp); } }
So now I have a function that will consistently attach events to elements across all [modern] browsers, and it even handles bubbleUp as an optional argument. Most of the time we don't want events to bubble up, so if the argument is null, I set it to false. Once again this is a very good reason for creating a routine, and I use it regularly.In order to simplify boolean tests
Several times in my experience in programming I have had to deal with the inevitable large list of boolean tests for a single result.Sometimes when you experience this, it means you have a design flaw. However design flaw or not, you have to deal with them. Most languages also offer boolean short circuiting, which makes large boolean tests simpler, but large tests can still become overwhelming. Short ciruiting in boolean tests means that tests are read from left to right and perenthesised conditionals are read in order of outer-most perentheses, so no race conditions apply. Because of short circuiting, the results of a test like this are easy to determine:
if(true && !false) { Console.WriteLine("True"); } else { Console.WriteLine("False"); }
This expression gets true read first, but if true fails, then !false is never evaluated. In C and C++ short circuiting isn't guaranteed to work, so if you want to ensure that the above appears to get read in order then you have to write:if((true) && !false) { printf("True"); } else { printf("False"); }
Anyway back to complex boolean expressions, lets say that you had a number of criteria pertaining to an employee used to determine the number of paid days off they receive during a single year: years with company, number of hours worked per week, employment status (full-time, part-time, contractor, intern), and whether or not they are enrolled in the incentive program. Now let's pretend like you put all of that into a huge conditional (this is obviously bad programming but I'm trying to make a point with what little I have to go on):int paidDaysOff = 0; if(_yearsEmployed > 5 && _weeklyHours >= 30 && (_employmentStatus == Employment.FULL_TIME || _employmentStatus == Emploment.PART_TIME) && _incentive == true) { paidDaysOff = 10; } ...
While this long conditional is not extremely complex, just the fact that it is long though makes it a good candidate to ensure ease of maintenance and readability:bool emplyeeDeserves10DaysOff() { bool result = _yearsEmployed > 5 && _weeklyHours >= 30 && (_employmentStatus == Employment.FULL_TIME || _employmentStatus == Emploment.PART_TIME) && _incentive == true; return result; } int getPaidDaysOff() { int paidDaysOff = 0; if(emplyeeDeserves10DaysOff()) { paidDaysOff = 10; } }
See how much more readable that is?In order to improve performance
The final good reason on this list to create a routine is to improve performance.I have had little to no experience with this particular scenario, as either I have not ever had a need to improve performance, or I haven't known a need to improve performance. Either way I hope that if you do find yourself in this situation, that rather than assuming that you need to make a routine, you step back and take a look at the big picture. Are you utilizing your compiler's option to optimize for performance? Are you doing this the way that the programming language publisher recommends you do them? Are you following design patterns or do you just have a mess of code? Also do you have a good set of fully functioning and passing unit tests. These things will help your program to be more efficient, thus increasing performance.
I have many more note pages like this one, and I hope that I have the opportunity to review and share those as well. I hope this was as informative for you as it has been for me. Thanks for letting me take some of your time.