21 October 2010

Patterns and Practices: Scope and Type Inference Through Syntactic Sugar

Code Complete

Once again I'm referring to Code Complete. Code Complete taught me a lot of things and coding style is one of those that I hold important to this day. Let me preface this by stating that Code Complete was built around C++ and later C# coding, still I believe that utilizing coding style that is indicative of scope and usage makes programs more maintainable. I also believe that more maintainable code can more easily be proven to work and be proven to be correct. Many programming languages, especially dynamically typed programming languages, which are typically scripting languages, use syntax to identify scope and possibly type. Here are some examples:

From C++
#include <iostream>
#include <string>
using namespace std;

class MyObject
{
private:
 string name;
public:
 MyObject(const string & name);
 string getName();
};

// The :: token indicates the relationship of this function definition.
MyObject::MyObject(const string & name)
{
 this->name = name; // this indicates the relationship of the left name versus the right name.
}

string MyObject::getName()
{
 return this->name;
}

int main()
{
 MyObject object("Object 1");

 cout << "First object's name: " << object.getName() << endl;

 return 0;
}
From Ruby
class MyObject
 attr_reader :name # The : prefix indicates that name is a symbol.
 
 def initialize(name)
  @name = name # The @ prefix indicates the relationship of the left name versus the right name.
 end
end

object = MyObject.new("Object 1")

puts "First object's name: " + object.name
From Python
# Tabs indicate containers in Python - there are no curly braces or end statements.
class MyObject:
    def __init__(self, name):
        self._name = name # self and _ indicates the relationship of the _name to the class.
        
    def getName(self):
        return self._name
        
object = MyObject('Object 1')

print "First object's name: ", object.getName()
From Perl
package MyObject;

sub new {
 my $class = shift; # The $ prefix indicates that the variable is a scalar.
 my $self = {
  _name => shift
 };
 
 bless $self, $class;
 
 return $self;
}

sub getName {
 my($self) = @_; # The @ prefix indicates an array.
 
 return $self->{_name}; # The { and } dereference hash values by key.
}

package main;

$object = new MyObject("Object 1");

print "First object's name: " . $object->getName();
1;
Anyway hopefully you get the point that many languages hint in some way how to use certain keywords. C++, a heavily typed language, requires you to signify who a method belongs to. The main or global context is inferred, but a class context requires ::. In Ruby a single colon, :, indicates a symbol, and you can reference that symbol in a class via @symbolName. Python requires tabs to indicate contained code or code in a container, such as a function or a class. And finally in Perl (where everything is a string) there are several indicators or what kind a variable is or how to use it, for example $ = scalar, % = hash, @ = array. All of these things are essentially syntactic sugar.

...and Scope Inference

In Code Complete there are several hints to infer scope in a program. Some of what I am about to show you were developed somewhat more by a team I worked on recently.

SyntaxConditions
PascalCasing
  • Namespaces
  • Class names
  • Method names*
  • Properties and public fields
IInterfaceName
  • All interfaces
TGenericType
  • All generic types
  • All template types
_camelCasing
  • Private class-scoped fields
__camelCasing
  • Private class-scoped static fields
camelCasing
  • Method arguments
  • Method-body-scoped variables
  • Method names*
* this may vary depending upon language
Using these scope hints it is very simple to see what the purpose and scope of each object is. For example if I see a variable named _name then I instantly know that it is privately owned by a class, or in the case of the Python example I know that I should not change it manually (Python doesn't have a private class scope).

11 February 2010

Patterns and Practices: Valid Reasons to Create a Routine

Code Complete

Several months ago I finished reading Steve McConnell's Code Complete 2nd Edition. I learned a lot from it and took notes. Some of what the book contained were methodologies that I already knew and used on a daily basis. Other parts of the book opened my eyes to a higher road in programming. My goal in reading the book was to become a better programmer, a better program designer or architect, and to better understand the reasons behind doing things a certain way. Especially helpful was the fact that most of the practices described in the book were practices we had implemented internally at my current job at the time. So aside from learning new things, they were new things that I was expected to know.

As I said, I already bought into a lot of the material of "Code Complete" before I had read it, but there were some things that I did just because, yet I knew there must be a good reason why I did them. In this next series of posts I am going to review parts of the book by reviewing my notes. I recommend that every programmer read this book. It really applies to everyone at every level, and every programming language. Nobody is exempt. In the past I have talked to some programmers of certain languages and they tried to convince me that "Code Complete" was only for C, C++ and C#. Those are the primary languages referenced in the book, but "Code Complete" is deeper than programming language choice. As I review my notes I may draw information from other sources as well. Please bear with me as I strive to convey my new knowledge in a meaningful fashion.

Valid Reasons to Create a Routine

For all intents and purposes the term "routine" here means any language construct the resembles (per the language) a function, procedure, sub-routine, method, and in some cases accessors and mutators (getters and setters).

  1. In order to reduce complexity

    The first reason that Code Complete gives that a programmer may choose to introduce a new routine is to reduce complexity. One of the general examples given was a situation when you might have a nasty long elaborate conditional expression that you want to pop into an if statement. Here is an example:

    namespace Nathandelane.Examples.Conditionals
    {
     public class ShowDifficultConditional
     {
      public ShowDifficultConditional()
      {
       string name = "Nathan Lane";
       string position = "Software Developer";
       int age = 29;
       
       if (name.StartsWith("Na") &&
        name.Contains("La") &&
        (position.Contains("Dev") ||
         position.Contains("Quality"))
        && (age < 35))
       {
        Console.WriteLine("{0} is a young {1} at age {2}", name, position, age);
       }
       else
       {
        Console.WriteLine("{0} is not such a young {1} at age {2}", name, position, age);
       }
      }
     }
    }
    
    I define a "elaborate...conditional expression" as any conditional expression that combines more than a single conditional operator. In this case I have five conditional statements of which two are their own compound conditional. (Aside: I don't normally like to stack my conditional statements, but because this blog has a relatively small amount of horizontal real estate, I chose to display them this way, which in my opinion is pretty ugly. But ignore it for now.)

    The principle of creating a routine to reduce complexity extracts a nasty long elaborate conditional expression into its own routine and replaces its original nastiness with a call to that routine. So in the case of the above, following this principle might look something like this:

    namespace Nathandelane.Examples.Conditionals
    {
     public class ShowDifficultConditional
     {
      public ShowDifficultConditional()
      {
       string name = "Nathan Lane";
       string position = "Software Developer";
       int age = 29;
       
       if (NastyConditionalIsTrue())
       {
        Console.WriteLine("{0} is a young {1} at age {2}", name, position, age);
       }
       else
       {
        Console.WriteLine("{0} is not such a young {1} at age {2}", name, position, age);
       }
      }
    
      private bool NastyConditionalIsTrue()
      {
       bool result = false;
    
       result = name.StartsWith("Na");
       result = result && name.Contains("La");
    
       bool subResult = (position.Contains("Dev") || position.Contains("Quality"));
    
       result = result && subResult;
       result = result && (age < 35);
    
       return result;
      }
     }
    }
    
    Now as you can see the nastiness isn't really gone, but the code complexity has been reduced drastically, and now I can deal with problems in that conditional expression without worrying about making sure that the if statement is formatted correctly, because I can see that it is without thinking too much about it.

    While this principle is cool I wouldn't recommend using it too often. If you have an ugly conditional expression, you may want to re-evaluate your program in general. For example a while back I wrote a parser for a calculator program that had a huge dispatch conditional block that consisted of about thirty nasty embedded conditional blocks. I reworked it several times and it just turned out to always be a mess, and there were always more "if's" and "else if's" to add in. So I changed the whole thing over to the state pattern, did away with all of the conditional-guess-work, and simplified work work greatly. In a way I did follow this principle, but to a greater degree than simply creating a new routine; I created a new class of routines.

  2. In order to introduce an intermediate, understandable abstraction

    In my HGrep program I have a lot of elaborate configuration settings. The software itself has 19 documented command-line argument options, some of which have multiple sub-options. The optional command-line arguments that are available is just one method of abstracting the interface for the program into intermediate, more understandable abstractions.

    To better accommodate these abstractions in the code, I create separate methods that provide settings to the agent for each of the possible command-line argument states. The immediate benefit of this is that if one of my arguments changes for some reason, then I can accommodate that change simply in the code by making a change to the method that corresponds to the argument.

  3. In order to avoid duplicate code

    This one should be a given to anybody who has been programming for more than a year, even if you're just a hobbyist. Duplicating code is generally speaking a great big no-no. Not that it's really that bad, except when it comes to maintenance. Think about it. What if you had duplicated six line of code 12 times in your program, and one day you decided that two of those lines needed to change. You would really need to make the change to 24 lines, and what would happen if you missed or forgot one? Ideally you would extract those six lines of code into a single routine, and then call that routine in those 12 places in your program. Then when your two-line code change came along, you would only need to worry about changing those two lines of code.

    See how simple that it? It even promotes increased productivity and reduces the likelihood of you making a mistake while making such a change and therefor introducing new defects.

  4. In order to support sub-classing

    This principle is an indicator of another of the great design patterns. Allowing for a hook routine is that pattern. Hook routines are routines that may be called by a process, but perhaps doesn't do anything unless it is implemented by the sub class. Basically the unimplemented routine is the hook, and an inheriting class can override that routine to allow for special usage. Hooks

    Hooks allow for dynamically class-enhancing child classes to have a little bit more say in what goes on behind the scenes. One example that I can remember from "Head First Design Patterns" is a pizza store program. This program allowed for the pizza store to implement its own toppings for its pizzas but maintain the same process of making the perfect pizza. The pizza store base class took care of that process, but each implementing pizza store had the ability to add certain toppings, like a special sauce or cheese by overriding certain addTopping() routines.

  5. In order to hide sequences

    Probably the most common sequence you might find in programming (especially object-oriented programming), that is also important is the initialization of an object. In object-oriented programming, objects are initialized by a class constructor, which is a special routine found in the class. The constructor is where we usually initialize instance variables, and we can call other private methods to do special tasks if one of the constructor arguments is a state variable.

    Wanting to hide a sequence from the developer using your API is an honorable desire in creating routines. Above I talked about hooks, which are routines that are found in sequences but are left to the implementor to take care of. These hooks, when implemented, may cause minor or significant changes to a sequence. But the sequence is hidden in another routine that ensures the implementor doesn't mess with the overall sequence.

  6. In order to hide pointer operations

    In low-level programming languages or programming languages that provide access to the lower levels of operating systems such as C and C++ pointer operations are exposable at the rawest level. Pointer operations can often be confusing, and as such should be encapsulated so that they are more easily maintainable. Some such pointer operations might include allocating and de-allocating memory, instantiation of a class or struct,or even the de-referencing of a pointer. These can all be simplified by wrapping their operations in a high-level routine.

  7. In order to improve portability

    Most programming languages are portable across comuter architectures to some extent, but almost all programming languages have some caveats. Some examples might include the inclusion of Win32 API extensions in order to better support Windows 32-bit architectures, or the Curses library to support Intel's console extensibility. One more common instance that we might see for using a method to improve portability comes from JavaScript. JavaScript, though an ECMA standard, still roughly comes in at least two different flavors: Internet Explorer and everything else (stadards-compliance). Because of this many things still differ to some extent across these two general platforms of browsers. Let's take for example th method of attaching events to an object.

    /**
     * alertMe is a function that calls alert with the object passed to it.
     * @param {event} e
     */
    function alertMe(e) {
     alert(e);
    }
    
    someElement = document.getElementById("someElementsId");
    
    if(someElement.attachEvent) {
     // Internet Explorer's method of attaching events:
     someElement.attacheEvent('onclick', alertMe);
    } else {
     // W3C's standard:
     someElement.addEventListener('click', alertMe, false);
     // The third argument above specifies whether the event should bubble up. No control on the event attachment in IE for this.
    }
    
    Now I am certain that I would not want to type that code every time I wanted to attach an event to a particular element, so I might do something like this:
    /**
     * myAttachEvent attaches an event handler to an element based on the
     * browser compatibility.
     * @param {element} element
     * @param {string} eventName Like 'click' or 'mouseover'
     * @param {function} callback
     * @param {bool} bubbleUp (Optional) Whether or not the event should bubble
     * up in the browser (not supported in IE)
     */
    function myAttachEvent(element, eventName, callback, bubbleUp) {
     if(!bubbleUp) {
      bubbleUp = false;
     }
     
     if(element.attachEvent) {
      element.attachEvent('on' + eventName, callback);
     } else {
      element.addEventListener(eventName, callback, bubbleUp);
     }
    }
    
    So now I have a function that will consistently attach events to elements across all [modern] browsers, and it even handles bubbleUp as an optional argument. Most of the time we don't want events to bubble up, so if the argument is null, I set it to false. Once again this is a very good reason for creating a routine, and I use it regularly.

  8. In order to simplify boolean tests

    Several times in my experience in programming I have had to deal with the inevitable large list of boolean tests for a single result.Sometimes when you experience this, it means you have a design flaw. However design flaw or not, you have to deal with them. Most languages also offer boolean short circuiting, which makes large boolean tests simpler, but large tests can still become overwhelming. Short ciruiting in boolean tests means that tests are read from left to right and perenthesised conditionals are read in order of outer-most perentheses, so no race conditions apply. Because of short circuiting, the results of a test like this are easy to determine:

    if(true && !false)
    {
     Console.WriteLine("True");
    }
    else
    {
     Console.WriteLine("False");
    }
    
    This expression gets true read first, but if true fails, then !false is never evaluated. In C and C++ short circuiting isn't guaranteed to work, so if you want to ensure that the above appears to get read in order then you have to write:
    if((true) && !false)
    {
     printf("True");
    }
    else
    {
     printf("False");
    }
    
    Anyway back to complex boolean expressions, lets say that you had a number of criteria pertaining to an employee used to determine the number of paid days off they receive during a single year: years with company, number of hours worked per week, employment status (full-time, part-time, contractor, intern), and whether or not they are enrolled in the incentive program. Now let's pretend like you put all of that into a huge conditional (this is obviously bad programming but I'm trying to make a point with what little I have to go on):
    int paidDaysOff = 0;
    
    if(_yearsEmployed > 5 && _weeklyHours >= 30 && (_employmentStatus ==
     Employment.FULL_TIME || _employmentStatus == Emploment.PART_TIME)
     && _incentive == true)
    {
     paidDaysOff = 10;
    }
    ...
    
    While this long conditional is not extremely complex, just the fact that it is long though makes it a good candidate to ensure ease of maintenance and readability:
    bool emplyeeDeserves10DaysOff()
    {
     bool result = _yearsEmployed > 5 && _weeklyHours >= 30 && 
     (_employmentStatus == Employment.FULL_TIME || 
     _employmentStatus == Emploment.PART_TIME) && _incentive
     == true;
     
     return result;
    }
    
    int getPaidDaysOff()
    {
     int paidDaysOff = 0;
     
     if(emplyeeDeserves10DaysOff())
     {
      paidDaysOff = 10;
     }
    }
    
    See how much more readable that is?

  9. In order to improve performance

    The final good reason on this list to create a routine is to improve performance.I have had little to no experience with this particular scenario, as either I have not ever had a need to improve performance, or I haven't known a need to improve performance. Either way I hope that if you do find yourself in this situation, that rather than assuming that you need to make a routine, you step back and take a look at the big picture. Are you utilizing your compiler's option to optimize for performance? Are you doing this the way that the programming language publisher recommends you do them? Are you following design patterns or do you just have a mess of code? Also do you have a good set of fully functioning and passing unit tests. These things will help your program to be more efficient, thus increasing performance.

I have many more note pages like this one, and I hope that I have the opportunity to review and share those as well. I hope this was as informative for you as it has been for me. Thanks for letting me take some of your time.

08 February 2010

Web Testing Tools

For six years now I have served my time as an IT professional in the Quality Assurance sector. Over that period of time I have tested a wide variety of systems from desktop applications to video games, and from web sites to embedded wifi networked systems. All of these systems required a different means to test them effectively. In each system the same mindset (detail-oriented) was employed, but different techniques were applied to utilize that mindset. In some cases a different toolset was required as well.

For the past three years I have developed and tested web-based solutions and I have come to enjoy the vast amount of software technology available to Quality Assurance professionals today. As I continue my career as a software developer I lean a lot on my past as a QA professional. The following is a list of tools that I have found to be extremely useful in my career.

Web-based Testing

Web-based testing tools are my most recent experience and so I will begin here. This first set of tools are simply browsers that I have found to increase my ability to ensure performance and compatibility metrics.

Browsers

Browsers are the primary tool of the Internet. Customers use browsers of all varieties including those that are no longer supported by the companies who created them. This part is sad, and hopefully we all have the guts to tell our customers that. Last year a man called me up to get support on his Netscape Navigator 9, and I had to tell him that not only do we not support that browser, but neither does the creator. I advised him to move to Firefox, which is similar, because that's what Netscape became, but it is highly supported. Anyway, here's the list of browsers and browser tools I use in my testing efforts:
  • IETester from DebugBar is a requirement in my book. IETester gives you the Internet Explorer suite of browsers. It even includes the last currently unsupported Microsoft browser, IE 5.5. But with the lastest installment you get IE 6, IE 7, and IE 8 also. IETester is actively developed, and works very well. It doesn't use emulation, rather it actually collects the old Trident-based rendering engines and allows you to see your web site in different tabs for each browser version.
  • Next on my list is Mozilla Firefox. This browser combines tabbed browsing, with ease-of-use, incredible stability, and a huge assortment of supported browser extensions. Some of my required extensions include Firebug, Firecookie, HttpFox, Screengrab, and Modify Headers. Having the ability to be extended with add-ons makes Firefox a versatile tool and makes it very valuable.
  • Of course new browsers get a lot of use in the market, and so Google Chrome and Apple Safari (both WebKit-based browsers) are required arsenal. While these browsers are built on the same rendering engine, their other internals are very different. Google and Apple are always trying to better the Web. A lot of people use these browsers also because of their stability and relatively high performance.
  • And finally the Opera Browser though not always of high use is an important asset. Out of the box it comes with Dragonfly, which is similar in functionality and purpose to Firebug. It also provides for another rendering standard, and Opera Mini is an important browser on the mobile platform, and uses the same rendering engine.

Automation

This next set of tools is a set of web testing tools that may or may not require a browser. Most of them I have used, some I have not used or haven't used very much, but their purpose is something I think highly of.

Automation: Browser-based

Browser-based automation tools come in at least two varieties. The first variety uses a browser like a Microsoft OLE (Object Linking and Embedding) object and the second utilizes the browser's core functionality to automate web-based testing.

Automation: Browser-based: OLE-type

  • Under free software licensing we've got several options. Python has a library named PyWinAuto for Windows that grabs a connection to the OLE server for Internet Explorer. The library then exposes the OLE API as a simpler Python API that can be used to control the browser.
  • Similarly, Ruby supports a system of grabbing a connection to Internet Explorer's OLE server and Watir (pronounced like "water") exposes this in a Ruby-based API. Watir's API is built in a familiar fashion such that people migrating from HP's QuickTest Pro web testing suite can grasp the concepts readily. Watir also has counterparts for other browsers, which are named for the browsers they support, such as FireWatir, ChromeWatir, and SafariWatir. Work is being done to integrate these all into a single codebase. Currently only FireWatir is integrated, and Watir uses a factory to determine which browser to launch. Firefox's only requirement is the JSSH (JavaScript SHell) extension which can be downloaded at the FireWatir page.
  • Finally .NET got its feet wet in this arena also with WatiN, which is loosely based on the API of Watir and thus follows some of the same conventions. Its licensing allows for free download, but modifications and additions are supported through paid support. The source is freely available and the licensing also supports personal (for business or private) modifications. The advantage of WatiN is in the .NET - if your testers know C# better than Ruby then it's a plus.
While there are probably more, they aren't likely to be as highly developed.

Automation: Browser-based: Browser-Type

Browser-type automation toolkits utilize a browser's core internal functionality. They may be browser extensions or they might use JavaScript to control suites of unit-like tests. Because of this some of these tools are inherently browser- and operating-system-independent.
  • The first on my list is the iMacros browser extension for Firefox. iMacros allows a user of the Firefox browser to record a series of steps taken on a webpage and then play it back at any time. You can also add in validations for particular elements on a web page. It appears as though Chrome also has an iMacros extension now. In order to use it, you will need to download and install the beta or developer branch for Chrome. Because this is a browser extension for specific browsers it is not browser- and operating-system-independent.
  • Next on my list is Selenium. Selenium comes in several flavors. The most independent variant is Selenium Core. Selenium core is an HTML-JavaScript-based testing system which utilizes frames to automate your website and validate various points on it. Because of HTTP security protocols, Selenium Core must be installed next to your website and be accessed under the same domain. So normally you would probably not use this solution to test a web site in a production environment. Selenium RC or Selenium Remote Control on the other hand still utilizes the browser but creates a proxy to your site through a browser that uses Selenium Core to run tests outside of the actual domain. This means that no testing code resides on your server, but you still get the same effect. Selenium is highly developed and highly supported. There are several other offerings from Selenium HQ as well.
There are other tools that fall under this category, but these two are the best. And with Selenium's vast offerings, I don't think you could go wrong.

Automation: HTTP-based

HTTP-based automation of the web sites offers some speed and simplicity to the mix. There is no reliance on browsers whether they are buggy or not and whether they fully support JavaScript or not. On the other hand, HTTP-base testing solutions generally don't support JavaScript, so if your site relies heavily on it then this may not be what you're looking for. I have successfully used HTTP-based testing solutions to test web services, most static and dynamically generated web pages, and XML and RSS products. Here are a few of those products.
  • HttpUnit is the first tool on this list. It is written in Java so generally speaking you can use it on any operating system as long as Java is supported by the operating system. HttpUnit goes about web-based testing like a series of unit tests based on the xUnit testing pattern. Generally you would use a third-party xUnit library to build your tests on, such as JUnit, then you would also use that library to run the tests. Unit testing utilizes the assertion pattern which means you assert a condition to be true, like an element that exists.
  • HtmlUnit is another offering, again written in Java. The major difference between HttpUnit and HtmlUnit is that HtmlUnit allows you to test JavaScript. It sandboxes the JavaScript in a website similar to the way a browser does it, but it still remains headless and keeps the browser reliance low and the portability and stability high.
  • HGrep is one of my own tools that I developed because sometimes I just wanted to know about headers and see things faster than a browser could provide them for me. Again it is a headless tool that I have successfully used to write automation. The tool itself does not provide any automation hooks, but scripts can be easily written in PowerShell which utilize HGrep for automation.
  • Apache's JMeter is a load testing tool for the Web. It specifically focuses on HTTP-based testing and can read and write to SOAP, LDAP, JMS, POP and IMAP mail and regular HTTP protocols.
This gives you a good variety of web application testing tools. These are all highly developed and I recommend them all. If you have any other favorites, please feel free to post them here. Thanks.