Monday, October 8, 2012

Code Contracts by Example

So how does one get his feet wet with Design-by-Contract? I remember when I first started using Microsoft Code Contracts I immediately made a huge mess of my projects from which it took several weeks to fully recover. After a while I figured out what worked and what didn't and I've developed a habitual way of doing things that seems to work okay. I won't go so far as to say they are best practices, but they keep me out of trouble. So without delving too deep into the theory of DbC, here are a few of the patterns I use.


Constructor-based Dependency Injection

While property-based DI is possible with code contracts, it gets ugly really quickly. The only way I've been able to make it work is with a backing field and explicit null checks in the getter. Basically, if you have a public setter then all bets are off as far as invariants are concerned. Constructor-based injection, on the other hand, is very clean and straightforward, and allows you to offload some of the work onto the caller. Further, the static checker can leverage readonly fields to infer pre- and postconditions.

public class Foo
{
    private readonly IBar _bar;
    private readonly IBaz _baz;

    public Foo(IBar bar, IBaz baz)
    {
        Contract.Requires(bar != null);
        Contract.Requires(baz != null);
        
        Contract.Ensures(_bar != null);
        Contract.Ensures(_baz != null);

        _bar = bar;
        _baz = baz;
    } 

    [ContractInvariantMethod]
    private void Invariant()
    {
        Contract.Invariant(_bar = bar);
        Contract.Invariant(_baz = baz);
    }
}

The invariants are probably not absolutely necessary, but I've found that the combination of explicit postconditions in the constructor, combined with explicit invariants, speeds up the static analysis and generally has better results.


Conversion of One Nullable Type to Another

I use this frequently for mapping DTOs to domain objects and vice-versa. The contract literally states that either the method parameter is null or the return value will not be null. Put another way, it guarantees that the return value will not be null so long as the method parameter is not null.

public DomainType ToValueObject(DtoType dto)
{
    Contract.Ensures(dto == null || Contract.Result<DomainType>() != null);
    if (dto == null)
    {
        return null;
    }
    return new DomainType(...);
}


Value Objects

When I first started migrating from if/throw blocks I had a lot of preconditions that looked like this:

Contract.Requires(str != null);
Contract.Requires(!string.IsNullOrWhiteSpace(str));
Contract.Requires(Regex.IsMatch(str, @"^[a-zA-Z0-9]+$");
Contract.Requires(str.Length() >= 3);
Contract.Requires(str.Length() < 255);

And so on. Moreover, I found that I had a lot of methods sprinkled about with identical sets of preconditions--usually there was some value that wasn't central to my domain (a telephone number or domain name) that had certain validation requirements in the 3 or 4 places it was used. After performing this conversion I discovered a number of things:

  • The responsibility for validation shifted out of my method and onto the caller
  • There were far more call sites than methods being called, which meant that these preconditions now had to be checked in many more places than before
  • The static checker had to prove the preconditions were met at every call site
  • The maintenance burden had increased by an order of magnitude

This was obviously not a move in the right direction. However, all of these issues were resolved by replacing string typed arguments with value objects (not necessarily value types) that had the following characteristics:

  • Explicit verification in the constructor
  • Postconditions on the constructor ensuring the validity of the newly instantiated instance
  • Immutability

The result was that all of the explicit checks moved to one place (the value object constructor) and all instances of the value object type are guaranteed to be always be valid (else the constructor would throw).

I found it helpful to put the validation logic into a public static IsValid method so that callers have a means to defensively validate input without catching exceptions. The result is something like this:

public class ValueObject
{
    private readonly string _value;

    public ValueObject(string str)
    {
        Contract.Ensures(!string.IsNullOrWhiteSpace(_string));
        Contract.Ensures(_string.Length >= 5);
        Contract.Ensures(_string.Length <= 15);
        if (!IsValid(str))
        {
            throw new ArgumentException();
        }
        _string = str;
    }

    public static bool IsValid(string str)
    {
        Contract.Ensures(Contract.Result<bool>() == false || !string.IsNullOrWhiteSpace(str));
        Contract.Ensures(Contract.Result<bool>() == false || str.Length >= 5);
        Contract.Ensures(Contract.Result<bool>() == false || str.Length <= 15);

        if (string.IsNullOrWhiteSpace(str)) return false;
        if (str.Length < 5) return false;
        if (str.Length > 15) return false;
        return true;
    }

    [ContractInvariantMethod]
    private void Invariant()
    {
        Contract.Invariant(!string.IsNullOrWhiteSpace(_string));
        Contract.Invariant(_string.Length >= 5);
        Contract.Invariant(_string.Length <= 15);
    }
}

The contracts for the method calls then could be reduced to simple Contract.Requires(arg != null) checks, which are much easier to prove. This also works well with numeric types such as prices, quantities, percentages, and other such range-restricted values.


Conclusion

DbC requires a lot of investment up front, but it does offer a lot of value. The trick is enforcing contracts that add value and letting other things go. For example, a telephone number may need to be validated against a regular expression, but it usually isn't necessary to express that as a postcondition, especially if none of the callers depend on that fact. Often it's enough to use an if/throw block and leave it at that. In fact, that last point bears repeating: contracts are not a wholesale replacement for if/throw blocks, and are actually inferior in many cases.

It should also be said that DbC alone is no substitute for good design and good programming. If a method has a laundry list of contracts, then it might be that there are simply too many parameters, or too many dependencies between parameters. Sometimes splitting a method into multiple methods with different parameters, or encapsulating a set of parameters into a new type of object will completely obviate the need for complex preconditions. Convoluted preconditions are probably an indication that a method is in need of refactoring, which is valuable in and of itself.

Bottom line, the point of DbC is to formally specify things that both the caller and the callee should be responsible for, so contracts should be limited to things that should have been obvious or inferred to begin with (such as null checks). There are many conditions that are not the responsibility of the caller and should continue to be handled during runtime within the method (such as business logic and complex validation requirements). Done right, DbC will simply help you prove what you already assumed in the first place and identify places in your code where the assumptions are met.

Thursday, October 4, 2012

More on Non-nullable Reference Types

I was thinking about this some more today, and it occurred to me that a very simple construct could achieve what is being asked for without making any changes to the language or tools. Nullable value types (structs) were first implemented with an explicit wrapper class; it wasn't until later that the ValueType? shorthand appeared. With implicit operators, we could basically do the same for non-nullable reference types:

public struct NotNull<T> : IEquatable<T>, IEquatable<NotNull<T>> where T: class
{
    private T _value;

    public NotNull(T value)
    {
        if (value == null)
        {
            throw new ArgumentNullException("value");
        }
        _value = value;
    }

    public T Dereference()
    {
        if (_value == null)
        {
            throw new NullReferenceException();
        }
        return _value;
    }

    public override bool Equals(object other)
    {
        if (ReferenceEquals(null, other)) 
        {
            return false;
        }
        if (other is NonNullable<T>)
        {
            return Equals((NotNull<T>)other);
        }
        if (other is T)
        {
            return Equals((T)other);
        }
        
        return false;
    }

    public bool Equals(NotNull<T> other)
    {
        return Equals(other._value);
    }

    public bool Equals(T other)
    {
        if (ReferenceEquals(null, other))
        {
            return false;
        }
        return Equals(Dereference(), other);
    }

    public int GetHashCode()
    {
        return Dereference().GetHashCode();
    }

    public static bool operator ==(NotNull<T> left, NotNull<T> right)
    {
        return Equals(left, right);
    }

    public static bool operator !=(NotNull<T> left, NotNull<T> right)
    {
        return !Equals(left, right);
    }

    public static implicit operator NotNull<T>(T value)
    {
        return new NotNull<T>(value);
    }

    public static implicit operator T(NotNull<T> notNullValue)
    {
        return notNullValue.Dereference();
    }
}

This does not get rid of NullReferenceExceptions, but it does bake in null checks into the code and makes for some interesting syntax:

public static void Main(string[] args)
{
    if (args.Length < 1) throw new Exception();
    // Implicit conversion to NotNull<string> will throw at call site
    // if args[0] is null
    Method(args[0]);
}

public static NotNull<string> Method(NotNull<string> stringValue)
{
    // Implicit conversion from NotNull<string> to string will
    // call NotNull<T>.Dereference(), which will throw in the case
    // that default(NotNull<string>) was passed in.
    string foo = stringValue;
    return foo + "bar";
}

It's not perfect. I used a struct because it's a value type that can never be null, so we don't have to worry about the nullity of the NotNull<T> itself. However, all structs have a default constructor that cannot be overridden, which means that if default(Nullable<T>) is used then the internal _value field will not be properly initialized. So the Dereference() method is implemented such that it will throw a NullReferenceException if _value in this case.

So what does this actually achieve? Well, a couple of things. First, it serves to document the method prototype. Obviously the caller is expecting a non-null value and/or ensures that the return value is not null. Second, it does enforce null checks as close to the call site as possible, which should prevent null references from making it too far. This is beneficial in that it would prevent NullReferenceExceptions from occurring within third-party code that may not have debugging enabled.

When dealing with anything more complicated than a single number or character, we need pointers, even with fancy managed languages like C# and Java. And unless they have something to point to, uninitialized references are here to stay. Syntactic candy and helper classes only go so far. I still contend that design-by-contract is the way to go.

Wednesday, October 3, 2012

Non-nullable Reference Types

A few days ago on InfoQ I saw a link to this blog post offering a solution to "The Billion Dollar Mistake": allowing null references. The proposal, non-nullable reference types, is designed to replace this:

/// <exception cref="ArgumentNullException">
///     Thrown if <paramref name="argument"/> is null.
/// </exception>
public void Method(SomeType argument)
{
    if (argument == null)
    {
        throw ArgumentNullException("argument");
    }
    // Do stuff...
}

With something like this:

public void Method(SomeType! argument)
{
    // Do stuff...
}

Of course complications arise with arrays of non-nullable reference types and with the methods that feature non-nullable output parameters. There are workarounds for those issues, but the result is a convoluted syntax and unnecessary overhead. But I think the main problem is that we're looking at the problem from the wrong direction.

I think we have to accept that reference types are nullable, period. There will always be a case in which a reference has not yet been or may never be initialized. However, we can stipulate that a caller must provide a properly initialized reference when invoking a function, and we can promise that the results of our function will not be null under normal program flow.

This isn't a new idea. "Design by Contract" has been around since the 80s in the Eiffel programming language. More recently it has reemerged with Spec# and Code Contracts tool set from Microsoft. And it works. For example, this:

public void Method(SomeType argument)
{
    Contract.Requires(argument != null);
    // Do something...
}

... will cause the static analysis tool to verify that all calls to that method within scope have verified that argument reference is not null before making the call. And it doesn't require new keywords, new syntax, compiler enhancements, or breaking changes to existing code.

While I would like to see pre- and post-conditions baked into the language itself (ala Eiffel or Spec#), I think that the "Billion Dollar Problem" is largely solved with the tools we already have.

Tuesday, October 2, 2012

Code Formatting in Blogger

I've several hours of research, trial, and error, I've finally come across a workable solution for formatting code snippets in Blogger. Here are the basic steps:
  1. From your dashboard view, click "Template" and then "Edit HTML".  A modal confirmation dialog will appear.  Click "Proceed".
  2. Find the <head> tag and insert the following on the next line:
    <link href='https://sites.google.com/site/itswadesh/codehighlighter/shCore.css' rel='stylesheet' type='text/css' />
        <link href='http://sites.google.com/site/itswadesh/codehighlighter/shThemeDefault.css' rel='stylesheet' type='text/css' />
        <script src='http://sites.google.com/site/itswadesh/codehighlighter/shCore.js' type='text/javascript' />
        <script src='http://sites.google.com/site/itswadesh/codehighlighter/shBrushCSharp.js' type='text/javascript' />
        <script src='http://sites.google.com/site/itswadesh/codehighlighter/shBrushJava.js' type='text/javascript' />
        <script src='http://sites.google.com/site/itswadesh/codehighlighter/shBrushCss.js' type='text/javascript' />
        <script src='http://sites.google.com/site/itswadesh/codehighlighter/shBrushXml.js' type='text/javascript' />
        <script type='text/javascript'>
            // Required for Blogger:
            SyntaxHighlighter.config.bloggerMode = true;
            // Set to true to enable line numbers:
            SyntaxHighlighter.defaults['gutter'] = false;
            // Apply formatting:
            SyntaxHighlighter.all();
        </script>
    

  3. Save the template

Once your template is updated, simply wrap your code snippets in a <pre> tag with a special class:

<pre class="brush: csharp">
public class Test
{
    public string Property { get; set; }

    public void Method()
    {
    }
}
</pre>

The result should look something like this:
public class Test
{
    public string Property { get; set; }

    public void Method()
    {
    }
}

The full list of brushes can be found here. Note that the HTML snippet in step 2 above only imports the C#, Java, XML, and CSS brushes. If, for example, you wanted to use Erlang, you would also have to include the following <script> tag:

    <script src='http://sites.google.com/site/itswadesh/codehighlighter/shBrushErlang.js' type='text/javascript' />