Monday, December 19, 2011

Value Objects and Code Contracts

A while ago I came across an excellent presentation by Dan Bergh Johnsson on the topic of value objects. There is nothing revolutionary in this talk, but it was a good reminder of what value objects actually are and where they should be used.  While the ideas in the video stand on their own, it is interesting to see how they complement and simplify pre- and post-condition checks on methods with code contracts.

Consider the following example (using Microsoft code contracts) in which a domain name is added to a collection that does not allow duplicates:

public class DomainNameList {
    private IList<string> _domainNames = new List<string>();
    public void AddDomainName(string domainName) {
        Contract.Requires(!string.IsNullOrWhitespace(domainName));
        Contract.Requires(!Contains(domainName));
        Contract.Requires(ValidationUtil.IsValid(domainName));
        _domainNames.Add(domainName.Trim());  // Trim to ensure whitespace does not affect Equals
    }
    public bool Contains(string domainName) {
        Contract.Requires(!string.IsNullOrWhitespace(domainName));
        domainName = domainName.Trim(); // Trim to ensure whitespace does not affect Equals
        return _domainNames.Any(dn => dn.Equals(domainName, StringComparison.OrdinalIgnoreCase));
    }
}

So for the privilege of using a string value instead of a value type, the following must be done in multiple places:
  • Trim the domain name string so that string.Equals works properly
  • Specify a special case-insensitive ordering so that mixed case does not affect equality
  • Call to an external utility class to verify that the domain name is valid
These measures would be necessary wherever domain name strings are used, which might be dozens of places throughout the application.  Since that is a clear violation of the DRY principle, let's put it all in once place:

public class DomainName : IEquatable<DomainName> {
    public const string DomainNamePattern = @"(?i)[a-z][a-z0-9\-_]{62}(\.[a-z][a-z0-9\-_]{62})*";
    private readonly string _domainName;
    public DomainName(string domainName) {
        if(string.IsNullorWhiteSpace(domainName)) throw new ArgumentNullException("domainName");
        if(!IsValid(domainName)) throw new ArgumentException("Invalid domain name", "domainName");
        _domainName = domainName.Trim();
    }
    public void IsValid(string domainName) {
        return domainName != null && Regex.IsMatch(domainName, DomainNamePattern);
    }
    public bool Equals(DomainName other) {
        if(ReferenceEquals(this, other)) return true;
        if(ReferenceEquals(other, null)) return false;
        return _domainName.Equals(other._domainName, StringComparison.OrdinalIgnoreCase);
    }
    public override bool Equals(object obj) {
        return Equals(obj as DomainName);
    }
    public override int GetHashCode() {
        return _domainName.ToLowerCase().GetHashCode();
    }
    public override string ToString() {
        return _domainName;
    }
    public static implicit operator DomainName(string domainName) {
        return domainName == null ? null : new DomainName(domainName);
    }
    public static implicit operator string(DomainName domainName) {
        return domainName == null ? null : domainName.ToString();
    }
    public static operator ==(DomainName left, DomainName right) {
        return Equals(left, right);
    }
    public static operator !=(DomainName left, DomainName right) {
        return !Equals(left, right);
    }
}

The DomainName type doesn't have any code contract pre-conditions other than the validation checks in the constructor.  And because the type is immutable, once a DomainName is successfully constructed then it is guaranteed to be valid--it is impossible to have a DomainName instance that is not valid.  The implicit operator functions also allow strings to be implicitly converted to DomainNames and vice versa, so changes to the DomainNameList class will be immediately backward-compatible with built in validation checks:

public class DomainNameList {
    private IList<DomainName> _domainNames = new IList<DomainName>();
    public void AddDomainName(DomainName domainName) {
        Contract.Requires(domainName != null);
        // No call to ValidationUtil!
        Contract.Requires(!Contains(domainName));
        _domainNames.Add(domainName); // No trim!
    }
    public bool Contains(DomainName domainName) {
        // No !string.IsNullOrEmpty contract is needed because Equals is now overridden
        // No trim!
        return _domainNames.Contains(domainName);
    }
}

Since Equals and GetHashCode are now provided, the Entire DomainNameList class could be reduced to ISet<DomainName>.

Domain names and phone numbers are somewhat obvious targets for value objects, especially if they are a significant part of the primary domain.  However, there are many, many cases where providing some encapsulated validation, normalization, and equals overloads makes life much better.  For example, product and service codes; account, quote, invoice, and order numbers; quantities; prices; percentages; user or actor IDs; distinguished names (LDAP); and so on.  Compare the following two method signatures and contracts:

Without value types:
public void AddProduct(string userId, string accountNumber, string orderNumber, string productCode, int quantity, decimal price, decimal? discount) {
    Contract.Requires(!string.IsNullOrWhitespace(userId));
    Contract.Requires(ValidationUtil.IsValidEmail(userId));
    Contract.Requires(!string.IsNullOrWhitespace(accountNumber));
    Contract.Requires(ValidationUtil.IsValidAccountNumber(accountNumber));
    Contract.Requires(!string.IsNullOrWhitespace(orderNumber));
    Contract.Requires(ValidationUtil.IsValidOrderNumber(orderNumber));
    Contract.Requires(!string.IsNullOrWhitespace(productCode));
    Contract.Requires(ValidationUtil.IsValidProductCode(productCode));
    Contract.Requires(quantity > 0);
    Contract.Requires(price > 0);
    Contract.Requires(discount == null || discount >= 0);
    Contract.Requires(discount == null || discount <= 100);
    ...
}

With value types:
public void AddProduct(UserId userId, AccountNumber accountNumber, OrderNumber orderNumber, ProductCode productCode, Quantity quantity, Price price, Discount discount) {
    Contract.Requires(userId != null);
    Contract.Requires(accountNumber != null);
    Contract.Requires(orderNumber != null);
    Contract.Requires(productCode != null);
    Contract.Requires(quantity != null);
    Contract.Requires(price != null);
    ...
}

Using value types throughout allows complex preconditions to be removed from many methods, which simplifies the static analysis and avoids a whole category of contract warnings and other issues.  The Equals and GetHashCode overrides enable the use of stock framework classes like lists and sets without having to implement special EqualityComparers and the like.  Furthermore, the implicit operators make migration smooth, as conversions back and forth will occur automatically, allowing a refactoring to occur over time.

But most importantly, the validation, equality, hash code, and all other operations are now encapsulated in one type, which decreases coupling and increases maintainability.  For example, suppose that user IDs need to change from E-mail address to LDAP DNs -- all that would need to change is the UserId type and anything that makes use of the value(s) it exposes.

So while contracts do a decent job of making sure you dot your Is and cross your Ts, runtime checks cannot be eliminated, and encapsulating them in strong types basically reduces the static analysis to a bunch of null checks that are easy to process and prove.

Friday, December 16, 2011

Free to Write Bad Code

The other day I stumbled across "Is it Possible to do Object-Oriented Programming in Java", a video presentation by Kevlin Henney hosted at InfoQ.  I had expected the same tired comparisons to Smalltalk and C++, but I was pleasantly surprised.  Despite the talk's title, it had very little to do with Java;  the bulk of the discussion centered around abstraction.  The one thing that really stuck with me was the purist notion of writing code such that concrete classes appear only to the right of "new" operators, and using interfaces everywhere else (even in concrete implementations of equals and hash code functions).

So in a new project I've been experimenting with this.  I started by identifying all of the types and services I would need to deal with and defined them as interfaces.  The arguments of each method call are defined in terms of either primitives or interfaces.  Whenever I had to deal with concrete types in the underlying framework I defined interfaces to wrap them.

The first thing I noticed was that I was able to model the entire project without a single line of functional code.  What is interesting here is that I was able to design it without knowing how to implement it.  In this particular case I was very familiar with the problem domain, but I did not have a clue about how to achieve that functionality using the underlying framework.  But since I wasn't dealing with concrete types I didn't have to.

The second thing I noticed was how many pieces of the system could be injected using Spring or some other IoC container.  This would enable me to stub or mock many pieces of the application and easily swap them out during the container initialization steps as the actual implementations were completed.  There are no constructors, so it is easy to imagine any of the types being constructed from some kind of factory or IoC container.

Lastly, but perhaps most importantly, when I took the previous observations into consideration I realized that I didn't have to obsess about the details of the implementation.  The implementation classes may be ugly, horrendous masses of spaghetti code, but that doesn't matter.  They can be swapped out whenever it is convenient.  If the initial implementation was with concrete classes, that would be a much more difficult undertaking.  Concrete classes leak all kinds of public, internal (package protected), and protected members, and refactoring them is a difficult task.  However, if all of the interactions between classes are constrained to an interface or function prototype, then there are no surprises.  Any implementation of the interface is interchangeable.

I've read some articles on test-driven development that present the notion of writing tests before writing code.  And while that has always made sense on an academic level, I've never believed it to be practical until now.  By modeling all interactions with interfaces, the tests are obvious and can actually compile (but not run) before the first line of actual code is written.

Other than a few extra interface types in the project, I haven't really noticed a downside to this approach--I've even extracted interfaces for aggregate root and entity types in DDD.  It really helps put the focus on abstraction and encapsulation, and removes a lot of pressure when coding the actual implementation.