GetHashCode Made Easy

Update (2019-06-12): I updated my HashCode implementation to cover a few more scenarios which I discuss below.

Update (2018-08-14): I updated this article to talk about a new HashCode class included in .NET Core 2.1 and licensing information for my code since I’ve been asked repeatedly.

Implementing GetHashCode is hard work and little understood. If you take a look on MSDN or StackOverflow for a few pointers, you’ll see a plethora of examples with all kinds of little used C# operators and magic numbers with little explanation for developers (Especially the newbies) what they are for and why we need them. This, for a method which exists on the Object class and is the root of all that is good and wholesome in C# is surprising.

Before I continue, I recommend reading Eric Lippert’s blog post about the subject. He does not show any code, just goes into when and where we need to implement the GetHashCode method. Eric does a much better job than I could do but in short, GetHashCode is implemented wherever you implement the Equals method and ideally your class should be immutable.

Now down to the nitty gritty. How do we make implementing GetHashCode easy. Well, suppose we have the following class:

In our example we have an immutable object with a variety of fields of different types, including a collection. One possible implementation of GetHashCode according to the highest rated StackOverflow post (If modified to fit our example and deal with nulls) may be:

I don’t know about you but that code looks awfully unwieldy to me. For a start we’ve got two different magic numbers 17 and 23. Why? As it happens these are prime numbers and reduces the chance of getting collisions between hashes (Two un-equal objects are supposed to have different hash codes but sometimes this is not the case due to hash collisions that can occur).

We’ve also got the unchecked C# keyword which stops overflow checking to improve performance (That’s not something you see every day). Bear in mind that the whole point of the GetHashCode method is to allow things like the Dictionary to quickly retrieve objects.

I personally would not be able to remember how to do this each time I need to implement GetHashCode and it seems like you could very easily introduce bugs by making a typo. How about a helper class (Well actually a struct)?

The helper struct can be used in our SuperHero class like so:

Now isn’t that pretty? All the nasty magic numbers and unchecked code has been hidden away. It is a very lightweight and simple struct, so although we create new instances of it, it’s stored in the stack rather than the memory heap. What’s more, is that is code is just as fast (I’ve timed it)! We’re using generics so there is no boxing or unboxing going on. We’re still using the unchecked keyword, so overflow checking is still disabled.

One interesting edge case is what to do when hashing a collection and you get either a null or empty collection. Should you use a zero to represent both scenarios (zero is usually used to represent a null value) or differentiate them somehow. I managed to get a response from Jon Skeet himself on StackOverflow:

if both states are valid, it seems perfectly reasonable to differentiate between them. (Someone carrying an empty box isn’t the same as someone not carrying a box at all…)

Jon Skeet

This is why we use the prime number 19 (it could have been any prime number) to represent an empty collection. Whether this matters or not depends on your use case. If an empty collection means something different in your scenario, then we’ve got you covered. Generally speaking though, if you are exposing a collection property in your class you should consider making it a getter only and initializing it in the constructor, so that it is never null in the first place but here we’re trying to cover all scenarios.

.NET Core HashCode

If you are using .NET Core 2.1, consider using the System.HashCode struct instead of my code. There are two ways to use it:

HashCode.Combine

The Combine method can be used to create a hash code, given up to eight objects.

HashCode.Add

The Add method is similar to my code but it does not handle collections and is not fluent:

Advantages and Disadvantages

There are several advantages to using the .NET Core HashCode:

  • It’s part of the .NET Core base class library (BCL), so it’s maintained and officially supported.
  • It has good performance. I haven’t written a benchmark against comparing my code to the .NET Core code. If you do, I’d be interested to hear the results.
  • Like my code, it handles nulls automatically.

The disadvantages are:

  • You have to be using .NET Core 2.1 or above.
  • It does not handle collections, you have to write a for loop to handle that yourself. You could be clever and write a quick extension method to handle this scenario.
  • It’s not fluent, so you end up with uglier code. If you are a fan of C# 7 expression bodied members, then my fluent code lets you use them, while the .NET Core HashCode.Add method does not.

License

I’ve been asked repeatedly for licensing of my code above. Developers have been asking permission to use it in a Kerbal Space program plugin and even in the excellent Chocolatey project which has been totally unexpected for me because this is code I wrote years ago. Please consider the code as MIT licensed, do good with it and be excellent to each other!