Monday, January 31, 2011

Using Except() Method in LINQ

There is a method that I rarely use in LINQ called "Except" - but it comes handy in my latest coding. So I thought I'd share how I use it. This method is for an IEnumerable and has multiple signatures: one that takes a second IEnumerable to exclude and the other one takes an IEnumerable and a comparer. You can read the spec in full here.

Here is an example:

int[] oneToTen = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int[] oneToTenEven = { 2, 4, 6, 8, 10 };
int[] oneToTenOdd;

// do some exclusion here
oneToTenOdd = oneToTen.Except(oneToTenEven);

// so that oneToTenOdd will have { 1, 3, 5, 7, 9 };
OK - that looks simple. But what about when the IEnumerable is of type T or some other complex type? This is where the second signature comes in handy. In this scenario, we will have to make our own comparer class to specify how we want both list items to be compared against each other.

Imagine this hypothetical situation where you are at a dealership and want to separate the cars that have been washed and ones that have not. Cars may have the same make, model, color, type, but each has different VIN number.

IEnumerable<Car> allCars = GetAllCars();
IEnumerable<Car> carAlreadyWashed = GetCarAlreadyWashed();
IEnumerable<Car> carNotWashed;

// do some exclusion here
carNotWashed = allCars.Except(carAlreadyWashed);

The above code which will normally work for simple comparison, won't work because the run-time will have no idea that it has to compare based on VIN number. We have to tell it to use that field to do comparison.

public class CarComparer : IEqualityComparer<Car> {
    public bool Equals(Car x, Car y) {
        //Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        //Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

         //Check whether the Car' properties are equal.
        return x.VIN == y.VIN;
    }

    // If Equals() returns true for a pair of objects
    // then GetHashCode() must return the same value for these objects.
    public int GetHashCode(Car car) {
        //Check whether the object is null
        if (Object.ReferenceEquals(plan, null)) return 0;
        //Get hash code for the VIN field.
        int hashCarVIN = car.VIN.GetHashCode();
        return hashCarVIN;
    }
}
Now then we can modify our code to be such as this:

IEnumerable<Car> allCars = GetAllCars();
IEnumerable<Car> carAlreadyWashed = GetCarAlreadyWashed();
IEnumerable<Car> carNotWashed;

// do some exclusion here
carNotWashed = allCars.Except(carAlreadyWashed, new CarComparer());


2 comments:

Aaron Stemen said...

Nice post Joe! There are more than a handful of Linq methods that I don't realize exist. Any() was one that I learned about recently instead of using Count() and comparing to 0.

Thanks Joe!

Johannes Setiabudi said...

I almost blogged about Any(), but decided to do Except() instead. O well, so that works out.