Tuesday, 11 November 2008

Day 2 - 15:15 : How LINQ works: A deep dive into the C# implementation of LINQ.

Luke Hoban presented this 400 level session. Luke is the program manager for visual studio languages.

Having not really used LINQ in anger, I was interested in following this session to understand a little more about LINQ and how it works. I got lost at about the time he started talking about expression trees for D-LINQ - which takes the query and rather than processing it, instead builds an SQL query, but I managed to get the basic principles down. If any of the following is wrong, then that's purely my fault, not the presenters.

LINQ to Objects - translation from query to code
Despite appearances, the following code doesn't actually get represented directly by IL when it is compiled. Instead it is first translated into invocations of methods, and in fact you could of written the same query in the way in which it is translated;

var query = from c in GetSomeList()
    where c.City == "London"
    select c;

The translated form:

bool IsCustomerFromLondon(SomeType obj)
    if( obj.City == "London" ) return true;
    return false;
IEnumerable<SomeType> query = GetSomeList().Where(IsCustomerFromLondon);

In fact, if we were doing it this way, we could make life easier for ourselves by using a Lambda expression:

IEnumerable<SomeType> query = GetSomeList().Where( c => c.City == "London" );

The C# compiler does this exact same thing - it actually transforms to invoking the above methods etc and this is what is turned into IL and built into your assemblies. The main difference being the compiler optimises the delegate construction (which is expensive) by caching it - so when it's used again, it just re-uses it from the cache. In the case of the Lambda expression, this is generated into code as a method that resembles our IsCustomerFromLondon method, but given a munged name and marked with the [CompilerGenerated] attribute.

The GetSomeList() method above returns an IEnumerable object. LINQ provides an extension method to IEnumerable - the where method, which takes a Func<type, bool> delegate expression for comparison. This is implemented as follows;

public static IEnumerable<T> Where<T>(this IEnumerable<T> source, Func<T, bool> filter)
    foreach( item in source )
        if( filter(item))
            yield return item;
Deferred Execution
Be aware though that this where method isn't invoked immediately, rather it is invoked on demand. Using our original query, if we then enumerated through the results, as;
foreach( SomeType item in query.List() )
    // do something to item

It would invoke the where method at this point, as soon as it finds a match, it would then yield back to the caller until it is asked to get the next item, where it would pick up where it left off. This is all done under the covers, but you can see it in operation by implementing your own Where extension method on IEnumerable and debug stepping through the code.

Expression trees
Luke then went on to describe expression trees where LINQ operates on the IQueryable interface instead of IEnumerable. The result of this is that the where extension methods and such like build expressions trees which contain a description of the code that invoked it rather than references to the delegate. This can then be used in the data provider to build SQL queries or such like. To be honest, I got lost at this point of the presentation as I've had only limited exposure to LINQ and haven't ever looked at the inner workings before. I'm hoping to find out more over the coming days of Tech Ed.

No comments:

Post a Comment