LINQ wonder world

Using LINQ to group on two columnsI am sure that, if you are a developer, at some point in your career heard about and asked yourself: what is LINQ? If not, there’s no problem, you’ll find out now : LINQ comes from language-integrated query and it is a collection of extensions to the .NET Framework that encompass language-integrated query, set, and transform operations. It helps developers by extending both C# and Visual Basic, adding native querying capabilities to these languages. It also comes with class libraries to get the full advantage out of these capabilities.

As we discussed in my previous article, Import CSV file and query it with LINQ, we are able to load and parse the file into a IQuerable collection, thus giving us the oportunity to do more and more SQL- like operations on it.

The C# 3.0 specification defines a so-called Query Expression Pattern along with translation rules from a LINQ expression to an expression in a subset of C# 3.0 without LINQ expressions. The translation thus defined is actually un-typed, which, in addition to lambda expressions being interpretable as either delegates or expression trees, allows for a great degree of flexibility for libraries wishing to expose parts of their interface as LINQ expression clauses. For example, LINQ to objects works on IEnumerable<T>s and with delegates, whereas LINQ to SQL makes use of the expression trees.

LINQ has some providers – code that lets you query a specific type of data. Visual Studio has a LINQ provider for objects (LINQ), SQL (DLINQ) and XML (XLINQ). Looking on the internet you will find many more providers, by example  for Excel, Flickr, Google, Sharepoint, WMI, etc.

The best thing is that the query syntax is the same for every provider, so you’ll not be required to learn new syntax each time you use a different provider. You can even query multiple providers in one query. The LINQ query syntax shares many commands and looks like SQL, only the command order is different – it starts with the ‘from’ statement. The main reason for this is that intellisense is able to give you hints the moment you have specified the object in the from statement.

So this is the theory – now let’s do some examples. A good starting point is 101 LINQ Samples – a comprehensive collection of queries, grouped in logical categories.

The most challenging query for me by now is the grouping of data from a table that needs to be based on two columns. Let’s assume that we have a table that contains some users that share books. For the the example, we assume further more that it contains also history, so a pair of users can share many times (thus we will have multiple rows ) the same book. The question is: how do we count how many books that the users share, grouped on user pairs?

var Links = (from b in BooksShared
       // The new key of the resulting dataset is new { b.SharerID, b.TargetID }
       group b by new { b.SharerID, b.TargetID } into g
       select new { UsersLinked = g.Key, LinksCount = g.Count() });

// writing the links into an XML file

XElement root = new XElement("root");

XElement links = new XElement("links");

//note that resulting Link structure contains as key the UsersLinked structure as its key and Count as integer, the count of the links between sharer and target
foreach (var Link in Links) {
  links.Add(new XElement("link",
                new XElement("UserId", Link.UsersLinked.SharerID),
                new XElement("TargetID", Link.UsersLinked.TargetID),
                new XElement("Count", Link.LinksCount)));