LINQ is one of the best reasons to use C# as your main programming language. It provides a simple way to query and manipulate groups of objects, and does so in an easy-to-read manner while still allowing for complex queries to be run.

Blest be the link that binds? Photo by JJ Ying / Unsplash

The Sample Solution

exceptionnotfound/CSharpInSimpleTerms
Contribute to exceptionnotfound/CSharpInSimpleTerms development by creating an account on GitHub.
Project for this post: 14LINQBasics

What Is LINQ?

As mentioned above, LINQ (Language Integrated Query) allows us to query and manipulate groups of objects in C#. It does this in two ways: a query syntax which looks a lot like SQL queries, and an API syntax which consists of a set of method calls.

Here's an example of the query syntax:

List<int> myNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };

var evenNumbers = from x in myNumbers
                  where x % 2 == 0
                  select x; //Get all even numbers
                  
foreach(var num in evenNumbers)
{
    Console.WriteLine(num.ToString());
}
This code block selects the even numbers from the number set and outputs them to the console.

Here's that same query using the API syntax:

List<int> myNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };

var evenNumbers = myNumbers.Where(x => x % 2 == 0);
                  
foreach(var num in evenNumbers)
{
    Console.WriteLine(num.ToString());
}

In most situations, the API syntax is more concise, but certain queries are simpler to write and more easily understood with the query syntax.

Namespace

LINQ operations can be found in the System.Linq namespace:

using System.Linq;

Anatomy of a Query

Let's break down the query we saw earlier:

List<int> myNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };

var evenNumbers = from x in myNumbers
                  where x % 2 == 0
                  select x;

A basic LINQ query has three parts:

  1. A from and in clause. The variable after the from specifies a name for an iterator; think of it as repesenting each individual object in the collection. The in clause specifies the collection we are querying from.
  2. An optional where clause. This uses the variable defined by the from keyword to create conditions that objects must match in order to be returned by the query.
  3. A select clause. The select keyword specifies what parts of the object to select. This can include the entire object or only specific properties.

Here's a slightly more complex query, using a custom class:

public class User
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public int BirthYear { get; set; }
}

var users = new List<User>()
{
    new User()
    {
        FirstName = "Terrance",
        LastName = "Johnson",
        BirthYear = 2005
    },
    new User()
    {
        FirstName = "John",
        LastName = "Smith",
        BirthYear = 1966
    },
    new User()
    {
        FirstName = "Eva",
        LastName = "Birch",
        BirthYear = 2002
    }
};

//Get the full combined name for people born in 1990 or later
var fullNames = from x in users
                where x.BirthYear >= 1990
                select new { x.FirstName, x.LastName };

This shows an example of a projection: we can use LINQ to select properties of types without needing to select the entire instance, and the resulting collection consists of only the properties we selected, not the entire object.

For comparison, here's that same query using API syntax:

//Get the full combined name for people born in 1990 or later
var fullNames = users.Where(x => x.BirthYear >= 1990)
                     .Select(x => new { x.FirstName, x.LastName }); //Projection

The rest of the samples in this post will be in API syntax unless otherwise noted.

Filtering

There are many ways to filter the results of a query, besides using a where clause.

First

For example, we may want only the first item returned. To do this we must use the => operator, which is the "goes to" operator, to define a condition which records must match in order to be selected.

var first = users.First(); //First element in the collection

//First element that matches a condition
var firstWithCondition = users.First(x => x.BirthYear > 2001);

The First() method throws an exception if no items are found. We can have it instead return a default value by using FirstOrDefault() (for all C# classes, the default value will be null):

//First element in collection or default value
var firstOrDefault = users.FirstOrDefault();

//First element that matches a condition OR default value
var firstOrDefaultWithCondition = users.FirstOrDefault(x => x.BirthYear > 2005);

Single

We can also get exactly one item using Single() or SingleOrDefault():

var singleUser = users.Single(x => x.FirstName == "John");

var singleUserOrDefault = users.SingleOrDefault(x => x.LastName == "Johnson");

Both Single() and SingleOrDefault() will throw an exception if more than one item matches the condition.

Distinct

LINQ can even return all distinct items in a collection:

var indistinctNumbers = new List<int> { 4, 2, 6, 4, 6, 1, 7, 2, 7 };

var distinctNumbers = indistinctNumbers.Distinct();

Ordering

We can order results from a LINQ query by their properties using the methods OrderBy() and ThenBy().

///Same User class as earlier
List<User> users = SomeOtherClass.GetUsers();

var orderedUsers = users.OrderBy(x => x.FirstName)
                        .ThenBy(x => x.LastName); //Alphabetical order 
                                                  //by first name
                                                  //then last name

Note that we cannot use ThenBy() without first having an OrderBy() call.

There are also descending-order versions of these methods:

var descendingOrderUsers 
    = users.OrderByDescending(x => x.FirstName)
            .ThenByDescending(x => x.LastName); //Reverse alphabetical order by
                                                //first name, then 
                                                //by last name

We can also use the orderby and descending keywords in the query syntax:

var users = new List<User>();

var myUsers = from x in users
              orderby x.BirthYear descending, x.FirstName descending
              select x;

Aggregation

When operating on a collection of number values, LINQ provides a few aggregation methods, such as Sum(), Min(), Max(), Count(), and Average(). Each of them can optionally be used after a Where() clause.

var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

Console.WriteLine("Sum: " + numbers.Sum()); //55

Console.WriteLine("Min: " + numbers.Where(x=> x >= 2).Min()); //2
Console.WriteLine("Max: " + numbers.Where(x => x < 7).Max()); //6

//Returns the number of elements: 10
Console.WriteLine("Count: " + numbers.Count()); 

//Returns the average of numbers whose value is > 3. Result: 7
Console.WriteLine("Average: " + numbers.Where(x => x > 3).Average()); 

Method Chaining

Note the last line in the previous example, the one that uses the Average() method. The great thing about LINQ's API syntax is that we can chain methods to produce concise, readable code, even for complicated queries.

For example: say we have a collection of users, and we need to get all combined user names (first + last) ordered by the first name alphabetically, where the first letter of the last name is J and the birth year is between 2000 and 2015.

The resulting LINQ method calls look like this:

var resultUsers = moreUsers.Where(x => x.LastName[0] == 'J'
                                       && x.BirthYear >= 2000
                                       && x.BirthYear <= 2015)
                           .OrderBy(x => x.FirstName)
                           .Select(x => x.FirstName + " " + x.LastName);

In this way, even complex queries become relatively simple LINQ calls.

IEnumerable<T> and Conversion

When using LINQ, the return type of a query is often of type IEnumerable<T>. This is a generic interface that collections implement in order to be enumerable, which means they can create an iterator over the collection which can return elements within it. We will discuss generics more thoroughly in the next post.

Most of the time, operating on a collection of IEnumerable<T> is fine if we just need certain values or a projection. We can even use IEnumerable<T> elements in for or foreach loops, as we saw way back in the first two code samples in this post.

However, sometimes what we really want is a full-blown collection. For these times, LINQ includes methods that will convert IEnumerable<T> to a concrete collection, such as a List<T> or an array.

var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

var evenNumbers = numbers.Where(x => x % 2 == 0);

List<int> list = evenNumbers.ToList();

int[] array = evenNumbers.ToArray();

Existence Operations

LINQ can check for the existence of objects in a collection that match given conditions. For example, let's say we have a list of users, and we want to know if any of the users were born in the year 1997.

bool isAnyoneBornIn1997 = users.Any(x => x.BirthYear == 1997);

We might also use Any() with no condition to check if there are any elements in a collection:

var users = SomeOtherClass.GetCertainUsers();

bool hasAny = users.Any(); //True if there are any elements, false otherwise.

We can also check if all the users in a particular collection were born in the year 1997:

bool isEveryoneBornIn1997 = users.All(x => x.BirthYear == 1997);

We can even check if a collection contains a particular value:

List<int> newNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

bool hasAFive = newNumbers.Contains(5);

Set Operations

LINQ allows us to perform set operations against two or more sets of objects.

Intersection

An intersection is the group of objects that appear in both of two lists.

var intersectionList1 = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var intersectionList2 = new List<int> { 2, 4, 6, 8, 10, 12, 14 };

var intersection = intersectionList1.Intersect(intersectionList2); 
//{ 2, 4, 6, 8 }

Union

A union is the combined list of unique objects from two separate lists. An element which appears in both lists will only be listed in the union object once.

var unionList1 = new List<int> { 5, 7, 3, 2, 9, 8 };
var unionList2 = new List<int> { 9, 4, 6, 1, 5 };

var union = unionList1.Union(unionList2); //{ 5, 7, 3, 2, 9, 8, 4, 6, 1 }

Except

There is also the LINQ method Except(), which produces the elements that are in the first set, but not in the second set.

var exceptList1 = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var exceptList2 = new List<int> { 7, 2, 8, 5, 0, 10, 3 };

var except = exceptList1.Except(exceptList2); //{ 1, 4, 6, 9 }

Grouping

Imagine we have the following Book class:

public class Book 
{
    public long ID { get; set; }
    public string Title { get; set; }
    public string AuthorName { get; set; }
    public int YearOfPublication { get; set; }
}

Also imagine that we have the following set of Book instances in a collection:

var books = new List<Book>()
{
    new Book()
    {
        ID = 1,
        Title = "Title 1",
        AuthorName = "Author 1",
        YearOfPublication = 2015
    },
    new Book()
    {
        ID = 2,
        Title = "Title 2",
        AuthorName = "Author 2",
        YearOfPublication = 2015
    },
    new Book()
    {
        ID = 3,
        Title = "Title 3",
        AuthorName = "Author 1",
        YearOfPublication = 2017
    },
    new Book()
    {
        ID = 4,
        Title = "Title 4",
        AuthorName = "Author 3",
        YearOfPublication = 1999
    },
    new Book()
    {
        ID = 5,
        Title = "Title 5",
        AuthorName = "Author 4",
        YearOfPublication = 2017
    },
};

One query we might want to run is to list each book in order by publication year. For this query, we don't care about titles or author names, we only care about the count of books in each publication year.

We can execute this query using a group by query. A group by query has the following format:

var results = from collectionVar in collectionName
              group collectionBar by collectionVar.PropertyName 
                  into varGroupName
              orderby varGroupName.Key //orderby is optional
              select new { 
                  Key = varGroupName.Key, 
                  Objects = varGroupName.ToList() 
              };

Using this format, our query to get all books in order by publishing year looks like this:

List<Book> books = SomeOtherClass.GetBooks();

var results = from b in books
              group b by b.YearOfPublication into g
              orderby g.Key
              select new { Year = g.Key, Books = g.ToList() };

We could then use a nested foreach loop to output all the books:

foreach(var result in results)
{
    Console.WriteLine("Books published in " + result.Year.ToString());

    var yearBooks = result.Books;
    foreach(var book in yearBooks)
    {
        Console.WriteLine(book.Title + " by " + book.AuthorName);
    }
}

Which gives these results:

Books published in 1999. Title 4 by Author 3. Books published in 2015. Title 1 by Author 1, Title 2 by Author 2. Books published in 2017. Title 3 by author 1, Title 5 by author 4.

Which you can see for yourself if you clone and run the sample project.

Glossary

  • Query Syntax - LINQ queries which use the from, in, where, and select keywords.
  • API Syntax - LINQ queries which use methods, e.g. Where() or First().
  • Iterator - An object which iterates over elements in a collection. In LINQ queries, iterators are given names and are used in conditions.
  • Conditions - When referring to LINQ queries, boolean values which must be true for an individual element in order for that element to be returned by the query. These are sometimes called predicates.
  • Projection - A set of properties from a class that we select as part of a LINQ query. Can also be properties from multiple classes in the select set.
  • Set operations - Operations on two or more collections. Can produce the intersection, union, or except group.

New Keywords and Operators

  • from - In a LINQ query, specifies a variable name to use for the iterator over a collection.
  • in - In a LINQ query, specifies the source collection the query will execute against.
  • where - In a LINQ query, specifies one or more conditions that objects in the collection must satisfy in order to be selected.
  • select-  In a LINQ query, specifies the objects or projections that will be created by the query.
  • orderby - In a LINQ query, specifies one or more properties to order the results by.
  • descending - In a LINQ query, specifies that the objects are to be ordered by the given property in descending order.
  • => - The "goes to" operator. Used to create lambda expressions in LINQ statements.

Summary

LINQ (Language Integrated Query) is a set of technologies that allow us to operate on and select elements from collections. Among the many operations we can perform are queries, ordering, conversion, set operations, existence operations, and grouping. All of these functionalities are available in either query syntax or API syntax; the latter is favored most of the time, but some functionalities are easier in the former.

There are quite a few more advanced things we can do with LINQ. If you want more samples, check out the 101 LINQ Samples group in this repository.

dotnet/try-samples
Contribute to dotnet/try-samples development by creating an account on GitHub.

You might have noticed the <T> syntax; it appears a lot in this post. This is representative of a generic, and we will discuss generics in the next post of this series. Check that out here:

C# in Simple Terms - Generics
Let’s make types that use other types!

You might also have noticed the => operator; this is the lambda operator and we read this aloud as "goes to". It is representative of both lambdas and expressions, which we will discuss in a later post.

Got questions about LINQ? I wanna hear them! Ask away in the comments below. And yes, I do know that's Zelda, not Link, in the page photo. :)

Happy Coding!