LINQ is one of the best reasons to use C# as your main programming language. It provides a simple way to query and manipulate groups of objects, and does so in an easy-to-read manner while still allowing for complex queries to be run.
The Sample Solution
What Is LINQ?
As mentioned above, LINQ (Language Integrated Query) allows us to query and manipulate groups of objects in C#. It does this in two ways: a query syntax which looks a lot like SQL queries, and an API syntax which consists of a set of method calls.
Here's an example of the query syntax:
Here's that same query using the API syntax:
List<int> myNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };
var evenNumbers = myNumbers.Where(x => x % 2 == 0);
foreach(var num in evenNumbers)
{
Console.WriteLine(num.ToString());
}
In most situations, the API syntax is more concise, but certain queries are simpler to write and more easily understood with the query syntax.
Namespace
LINQ operations can be found in the System.Linq
namespace:
using System.Linq;
Anatomy of a Query
Let's break down the query we saw earlier:
List<int> myNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };
var evenNumbers = from x in myNumbers
where x % 2 == 0
select x;
A basic LINQ query has three parts:
- A
from
andin
clause. The variable after thefrom
specifies a name for an iterator; think of it as repesenting each individual object in the collection. Thein
clause specifies the collection we are querying from. - An optional
where
clause. This uses the variable defined by thefrom
keyword to create conditions that objects must match in order to be returned by the query. - A
select
clause. Theselect
keyword specifies what parts of the object to select. This can include the entire object or only specific properties.
Here's a slightly more complex query, using a custom class:
public class User
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int BirthYear { get; set; }
}
var users = new List<User>()
{
new User()
{
FirstName = "Terrance",
LastName = "Johnson",
BirthYear = 2005
},
new User()
{
FirstName = "John",
LastName = "Smith",
BirthYear = 1966
},
new User()
{
FirstName = "Eva",
LastName = "Birch",
BirthYear = 2002
}
};
//Get the full combined name for people born in 1990 or later
var fullNames = from x in users
where x.BirthYear >= 1990
select new { x.FirstName, x.LastName };
This shows an example of a projection: we can use LINQ to select properties of types without needing to select the entire instance, and the resulting collection consists of only the properties we selected, not the entire object.
For comparison, here's that same query using API syntax:
//Get the full combined name for people born in 1990 or later
var fullNames = users.Where(x => x.BirthYear >= 1990)
.Select(x => new { x.FirstName, x.LastName }); //Projection
The rest of the samples in this post will be in API syntax unless otherwise noted.
Filtering
There are many ways to filter the results of a query, besides using a where
clause.
First
For example, we may want only the first item returned. To do this we must use the =>
operator, which is the "goes to" operator, to define a condition which records must match in order to be selected.
var first = users.First(); //First element in the collection
//First element that matches a condition
var firstWithCondition = users.First(x => x.BirthYear > 2001);
The First()
method throws an exception if no items are found. We can have it instead return a default value by using FirstOrDefault()
(for all C# classes, the default value will be null
):
//First element in collection or default value
var firstOrDefault = users.FirstOrDefault();
//First element that matches a condition OR default value
var firstOrDefaultWithCondition = users.FirstOrDefault(x => x.BirthYear > 2005);
Single
We can also get exactly one item using Single()
or SingleOrDefault()
:
var singleUser = users.Single(x => x.FirstName == "John");
var singleUserOrDefault = users.SingleOrDefault(x => x.LastName == "Johnson");
Both Single()
and SingleOrDefault()
will throw an exception if more than one item matches the condition.
Distinct
LINQ can even return all distinct items in a collection:
var indistinctNumbers = new List<int> { 4, 2, 6, 4, 6, 1, 7, 2, 7 };
var distinctNumbers = indistinctNumbers.Distinct();
Ordering
We can order results from a LINQ query by their properties using the methods OrderBy()
and ThenBy()
.
///Same User class as earlier
List<User> users = SomeOtherClass.GetUsers();
var orderedUsers = users.OrderBy(x => x.FirstName)
.ThenBy(x => x.LastName); //Alphabetical order
//by first name
//then last name
Note that we cannot use ThenBy()
without first having an OrderBy()
call.
There are also descending-order versions of these methods:
var descendingOrderUsers
= users.OrderByDescending(x => x.FirstName)
.ThenByDescending(x => x.LastName); //Reverse alphabetical order by
//first name, then
//by last name
We can also use the orderby
and descending
keywords in the query syntax:
var users = new List<User>();
var myUsers = from x in users
orderby x.BirthYear descending, x.FirstName descending
select x;
Aggregation
When operating on a collection of number values, LINQ provides a few aggregation methods, such as Sum()
, Min()
, Max()
, Count()
, and Average()
. Each of them can optionally be used after a Where()
clause.
var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Console.WriteLine("Sum: " + numbers.Sum()); //55
Console.WriteLine("Min: " + numbers.Where(x=> x >= 2).Min()); //2
Console.WriteLine("Max: " + numbers.Where(x => x < 7).Max()); //6
//Returns the number of elements: 10
Console.WriteLine("Count: " + numbers.Count());
//Returns the average of numbers whose value is > 3. Result: 7
Console.WriteLine("Average: " + numbers.Where(x => x > 3).Average());
Method Chaining
Note the last line in the previous example, the one that uses the Average()
method. The great thing about LINQ's API syntax is that we can chain methods to produce concise, readable code, even for complicated queries.
For example: say we have a collection of users, and we need to get all combined user names (first + last) ordered by the first name alphabetically, where the first letter of the last name is J and the birth year is between 2000 and 2015.
The resulting LINQ method calls look like this:
var resultUsers = moreUsers.Where(x => x.LastName[0] == 'J'
&& x.BirthYear >= 2000
&& x.BirthYear <= 2015)
.OrderBy(x => x.FirstName)
.Select(x => x.FirstName + " " + x.LastName);
In this way, even complex queries become relatively simple LINQ calls.
IEnumerable<T> and Conversion
When using LINQ, the return type of a query is often of type IEnumerable<T>
. This is a generic interface that collections implement in order to be enumerable, which means they can create an iterator over the collection which can return elements within it. We will discuss generics more thoroughly in the next post.
Most of the time, operating on a collection of IEnumerable<T>
is fine if we just need certain values or a projection. We can even use IEnumerable<T>
elements in for
or foreach
loops, as we saw way back in the first two code samples in this post.
However, sometimes what we really want is a full-blown collection. For these times, LINQ includes methods that will convert IEnumerable<T>
to a concrete collection, such as a List<T>
or an array.
var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var evenNumbers = numbers.Where(x => x % 2 == 0);
List<int> list = evenNumbers.ToList();
int[] array = evenNumbers.ToArray();
Existence Operations
LINQ can check for the existence of objects in a collection that match given conditions. For example, let's say we have a list of users, and we want to know if any of the users were born in the year 1997.
bool isAnyoneBornIn1997 = users.Any(x => x.BirthYear == 1997);
We might also use Any()
with no condition to check if there are any elements in a collection:
var users = SomeOtherClass.GetCertainUsers();
bool hasAny = users.Any(); //True if there are any elements, false otherwise.
We can also check if all the users in a particular collection were born in the year 1997:
bool isEveryoneBornIn1997 = users.All(x => x.BirthYear == 1997);
We can even check if a collection contains a particular value:
List<int> newNumbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
bool hasAFive = newNumbers.Contains(5);
Set Operations
LINQ allows us to perform set operations against two or more sets of objects.
Intersection
An intersection is the group of objects that appear in both of two lists.
var intersectionList1 = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var intersectionList2 = new List<int> { 2, 4, 6, 8, 10, 12, 14 };
var intersection = intersectionList1.Intersect(intersectionList2);
//{ 2, 4, 6, 8 }
Union
A union is the combined list of unique objects from two separate lists. An element which appears in both lists will only be listed in the union object once.
var unionList1 = new List<int> { 5, 7, 3, 2, 9, 8 };
var unionList2 = new List<int> { 9, 4, 6, 1, 5 };
var union = unionList1.Union(unionList2); //{ 5, 7, 3, 2, 9, 8, 4, 6, 1 }
Except
There is also the LINQ method Except()
, which produces the elements that are in the first set, but not in the second set.
var exceptList1 = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var exceptList2 = new List<int> { 7, 2, 8, 5, 0, 10, 3 };
var except = exceptList1.Except(exceptList2); //{ 1, 4, 6, 9 }
Grouping
Imagine we have the following Book class:
public class Book
{
public long ID { get; set; }
public string Title { get; set; }
public string AuthorName { get; set; }
public int YearOfPublication { get; set; }
}
Also imagine that we have the following set of Book
instances in a collection:
var books = new List<Book>()
{
new Book()
{
ID = 1,
Title = "Title 1",
AuthorName = "Author 1",
YearOfPublication = 2015
},
new Book()
{
ID = 2,
Title = "Title 2",
AuthorName = "Author 2",
YearOfPublication = 2015
},
new Book()
{
ID = 3,
Title = "Title 3",
AuthorName = "Author 1",
YearOfPublication = 2017
},
new Book()
{
ID = 4,
Title = "Title 4",
AuthorName = "Author 3",
YearOfPublication = 1999
},
new Book()
{
ID = 5,
Title = "Title 5",
AuthorName = "Author 4",
YearOfPublication = 2017
},
};
One query we might want to run is to list each book in order by publication year. For this query, we don't care about titles or author names, we only care about the count of books in each publication year.
We can execute this query using a group by
query. A group by
query has the following format:
var results = from collectionVar in collectionName
group collectionBar by collectionVar.PropertyName
into varGroupName
orderby varGroupName.Key //orderby is optional
select new {
Key = varGroupName.Key,
Objects = varGroupName.ToList()
};
Using this format, our query to get all books in order by publishing year looks like this:
List<Book> books = SomeOtherClass.GetBooks();
var results = from b in books
group b by b.YearOfPublication into g
orderby g.Key
select new { Year = g.Key, Books = g.ToList() };
We could then use a nested foreach
loop to output all the books:
foreach(var result in results)
{
Console.WriteLine("Books published in " + result.Year.ToString());
var yearBooks = result.Books;
foreach(var book in yearBooks)
{
Console.WriteLine(book.Title + " by " + book.AuthorName);
}
}
Which gives these results:
Which you can see for yourself if you clone and run the sample project.
Glossary
- Query Syntax - LINQ queries which use the
from
,in
,where
, andselect
keywords. - API Syntax - LINQ queries which use methods, e.g.
Where()
orFirst()
. - Iterator - An object which iterates over elements in a collection. In LINQ queries, iterators are given names and are used in conditions.
- Conditions - When referring to LINQ queries, boolean values which must be true for an individual element in order for that element to be returned by the query. These are sometimes called predicates.
- Projection - A set of properties from a class that we select as part of a LINQ query. Can also be properties from multiple classes in the select set.
- Set operations - Operations on two or more collections. Can produce the intersection, union, or except group.
New Keywords and Operators
from
- In a LINQ query, specifies a variable name to use for the iterator over a collection.in
- In a LINQ query, specifies the source collection the query will execute against.where
- In a LINQ query, specifies one or more conditions that objects in the collection must satisfy in order to be selected.select
- In a LINQ query, specifies the objects or projections that will be created by the query.orderby
- In a LINQ query, specifies one or more properties to order the results by.descending
- In a LINQ query, specifies that the objects are to be ordered by the given property in descending order.=>
- The "goes to" operator. Used to create lambda expressions in LINQ statements.
Summary
LINQ (Language Integrated Query) is a set of technologies that allow us to operate on and select elements from collections. Among the many operations we can perform are queries, ordering, conversion, set operations, existence operations, and grouping. All of these functionalities are available in either query syntax or API syntax; the latter is favored most of the time, but some functionalities are easier in the former.
There are quite a few more advanced things we can do with LINQ. If you want more samples, check out the 101 LINQ Samples group in this repository.
You might have noticed the <T>
syntax; it appears a lot in this post. This is representative of a generic, and we will discuss generics in the next post of this series. Check that out here:
You might also have noticed the =>
operator; this is the lambda operator and we read this aloud as "goes to". It is representative of both lambdas and expressions, which we will discuss in a later post.
Got questions about LINQ? I wanna hear them! Ask away in the comments below. And yes, I do know that's Zelda, not Link, in the page photo. :)
Happy Coding!