Entity Framework Core Outer Join
Table of Contents
Introduction
Entity Framework Core (EF Core) + LINQ to Entities can express outer joins.
- Orders table
OrderID | CustomerID | OrderDate |
---|---|---|
1 | 101 | 2025-10-01 |
2 | 101 | 2025-10-05 |
3 | 102 | 2025-10-03 |
4 | 102 | 2025-10-05 |
- Discounts table
CustomerID | StartDate | DiscountRate |
---|---|---|
101 | 2025-10-01 | 10 |
101 | 2025-10-04 | 15 |
102 | 2025-10-04 | 5 |
102 | 2025-10-07 | 20 |
We will look at how to perform an outer join using these two tables as an example.
Basics of Outer Join
LINQ to Entities provides two syntaxes for writing queries: Query Syntax
and Method Syntax
.
Query Syntax
Before discussing outer joins, let's first look at inner joins.
// Inner Join
var query = from o in context.Orders
join d in context.Discounts
on o.CustomerID equals d.CustomerID
select new
{
o.OrderID,
o.CustomerID,
o.OrderDate,
d.StartDate,
d.DiscountRate
};
Outer joins are expressed as follows.
// Outer Join
var query = from o in context.Orders
join d in context.Discounts
on o.CustomerID equals d.CustomerID into discountGroup
from d in discountGroup.DefaultIfEmpty()
select new
{
o.OrderID,
o.CustomerID,
o.OrderDate,
d.StartDate,
d.DiscountRate
};
For outer joins, we add into discountGroup
and from d in discountGroup.DefaultIfEmpty()
.
into discountGroup
- Groups Discounts where CustomerID matches for each record
o
in Orders. discountGroup
is of typeIEnumerable<Discount>
, representing a 1:N relationship between Orders and Discounts.
- Groups Discounts where CustomerID matches for each record
from d in discountGroup.DefaultIfEmpty()
- If
discountGroup
is empty,DefaultIfEmpty()
assigns null tod
. This ensures that Orders without corresponding Discounts are also included in the result (i.e., outer join). - If
DefaultIfEmpty()
is not used and onlyfrom d in discountGroup
is written, Orders without Discounts will not be included in the result, making it an inner join.
- If
This concludes the explanation of the syntax.
This query
is converted into SQL by each Entity Framework database provider and then executed.
You can check what SQL is converted by query.ToQueryString()
.
For SQL Server, the following SQL was output.
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID]
For PostgreSQL, the following SQL was generated.
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID"
For Oracle.
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON "o"."CustomerID" = "d"."CustomerID"
Method Syntax (GroupJoin + SelectMany)
Let's first look at inner joins in method syntax.
// Inner Join
var query = context.Orders
.Join(context.Discounts,
o => o.CustomerID,
d => d.CustomerID,
(o, d) => new
{
o.OrderID,
o.CustomerID,
o.OrderDate,
d.StartDate,
d.DiscountRate
});
For outer joins, it changes like this.
// Outer Join
var query = context.Orders
.GroupJoin(context.Discounts,
o => o.CustomerID,
d => d.CustomerID,
(o, discounts) => new { o, discounts })
.SelectMany(
od => od.discounts .DefaultIfEmpty(),
(od, d) => new
{
od.o.OrderID,
od.o.CustomerID,
od.o.OrderDate,
d.StartDate,
d.DiscountRate
});
GroupJoin
- Groups Discounts where CustomerID matches for each record
o
in Orders. discounts
is of typeIEnumerable<Discount>
, representing a 1:N relationship between Orders and Discounts.- The return value is an anonymous type
{ o, discounts }
.
- Groups Discounts where CustomerID matches for each record
SelectMany
SelectMany
serves to flatten the collection.DefaultIfEmpty()
assigns null ifdiscounts
is empty (i.e., outer join).- If
DefaultIfEmpty()
is not used, it becomes an inner join. - Generates a new anonymous type using
od
(which contains the original Order) and each elementd
from the flatteneddiscounts
.
This concludes the explanation of the syntax.
Let's also check what kind of SQL is generated.
-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON "o"."CustomerID" = "d"."CustomerID
The exact same SQL as in the query syntax case is generated.
Executing Queries and Retrieving Results
While this note focuses on "outer joins" and is not directly related to this topic, the query is not executed when query
is created. The query is executed when its results are needed.
This is known as "deferred execution" in LINQ to Entities.
It might be unclear what "when its results are needed" means, but specifically, it occurs when iterating through query
with foreach
, calling query.ToList()
, query.FirstOrDefault()
, etc.
The method for executing the query and retrieving results is the same for both Query Syntax
and Method Syntax
.
To output all results to the console, you would write it as follows.
foreach (var result in query)
{
Console.WriteLine(result);
}
Execution Result
OrderID | CustomerID | OrderDate | StartDate | DiscountRate |
---|---|---|---|---|
1 | 101 | 2025-10-01 | 2025-10-01 | 10 |
1 | 101 | 2025-10-01 | 2025-10-04 | 15 |
2 | 101 | 2025-10-05 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-04 | 15 |
3 | 102 | 2025-10-03 | 2025-10-04 | 5 |
3 | 102 | 2025-10-03 | 2025-10-07 | 20 |
4 | 102 | 2025-10-05 | 2025-10-04 | 5 |
4 | 102 | 2025-10-05 | 2025-10-07 | 20 |
Joining with Inequality Conditions
Sometimes, you need to include inequality conditions in the join clause.
For example, you might want to join only "discounts applicable on the order date" with OrderDate >= StartDate
.
Query Syntax
var query = from o in context.Orders
join d in context.Discounts
on o.CustomerID equals d.CustomerID into discountGroup
from d in discountGroup
.Where(d => o.OrderDate >= d.StartDate)
.DefaultIfEmpty()
select new
{
o.OrderID,
o.CustomerID,
o.OrderDate,
d.StartDate,
d.DiscountRate,
};
Let's execute it.
{ OrderID = 1, CustomerID = 101, OrderDate = 2025/10/01 0:00:00, StartDate = 2025/10/01 0:00:00, DiscountRate = 10 }
{ OrderID = 2, CustomerID = 101, OrderDate = 2025/10/05 0:00:00, StartDate = 2025/10/01 0:00:00, DiscountRate = 10 }
{ OrderID = 2, CustomerID = 101, OrderDate = 2025/10/05 0:00:00, StartDate = 2025/10/04 0:00:00, DiscountRate = 15 }
Nullable object must have a value.
An error occurred on the 4th line.
This is because Discounts
is null on the 4th line, but the type of StartDate
in the result set is DateTime
(non-nullable).
We can rewrite it as follows to inform the compiler of the appropriate type.
var query = from o in context.Orders
join d in context.Discounts
on o.CustomerID equals d.CustomerID into discountGroup
from d in discountGroup
.Where(d => o.OrderDate >= d.StartDate)
.DefaultIfEmpty()
select new
{
o.OrderID,
o.CustomerID,
o.OrderDate,
StartDate = (DateTime?)d.StartDate,
DiscountRate = (int?)d.DiscountRate,
};
The expected execution result was obtained as follows.
-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID] AND [o].[OrderDate] >= [d].[StartDate]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID" AND o."OrderDate" >= d."StartDate"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON (("o"."CustomerID" = "d"."CustomerID") AND ("o"."OrderDate" >= "d"."StartDate"))
OrderID | CustomerID | OrderDate | StartDate | DiscountRate |
---|---|---|---|---|
1 | 101 | 2025-10-01 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-04 | 15 |
3 | 102 | 2025-10-03 | ||
4 | 102 | 2025-10-05 | 2025-10-04 | 5 |
Although the expected result was obtained, the query syntax includes a method syntax call Where(d => o.OrderDate >= d.StartDate)
. While such a mix of query and method syntax is not necessarily bad, it shows the limits of what can be achieved with query syntax alone.
Method Syntax (GroupJoin + SelectMany)
var query = context.Orders
.GroupJoin(context.Discounts,
o => o.CustomerID,
d => d.CustomerID,
(o, d) => new { o, d }
)
.SelectMany(x => x.d
.Where(d => x.o.OrderDate >= d.StartDate)
.DefaultIfEmpty(),
(x, d) => new
{
x.o.OrderID,
x.o.CustomerID,
x.o.OrderDate,
StartDate = (DateTime?)d.StartDate,
DiscountRate = (int?)d.DiscountRate,
}
);
The result is as follows.
-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID] AND [o].[OrderDate] >= [d].[StartDate]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID" AND o."OrderDate" >= d."StartDate"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON (("o"."CustomerID" = "d"."CustomerID") AND ("o"."OrderDate" >= "d"."StartDate"))
OrderID | CustomerID | OrderDate | StartDate | DiscountRate |
---|---|---|---|---|
1 | 101 | 2025-10-01 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-04 | 15 |
3 | 102 | 2025-10-03 | ||
4 | 102 | 2025-10-05 | 2025-10-04 | 5 |
Method Syntax (SelectMany + Where)
Outer joins can also be written as follows without using GroupJoin.
var query = context.Orders
.SelectMany(o => context.Discounts
.Where(d => o.CustomerID == d.CustomerID && o.OrderDate >= d.StartDate)
.DefaultIfEmpty(),
(o, d) => new
{
o.OrderID,
o.CustomerID,
o.OrderDate,
StartDate = (DateTime?)d.StartDate,
DiscountRate = (int?)d.DiscountRate,
}
);
Only SelectMany is used, and join conditions are grouped in Where.
The generated SQL will be exactly the same as the one using GroupJoin.
-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID] AND [o].[OrderDate] >= [d].[StartDate]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID" AND o."OrderDate" >= d."StartDate"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON (("o"."CustomerID" = "d"."CustomerID") AND ("o"."OrderDate" >= "d"."StartDate"))
OrderID | CustomerID | OrderDate | StartDate | DiscountRate |
---|---|---|---|---|
1 | 101 | 2025-10-01 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-01 | 10 |
2 | 101 | 2025-10-05 | 2025-10-04 | 15 |
3 | 102 | 2025-10-03 | ||
4 | 102 | 2025-10-05 | 2025-10-04 | 5 |
This form is more compact than GroupJoin, so it is probably preferable.
Summary
We introduced the basic syntax for outer joins and the technique to perform outer joins using only SelectMany without GroupJoin.
While official documentation often describes outer joins using GroupJoin + SelectMany, it is possible to write outer joins more simply with just SelectMany.
However, it's important to note that the generated and executed SQL may vary depending on the database provider and its version. We covered SQL Server, PostgreSQL, and Oracle here, but other databases may generate unintended SQL or throw exceptions at runtime. Always verify what SQL is being converted.
The versions of the database providers used are as follows:
- Microsoft.EntityFrameworkCore.SqlServer 9.0.9
- Npgsql.EntityFrameworkCore.PostgreSQL 9.0.4
- Oracle.EntityFrameworkCore 9.23.90