Entity Framework Core Outer Join

Programming
Published on October 7, 2025 Last updated on October 13, 2025

Introduction

Entity Framework Core (EF Core) + LINQ to Entities can express outer joins.

  • Orders table
OrderID CustomerID OrderDate
1 101 2025-10-01
2 101 2025-10-05
3 102 2025-10-03
4 102 2025-10-05

  • Discounts table
CustomerID StartDate DiscountRate
101 2025-10-01 10
101 2025-10-04 15
102 2025-10-04 5
102 2025-10-07 20

We will look at how to perform an outer join using these two tables as an example.

Basics of Outer Join

LINQ to Entities provides two syntaxes for writing queries: Query Syntax and Method Syntax.

Query Syntax

Before discussing outer joins, let's first look at inner joins.

// Inner Join
var query = from o in context.Orders
            join d in context.Discounts
            on o.CustomerID equals d.CustomerID
            select new
            {
                o.OrderID,
                o.CustomerID,
                o.OrderDate,
                d.StartDate,
                d.DiscountRate
            };

Outer joins are expressed as follows.

// Outer Join
var query = from o in context.Orders
            join d in context.Discounts
            on o.CustomerID equals d.CustomerID into discountGroup
            from d in discountGroup.DefaultIfEmpty()
            select new
            {
                o.OrderID,
                o.CustomerID,
                o.OrderDate,
                d.StartDate,
                d.DiscountRate
            };

For outer joins, we add into discountGroup and from d in discountGroup.DefaultIfEmpty().

  • into discountGroup

    • Groups Discounts where CustomerID matches for each record o in Orders.
    • discountGroup is of type IEnumerable<Discount>, representing a 1:N relationship between Orders and Discounts.
  • from d in discountGroup.DefaultIfEmpty()

    • If discountGroup is empty, DefaultIfEmpty() assigns null to d. This ensures that Orders without corresponding Discounts are also included in the result (i.e., outer join).
    • If DefaultIfEmpty() is not used and only from d in discountGroup is written, Orders without Discounts will not be included in the result, making it an inner join.

This concludes the explanation of the syntax.

This query is converted into SQL by each Entity Framework database provider and then executed.
You can check what SQL is converted by query.ToQueryString().
For SQL Server, the following SQL was output.

SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID]

For PostgreSQL, the following SQL was generated.

SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID"

For Oracle.

SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON "o"."CustomerID" = "d"."CustomerID"

Method Syntax (GroupJoin + SelectMany)

Let's first look at inner joins in method syntax.

// Inner Join
var query = context.Orders
    .Join(context.Discounts,
        o => o.CustomerID,
        d => d.CustomerID,
        (o, d) => new
        {
            o.OrderID,
            o.CustomerID,
            o.OrderDate,
            d.StartDate,
            d.DiscountRate
        });

For outer joins, it changes like this.

// Outer Join
var query = context.Orders
    .GroupJoin(context.Discounts,
        o => o.CustomerID,
        d => d.CustomerID,
        (o, discounts) => new { o, discounts  })
    .SelectMany(
        od => od.discounts .DefaultIfEmpty(),
        (od, d) => new
        {
            od.o.OrderID,
            od.o.CustomerID,
            od.o.OrderDate,
            d.StartDate,
            d.DiscountRate
        });
  • GroupJoin

    • Groups Discounts where CustomerID matches for each record o in Orders.
    • discounts is of type IEnumerable<Discount>, representing a 1:N relationship between Orders and Discounts.
    • The return value is an anonymous type { o, discounts }.
  • SelectMany

    • SelectMany serves to flatten the collection.
    • DefaultIfEmpty() assigns null if discounts is empty (i.e., outer join).
    • If DefaultIfEmpty() is not used, it becomes an inner join.
    • Generates a new anonymous type using od (which contains the original Order) and each element d from the flattened discounts.

This concludes the explanation of the syntax.

Let's also check what kind of SQL is generated.

-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON "o"."CustomerID" = "d"."CustomerID

The exact same SQL as in the query syntax case is generated.

Executing Queries and Retrieving Results

While this note focuses on "outer joins" and is not directly related to this topic, the query is not executed when query is created. The query is executed when its results are needed. This is known as "deferred execution" in LINQ to Entities.

It might be unclear what "when its results are needed" means, but specifically, it occurs when iterating through query with foreach, calling query.ToList(), query.FirstOrDefault(), etc.

The method for executing the query and retrieving results is the same for both Query Syntax and Method Syntax.
To output all results to the console, you would write it as follows.

foreach (var result in query)
{
    Console.WriteLine(result);
}

Execution Result

OrderID CustomerID OrderDate StartDate DiscountRate
1 101 2025-10-01 2025-10-01 10
1 101 2025-10-01 2025-10-04 15
2 101 2025-10-05 2025-10-01 10
2 101 2025-10-05 2025-10-04 15
3 102 2025-10-03 2025-10-04 5
3 102 2025-10-03 2025-10-07 20
4 102 2025-10-05 2025-10-04 5
4 102 2025-10-05 2025-10-07 20

Joining with Inequality Conditions

Sometimes, you need to include inequality conditions in the join clause.
For example, you might want to join only "discounts applicable on the order date" with OrderDate >= StartDate.

Query Syntax

var query = from o in context.Orders
            join d in context.Discounts
            on o.CustomerID equals d.CustomerID into discountGroup
            from d in discountGroup
                .Where(d => o.OrderDate >= d.StartDate)
                .DefaultIfEmpty()
            select new
            {
                o.OrderID,
                o.CustomerID,
                o.OrderDate,
                d.StartDate,
                d.DiscountRate,
            };

Let's execute it.

{ OrderID = 1, CustomerID = 101, OrderDate = 2025/10/01 0:00:00, StartDate = 2025/10/01 0:00:00, DiscountRate = 10 }
{ OrderID = 2, CustomerID = 101, OrderDate = 2025/10/05 0:00:00, StartDate = 2025/10/01 0:00:00, DiscountRate = 10 }
{ OrderID = 2, CustomerID = 101, OrderDate = 2025/10/05 0:00:00, StartDate = 2025/10/04 0:00:00, DiscountRate = 15 }
Nullable object must have a value.

An error occurred on the 4th line. This is because Discounts is null on the 4th line, but the type of StartDate in the result set is DateTime (non-nullable).

We can rewrite it as follows to inform the compiler of the appropriate type.

var query = from o in context.Orders
            join d in context.Discounts
            on o.CustomerID equals d.CustomerID into discountGroup
            from d in discountGroup
                .Where(d => o.OrderDate >= d.StartDate)
                .DefaultIfEmpty()
            select new
            {
                o.OrderID,
                o.CustomerID,
                o.OrderDate,
                StartDate = (DateTime?)d.StartDate,
                DiscountRate = (int?)d.DiscountRate,
            };

The expected execution result was obtained as follows.

-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID] AND [o].[OrderDate] >= [d].[StartDate]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID" AND o."OrderDate" >= d."StartDate"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON (("o"."CustomerID" = "d"."CustomerID") AND ("o"."OrderDate" >= "d"."StartDate"))
OrderID CustomerID OrderDate StartDate DiscountRate
1 101 2025-10-01 2025-10-01 10
2 101 2025-10-05 2025-10-01 10
2 101 2025-10-05 2025-10-04 15
3 102 2025-10-03
4 102 2025-10-05 2025-10-04 5

Although the expected result was obtained, the query syntax includes a method syntax call Where(d => o.OrderDate >= d.StartDate). While such a mix of query and method syntax is not necessarily bad, it shows the limits of what can be achieved with query syntax alone.

Method Syntax (GroupJoin + SelectMany)

var query = context.Orders
    .GroupJoin(context.Discounts,
        o => o.CustomerID,
        d => d.CustomerID,
        (o, d) => new { o, d }
    )
    .SelectMany(x => x.d
        .Where(d => x.o.OrderDate >= d.StartDate)
        .DefaultIfEmpty(),
        (x, d) => new
        {
            x.o.OrderID,
            x.o.CustomerID,
            x.o.OrderDate,
            StartDate = (DateTime?)d.StartDate,
            DiscountRate = (int?)d.DiscountRate,
        }
    );

The result is as follows.

-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID] AND [o].[OrderDate] >= [d].[StartDate]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID" AND o."OrderDate" >= d."StartDate"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON (("o"."CustomerID" = "d"."CustomerID") AND ("o"."OrderDate" >= "d"."StartDate"))
OrderID CustomerID OrderDate StartDate DiscountRate
1 101 2025-10-01 2025-10-01 10
2 101 2025-10-05 2025-10-01 10
2 101 2025-10-05 2025-10-04 15
3 102 2025-10-03
4 102 2025-10-05 2025-10-04 5

Method Syntax (SelectMany + Where)

Outer joins can also be written as follows without using GroupJoin.

var query = context.Orders
    .SelectMany(o => context.Discounts
        .Where(d => o.CustomerID == d.CustomerID && o.OrderDate >= d.StartDate)
        .DefaultIfEmpty(),
        (o, d) => new
        {
            o.OrderID,
            o.CustomerID,
            o.OrderDate,
            StartDate = (DateTime?)d.StartDate,
            DiscountRate = (int?)d.DiscountRate,
        }
    );

Only SelectMany is used, and join conditions are grouped in Where.
The generated SQL will be exactly the same as the one using GroupJoin.

-- SQL Server
SELECT [o].[OrderID], [o].[CustomerID], [o].[OrderDate], [d].[StartDate], [d].[DiscountRate]
FROM [Orders] AS [o]
LEFT JOIN [Discounts] AS [d] ON [o].[CustomerID] = [d].[CustomerID] AND [o].[OrderDate] >= [d].[StartDate]
-- PostgreSQL
SELECT o."OrderID", o."CustomerID", o."OrderDate", d."StartDate", d."DiscountRate"
FROM "Orders" AS o
LEFT JOIN "Discounts" AS d ON o."CustomerID" = d."CustomerID" AND o."OrderDate" >= d."StartDate"
-- Oracle
SELECT "o"."OrderID", "o"."CustomerID", "o"."OrderDate", "d"."StartDate", "d"."DiscountRate"
FROM "Orders" "o"
LEFT JOIN "Discounts" "d" ON (("o"."CustomerID" = "d"."CustomerID") AND ("o"."OrderDate" >= "d"."StartDate"))
OrderID CustomerID OrderDate StartDate DiscountRate
1 101 2025-10-01 2025-10-01 10
2 101 2025-10-05 2025-10-01 10
2 101 2025-10-05 2025-10-04 15
3 102 2025-10-03
4 102 2025-10-05 2025-10-04 5

This form is more compact than GroupJoin, so it is probably preferable.

Summary

We introduced the basic syntax for outer joins and the technique to perform outer joins using only SelectMany without GroupJoin.
While official documentation often describes outer joins using GroupJoin + SelectMany, it is possible to write outer joins more simply with just SelectMany.
However, it's important to note that the generated and executed SQL may vary depending on the database provider and its version. We covered SQL Server, PostgreSQL, and Oracle here, but other databases may generate unintended SQL or throw exceptions at runtime. Always verify what SQL is being converted.

The versions of the database providers used are as follows:

  • Microsoft.EntityFrameworkCore.SqlServer 9.0.9
  • Npgsql.EntityFrameworkCore.PostgreSQL 9.0.4
  • Oracle.EntityFrameworkCore 9.23.90