Using selectinload vs joinedload for N+1 Prevention in SQLAlchemy 2.0

Q: Why does joinedload on a collection require .unique() but selectinload does not?

joinedload rewrites the SQL to include a JOIN, so the database returns one row per (parent, child) pair. For a User with three Order rows, the result cursor contains three rows all with the same user columns. SQLAlchemy 2.0 requires you to call .unique() to signal that deduplication should happen in Python. selectinload fetches parent and child in separate queries, so the parent cursor always has exactly one row per parent — no deduplication needed.

Q: How deep can I chain selectinload ?

There is no hard limit. Each level adds one batch SELECT to the total query count. A three-level chain ( User → orders → products → categories ) costs four queries total. That is almost always preferable to the exponential row counts a deeply nested joinedload chain would produce. Monitor the IN clause size at each level — if any level loads tens of thousands of IDs, paginate the parent query first.

Q: Does selectinload work correctly with LIMIT and OFFSET on the parent query?

Yes. SQLAlchemy runs the parent query with your LIMIT / OFFSET first, collects the primary keys from that result, then issues the secondary IN query scoped to exactly those keys. The child load is always consistent with the paginated parent set, with no risk of the windowing problem that a JOIN-based approach would suffer.

Use selectinload for collections and joinedload for single-object relationships — both prevent N+1 queries in Complex Joins and Relationship Loading Strategies without blocking the async event loop.

Quick Answer

Before — implicit lazy loading that breaks in async:

# Python 3.11+
import asyncio
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship

engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(engine, expire_on_commit=False)


class Base(DeclarativeBase):
    pass


class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str]
    orders: Mapped[list["Order"]] = relationship(back_populates="user")


class Order(Base):
    __tablename__ = "orders"
    id: Mapped[int] = mapped_column(primary_key=True)
    user_id: Mapped[int] = mapped_column()
    total: Mapped[float]
    user: Mapped["User"] = relationship(back_populates="orders")


async def bad_get_users_with_orders() -> None:
    async with AsyncSessionLocal() as session:
        result = await session.execute(select(User))
        users = result.scalars().all()
        for user in users:
            # MissingGreenlet or LazyLoadError raised here —
            # async sessions cannot issue a synchronous SELECT
            print(user.orders)

After — explicit eager loading, fully async-safe:

# Python 3.11+
import asyncio
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, selectinload, joinedload

engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(engine, expire_on_commit=False)


class Base(DeclarativeBase):
    pass


class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str]
    orders: Mapped[list["Order"]] = relationship(back_populates="user")
    invoice: Mapped["Invoice | None"] = relationship(back_populates="user", uselist=False)


class Order(Base):
    __tablename__ = "orders"
    id: Mapped[int] = mapped_column(primary_key=True)
    user_id: Mapped[int] = mapped_column()
    total: Mapped[float]
    user: Mapped["User"] = relationship(back_populates="orders")


class Invoice(Base):
    __tablename__ = "invoices"
    id: Mapped[int] = mapped_column(primary_key=True)
    user_id: Mapped[int] = mapped_column()
    amount: Mapped[float]
    user: Mapped["User"] = relationship(back_populates="invoice")


async def get_users_with_orders(session: AsyncSession) -> list[User]:
    # selectinload: issues one extra SELECT … WHERE user_id IN (…)
    # safe for one-to-many collections, no row multiplication
    stmt = select(User).options(selectinload(User.orders))
    result = await session.execute(stmt)
    return result.scalars().all()


async def get_users_with_invoice(session: AsyncSession) -> list[User]:
    # joinedload: LEFT OUTER JOIN, single round-trip
    # safe for many-to-one / one-to-one (uselist=False)
    # .unique() collapses any duplicated parent rows
    stmt = select(User).options(joinedload(User.invoice))
    result = await session.execute(stmt)
    return result.scalars().unique().all()


async def main() -> None:
    async with AsyncSessionLocal() as session:
        users = await get_users_with_orders(session)
        for u in users:
            print(u.email, [o.total for o in u.orders])

        users_with_invoices = await get_users_with_invoice(session)
        for u in users_with_invoices:
            print(u.email, u.invoice)

asyncio.run(main())

Execution Context & Async Workflow Integration

SQLAlchemy's async session layer wraps every database interaction in the asyncio event loop via asyncpg. Lazy loading — the SQLAlchemy 1.x default — works by issuing a fresh SELECT the moment you access an unloaded attribute. In a synchronous context that is transparent. In an async context it requires the ORM to start a new coroutine mid-attribute-access, which is impossible without a running greenlet. The result is an immediate MissingGreenlet error.

Eager loaders solve this by scheduling all necessary queries before you touch any attribute:

selectinload fires a second await session.execute(select(Order).where(Order.user_id.in_([1, 2, 3]))) call after the parent query returns. asyncpg can pipeline and cache the prepared statement for the IN query across requests, making repeat calls cheap. Because no JOIN is involved, the parent result set is never inflated — 100 users with 50 orders each returns exactly 100 parent rows plus one batch query of up to 5 000 child rows.
joinedload rewrites the parent query to include a LEFT OUTER JOIN. The database returns one row per (parent, child) combination. For a many-to-one or one-to-one relationship that means one row per parent — no inflation, single round-trip. For a one-to-many collection every child produces a duplicate parent row; SQLAlchemy deduplicates in Python, but you must call .unique() to signal that the result cursor has duplicates. Without .unique(), SQLAlchemy 2.0 raises a warning and may return duplicate ORM objects.

Choosing between them:

Relationship cardinality	Recommended loader	Why
`many-to-one` / `one-to-one`	`joinedload`	No row multiplication, single round-trip
`one-to-many` collection	`selectinload`	No row inflation, bounded `IN` clause
`many-to-many` collection	`selectinload`	Association table makes JOIN result set explode
Deep graph (3+ levels)	Chained `selectinload`	Each level adds one focused batch query

Async variant with dependency injection:

# Python 3.11+
from collections.abc import AsyncGenerator
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import selectinload, joinedload

DATABASE_URL = "postgresql+asyncpg://user:pass@localhost/db"
engine = create_async_engine(DATABASE_URL, pool_size=10, max_overflow=20)
AsyncSessionLocal = async_sessionmaker(engine, expire_on_commit=False)


async def get_session() -> AsyncGenerator[AsyncSession, None]:
    async with AsyncSessionLocal() as session:
        yield session


async def load_orders_for_tenant(tenant_id: int, session: AsyncSession) -> list:
    """Load all users for a tenant with their orders and each order's products."""
    from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship

    # Chained selectinload for a three-level graph:
    # User → orders → products
    stmt = (
        select(User)
        .where(User.tenant_id == tenant_id)
        .options(
            selectinload(User.orders).selectinload(Order.products)
        )
    )
    result = await session.execute(stmt)
    return result.scalars().all()

Each .selectinload() in the chain translates to one additional batch SELECT, so a three-level graph costs three total queries regardless of row count — far better than the N+1 (or N×M+1) that lazy loading would produce.

Resolving Warnings, Errors & Common Mistakes

Warning/Error	Root Cause	Production Fix
`sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called`	Relationship accessed on an async session without an active coroutine context — triggered by implicit lazy load	Add `selectinload` or `joinedload` to the originating query via `.options()`; never access unloaded relationships outside `await` scope
`sqlalchemy.exc.InvalidRequestError: 'User.orders' is not available due to lazy='raise'` (LazyLoadError)	Relationship declared with `lazy="raise"` or `lazy="raise_on_sql"` and accessed without explicit loading	Intentional safety net — add the appropriate loader to the query; do not relax `lazy="raise"` in production
`sqlalchemy.orm.exc.DetachedInstanceError: Instance <User> is not bound to a Session`	ORM object used after its session closed or was garbage-collected	Call `result.scalars().all()` inside the `async with AsyncSession()` block; use `expire_on_commit=False` on the session factory so objects survive `commit()`
`SAWarning: SELECT statement has a cartesian product`	`joinedload` used alongside an explicit `.join()` on the same table without a proper ON clause, or multiple `joinedload` on sibling collections	Switch sibling collection loads to `selectinload`; use `contains_eager` (see below) when you own the JOIN; check for missing foreign-key constraints
`SAWarning: Duplicate column … result set contains X rows` (implicit N+1 duplicate warning)	`joinedload` on a one-to-many without `.unique()` — SQLAlchemy detects inflated rows	Always call `.scalars().unique().all()` when any `joinedload` targets a collection

Advanced Loader Optimization

`contains_eager` with an explicit JOIN

joinedload lets SQLAlchemy control the JOIN. contains_eager lets you write the JOIN yourself — useful when you need to filter, order, or index-hint the joined table — while still populating the relationship on the loaded objects.

# Python 3.11+
from sqlalchemy import select, and_
from sqlalchemy.orm import contains_eager
from sqlalchemy.ext.asyncio import AsyncSession


async def get_users_with_recent_orders(session: AsyncSession) -> list[User]:
    """
    Load users and only their orders placed in the last 30 days.
    A plain joinedload cannot filter the JOIN condition; contains_eager can.
    """
    from datetime import datetime, timedelta, UTC

    cutoff = datetime.now(UTC) - timedelta(days=30)

    stmt = (
        select(User)
        .join(User.orders)
        .where(Order.created_at >= cutoff)
        .options(contains_eager(User.orders))
        .order_by(User.id, Order.created_at.desc())
    )
    result = await session.execute(stmt)
    # .unique() required: the explicit JOIN inflates rows just like joinedload
    return result.scalars().unique().all()

contains_eager tells SQLAlchemy "the columns for this relationship are already present in the result set — hydrate from there." This avoids the second query that selectinload would issue while giving you full control over the JOIN predicate.

raiseload as a production safety net:

Declare relationships with lazy="raise" globally on the model, then explicitly opt in to loading at the query site. Any forgotten relationship access becomes an immediate InvalidRequestError during development instead of a silent N+1 in production.

# Python 3.11+
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship


class Base(DeclarativeBase):
    pass


class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str]
    # Raises immediately on attribute access if not explicitly loaded
    orders: Mapped[list["Order"]] = relationship(back_populates="user", lazy="raise")
    invoice: Mapped["Invoice | None"] = relationship(
        back_populates="user", uselist=False, lazy="raise"
    )

You can also apply raiseload("*") as a query option to lock down every relationship on a specific query without changing the model default:

# Python 3.11+
from sqlalchemy import select
from sqlalchemy.orm import selectinload, raiseload
from sqlalchemy.ext.asyncio import AsyncSession


async def get_users_orders_only(session: AsyncSession) -> list[User]:
    stmt = (
        select(User)
        .options(
            selectinload(User.orders),   # explicitly loaded — fine
            raiseload("*"),              # everything else raises on access
        )
    )
    result = await session.execute(stmt)
    return result.scalars().all()

Frequently Asked Questions

Why does joinedload on a collection require .unique() but selectinload does not?

joinedload rewrites the SQL to include a JOIN, so the database returns one row per (parent, child) pair. For a User with three Order rows, the result cursor contains three rows all with the same user columns. SQLAlchemy 2.0 requires you to call .unique() to signal that deduplication should happen in Python. selectinload fetches parent and child in separate queries, so the parent cursor always has exactly one row per parent — no deduplication needed.

Can I use both selectinload and joinedload in the same query?

Yes. You can mix loaders for different relationships on the same entity. A common pattern is joinedload for a many-to-one (e.g. Order → User) combined with selectinload for a one-to-many (e.g. Order → OrderLine). Apply them as separate .options() arguments or chain them inside a single .options() call.

How deep can I chain selectinload?

There is no hard limit. Each level adds one batch SELECT to the total query count. A three-level chain (User → orders → products → categories) costs four queries total. That is almost always preferable to the exponential row counts a deeply nested joinedload chain would produce. Monitor the IN clause size at each level — if any level loads tens of thousands of IDs, paginate the parent query first.

Does selectinload work correctly with LIMIT and OFFSET on the parent query?

Yes. SQLAlchemy runs the parent query with your LIMIT/OFFSET first, collects the primary keys from that result, then issues the secondary IN query scoped to exactly those keys. The child load is always consistent with the paginated parent set, with no risk of the windowing problem that a JOIN-based approach would suffer.

Complex Joins and Relationship Loading Strategies — parent topic covering join types, relationship mapping, and loading strategy trade-offs
Fixing Cartesian Product Warnings in SQLAlchemy Joins — sibling page on diagnosing and resolving SAWarning cartesian product errors that often appear alongside incorrect joinedload usage
Common Table Expressions, CTEs and Recursive Queries — adjacent topic for advanced query composition that pairs well with eager loading strategies
Session Lifecycle and Scope Management — cross-topic reference for understanding how session boundaries affect DetachedInstanceError and expire_on_commit behaviour