Using selectinload vs joinedload for N+1 Prevention in SQLAlchemy 2.0
Use selectinload for collections and joinedload for single-object relationships — both prevent N+1 queries in Complex Joins and Relationship Loading Strategies without blocking the async event loop.
Quick Answer
Before — implicit lazy loading that breaks in async:
# Python 3.11+
import asyncio
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(engine, expire_on_commit=False)
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str]
orders: Mapped[list["Order"]] = relationship(back_populates="user")
class Order(Base):
__tablename__ = "orders"
id: Mapped[int] = mapped_column(primary_key=True)
user_id: Mapped[int] = mapped_column()
total: Mapped[float]
user: Mapped["User"] = relationship(back_populates="orders")
async def bad_get_users_with_orders() -> None:
async with AsyncSessionLocal() as session:
result = await session.execute(select(User))
users = result.scalars().all()
for user in users:
# MissingGreenlet or LazyLoadError raised here —
# async sessions cannot issue a synchronous SELECT
print(user.orders)
After — explicit eager loading, fully async-safe:
# Python 3.11+
import asyncio
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, selectinload, joinedload
engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(engine, expire_on_commit=False)
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str]
orders: Mapped[list["Order"]] = relationship(back_populates="user")
invoice: Mapped["Invoice | None"] = relationship(back_populates="user", uselist=False)
class Order(Base):
__tablename__ = "orders"
id: Mapped[int] = mapped_column(primary_key=True)
user_id: Mapped[int] = mapped_column()
total: Mapped[float]
user: Mapped["User"] = relationship(back_populates="orders")
class Invoice(Base):
__tablename__ = "invoices"
id: Mapped[int] = mapped_column(primary_key=True)
user_id: Mapped[int] = mapped_column()
amount: Mapped[float]
user: Mapped["User"] = relationship(back_populates="invoice")
async def get_users_with_orders(session: AsyncSession) -> list[User]:
# selectinload: issues one extra SELECT … WHERE user_id IN (…)
# safe for one-to-many collections, no row multiplication
stmt = select(User).options(selectinload(User.orders))
result = await session.execute(stmt)
return result.scalars().all()
async def get_users_with_invoice(session: AsyncSession) -> list[User]:
# joinedload: LEFT OUTER JOIN, single round-trip
# safe for many-to-one / one-to-one (uselist=False)
# .unique() collapses any duplicated parent rows
stmt = select(User).options(joinedload(User.invoice))
result = await session.execute(stmt)
return result.scalars().unique().all()
async def main() -> None:
async with AsyncSessionLocal() as session:
users = await get_users_with_orders(session)
for u in users:
print(u.email, [o.total for o in u.orders])
users_with_invoices = await get_users_with_invoice(session)
for u in users_with_invoices:
print(u.email, u.invoice)
asyncio.run(main())
Execution Context & Async Workflow Integration
SQLAlchemy's async session layer wraps every database interaction in the asyncio event loop via asyncpg. Lazy loading — the SQLAlchemy 1.x default — works by issuing a fresh SELECT the moment you access an unloaded attribute. In a synchronous context that is transparent. In an async context it requires the ORM to start a new coroutine mid-attribute-access, which is impossible without a running greenlet. The result is an immediate MissingGreenlet error.
Eager loaders solve this by scheduling all necessary queries before you touch any attribute:
selectinloadfires a secondawait session.execute(select(Order).where(Order.user_id.in_([1, 2, 3])))call after the parent query returns. asyncpg can pipeline and cache the prepared statement for theINquery across requests, making repeat calls cheap. Because no JOIN is involved, the parent result set is never inflated — 100 users with 50 orders each returns exactly 100 parent rows plus one batch query of up to 5 000 child rows.joinedloadrewrites the parent query to include aLEFT OUTER JOIN. The database returns one row per (parent, child) combination. For a many-to-one or one-to-one relationship that means one row per parent — no inflation, single round-trip. For a one-to-many collection every child produces a duplicate parent row; SQLAlchemy deduplicates in Python, but you must call.unique()to signal that the result cursor has duplicates. Without.unique(), SQLAlchemy 2.0 raises a warning and may return duplicate ORM objects.
Choosing between them:
| Relationship cardinality | Recommended loader | Why |
|---|---|---|
many-to-one / one-to-one | joinedload | No row multiplication, single round-trip |
one-to-many collection | selectinload | No row inflation, bounded IN clause |
many-to-many collection | selectinload | Association table makes JOIN result set explode |
| Deep graph (3+ levels) | Chained selectinload | Each level adds one focused batch query |
Async variant with dependency injection:
# Python 3.11+
from collections.abc import AsyncGenerator
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import selectinload, joinedload
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost/db"
engine = create_async_engine(DATABASE_URL, pool_size=10, max_overflow=20)
AsyncSessionLocal = async_sessionmaker(engine, expire_on_commit=False)
async def get_session() -> AsyncGenerator[AsyncSession, None]:
async with AsyncSessionLocal() as session:
yield session
async def load_orders_for_tenant(tenant_id: int, session: AsyncSession) -> list:
"""Load all users for a tenant with their orders and each order's products."""
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
# Chained selectinload for a three-level graph:
# User → orders → products
stmt = (
select(User)
.where(User.tenant_id == tenant_id)
.options(
selectinload(User.orders).selectinload(Order.products)
)
)
result = await session.execute(stmt)
return result.scalars().all()
Each .selectinload() in the chain translates to one additional batch SELECT, so a three-level graph costs three total queries regardless of row count — far better than the N+1 (or N×M+1) that lazy loading would produce.
Resolving Warnings, Errors & Common Mistakes
| Warning/Error | Root Cause | Production Fix |
|---|---|---|
sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called | Relationship accessed on an async session without an active coroutine context — triggered by implicit lazy load | Add selectinload or joinedload to the originating query via .options(); never access unloaded relationships outside await scope |
sqlalchemy.exc.InvalidRequestError: 'User.orders' is not available due to lazy='raise' (LazyLoadError) | Relationship declared with lazy="raise" or lazy="raise_on_sql" and accessed without explicit loading | Intentional safety net — add the appropriate loader to the query; do not relax lazy="raise" in production |
sqlalchemy.orm.exc.DetachedInstanceError: Instance <User> is not bound to a Session | ORM object used after its session closed or was garbage-collected | Call result.scalars().all() inside the async with AsyncSession() block; use expire_on_commit=False on the session factory so objects survive commit() |
SAWarning: SELECT statement has a cartesian product | joinedload used alongside an explicit .join() on the same table without a proper ON clause, or multiple joinedload on sibling collections | Switch sibling collection loads to selectinload; use contains_eager (see below) when you own the JOIN; check for missing foreign-key constraints |
SAWarning: Duplicate column … result set contains X rows (implicit N+1 duplicate warning) | joinedload on a one-to-many without .unique() — SQLAlchemy detects inflated rows | Always call .scalars().unique().all() when any joinedload targets a collection |
Advanced Loader Optimization
contains_eager with an explicit JOIN
joinedload lets SQLAlchemy control the JOIN. contains_eager lets you write the JOIN yourself — useful when you need to filter, order, or index-hint the joined table — while still populating the relationship on the loaded objects.
# Python 3.11+
from sqlalchemy import select, and_
from sqlalchemy.orm import contains_eager
from sqlalchemy.ext.asyncio import AsyncSession
async def get_users_with_recent_orders(session: AsyncSession) -> list[User]:
"""
Load users and only their orders placed in the last 30 days.
A plain joinedload cannot filter the JOIN condition; contains_eager can.
"""
from datetime import datetime, timedelta, UTC
cutoff = datetime.now(UTC) - timedelta(days=30)
stmt = (
select(User)
.join(User.orders)
.where(Order.created_at >= cutoff)
.options(contains_eager(User.orders))
.order_by(User.id, Order.created_at.desc())
)
result = await session.execute(stmt)
# .unique() required: the explicit JOIN inflates rows just like joinedload
return result.scalars().unique().all()
contains_eager tells SQLAlchemy "the columns for this relationship are already present in the result set — hydrate from there." This avoids the second query that selectinload would issue while giving you full control over the JOIN predicate.
raiseload as a production safety net:
Declare relationships with lazy="raise" globally on the model, then explicitly opt in to loading at the query site. Any forgotten relationship access becomes an immediate InvalidRequestError during development instead of a silent N+1 in production.
# Python 3.11+
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str]
# Raises immediately on attribute access if not explicitly loaded
orders: Mapped[list["Order"]] = relationship(back_populates="user", lazy="raise")
invoice: Mapped["Invoice | None"] = relationship(
back_populates="user", uselist=False, lazy="raise"
)
You can also apply raiseload("*") as a query option to lock down every relationship on a specific query without changing the model default:
# Python 3.11+
from sqlalchemy import select
from sqlalchemy.orm import selectinload, raiseload
from sqlalchemy.ext.asyncio import AsyncSession
async def get_users_orders_only(session: AsyncSession) -> list[User]:
stmt = (
select(User)
.options(
selectinload(User.orders), # explicitly loaded — fine
raiseload("*"), # everything else raises on access
)
)
result = await session.execute(stmt)
return result.scalars().all()
Frequently Asked Questions
Why does joinedload on a collection require .unique() but selectinload does not?
joinedload rewrites the SQL to include a JOIN, so the database returns one row per (parent, child) pair. For a User with three Order rows, the result cursor contains three rows all with the same user columns. SQLAlchemy 2.0 requires you to call .unique() to signal that deduplication should happen in Python. selectinload fetches parent and child in separate queries, so the parent cursor always has exactly one row per parent — no deduplication needed.
Can I use both selectinload and joinedload in the same query?
Yes. You can mix loaders for different relationships on the same entity. A common pattern is joinedload for a many-to-one (e.g. Order → User) combined with selectinload for a one-to-many (e.g. Order → OrderLine). Apply them as separate .options() arguments or chain them inside a single .options() call.
How deep can I chain selectinload?
There is no hard limit. Each level adds one batch SELECT to the total query count. A three-level chain (User → orders → products → categories) costs four queries total. That is almost always preferable to the exponential row counts a deeply nested joinedload chain would produce. Monitor the IN clause size at each level — if any level loads tens of thousands of IDs, paginate the parent query first.
Does selectinload work correctly with LIMIT and OFFSET on the parent query?
Yes. SQLAlchemy runs the parent query with your LIMIT/OFFSET first, collects the primary keys from that result, then issues the secondary IN query scoped to exactly those keys. The child load is always consistent with the paginated parent set, with no risk of the windowing problem that a JOIN-based approach would suffer.
Related
- Complex Joins and Relationship Loading Strategies — parent topic covering join types, relationship mapping, and loading strategy trade-offs
- Fixing Cartesian Product Warnings in SQLAlchemy Joins — sibling page on diagnosing and resolving SAWarning cartesian product errors that often appear alongside incorrect joinedload usage
- Common Table Expressions, CTEs and Recursive Queries — adjacent topic for advanced query composition that pairs well with eager loading strategies
- Session Lifecycle and Scope Management — cross-topic reference for understanding how session boundaries affect DetachedInstanceError and expire_on_commit behaviour