Understanding Session.expunge vs Session.clear in Python
Session.expunge() removes a single tracked instance from the identity map while preserving its database row state, whereas Session.clear() empties the entire identity map and transitions all tracked objects to detached — and both operations are explained in full context in the Session Lifecycle and Scope Management guide. Neither method triggers flush() or commit(). Use expunge() for selective object promotion to external caches; use clear() for stateless request teardown and memory reclamation.
Quick Answer
The fundamental distinction is identity map scope and state transition granularity. The before/after behavior for each operation:
# Before: object is persistent (tracked by session)
from sqlalchemy import inspect
# expunge() — surgical removal of one object
session.expunge(user)
assert inspect(user).detached is True # user is now detached
assert inspect(order).persistent is True # other objects unchanged
# clear() — full identity map reset
session.clear()
assert inspect(order).detached is True # every tracked object is now detached
assert len(session.identity_map) == 0
Both operations execute synchronously on the in-memory identity map. In AsyncSession, neither method requires await — they delegate directly to the underlying synchronous Session registry without touching the database or the event loop.
Execution Context & Async Workflow Integration
In high-concurrency async endpoints, unbounded identity map growth causes memory pressure and can exhaust connection pool resources when sessions live longer than a single request. The identity map accumulates one Python object per database row loaded, and without explicit removal those objects remain in memory for the session's lifetime.
AsyncSession wraps a regular Session internally. Calls to expunge() and clear() dispatch to the synchronous layer immediately, making them safe to call inside coroutines without await:
from __future__ import annotations
from typing import Any
from sqlalchemy import select, Integer, String
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, selectinload
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
name: Mapped[str] = mapped_column(String(100))
orders: Mapped[list["Order"]] = __import__("sqlalchemy.orm", fromlist=["relationship"]).relationship("Order", lazy="select")
class Order(Base):
__tablename__ = "orders"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
user_id: Mapped[int] = mapped_column(Integer)
total: Mapped[float] = mapped_column()
engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db", pool_size=10)
AsyncSessionLocal = async_sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)
async def process_and_cache_user(user_id: int) -> dict[str, Any]:
"""Load a user, pre-fetch relationships, expunge for external caching."""
async with AsyncSessionLocal() as session:
stmt = (
select(User)
.where(User.id == user_id)
.options(selectinload(User.orders))
)
result = await session.execute(stmt)
user = result.scalar_one()
# Pre-load complete — safe to detach
session.expunge(user)
# user.orders is accessible because selectinload fetched it eagerly
return {
"id": user.id,
"name": user.name,
"order_count": len(user.orders),
}
async def stateless_worker_tick(session: AsyncSession) -> None:
"""Background worker that processes a batch then clears memory."""
result = await session.execute(select(Order).limit(500))
orders = result.scalars().all()
for order in orders:
# process each order ...
pass
# Discard all tracked state before the next batch iteration
# Prevents identity map from growing without bound across ticks
session.clear()
The key distinction for async workflows: expunge() is appropriate when you need to pass a fully hydrated object to a serializer, background task queue, or external cache that lives outside the session scope. clear() is appropriate at batch boundaries in long-running workers where you want a clean slate but are reusing the session object itself to avoid the overhead of creating a new engine connection.
Resolving Warnings, Errors & Common Mistakes
| Error / Warning | Root Cause | Production Fix |
|---|---|---|
sqlalchemy.orm.exc.DetachedInstanceError: Instance <User> is not bound to a Session; attribute refresh operation cannot proceed | Accessing an unloaded attribute or lazy relationship after expunge() or clear(). The instance lost its session reference. | Pre-load all required attributes with selectinload() / joinedload() before detaching. Or re-attach via session.add(obj). |
sqlalchemy.exc.InvalidRequestError: Can't flush an 'expunge'-ed object that is pending | Calling session.flush() after expunging an object that was in pending state (never flushed to DB). SQLAlchemy still tracks the pending INSERT. | Always flush pending objects before expunging, or use session.expunge_all() only after the transaction boundary. |
sqlalchemy.exc.InvalidRequestError: This session's transaction has been rolled back due to a previous exception during flush | Calling clear() or re-using a session after a failed flush without rollback first. | Call await session.rollback() before any subsequent operations including clear(). |
sqlalchemy.exc.InvalidRequestError: Instance <Order> is already attached to session '...' (this is '...') | Calling session.add(obj) on an expunged object in a different session than the one that originally tracked it. | Use session.merge(obj) to integrate an object from a foreign or closed session. |
Stale attribute values returned after clear() followed by re-query | Identity map was cleared but a cached object reference in application code still holds old values. | Discard Python references to objects before calling clear(), or use session.expire(obj) selectively instead. |
Interaction with Pending and Deleted Objects
The behavior of both operations shifts depending on whether the object is pending (never flushed) or deleted (marked for deletion):
from __future__ import annotations
from sqlalchemy.orm import Session
from sqlalchemy import inspect
def demonstrate_state_edge_cases(session: Session) -> None:
# Pending object — not yet in the database
new_user = User(name="Bob", id=99)
session.add(new_user)
assert inspect(new_user).pending
# expunge a pending object — it becomes transient, not detached
session.expunge(new_user)
assert inspect(new_user).transient # No DB identity, so transient not detached
# Deleted object
existing = session.get(User, 1)
session.delete(existing)
assert inspect(existing).deleted
# clear() on a session with deleted objects re-raises them to detached
session.clear()
assert inspect(existing).detached # DELETE never issued; row still exists in DB
This edge case matters in batch jobs: if you call session.clear() while the session has deleted objects pending, those deletions are silently abandoned. The rows remain in the database. Always call session.flush() before session.clear() if any deletions must be committed.
Advanced Identity Map Optimization
Selective Expunge in Cache Population Pipelines
In read-heavy APIs that maintain an application-level cache (Redis, Memcached), the pattern of load → expunge → serialize → store reduces ORM memory overhead while preserving safe serialization outside the session boundary:
from __future__ import annotations
import json
from sqlalchemy import select, Integer, String, Numeric
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, selectinload
from sqlalchemy.ext.asyncio import AsyncSession
from decimal import Decimal
class Base(DeclarativeBase):
pass
class Product(Base):
__tablename__ = "products"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
sku: Mapped[str] = mapped_column(String(50))
name: Mapped[str] = mapped_column(String(200))
price: Mapped[Decimal] = mapped_column(Numeric(10, 2))
async def warm_product_cache(
session: AsyncSession,
redis_client,
product_ids: list[int],
) -> None:
"""Bulk-load products, expunge each after caching to reclaim ORM memory."""
stmt = select(Product).where(Product.id.in_(product_ids))
result = await session.execute(stmt)
products = result.scalars().all()
for product in products:
payload = json.dumps({
"id": product.id,
"sku": product.sku,
"name": product.name,
"price": str(product.price),
})
await redis_client.set(f"product:{product.id}", payload, ex=300)
session.expunge(product)
# product is now detached; Python GC can reclaim its ORM wrapper
This avoids holding a large identity map in memory for the duration of a cache warming job. Each product is eligible for garbage collection as soon as it is expunged, even though the list comprehension products still holds references — so for very large batches, process in chunks and avoid storing all scalars in a single list.
expunge_all() as a Lightweight clear() Alternative
session.expunge_all() detaches every persistent object without resetting internal session state (pending new/dirty/deleted tracking lists). It is subtly different from session.clear():
# expunge_all() — detaches persistent objects, preserves pending/dirty/deleted queues
session.expunge_all()
# session.new, session.dirty, session.deleted may still contain objects
# clear() — full reset including new/dirty/deleted queues
session.clear()
# session.new, session.dirty, session.deleted are all empty
Use expunge_all() when you want to release persistent objects to external code but still intend to commit or rollback pending changes tracked in the session. Use clear() only when you are certain no pending changes should survive.
Frequently Asked Questions
Does Session.clear() trigger a database rollback?
No. clear() only empties the identity map and resets internal tracking collections. Pending changes remain in the active database transaction until commit() or rollback() is explicitly invoked. If you have pending inserts or updates and call clear(), those changes are lost from the session's tracking but the database transaction remains open — creating a confusing state where the DB holds an open transaction with no corresponding ORM state.
Can I use Session.expunge() with AsyncSession in SQLAlchemy 2.0?
Yes. AsyncSession delegates expunge() to the underlying synchronous Session internal proxy. It operates synchronously on the identity map without blocking the event loop and does not require await. The same applies to expunge_all() and clear().
When should I prefer clear() over expunge() in high-throughput APIs?
Use clear() at the end of stateless request handlers or between batch iterations in long-running workers where you want to guarantee zero memory retention from ORM tracking. Use expunge() only when specific objects must persist in application-level cache or be passed to code outside the ORM session while retaining their attribute values.
How do I resolve DetachedInstanceError after calling expunge()?
The error means code accessed an attribute that was not loaded before detachment. Fix it by pre-loading required attributes and relationships using selectinload() or joinedload() in the query that hydrates the object, or re-attach the object to a new session via session.add(obj) (if primary key is set and DB row exists) before accessing unloaded attributes.
Related
- Session Lifecycle and Scope Management — Parent guide covering all five object states, identity map semantics, and request-scoped session patterns.
- Fixing DetachedInstanceError After Commit in SQLAlchemy — Sibling guide focused on the most common detachment scenario: attribute access after
commit()withexpire_on_commit=True. - Transaction Isolation and Commit Strategies — Covers how
commit(),rollback(), and savepoints interact with session state. - Using expire_on_commit=False in FastAPI Dependencies — Framework-specific pattern that avoids post-commit lazy-load issues without manual expunge.