Excluding Tables and Schemas from Alembic Autogenerate
Use the include_object or include_name hook in env.py to tell Alembic which tables and schemas to ignore during autogenerate — this prevents it from emitting spurious op.drop_table directives for tables your application does not own, and is a critical step in any autogenerate workflow.
Quick Answer
# BEFORE — no filter: autogenerate drops spatial_ref_sys, celery_taskmeta, etc.
context.configure(
connection=connection,
target_metadata=target_metadata,
)
# AFTER — include_object hook keeps autogenerate scoped to your app's tables
EXCLUDE_TABLES = frozenset({
# PostGIS geometry tables
"spatial_ref_sys",
"raster_columns",
"raster_overviews",
"geometry_columns",
# Celery result backend
"celery_taskmeta",
"celery_tasksetmeta",
# Django admin tables (shared database)
"django_migrations",
"django_content_type",
"auth_user",
"auth_permission",
"auth_group",
})
EXCLUDE_SCHEMAS = frozenset({"tiger", "tiger_data", "topology"}) # PostGIS
def include_object(object, name, type_, reflected, compare_to):
if type_ == "table":
if name in EXCLUDE_TABLES:
return False
# object.schema is None for the default schema
if getattr(object, "schema", None) in EXCLUDE_SCHEMAS:
return False
if type_ == "schema" and name in EXCLUDE_SCHEMAS:
return False
return True
context.configure(
connection=connection,
target_metadata=target_metadata,
include_schemas=True,
include_object=include_object,
)
Execution Context & Async Workflow Integration
Alembic calls include_object during the diff phase — after reflecting the live schema but before writing any op-directives. The function receives five arguments:
| Argument | Type | Meaning |
|---|---|---|
object | SchemaItem | The reflected or model object (Table, Column, Index, etc.) |
name | str | The object's name |
type_ | str | "table", "column", "index", "unique_constraint", "foreign_key_constraint", "schema" |
reflected | bool | True if the object came from database reflection |
compare_to | SchemaItem | None | The counterpart from the other side of the diff; None if no counterpart exists |
Return True to include the object in the diff, False to exclude it entirely. When you exclude a table, Alembic skips its columns, indexes, and constraints too — you do not need a separate column-level filter for tables you have already excluded at the table level.
Choosing between include_object and include_name
The include_name hook fires earlier — it filters by name alone before full reflection occurs, making it more efficient for large schemas:
# alembic/env.py — include_name for prefix-based and allowlist filtering
OWNED_SCHEMAS = {"public", "app"}
TABLE_PREFIX_EXCLUDE = ("celery_", "django_", "spatial_", "raster_")
def include_name(name, type_, parent_names):
"""
Filter objects by name before reflection.
parent_names is a dict with keys 'schema_name' and 'table_name'
depending on context.
"""
if type_ == "schema":
# Only reflect schemas owned by this service
return name in OWNED_SCHEMAS
if type_ == "table":
if name.startswith(TABLE_PREFIX_EXCLUDE):
return False
return True
context.configure(
connection=connection,
target_metadata=target_metadata,
include_schemas=True,
include_name=include_name,
)
Use include_name for simple prefix or allowlist filtering. Use include_object when you need to inspect the actual schema object's attributes — for example, checking whether a table has a specific column, or whether a foreign key references an excluded table. The two hooks can be used together: include_name reduces the reflection surface, then include_object applies finer-grained logic to what remains.
In async projects the hook functions themselves are synchronous — they run inside connection.run_sync(do_run_migrations), so no event loop is available inside them. Keep hook functions pure Python with no await expressions and no I/O.
Scoping Autogenerate to Tables Owned by Your Application
The cleanest pattern for shared-database architectures is to derive the exclusion set dynamically from your own metadata, so you never need to maintain a manual exclusion list:
# alembic/env.py — exclude everything not in our metadata
from myapp.models import Base
import myapp.models.users # noqa: F401
import myapp.models.orders # noqa: F401
import myapp.models.products # noqa: F401
import myapp.models.invoices # noqa: F401
target_metadata = Base.metadata
# All table names this service owns, keyed by (schema, name) for schema-aware DBs
OWNED_TABLES = frozenset(
(t.schema, t.name) for t in target_metadata.tables.values()
)
def include_object(object, name, type_, reflected, compare_to):
if type_ == "table" and reflected:
schema = getattr(object, "schema", None)
if (schema, name) not in OWNED_TABLES:
# Table is in the DB but not owned by this service — skip it
return False
return True
def do_run_migrations(connection):
context.configure(
connection=connection,
target_metadata=target_metadata,
compare_type=True,
include_schemas=True,
include_object=include_object,
)
with context.begin_transaction():
context.run_migrations()
This approach is self-maintaining: when you add a new model, OWNED_TABLES automatically expands to include it. No manual updates to the exclusion set are needed.
Resolving Warnings, Errors & Common Mistakes
| Warning / Error | Root Cause | Production Fix |
|---|---|---|
autogenerate produces op.drop_table("spatial_ref_sys") | PostGIS geometry tables visible in reflected schema but absent from target_metadata | Add spatial_ref_sys, raster_columns, raster_overviews, geometry_columns to the exclusion set |
autogenerate produces op.drop_table("celery_taskmeta") | Celery result backend tables in same DB; not in model metadata | Add all celery_* tables to exclusion set, or migrate Celery to a dedicated database |
autogenerate produces op.drop_schema("tiger") | PostGIS tiger schema visible when include_schemas=True | Add tiger, tiger_data, topology to EXCLUDE_SCHEMAS; also handle in include_name |
include_object not called for reflected schema objects | include_schemas=False (the default); schema-level objects not reflected at all | Set include_schemas=True in context.configure |
TypeError: include_object() takes 4 positional arguments but 5 were given | Old four-argument hook signature missing the compare_to parameter | Update the signature to (object, name, type_, reflected, compare_to) |
autogenerate still emits changes for excluded table | include_name returned True but include_object was not wired; Alembic reflected it fully | Ensure both hooks are passed to context.configure; include_name alone does not block object-level diffs if the table was already reflected |
autogenerate drops the alembic_version table | Custom version_table name is not the default alembic_version so Alembic's internal skip logic does not trigger | Add the custom version table name to EXCLUDE_TABLES |
AttributeError: 'NoneType' object has no attribute 'schema' | include_object accesses object.schema on a non-Table object such as an Index | Guard with if type_ == "table": before accessing .schema; use getattr(object, "schema", None) |
autogenerate emits spurious op.alter_column for unchanged columns | compare_server_default=True combined with dialect rendering differences | Add a process_revision_directives hook to filter out no-op server-default changes, or disable compare_server_default for known-noisy columns |
autogenerate emits op.drop_table for alembic_version in tenant schema | Running include_schemas=True without version_table_schema causes Alembic to create alembic_version in every schema it encounters | Set version_table_schema="public" in context.configure |
Advanced Exclusion Optimization
Regex-Based Exclusion for Dynamic Table Names
For projects with many third-party integrations or dynamically created tables (per-tenant partitions, time-sharded tables), a regex-based approach is more maintainable than a static set:
# alembic/env.py — regex exclusion pattern
import re
from myapp.models import Base
target_metadata = Base.metadata
EXCLUDE_PATTERNS = [
re.compile(r"^celery_"),
re.compile(r"^django_"),
re.compile(r"^spatial_"),
re.compile(r"^raster_"),
re.compile(r"^topology\."), # PostGIS topology schema tables
re.compile(r"^tenant_\d+_archive$"), # archived tenant shards
re.compile(r"^_timescaledb_"), # TimescaleDB internal tables
]
OWNED_SCHEMAS = {"public", "app", "reporting"}
def include_object(object, name, type_, reflected, compare_to):
if type_ == "table":
# Reject tables in schemas we do not own
schema = getattr(object, "schema", None)
if schema is not None and schema not in OWNED_SCHEMAS:
return False
# Reject tables matching the exclusion patterns
if any(p.search(name) for p in EXCLUDE_PATTERNS):
return False
if type_ == "schema" and name not in OWNED_SCHEMAS:
return False
return True
Reading Exclusions from Environment Variables
For CI pipelines that need to vary exclusion lists without modifying env.py, read the list from an environment variable:
# alembic/env.py — environment-variable-driven exclusion
import os
_extra_excludes = os.environ.get("ALEMBIC_EXCLUDE_TABLES", "")
EXCLUDE_TABLES = frozenset(
t.strip() for t in _extra_excludes.split(",") if t.strip()
) | {
"spatial_ref_sys",
"celery_taskmeta",
"celery_tasksetmeta",
}
With this pattern, a CI job running against a shared staging environment can pass ALEMBIC_EXCLUDE_TABLES=legacy_product_catalog,old_sessions without touching the codebase.
Column-Level Exclusion within Managed Tables
To exclude a specific column from comparison within an otherwise managed table — for example, a search_vector column managed by a PostgreSQL trigger rather than SQLAlchemy:
EXCLUDE_COLUMNS = {
("products", "search_vector"), # tsvector maintained by trigger
("orders", "tsv_description"),
}
def include_object(object, name, type_, reflected, compare_to):
if type_ == "column":
table_name = getattr(object, "table", None)
if table_name is not None:
key = (table_name.name, name)
if key in EXCLUDE_COLUMNS:
return False
if type_ == "table" and name in EXCLUDE_TABLES:
return False
return True
Note that excluding a column does not automatically exclude foreign key constraints or indexes that reference it. Add those to the exclusion logic separately if needed.
Verifying That Exclusions Work Correctly
After wiring include_object or include_name, verify the hooks are effective by running autogenerate against a database that contains the excluded tables and checking the output:
# Run autogenerate and inspect what Alembic proposes
alembic revision --autogenerate -m "verify_exclusions_test"
# Open the generated file and confirm excluded tables are absent
cat alembic/versions/*verify_exclusions_test*.py
The generated file should not contain any op.drop_table("spatial_ref_sys") or similar directives. If it does, the hook is not being called — which most commonly means include_schemas=True is missing from context.configure, so Alembic is not reflecting those schemas at all and the tables appear via a different path.
A second verification approach: run autogenerate twice in a row without making any model changes. If both runs produce revision files with no ops inside upgrade() and downgrade(), the exclusion is stable and idempotent. If the second run still proposes changes, there is a spurious diff loop — typically caused by compare_server_default=True combined with dialect rendering differences for columns that were not excluded.
Auditing Hook Coverage with Logging
Enable Alembic's diff logging to see which objects are being considered and whether hooks are firing:
# alembic/env.py — debug logging for hook troubleshooting
import logging
logging.getLogger("alembic.autogenerate.compare").setLevel(logging.DEBUG)
With this set, Alembic logs each object it compares. You can grep the output for the table names you expect to be excluded and confirm they are absent from the comparison log.
Testing include_object in Isolation
For larger projects, unit-test the hook function directly to avoid running a full migration cycle during CI:
# tests/test_alembic_env.py
from unittest.mock import MagicMock
from alembic.env import include_object # import the hook function directly
def make_table_mock(name, schema=None):
obj = MagicMock()
obj.schema = schema
return obj
def test_excludes_spatial_ref_sys():
obj = make_table_mock("spatial_ref_sys")
assert include_object(obj, "spatial_ref_sys", "table", True, None) is False
def test_excludes_tiger_schema_tables():
obj = make_table_mock("counties", schema="tiger")
assert include_object(obj, "counties", "table", True, None) is False
def test_includes_owned_table():
obj = make_table_mock("orders")
assert include_object(obj, "orders", "table", True, None) is True
def test_includes_columns_in_owned_table():
# Column objects do not have schema on them directly
col = MagicMock()
col.table = make_table_mock("products")
assert include_object(col, "price", "column", True, None) is True
Running these as standard pytest tests catches hook regressions without needing a live database.
Frequently Asked Questions
Does include_object prevent autogenerate from detecting new columns in excluded tables?
Yes. Once a table is excluded at the table level, Alembic skips all of its columns, indexes, and constraints. You do not need column-level filters for excluded tables.
Can I exclude a single column within an otherwise managed table?
Yes. Check type_ == "column" and inspect the parent table name via object.table.name inside include_object. Return False for the specific column. Be aware that foreign keys and indexes referencing that column may also need to be excluded; handle them by checking type_ == "foreign_key_constraint" or type_ == "index" and inspecting the referenced column names.
Why does setting include_schemas=False still show foreign-schema tables?include_schemas=False means Alembic only reflects the default schema (usually public in PostgreSQL). But if your target_metadata contains Table objects with an explicit schema= argument, Alembic reflects those schemas regardless of this flag. Set include_schemas=True and use include_object to control which schemas are diffed.
Is there a way to exclude tables without editing env.py?
Alembic does not support exclusion via alembic.ini keys. The include_object and include_name hooks in env.py are the only supported mechanism. To allow runtime overrides without code changes, read the exclusion list from an environment variable as shown in the advanced optimization section above.
Related
- Autogenerating and reviewing migration scripts — parent guide: how autogenerate works end-to-end, naming conventions, editing revisions, and branch/merge workflows.
- Configuring Alembic with async SQLAlchemy engines — wiring
env.pyfor asyncpg, which is required beforeinclude_objecthooks can run. - Zero-downtime schema migration strategies — safely applying migrations to live systems where table exclusion matters most for avoiding accidental data loss.