[{"data":1,"prerenderedAt":1230},["ShallowReactive",2],{"page-\u002Fadvanced-query-patterns-and-bulk-data-operations\u002Fhigh-performance-bulk-inserts-and-updates\u002F":3},{"id":4,"title":5,"body":6,"description":16,"extension":1224,"meta":1225,"navigation":178,"path":1226,"seo":1227,"stem":1228,"__hash__":1229},"content\u002Fadvanced-query-patterns-and-bulk-data-operations\u002Fhigh-performance-bulk-inserts-and-updates\u002Findex.md","High-Performance Bulk Inserts and Updates in SQLAlchemy 2.0",{"type":7,"value":8,"toc":1200},"minimark",[9,13,17,22,27,43,47,88,97,101,105,119,410,414,429,433,437,456,460,474,669,673,677,680,684,691,859,863,867,874,878,892,1063,1067,1144,1148,1157,1174,1187,1196],[10,11,5],"h1",{"id":12},"high-performance-bulk-inserts-and-updates-in-sqlalchemy-20",[14,15,16],"p",{},"Modern data ingestion pipelines demand deterministic latency, predictable memory footprints, and strict async\u002Fawait boundaries. SQLAlchemy 2.0 introduces a unified execution model that bridges the ORM's unit-of-work with Core's raw execution engine. However, achieving production-grade throughput requires deliberate architectural choices around transaction scoping, dialect-specific optimizations, and cursor management. This guide details production-ready patterns for bulk operations, emphasizing performance trade-offs and safe async execution boundaries.",[18,19,21],"h2",{"id":20},"architectural-foundations-for-bulk-data-workflows","Architectural Foundations for Bulk Data Workflows",[23,24,26],"h3",{"id":25},"orm-vs-core-execution-boundaries","ORM vs Core Execution Boundaries",[14,28,29,30,34,35,38,39,42],{},"The SQLAlchemy ORM excels at state tracking, relationship hydration, and identity map management, but these features introduce measurable overhead during high-volume data operations. Each session flush triggers attribute instrumentation, dirty-state tracking, and dependency resolution. For bulk workflows exceeding a few thousand rows, bypassing the ORM's unit-of-work in favor of Core's ",[31,32,33],"code",{},"Connection"," or ",[31,36,37],{},"Engine"," execution layer eliminates this tax. Core operates directly against the DBAPI, allowing dialects like ",[31,40,41],{},"asyncpg"," to leverage native prepared statement batching and zero-copy buffer transfers.",[23,44,46],{"id":45},"async-driver-selection-and-pool-configuration","Async Driver Selection and Pool Configuration",[14,48,49,50,52,53,56,57,60,61,64,65,68,69,72,73,76,77,34,80,83,84,87],{},"Driver selection dictates the ceiling of your throughput. ",[31,51,41],{}," is the de facto standard for PostgreSQL in async environments, offering native C-level batch execution and efficient binary protocol encoding. When configuring ",[31,54,55],{},"create_async_engine()",", pool sizing must align with your event loop concurrency rather than traditional synchronous thread models. A ",[31,58,59],{},"pool_size"," of 5–10 with ",[31,62,63],{},"max_overflow=20"," typically suffices for I\u002FO-bound async workers. Enable ",[31,66,67],{},"pool_pre_ping=True"," to handle stale connections gracefully, and tune ",[31,70,71],{},"statement_cache_size"," based on query cardinality. For SQLite, ",[31,74,75],{},"aiosqlite"," requires careful transaction isolation tuning (",[31,78,79],{},"BEGIN IMMEDIATE",[31,81,82],{},"BEGIN EXCLUSIVE",") to prevent ",[31,85,86],{},"database is locked"," errors during concurrent writes.",[14,89,90,91,96],{},"Understanding how these execution boundaries interact with broader data pipeline architectures is essential. For a comprehensive breakdown of scalable ingestion patterns, consult the ",[92,93,95],"a",{"href":94},"\u002Fadvanced-query-patterns-and-bulk-data-operations\u002F","Advanced Query Patterns and Bulk Data Operations"," framework, which contextualizes bulk workflows within distributed data architectures.",[18,98,100],{"id":99},"core-level-batch-execution-and-parameter-binding","Core-Level Batch Execution and Parameter Binding",[23,102,104],{"id":103},"fast-execution-paths","Fast Execution Paths",[14,106,107,108,111,112,115,116,118],{},"SQLAlchemy 2.0 standardizes bulk insertion via ",[31,109,110],{},"Insert.values()"," accepting a sequence of dictionaries. This syntax bypasses ORM instrumentation and maps directly to the DBAPI's ",[31,113,114],{},"executemany"," protocol. When paired with ",[31,117,41],{},", the driver automatically compiles a single prepared statement and streams parameters in binary format, avoiding repeated SQL parsing and network round-trips.",[120,121,126],"pre",{"className":122,"code":123,"language":124,"meta":125,"style":125},"language-python shiki shiki-themes github-light github-dark","from typing import List, Dict, Any\nfrom sqlalchemy.ext.asyncio import AsyncEngine, create_async_engine\nfrom sqlalchemy import insert, Table, MetaData\n\n# Assume a pre-defined metadata\u002Ftable object\nmetadata = MetaData()\nmetrics_table = Table(\"metrics\", metadata)\n\nasync def bulk_insert_metrics(\n engine: AsyncEngine,\n payload: List[Dict[str, Any]],\n chunk_size: int = 2500\n) -> None:\n \"\"\"Execute chunked bulk inserts using Core execution boundaries.\"\"\"\n if not payload:\n return\n\n stmt = insert(metrics_table)\n \n # Explicit transaction boundary prevents implicit autocommit fragmentation\n async with engine.begin() as conn:\n for i in range(0, len(payload), chunk_size):\n chunk = payload[i : i + chunk_size]\n await conn.execute(stmt, chunk)\n","python","",[31,127,128,147,160,173,180,187,199,217,222,238,244,257,272,284,290,302,308,313,324,330,336,354,384,401],{"__ignoreMap":125},[129,130,133,137,141,144],"span",{"class":131,"line":132},"line",1,[129,134,136],{"class":135},"szBVR","from",[129,138,140],{"class":139},"sVt8B"," typing ",[129,142,143],{"class":135},"import",[129,145,146],{"class":139}," List, Dict, Any\n",[129,148,150,152,155,157],{"class":131,"line":149},2,[129,151,136],{"class":135},[129,153,154],{"class":139}," sqlalchemy.ext.asyncio ",[129,156,143],{"class":135},[129,158,159],{"class":139}," AsyncEngine, create_async_engine\n",[129,161,163,165,168,170],{"class":131,"line":162},3,[129,164,136],{"class":135},[129,166,167],{"class":139}," sqlalchemy ",[129,169,143],{"class":135},[129,171,172],{"class":139}," insert, Table, MetaData\n",[129,174,176],{"class":131,"line":175},4,[129,177,179],{"emptyLinePlaceholder":178},true,"\n",[129,181,183],{"class":131,"line":182},5,[129,184,186],{"class":185},"sJ8bj","# Assume a pre-defined metadata\u002Ftable object\n",[129,188,190,193,196],{"class":131,"line":189},6,[129,191,192],{"class":139},"metadata ",[129,194,195],{"class":135},"=",[129,197,198],{"class":139}," MetaData()\n",[129,200,202,205,207,210,214],{"class":131,"line":201},7,[129,203,204],{"class":139},"metrics_table ",[129,206,195],{"class":135},[129,208,209],{"class":139}," Table(",[129,211,213],{"class":212},"sZZnC","\"metrics\"",[129,215,216],{"class":139},", metadata)\n",[129,218,220],{"class":131,"line":219},8,[129,221,179],{"emptyLinePlaceholder":178},[129,223,225,228,231,235],{"class":131,"line":224},9,[129,226,227],{"class":135},"async",[129,229,230],{"class":135}," def",[129,232,234],{"class":233},"sScJk"," bulk_insert_metrics",[129,236,237],{"class":139},"(\n",[129,239,241],{"class":131,"line":240},10,[129,242,243],{"class":139}," engine: AsyncEngine,\n",[129,245,247,250,254],{"class":131,"line":246},11,[129,248,249],{"class":139}," payload: List[Dict[",[129,251,253],{"class":252},"sj4cs","str",[129,255,256],{"class":139},", Any]],\n",[129,258,260,263,266,269],{"class":131,"line":259},12,[129,261,262],{"class":139}," chunk_size: ",[129,264,265],{"class":252},"int",[129,267,268],{"class":135}," =",[129,270,271],{"class":252}," 2500\n",[129,273,275,278,281],{"class":131,"line":274},13,[129,276,277],{"class":139},") -> ",[129,279,280],{"class":252},"None",[129,282,283],{"class":139},":\n",[129,285,287],{"class":131,"line":286},14,[129,288,289],{"class":212}," \"\"\"Execute chunked bulk inserts using Core execution boundaries.\"\"\"\n",[129,291,293,296,299],{"class":131,"line":292},15,[129,294,295],{"class":135}," if",[129,297,298],{"class":135}," not",[129,300,301],{"class":139}," payload:\n",[129,303,305],{"class":131,"line":304},16,[129,306,307],{"class":135}," return\n",[129,309,311],{"class":131,"line":310},17,[129,312,179],{"emptyLinePlaceholder":178},[129,314,316,319,321],{"class":131,"line":315},18,[129,317,318],{"class":139}," stmt ",[129,320,195],{"class":135},[129,322,323],{"class":139}," insert(metrics_table)\n",[129,325,327],{"class":131,"line":326},19,[129,328,329],{"class":139}," \n",[129,331,333],{"class":131,"line":332},20,[129,334,335],{"class":185}," # Explicit transaction boundary prevents implicit autocommit fragmentation\n",[129,337,339,342,345,348,351],{"class":131,"line":338},21,[129,340,341],{"class":135}," async",[129,343,344],{"class":135}," with",[129,346,347],{"class":139}," engine.begin() ",[129,349,350],{"class":135},"as",[129,352,353],{"class":139}," conn:\n",[129,355,357,360,363,366,369,372,375,378,381],{"class":131,"line":356},22,[129,358,359],{"class":135}," for",[129,361,362],{"class":139}," i ",[129,364,365],{"class":135},"in",[129,367,368],{"class":252}," range",[129,370,371],{"class":139},"(",[129,373,374],{"class":252},"0",[129,376,377],{"class":139},", ",[129,379,380],{"class":252},"len",[129,382,383],{"class":139},"(payload), chunk_size):\n",[129,385,387,390,392,395,398],{"class":131,"line":386},23,[129,388,389],{"class":139}," chunk ",[129,391,195],{"class":135},[129,393,394],{"class":139}," payload[i : i ",[129,396,397],{"class":135},"+",[129,399,400],{"class":139}," chunk_size]\n",[129,402,404,407],{"class":131,"line":403},24,[129,405,406],{"class":135}," await",[129,408,409],{"class":139}," conn.execute(stmt, chunk)\n",[23,411,413],{"id":412},"chunking-and-memory-profiling","Chunking and Memory Profiling",[14,415,416,417,420,421,423,424,428],{},"Unbounded batch sizes trigger WAL (Write-Ahead Log) exhaustion, transaction log bloat, and heap fragmentation. Chunking at 1,000–5,000 rows per transaction balances I\u002FO throughput with memory stability. Monitor ",[31,418,419],{},"pg_stat_activity"," and application heap metrics to adjust chunk sizes dynamically. For deep dives into heap allocation strategies and dialect-specific ",[31,422,114],{}," tuning, refer to ",[92,425,427],{"href":426},"\u002Fadvanced-query-patterns-and-bulk-data-operations\u002Fhigh-performance-bulk-inserts-and-updates\u002Fbatch-inserting-millions-of-rows-with-sqlalchemy-coreexecute\u002F","Batch Inserting Millions of Rows with SQLAlchemy core.execute",".",[18,430,432],{"id":431},"upsert-logic-and-conflict-resolution-patterns","Upsert Logic and Conflict Resolution Patterns",[23,434,436],{"id":435},"native-on-conflict-mapping","Native ON CONFLICT Mapping",[14,438,439,440,443,444,447,448,451,452,455],{},"Idempotent data ingestion requires conflict resolution at the database level. SQLAlchemy 2.0 exposes ",[31,441,442],{},"Insert.on_conflict_do_update()",", which compiles directly to PostgreSQL's ",[31,445,446],{},"ON CONFLICT ... DO UPDATE"," or SQLite's equivalent. This avoids the ",[31,449,450],{},"SELECT","-then-",[31,453,454],{},"INSERT\u002FUPDATE"," race condition and executes atomically within a single statement.",[23,457,459],{"id":458},"conditional-update-expressions","Conditional Update Expressions",[14,461,462,463,34,466,469,470,473],{},"Targeting specific constraints requires explicit ",[31,464,465],{},"index_elements",[31,467,468],{},"constraint"," parameters. The ",[31,471,472],{},"EXCLUDED"," pseudo-table allows referencing incoming values during the update phase. For production-tested conflict resolution strategies, see Bulk Updating with ON CONFLICT DO UPDATE in PostgreSQL.",[120,475,477],{"className":122,"code":476,"language":124,"meta":125,"style":125},"from sqlalchemy import insert, func\nfrom sqlalchemy.dialects.postgresql import insert as pg_insert\n\nasync def upsert_device_readings(\n engine: AsyncEngine,\n readings: List[Dict[str, Any]]\n) -> None:\n \"\"\"Perform atomic upserts targeting a composite unique constraint.\"\"\"\n stmt = pg_insert(metrics_table).values(readings)\n \n upsert_stmt = stmt.on_conflict_do_update(\n index_elements=[\"device_id\", \"timestamp\"],\n set_={\n \"value\": stmt.excluded.value,\n \"updated_at\": func.now()\n },\n where=stmt.excluded.value > metrics_table.c.value\n )\n \n async with engine.begin() as conn:\n await conn.execute(upsert_stmt)\n",[31,478,479,490,507,511,522,526,536,544,549,558,562,572,594,604,612,620,625,641,646,650,662],{"__ignoreMap":125},[129,480,481,483,485,487],{"class":131,"line":132},[129,482,136],{"class":135},[129,484,167],{"class":139},[129,486,143],{"class":135},[129,488,489],{"class":139}," insert, func\n",[129,491,492,494,497,499,502,504],{"class":131,"line":149},[129,493,136],{"class":135},[129,495,496],{"class":139}," sqlalchemy.dialects.postgresql ",[129,498,143],{"class":135},[129,500,501],{"class":139}," insert ",[129,503,350],{"class":135},[129,505,506],{"class":139}," pg_insert\n",[129,508,509],{"class":131,"line":162},[129,510,179],{"emptyLinePlaceholder":178},[129,512,513,515,517,520],{"class":131,"line":175},[129,514,227],{"class":135},[129,516,230],{"class":135},[129,518,519],{"class":233}," upsert_device_readings",[129,521,237],{"class":139},[129,523,524],{"class":131,"line":182},[129,525,243],{"class":139},[129,527,528,531,533],{"class":131,"line":189},[129,529,530],{"class":139}," readings: List[Dict[",[129,532,253],{"class":252},[129,534,535],{"class":139},", Any]]\n",[129,537,538,540,542],{"class":131,"line":201},[129,539,277],{"class":139},[129,541,280],{"class":252},[129,543,283],{"class":139},[129,545,546],{"class":131,"line":219},[129,547,548],{"class":212}," \"\"\"Perform atomic upserts targeting a composite unique constraint.\"\"\"\n",[129,550,551,553,555],{"class":131,"line":224},[129,552,318],{"class":139},[129,554,195],{"class":135},[129,556,557],{"class":139}," pg_insert(metrics_table).values(readings)\n",[129,559,560],{"class":131,"line":240},[129,561,329],{"class":139},[129,563,564,567,569],{"class":131,"line":246},[129,565,566],{"class":139}," upsert_stmt ",[129,568,195],{"class":135},[129,570,571],{"class":139}," stmt.on_conflict_do_update(\n",[129,573,574,578,580,583,586,588,591],{"class":131,"line":259},[129,575,577],{"class":576},"s4XuR"," index_elements",[129,579,195],{"class":135},[129,581,582],{"class":139},"[",[129,584,585],{"class":212},"\"device_id\"",[129,587,377],{"class":139},[129,589,590],{"class":212},"\"timestamp\"",[129,592,593],{"class":139},"],\n",[129,595,596,599,601],{"class":131,"line":274},[129,597,598],{"class":576}," set_",[129,600,195],{"class":135},[129,602,603],{"class":139},"{\n",[129,605,606,609],{"class":131,"line":286},[129,607,608],{"class":212}," \"value\"",[129,610,611],{"class":139},": stmt.excluded.value,\n",[129,613,614,617],{"class":131,"line":292},[129,615,616],{"class":212}," \"updated_at\"",[129,618,619],{"class":139},": func.now()\n",[129,621,622],{"class":131,"line":304},[129,623,624],{"class":139}," },\n",[129,626,627,630,632,635,638],{"class":131,"line":310},[129,628,629],{"class":576}," where",[129,631,195],{"class":135},[129,633,634],{"class":139},"stmt.excluded.value ",[129,636,637],{"class":135},">",[129,639,640],{"class":139}," metrics_table.c.value\n",[129,642,643],{"class":131,"line":315},[129,644,645],{"class":139}," )\n",[129,647,648],{"class":131,"line":326},[129,649,329],{"class":139},[129,651,652,654,656,658,660],{"class":131,"line":332},[129,653,341],{"class":135},[129,655,344],{"class":135},[129,657,347],{"class":139},[129,659,350],{"class":135},[129,661,353],{"class":139},[129,663,664,666],{"class":131,"line":338},[129,665,406],{"class":135},[129,667,668],{"class":139}," conn.execute(upsert_stmt)\n",[18,670,672],{"id":671},"memory-management-and-async-streaming-pipelines","Memory Management and Async Streaming Pipelines",[23,674,676],{"id":675},"event-loop-preservation","Event Loop Preservation",[14,678,679],{},"Blocking the event loop during large dataset ingestion causes cascading latency across async services. Python's garbage collector can trigger unpredictable pauses when allocating millions of ORM instances. Generator-based chunking and explicit cursor management maintain stable heap allocation and prevent event loop starvation.",[23,681,683],{"id":682},"cursor-based-iteration","Cursor-Based Iteration",[14,685,686,687,690],{},"SQLAlchemy 2.0 fully supports ",[31,688,689],{},"yield_per"," in async contexts, but requires server-side cursor configuration. By default, async drivers fetch all results into memory. Enabling server-side cursors shifts buffering to the database, allowing incremental consumption. For implementation details on backpressure handling and cursor lifecycle management, review Using yield_per for Streaming Large Query Results.",[120,692,694],{"className":122,"code":693,"language":124,"meta":125,"style":125},"from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker\nfrom sqlalchemy import select, func\nfrom typing import AsyncGenerator\n\nasync def stream_metrics_for_processing(\n session: AsyncSession,\n device_id: str\n) -> AsyncGenerator[Any, None]:\n \"\"\"Stream results using yield_per with explicit transaction boundaries.\"\"\"\n stmt = select(metrics_table).where(\n metrics_table.c.device_id == device_id\n ).order_by(metrics_table.c.timestamp)\n\n # Server-side cursor requires an active transaction\n async with session.begin():\n result = await session.scalars(stmt.execution_options(yield_per=1000))\n async for row in result:\n yield row\n",[31,695,696,707,718,729,733,744,749,757,767,772,781,792,797,801,806,815,837,851],{"__ignoreMap":125},[129,697,698,700,702,704],{"class":131,"line":132},[129,699,136],{"class":135},[129,701,154],{"class":139},[129,703,143],{"class":135},[129,705,706],{"class":139}," AsyncSession, async_sessionmaker\n",[129,708,709,711,713,715],{"class":131,"line":149},[129,710,136],{"class":135},[129,712,167],{"class":139},[129,714,143],{"class":135},[129,716,717],{"class":139}," select, func\n",[129,719,720,722,724,726],{"class":131,"line":162},[129,721,136],{"class":135},[129,723,140],{"class":139},[129,725,143],{"class":135},[129,727,728],{"class":139}," AsyncGenerator\n",[129,730,731],{"class":131,"line":175},[129,732,179],{"emptyLinePlaceholder":178},[129,734,735,737,739,742],{"class":131,"line":182},[129,736,227],{"class":135},[129,738,230],{"class":135},[129,740,741],{"class":233}," stream_metrics_for_processing",[129,743,237],{"class":139},[129,745,746],{"class":131,"line":189},[129,747,748],{"class":139}," session: AsyncSession,\n",[129,750,751,754],{"class":131,"line":201},[129,752,753],{"class":139}," device_id: ",[129,755,756],{"class":252},"str\n",[129,758,759,762,764],{"class":131,"line":219},[129,760,761],{"class":139},") -> AsyncGenerator[Any, ",[129,763,280],{"class":252},[129,765,766],{"class":139},"]:\n",[129,768,769],{"class":131,"line":224},[129,770,771],{"class":212}," \"\"\"Stream results using yield_per with explicit transaction boundaries.\"\"\"\n",[129,773,774,776,778],{"class":131,"line":240},[129,775,318],{"class":139},[129,777,195],{"class":135},[129,779,780],{"class":139}," select(metrics_table).where(\n",[129,782,783,786,789],{"class":131,"line":246},[129,784,785],{"class":139}," metrics_table.c.device_id ",[129,787,788],{"class":135},"==",[129,790,791],{"class":139}," device_id\n",[129,793,794],{"class":131,"line":259},[129,795,796],{"class":139}," ).order_by(metrics_table.c.timestamp)\n",[129,798,799],{"class":131,"line":274},[129,800,179],{"emptyLinePlaceholder":178},[129,802,803],{"class":131,"line":286},[129,804,805],{"class":185}," # Server-side cursor requires an active transaction\n",[129,807,808,810,812],{"class":131,"line":292},[129,809,341],{"class":135},[129,811,344],{"class":135},[129,813,814],{"class":139}," session.begin():\n",[129,816,817,820,822,824,827,829,831,834],{"class":131,"line":304},[129,818,819],{"class":139}," result ",[129,821,195],{"class":135},[129,823,406],{"class":135},[129,825,826],{"class":139}," session.scalars(stmt.execution_options(",[129,828,689],{"class":576},[129,830,195],{"class":135},[129,832,833],{"class":252},"1000",[129,835,836],{"class":139},"))\n",[129,838,839,841,843,846,848],{"class":131,"line":310},[129,840,341],{"class":135},[129,842,359],{"class":135},[129,844,845],{"class":139}," row ",[129,847,365],{"class":135},[129,849,850],{"class":139}," result:\n",[129,852,853,856],{"class":131,"line":315},[129,854,855],{"class":135}," yield",[129,857,858],{"class":139}," row\n",[18,860,862],{"id":861},"advanced-bulk-transformations-with-ctes-and-joins","Advanced Bulk Transformations with CTEs and Joins",[23,864,866],{"id":865},"pre-fetching-relationship-graphs","Pre-Fetching Relationship Graphs",[14,868,869,870,428],{},"When bulk updates depend on related entities, post-insert hydration queries create severe N+1 bottlenecks. Pre-fetching relationship graphs via joined or subquery loading eliminates redundant round-trips. By aligning bulk operations with eager loading strategies, you can maintain referential integrity without sacrificing throughput. For optimized loading patterns, explore ",[92,871,873],{"href":872},"\u002Fadvanced-query-patterns-and-bulk-data-operations\u002Fcomplex-joins-and-relationship-loading-strategies\u002F","Complex Joins and Relationship Loading Strategies",[23,875,877],{"id":876},"hierarchical-data-migrations","Hierarchical Data Migrations",[14,879,880,881,34,884,887,888,428],{},"Complex data migrations often require dependency resolution before bulk updates can safely execute. Common Table Expressions (CTEs) allow staging intermediate results, computing topological sorts, and joining derived sets directly into ",[31,882,883],{},"UPDATE",[31,885,886],{},"DELETE"," statements. This approach pushes computation to the database engine, minimizing application-side sorting and memory overhead. For recursive dependency resolution patterns, consult ",[92,889,891],{"href":890},"\u002Fadvanced-query-patterns-and-bulk-data-operations\u002Fcommon-table-expressions-ctes-and-recursive-queries\u002F","Common Table Expressions (CTEs) and Recursive Queries",[120,893,895],{"className":122,"code":894,"language":124,"meta":125,"style":125},"from sqlalchemy import select, update, text\nfrom sqlalchemy.sql import cte\n\nasync def bulk_update_from_staging(\n engine: AsyncEngine,\n staging_table: Table,\n target_table: Table\n) -> None:\n \"\"\"Execute a CTE-driven bulk update targeting a derived staging set.\"\"\"\n staging_cte = select(staging_table).cte(\"staging_data\")\n \n update_stmt = (\n update(target_table)\n .where(target_table.c.id == staging_cte.c.source_id)\n .values(\n status=staging_cte.c.new_status,\n processed_at=func.now()\n )\n )\n \n async with engine.begin() as conn:\n await conn.execute(update_stmt)\n",[31,896,897,908,920,924,935,939,944,949,957,962,978,982,992,997,1007,1012,1022,1032,1036,1040,1044,1056],{"__ignoreMap":125},[129,898,899,901,903,905],{"class":131,"line":132},[129,900,136],{"class":135},[129,902,167],{"class":139},[129,904,143],{"class":135},[129,906,907],{"class":139}," select, update, text\n",[129,909,910,912,915,917],{"class":131,"line":149},[129,911,136],{"class":135},[129,913,914],{"class":139}," sqlalchemy.sql ",[129,916,143],{"class":135},[129,918,919],{"class":139}," cte\n",[129,921,922],{"class":131,"line":162},[129,923,179],{"emptyLinePlaceholder":178},[129,925,926,928,930,933],{"class":131,"line":175},[129,927,227],{"class":135},[129,929,230],{"class":135},[129,931,932],{"class":233}," bulk_update_from_staging",[129,934,237],{"class":139},[129,936,937],{"class":131,"line":182},[129,938,243],{"class":139},[129,940,941],{"class":131,"line":189},[129,942,943],{"class":139}," staging_table: Table,\n",[129,945,946],{"class":131,"line":201},[129,947,948],{"class":139}," target_table: Table\n",[129,950,951,953,955],{"class":131,"line":219},[129,952,277],{"class":139},[129,954,280],{"class":252},[129,956,283],{"class":139},[129,958,959],{"class":131,"line":224},[129,960,961],{"class":212}," \"\"\"Execute a CTE-driven bulk update targeting a derived staging set.\"\"\"\n",[129,963,964,967,969,972,975],{"class":131,"line":240},[129,965,966],{"class":139}," staging_cte ",[129,968,195],{"class":135},[129,970,971],{"class":139}," select(staging_table).cte(",[129,973,974],{"class":212},"\"staging_data\"",[129,976,977],{"class":139},")\n",[129,979,980],{"class":131,"line":246},[129,981,329],{"class":139},[129,983,984,987,989],{"class":131,"line":259},[129,985,986],{"class":139}," update_stmt ",[129,988,195],{"class":135},[129,990,991],{"class":139}," (\n",[129,993,994],{"class":131,"line":274},[129,995,996],{"class":139}," update(target_table)\n",[129,998,999,1002,1004],{"class":131,"line":286},[129,1000,1001],{"class":139}," .where(target_table.c.id ",[129,1003,788],{"class":135},[129,1005,1006],{"class":139}," staging_cte.c.source_id)\n",[129,1008,1009],{"class":131,"line":292},[129,1010,1011],{"class":139}," .values(\n",[129,1013,1014,1017,1019],{"class":131,"line":304},[129,1015,1016],{"class":576}," status",[129,1018,195],{"class":135},[129,1020,1021],{"class":139},"staging_cte.c.new_status,\n",[129,1023,1024,1027,1029],{"class":131,"line":310},[129,1025,1026],{"class":576}," processed_at",[129,1028,195],{"class":135},[129,1030,1031],{"class":139},"func.now()\n",[129,1033,1034],{"class":131,"line":315},[129,1035,645],{"class":139},[129,1037,1038],{"class":131,"line":326},[129,1039,645],{"class":139},[129,1041,1042],{"class":131,"line":332},[129,1043,329],{"class":139},[129,1045,1046,1048,1050,1052,1054],{"class":131,"line":338},[129,1047,341],{"class":135},[129,1049,344],{"class":135},[129,1051,347],{"class":139},[129,1053,350],{"class":135},[129,1055,353],{"class":139},[129,1057,1058,1060],{"class":131,"line":356},[129,1059,406],{"class":135},[129,1061,1062],{"class":139}," conn.execute(update_stmt)\n",[18,1064,1066],{"id":1065},"production-pitfalls-and-mitigation-strategies","Production Pitfalls and Mitigation Strategies",[1068,1069,1070,1086,1096,1102,1111,1124],"ol",{},[1071,1072,1073,1077,1078,1081,1082,1085],"li",{},[1074,1075,1076],"strong",{},"ORM Session Flush Overhead:"," Inserting >10k rows via ",[31,1079,1080],{},"session.add_all()"," triggers exponential latency due to identity map synchronization and relationship traversal. Switch to Core ",[31,1083,1084],{},"insert().values()"," for raw throughput.",[1071,1087,1088,1091,1092,1095],{},[1074,1089,1090],{},"Unbounded Transaction Scopes:"," Holding a single transaction open for millions of rows exhausts WAL space and causes table bloat. Implement explicit chunked commits with ",[31,1093,1094],{},"async with engine.begin()"," boundaries.",[1071,1097,1098,1101],{},[1074,1099,1100],{},"Deadlocks on Unsorted PK Ranges:"," Concurrent bulk updates hitting overlapping index ranges in random order cause lock contention. Sort payloads by primary key before chunking to enforce deterministic lock acquisition.",[1071,1103,1104,1107,1108,1110],{},[1074,1105,1106],{},"Memory Leaks from Unbounded Async Result Sets:"," Fetching entire result sets into memory without server-side cursors triggers OOM kills. Always pair large queries with ",[31,1109,689],{}," and explicit transaction scopes.",[1071,1112,1113,1119,1120,1123],{},[1074,1114,1115,1116,1118],{},"Silent Type Coercion in ",[31,1117,114],{},":"," Incorrect parameter binding can cause dialect-level silent casting failures. Validate payload schemas against SQLAlchemy ",[31,1121,1122],{},"Column.type"," definitions before execution.",[1071,1125,1126,1133,1134,1136,1137,1139,1140,1143],{},[1074,1127,1128,1129,1132],{},"Missing ",[31,1130,1131],{},"RETURNING"," Clause Optimization:"," Omitting ",[31,1135,1131],{}," forces redundant ",[31,1138,450],{}," queries for post-insert hydration. Use ",[31,1141,1142],{},"stmt.returning()"," to capture generated IDs or computed columns in a single round-trip.",[18,1145,1147],{"id":1146},"frequently-asked-questions","Frequently Asked Questions",[14,1149,1150,1153,1154,1156],{},[1074,1151,1152],{},"When should I bypass the ORM and use Core for bulk inserts?","\nUse Core when inserting >5,000 rows, when relationship hydration is unnecessary, or when leveraging native fast-execution paths like ",[31,1155,41],{}," prepared statements. The ORM's unit-of-work overhead becomes prohibitive at scale.",[14,1158,1159,1162,1163,1166,1167,1170,1171,1173],{},[1074,1160,1161],{},"How do I prevent asyncpg connection timeouts during massive batches?","\nImplement chunked execution with explicit ",[31,1164,1165],{},"COMMIT"," intervals, configure ",[31,1168,1169],{},"statement_timeout"," at the connection level, and use ",[31,1172,689],{}," to maintain steady I\u002FO pressure. Avoid holding cursors open across event loop yields without active transactions.",[14,1175,1176,1179,1180,1182,1183,1186],{},[1074,1177,1178],{},"Can I use yield_per with async SQLAlchemy sessions?","\nYes, SQLAlchemy 2.0 fully supports ",[31,1181,689],{}," in async contexts, but requires server-side cursor configuration and explicit transaction boundaries to prevent cursor invalidation. Ensure ",[31,1184,1185],{},"execution_options(yield_per=N)"," is applied to the statement before execution.",[14,1188,1189,1192,1193,1195],{},[1074,1190,1191],{},"What is the optimal chunk size for millions of rows?","\nTypically 1,000 to 5,000 rows per transaction, depending on row width, index count, and available WAL space. Profile heap usage and adjust based on ",[31,1194,419],{}," latency metrics. Start conservative and scale upward until WAL pressure or GC pauses appear.",[1197,1198,1199],"style",{},"html pre.shiki code .szBVR, html code.shiki .szBVR{--shiki-default:#D73A49;--shiki-dark:#F97583}html pre.shiki code .sVt8B, html code.shiki .sVt8B{--shiki-default:#24292E;--shiki-dark:#E1E4E8}html pre.shiki code .sJ8bj, html code.shiki .sJ8bj{--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .sZZnC, html code.shiki .sZZnC{--shiki-default:#032F62;--shiki-dark:#9ECBFF}html pre.shiki code .sScJk, html code.shiki .sScJk{--shiki-default:#6F42C1;--shiki-dark:#B392F0}html pre.shiki code .sj4cs, html code.shiki .sj4cs{--shiki-default:#005CC5;--shiki-dark:#79B8FF}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .s4XuR, html code.shiki .s4XuR{--shiki-default:#E36209;--shiki-dark:#FFAB70}",{"title":125,"searchDepth":149,"depth":149,"links":1201},[1202,1206,1210,1214,1218,1222,1223],{"id":20,"depth":149,"text":21,"children":1203},[1204,1205],{"id":25,"depth":162,"text":26},{"id":45,"depth":162,"text":46},{"id":99,"depth":149,"text":100,"children":1207},[1208,1209],{"id":103,"depth":162,"text":104},{"id":412,"depth":162,"text":413},{"id":431,"depth":149,"text":432,"children":1211},[1212,1213],{"id":435,"depth":162,"text":436},{"id":458,"depth":162,"text":459},{"id":671,"depth":149,"text":672,"children":1215},[1216,1217],{"id":675,"depth":162,"text":676},{"id":682,"depth":162,"text":683},{"id":861,"depth":149,"text":862,"children":1219},[1220,1221],{"id":865,"depth":162,"text":866},{"id":876,"depth":162,"text":877},{"id":1065,"depth":149,"text":1066},{"id":1146,"depth":149,"text":1147},"md",{},"\u002Fadvanced-query-patterns-and-bulk-data-operations\u002Fhigh-performance-bulk-inserts-and-updates",{"title":5,"description":16},"advanced-query-patterns-and-bulk-data-operations\u002Fhigh-performance-bulk-inserts-and-updates\u002Findex","K57vPpD1EWmunV-7orHRKRxUsMoDLTxQUclUbaFeulc",1778149144398]