Execution Plan Cost Model

Parent: SQL Server Cost Based Optimizer


Microsoft SQL Server has come a long way over 30 years. A highly desired goal is a product that can be used by knowledge workers whose expertise is in their data without too much assistance from specialists in the internal workings of the underlying database engine. The key to this objective is to bring software intelligence to fully utilize the power of modern computer systems. The Adaptive Join feature that was introduced in version 2017 is now part of the Intelligent Query Processing feature family in 2019 does much in this regard. But this effort is hampered by a weakness in one of the cornerstone foundations of the query optimizer.

Of the many nearly intractable challenges in query optimization, the least difficult is a model for the cost of operations in the execution plan. Yet it is this aspect that is highly in need of overhaul. The existing cost model has been in-place since version 2005 and that being an adjustment of the one used since version 7. It is based on the assumption that leaf level page access involves disk IO. Some of the details of the execution plan model in the parent page SQL Server Cost Based Optimizer

In brief, the cost model is largely an expression of the difference between random and sequential IO characteristics of hard drives. The basic from SQL Server version 2005 and forward is the following.

  The initial page access is assumed to have random IO access with cost 0.003125,
  corresponding to a storage system capable of 320 IOPS (1/320 = 0.003125).

  IndexSeekCL   IndexSeekNC

  The cost of accessing successive pages is modeled as sequential IO with cost 0.00074074 per page,
  corresponding to 1350 pages per sec (1/1350 = 0.00074074), or 10,800KB/s (10.5MB/s).


The Key Lookup for a single row has the same cost structure as for a single row index seek.

  KeyLookup   RID

In a query plan having multiple executes of the Key Lookup component, the Estimate I/O Cost is shown as 0.003125, representing a single execution.


The Subtree cost is a multiple of the IO cost based on the estimate of how likely a leaf level pages was previously accessed in this specific operation and whether it was flushed out or not, plus the single execute CPU cost of 0.0001581 times the full Estimated Number of Executions.


In the above, the Subtree Cost is 12.0854. We can subtract 0.0001581 x 4258.68 for the CPU contribution, leaving 11.412. This is 3651.87 times 0.003125, meaning the model says IO costs are incurred for 85.75% of executes.


The scan operation is much simpler. The above table has 13500 pages, the first page contributes 0.003125, and each of the remaining 13499 pages contributes 1/1350 for a total IO of 10.002384.





The CPU for each additional row is 0.0000011. Note that this value is smaller than the successive page cost by a factor 673. The typical table might have 50-200 rows per page. What this means is that the full cost of a scan is largely in the IO portion. A very narrow index, 4 byte key + 4 byte cluster key could have 579 rows per page, but this is likely an extreme case.

The net result of the SQL Server cost model is that the execution plan at DOP 1 shifts from an Index Seek + Key Lookup to a Scan when the number of rows requiring a key lookup is about 31.6% the number of pages. Alternatively, the model has a Key Lookup as approximately equal in cost to scanning 3.16 pages. The shift point comes sooner at higher DOP.

The assumption of disk IO was reasonably valid in the very long ago past, when system memory was tens of MB's. It was not really so at tens of GB's. For a long time now, most properly tuned SQL environments operate with active data largely in memory. The cost model of execution plan operations for data in memory is very different. Even when there is IO, now that data is on flash storage, the random versus sequential IO characteristics are no longer particularly important.

Actual Cost Structure for Data in Buffer Cache

This is potentially an overly complicated topic. However, consider that the objective for the cost model is to determine the best data access method and join order. We do not need to have a complete model for every aspect of query execution. In particular, we need not be concerned with the cost of logic that must be processed regardless of the execution plan unless the number of executions necessary is impacted.

Variabilities in query cost include the lock level, from Row, Page, Table and NOLOCK. Consider when SQL Server default changes from row to table lock. This was possibly the reason for the anomaly in versions 7 and 2000 for the very high cost assigned to the first page of a scan. The higher isolation levels beyond Read Committed also have high cost, for which scans should be avoided as much as possible.

The contents of the processor cache is a factor. Operations that work on a set of data that fits within the L2 will have very low cost for each subsequent operation. Note that for a substantial period from 2010 to 2016, Intel held the L2 cache fixed at 256K. The Intel Xeon Scalable processors, however, have 1M L2. Also, the Meltdown and Spectre vulnerabilities were patched with fixes that appear to flush L1 (and L2?) caches at certain points?

Regardless of the complexities, my preference is to adopt a simple model under conservative performance assumptions. The apparent pattern is the initial part of a multi-row or multiple number of executes operation has a much higher cost than the subsequent rows or executions within an execution plan operation. For the cost model, we are more interested in the cost of incremental rows and executes because first steps must be completed in any execution plan. It is the cost of subsequent elements that determines the optimal plan

As a very rough model, the preliminary proposal for data already residing in the buffer cache is a page access cost of approximately 0.8µs, a row access cost of 0.05 µs per row. The cost of either a key lookup to a heap appears to be about 2.5 µs per execute for row lock, and lower if a coarser lock is in effect.

The cost of key lookups to a clustered index appears to be 4 µs per execute at lower counts, falling to 2 per execute at higher counts. This should be similar to a loop join inner source execute.

Key Lookup ‐ Scan Cross-over

Now consider the implications of the actual cost structure versus the existing IO based cost model. As stated earlier, the SQL Server query optimizer thinks the Key Lookup execute is about 3X higher than a Scan of one page.

Suppose our table has 80 rows page, or an average of 100 bytes per row. The cost of each page scanned is 0.8 µs for the page plus 80 x 0.05 µs for the rows = 4.8 µs combined. (This translates to a data consumption rate of 1.6GB/s which might seem high. In a query aggregating several column, data consumption rate might be several hundred MB/s due to the cost of the logic.)

Assume the high row count cost is in effect for the key lookup, or approximately 2 µs per row, which is lower, not higher than a scan of one page.

The query optimizer gives up on the index seek + key lookup, switching to a scan too soon by a factor of 6. In other words, SQL Server significantly underestimates the cost of a scan, more so for narrow indexes than fat tables, and overestimates the cost for Key Lookups.


There are serious discrepancies between the SQL Server execution plan cost model and the true cost of operations for data in memory. Not having a reasonably accurate model places serious limitations on what can be done with software intelligence working with such inaccuracies. Even more serious are the limitations with the existing model of parallel execution plan operations. The result is that the execution plan provides very little insight on expected scaling of parallel execution.