MySQL: is out

2018-07-02 10:07:03

One of the main reasons given for using auto-increment PK in MySQL is that it guarantees all inserts into the clustered PK index will be in order and hence fast. I understand that.

But what about secondary indexes? Say my table has a secondary index. Inserts will be in-order with respect to the PK clustered index, but out-of-order with respect to the secondary index B+ Tree.

So wouldn't the inserts still be slow because MySQL needs to be constantly re-arranging the secondary index B+ Tree as inserts are coming in?

I just wondered if using auto-increment here really is buying me anything in terms of insert performance. Would greatly appreciate some clarifications here.

The primary key will be clustered, which means that it directly points to the data on disk. Having to rearrange that data means that full records must be moved around. For a secondary index, it is really just a bunch of pointers to locations on disk. The secondary index has nothing to do with the ordering of the records, so having to shift pointers around in a secondary index is just that, moving pointers. This is a much faster operation than having to move full records.

Your basic assumption is only true if you have a write-only (or at least update-only) table. If you are deleting records the PKs for new records will be inserted non-sequentially (physically).

Efficiency of index inserts is almost always a secondary consideration and messing with it is a premature optimization antipattern. Have you considered the typically more significant issues of cardinality, key field lengths, cache sizes, etc.?

Using autoincrement surrogate PK's is usually suboptimal in the first place - there's usually a more useful unique key with real values that cluster in more meaningful ways. (And you can only cluster with innodb tables - you realize that, right?)

"Clustering" means the index essentially is the table. So it has a benefit when inserting a surrogate key because everything gets added to the end of the table because the next index value is always higher than any previous (as you already know.)

Unless you're filling holes created by deleted records. This may happen indirectly but can be an overhead issue because entire records must be relocated which is self-evidently more work than just moving index key values and pointers.

Clustered records don't provide much benefit for queries for single records so much as for ranges of records (eg items for an order, a customer, a user. If you can pick up several (or several hundred) records for the same user, for instance, that's worth clustering for. It's much less likely that records will be contiguously inserted for a single user (in most scenarios), so clustering chronologically doesn't help much. But your requirement may differ.

You didn't specify innodb so I answered primarily for myisam (the default) where only an autoincrement or chronological index would simulate clustering - there's no explicit option.

链接地址: http://www.djcxy.com/p/90378.html

上一篇: 增加一个主键

下一篇: MySQL：已经出来了