MySQL性能独特的varchar字段与独特的bigint
我正在处理一个应用程序,该应用程序将实现十六进制值作为商业密钥(除了自动增量字段作为主键),与Gmail中看到的URL ID类似。我将添加一个唯一的约束到列,最初考虑将值存储为一个bigint,以避免搜索一个varchar字段,但是想知道如果该字段是唯一的,这是必要的。
I'm working on an application that will be implementing a hex value as a business key (in addition to an auto increment field as primary key) similar to the URL id seen in Gmail. I will be adding a unique constraint to the column and was originally thinking of storing the value as a bigint to get away from searching a varchar field but was wondering if that's necessary if the field is unique.
内部连接将使用自动增量字段完成,十六进制值将用于where子句进行过滤。
Internal joins would be done using the auto increment field and the hex value would be used in the where clause for filtering.
只需将值作为varchar(x)存储,或者在执行转换时可能需要使用char(x)从十六进制到数据库中存储整数值?是否值得额外的复杂度?
What sort of performance hit would there be in simply storing the value as a varchar(x), or perhaps a char(x) over the additional work in doing the conversion to and from hex to store the value as an integer in the database? Is it worth the additional complexity?
我对少量行(50k)进行了快速测试,并具有相似的搜索结果时间。如果有一个很大的性能问题会是线性的还是指数的?
I did a quick test on a small number of rows (50k) and had similar search result times. If there is a large performance issue would it be linear, or exponential?
我使用InnoDB作为引擎。
I'm using InnoDB as the engine.
是你的十六进制值是一个GUID?虽然我以前担心像索引这样长的项目的表现,我发现在现代数据库中,数百万条记录的性能差异是相当微不足道的。
Is your hex value a GUID? Although I used to worry about the performance of such long items as indexes, I have found that on modern databases the performance difference on even millions of records is fairly insignificant.
可能更大的问题是索引所消耗的内存(例如,16字节与4字节int),但在我可以为其分配的控制器上。只要索引可以在内存中,我发现其他操作有更多的开销,索引元素的大小没有明显的区别。
A potentially larger problem is the memory that the index consumes (16 byte vs 4 byte int, for example), but on servers that I control I can allocate for that. As long as the index can be in memory, I find that there is more overhead from other operations that the size of the index element doesn't make a noticeable difference.
在上行方面,如果您使用GUID,您可以为创建的记录获得服务器独立性,并在多个服务器上合并数据(这是我关心的数据,因为我们的系统聚合来自子系统的数据)的更多灵活性。
On the upside, if you use a GUID you gain server independence for records created and more flexibility in merging data on multiple servers (which is something I care about, as our system aggregates data from child systems).
这篇文章有一个图表,似乎可以备份我的怀疑:神话,GUID与自动增量
There is a graph on this article that seems to back up my suspicion: Myths, GUID vs Autoincrement