何时使用 SELECT ... FOR UPDATE?

问题描述:

请帮助我理解 SELECT ... FOR UPDATE 背后的用例.

Please help me understand the use-case behind SELECT ... FOR UPDATE.

问题 1:以下是否是应该使用 SELECT ... FOR UPDATE 的好例子?

Question 1: Is the following a good example of when SELECT ... FOR UPDATE should be used?

给定:

  • 房间[id]
  • 标签[id, name]
  • room_tags[room_id, tag_id]
    • room_id 和 tag_id 是外键

    应用程序想要列出所有房间及其标签,但需要区分没有标签的房间和已删除的房间.如果不使用 SELECT ... FOR UPDATE,可能发生的情况是:

    The application wants to list all rooms and their tags, but needs to differentiate between rooms with no tags versus rooms that have been removed. If SELECT ... FOR UPDATE is not used, what could happen is:

    • 最初:
      • 房间包含 [id = 1]
      • 标签包含[id = 1, name = 'cats']
      • room_tags 包含 [room_id = 1, tag_id = 1]
      • 返回 [id = 1]
      • 返回一个空列表

      现在线程 1 认为房间 1 没有标签,但实际上房间已被移除.为了解决这个问题,线程1应该SELECT id FROM rooms FOR UPDATE,从而防止线程2从rooms中删除,直到线程1完成.对吗?

      Now Thread 1 thinks that room 1 has no tags, but in reality the room has been removed. To solve this problem, Thread 1 should SELECT id FROM rooms FOR UPDATE, thereby preventing Thread 2 from deleting from rooms until Thread 1 is done. Is that correct?

      问题 2:何时应该使用 SERIALIZABLE 事务隔离与 READ_COMMITTEDSELECT ... FOR UPDATE?

      Question 2: When should one use SERIALIZABLE transaction isolation versus READ_COMMITTED with SELECT ... FOR UPDATE?

      答案应该是可移植的(不是特定于数据库的).如果这不可能,请解释原因.

      Answers are expected to be portable (not database-specific). If that's not possible, please explain why.

实现房间和标签之间的一致性并确保房间在删除后永远不会返回的唯一可移植方法是使用 SELECT FOR UPDATE 锁定它们.

The only portable way to achieve consistency between rooms and tags and making sure rooms are never returned after they had been deleted is locking them with SELECT FOR UPDATE.

但是在某些系统中,锁定是并发控制的副作用,您无需显式指定 FOR UPDATE 即可获得相同的结果.

However in some systems locking is a side effect of concurrency control, and you achieve the same results without specifying FOR UPDATE explicitly.

为了解决这个问题,线程1应该SELECT id FROM rooms FOR UPDATE,从而防止线程2从rooms中删除,直到线程1完成.对吗?

To solve this problem, Thread 1 should SELECT id FROM rooms FOR UPDATE, thereby preventing Thread 2 from deleting from rooms until Thread 1 is done. Is that correct?

这取决于您的数据库系统使用的并发控制.

This depends on the concurrency control your database system is using.

  • MyISAMMySQL(和其他几个旧系统)中会在查询期间锁定整个表.

  • MyISAM in MySQL (and several other old systems) does lock the whole table for the duration of a query.

SQL Server 中,SELECT 查询在他们检查过的记录/页面/表上放置共享锁,而 DML 查询放置更新锁(后来被提升为独占锁或降级为共享锁).排他锁与共享锁不兼容,因此 SELECTDELETE 查询将锁定,直到另一个会话提交.

In SQL Server, SELECT queries place shared locks on the records / pages / tables they have examined, while DML queries place update locks (which later get promoted to exclusive or demoted to shared locks). Exclusive locks are incompatible with shared locks, so either SELECT or DELETE query will lock until another session commits.

在使用 MVCC 的数据库中(如 OraclePostgreSQLMySQLInnoDB),DML 查询会创建记录的副本(以一种或另一种方式),并且通常读取器不会阻止写入器,反之亦然.对于这些数据库,SELECT FOR UPDATE 会派上用场:它会锁定 SELECTDELETE 查询,直到另一个会话提交,就像 SQL Server 可以.

In databases which use MVCC (like Oracle, PostgreSQL, MySQL with InnoDB), a DML query creates a copy of the record (in one or another way) and generally readers do not block writers and vice versa. For these databases, a SELECT FOR UPDATE would come handy: it would lock either SELECT or the DELETE query until another session commits, just as SQL Server does.

什么时候应该使用 REPEATABLE_READ 事务隔离与 READ_COMMITTEDSELECT ... FOR UPDATE ?

When should one use REPEATABLE_READ transaction isolation versus READ_COMMITTED with SELECT ... FOR UPDATE?

一般来说,REPEATABLE READ 不禁止幻像行(在另一个事务中出现或消失的行,而不是被修改)

Generally, REPEATABLE READ does not forbid phantom rows (rows that appeared or disappeared in another transaction, rather than being modified)

  • Oracle 和更早的PostgreSQL 版本中,REPEATABLE READ 实际上是SERIALIZABLE 的同义词.基本上,这意味着事务在开始后不会看到所做的更改.所以在这个设置中,最后一个 Thread 1 查询将返回房间,就好像它从未被删除过一样(这可能是也可能不是你想要的).如果你不想在房间被删除后显示它们,你应该用 SELECT FOR UPDATE

  • In Oracle and earlier PostgreSQL versions, REPEATABLE READ is actually a synonym for SERIALIZABLE. Basically, this means that the transaction does not see changes made after it has started. So in this setup, the last Thread 1 query will return the room as if it has never been deleted (which may or may not be what you wanted). If you don't want to show the rooms after they have been deleted, you should lock the rows with SELECT FOR UPDATE

InnoDB 中,REPEATABLE READSERIALIZABLE 是不同的东西:SERIALIZABLE 模式下的读者设置next-key 锁定它们评估的记录,有效地防止它们上的并发 DML.因此,在可序列化模式下,您不需要 SELECT FOR UPDATE,但在 REPEATABLE READREAD COMMITED 中确实需要它们.

In InnoDB, REPEATABLE READ and SERIALIZABLE are different things: readers in SERIALIZABLE mode set next-key locks on the records they evaluate, effectively preventing the concurrent DML on them. So you don't need a SELECT FOR UPDATE in serializable mode, but do need them in REPEATABLE READ or READ COMMITED.

请注意,隔离模式标准确实规定您在查询中看不到某些怪癖,但没有定义如何(使用锁定或使用 MVCC 或其他方式).

Note that the standard on isolation modes does prescribe that you don't see certain quirks in your queries but does not define how (with locking or with MVCC or otherwise).

当我说你不需要SELECT FOR UPDATE"时,我真的应该加上因为某些数据库引擎实现的副作用".

When I say "you don't need SELECT FOR UPDATE" I really should have added "because of side effects of certain database engine implementation".