如何向具有大量行的现有数据库表添加标识列

问题描述:

我有一个数据库表,其中有〜40 000 000行.我想向该表添加一个标识列.如何以对日志友好的方式进行操作?

I have a database table which has ~ 40 000 000 rows. I want to add an identity column to this table. How to do it in a log-friendly manner?

当我执行以下操作时:

ALTER TABLE table_1
  ADD id INT IDENTITY

这只会填满整个日志空间.

this just fills up the entire log space.

有什么方法可以以日志友好的方式进行操作?该数据库位于SQL Server 2008上.

Is there any way to do it in a log-friendly manner? The database is on SQL Server 2008.

谢谢, 莫汉.

总体过程可能会慢很多,并且总体锁定开销也会增加,但是如果您只关心事务日志的大小,则可以尝试以下操作.

The overall process will probably be a lot slower with more overall locking overhead but if you only care about transaction log size you could try the following.

  1. 添加可为空的整数非标识列(仅元数据更改).
  2. 编写代码以批量更新具有唯一顺序整数的代码.这将减小每个单独事务的大小,并减小日志大小(假设使用简单的恢复模型).我下面的代码以100为批次进行此操作,希望您已有一个可用的PK,您可以利用该PK从上次停止的地方开始工作,而不是反复进行扫描,而重复扫描将花费越来越长的时间.
  3. 使用ALTER TABLE ... ALTER COLUMN将列标记为NOT NULL.这将需要锁定并扫描整个表以验证更改,但不需要太多日志记录.
  4. 使用ALTER TABLE ... SWITCH将列设置为标识列.这是仅元数据更改.
  1. Add a nullable integer non identity column (metadata only change).
  2. Write code to update this with unique sequential integers in batches. This will reduce the size of each individual transaction and keep the log size down (assuming simple recovery model). My code below does this in batches of 100 hopefully you have an existing PK you can leverage to pick up where you left off rather than the repeated scans that will take increasingly long towards the end.
  3. use ALTER TABLE ... ALTER COLUMN to mark the column as NOT NULL. This will require the entire table to be locked and scanned to validate the change but not require much logging.
  4. Use ALTER TABLE ... SWITCH to make the column an identity column. This is a metadata only change.

下面的示例代码

/*Set up test table with just one column*/

CREATE TABLE table_1 ( original_column INT )
INSERT  INTO table_1
        SELECT DISTINCT
                number
        FROM    master..spt_values



/*Step 1 */
ALTER TABLE table_1 ADD id INT NULL



/*Step 2 */
DECLARE @Counter INT = 0 ,
    @PrevCounter INT = -1

WHILE @PrevCounter <> @Counter 
    BEGIN
        SET @PrevCounter = @Counter;
        WITH    T AS ( SELECT TOP 100
                                * ,
                                ROW_NUMBER() OVER ( ORDER BY @@SPID )
                                + @Counter AS new_id
                       FROM     table_1
                       WHERE    id IS NULL
                     )
            UPDATE  T
            SET     id = new_id
        SET @Counter = @Counter + @@ROWCOUNT
    END


BEGIN TRY;
    BEGIN TRANSACTION ;
     /*Step 3 */
    ALTER TABLE table_1 ALTER COLUMN id INT NOT NULL

    /*Step 4 */
    DECLARE @TableScript NVARCHAR(MAX) = '
    CREATE TABLE dbo.Destination(
        original_column INT,
        id INT IDENTITY(' + CAST(@Counter + 1 AS VARCHAR) + ',1)
        )

        ALTER TABLE dbo.table_1 SWITCH TO dbo.Destination;
    '       

    EXEC(@TableScript)


    DROP TABLE table_1 ;

    EXECUTE sp_rename N'dbo.Destination', N'table_1', 'OBJECT' ;


    COMMIT TRANSACTION ;
END TRY
BEGIN CATCH
    IF XACT_STATE() <> 0 
        ROLLBACK TRANSACTION ;
    PRINT ERROR_MESSAGE() ;
END CATCH ;