Change Column Types in a Huge Table

Change column types in a huge table

Depending on what change you are making, sometimes it can be easier to take a maintenance window. During that window (where nobody should be able to change the data in the table) you can:

  1. drop any indexes/constraints pointing to the old column, and disable triggers
  2. add a new nullable column with the new data type (even if it is meant to be NOT NULL)
  3. update the new column setting it equal to the old column's value (and you can do this in chunks of individual transactions (say, affecting 10000 rows at a time using UPDATE TOP (10000) ... SET newcol = oldcol WHERE newcol IS NULL) and with CHECKPOINT to avoid overrunning your log)
  4. once the updates are all done, drop the old column
  5. rename the new column (and add a NOT NULL constraint if appropriate)
  6. rebuild indexes and update statistics

The key here is that it allows you to perform the update incrementally in step 3, which you can't do in a single ALTER TABLE command.

This assumes the column is not playing a major role in data integrity - if it is involved in a bunch of foreign key relationships, there are more steps.

EDIT

Also, and just wondering out loud, I haven't done any testing for this (but adding it to the list). I wonder if page + row compression would help here? If you change an INT to a BIGINT, with compression in place SQL Server should still treat all values as if they still fit in an INT. Again, I haven't tested if this would make an alter faster or slower, or how much longer it would take to add compression in the first place. Just throwing it out there.

Change huge table PK column data type

In addition to KLE's suggestion, the following queries might help:

To disable all constraints on the tables that reference oldTable try to execute the output of the following query:

SELECT 'ALTER TABLE ' + OBJECT_NAME(fk.parent_object_id) + ' NOCHECK CONSTRAINT ' + fk.name
FROM sys.foreign_keys fk
INNER JOIN sys.foreign_key_columns AS fkc ON fk.OBJECT_ID = fkc.constraint_object_id
WHERE OBJECT_NAME (fk.referenced_object_id) = 'oldTable'

To move all data into the new table, with alteration of the field try this:

INSERT INTO newTable
SELECT CONVERT(BIGINT, ID) AS ID, COL1, COL2, ..., COLN
FROM oldTable

To drop the old table:

DROP TABLE oldTable

To rename the new table to the old name:

sp_rename newTable, oldTable

To reenable all the constraints on the tables that reference oldTable, try to execute the output of the following query:

SELECT 'ALTER TABLE ' + OBJECT_NAME(fk.parent_object_id) + ' CHECK CONSTRAINT ' + fk.name
FROM sys.foreign_keys fk
INNER JOIN sys.foreign_key_columns AS fkc ON fk.OBJECT_ID = fkc.constraint_object_id
WHERE OBJECT_NAME (fk.referenced_object_id) = 'oldTable'

Hope it helps...

Changing a datatype in a table without reaching the transaction log limits

You might find it faster to create a new table, truncate the old, and then re-insert:

select t.*
into temp_my_table
from my_table t;

truncate table my_table; -- back it up first!

alter table mytable alter column comment nvarchar(max); -- no need to be cheap here

insert into mytable
select t.*
from t;

One caveat: You may have to pay attention if you have insert triggers or identity columns.

SQL Server performance for alter table alter column change data type

Coincidentally, I had to do something very similar about 3 hours ago. The table was 35m rows, it is fairly wide, and it was taking forever to just do this:

alter table myTable add myNewColumn int not null default 0;

Here's what what I ended up going with:

alter table myTable add myNewColumn int null;

while 1=1
begin
update top (100000) myTable
set
myNewColumn = 0
where
myNewColumn is null;

if @@ROWCOUNT = 0 break;
end

alter table myTable alter column myNewColumn int not null;
alter table myTable add constraint tw_def_myNewColumn default (0) for myNewColumn;

This time around, the alter table statements were near-instant. It took about 7-8 minutes (on a slow server) to do the update batches. I'm speculating that SQL Server was generating undo in my original query to restore the values, but I didn't expect that starting off.

Anyway, in your case, maybe something similar would help. You could try adding a new bigint column, update the new column in batches, then set the constraints on it.

Strategies to modify huge database

After many tests and backups, we finally used the following aproach:

  1. Create a new column [columnName_NEW] with the desired format change. Allow NULLS
  2. Create a trigger for INSERTS to update the new column with the value in the column to be replaced
  3. Copy the old column value to the new column by batches
    This operation is very time consuming. We ran a batch every day in a maintenance window (2h during 4 days). Our batch filled the values taking oldest rows first, we counted on the trigger filling the new ones
  4. Once #3 is complete, don't allow NULLS anymore on the new column, but set a default value to avoid the INSERT trigger to crash
  5. Create all the needed indexes and views on the new column. This is very time consuming but can be done ONLINE
  6. Allow NULLS on the old column
  7. Remove the insert trigger - start downtime now!
  8. Rename the old column to [columnName_OLD], the new to [columnName]. This requires few downtime seconds!

--> You can consider it is finally done!


  1. After some safe time, you can backup the result and remove [columnName_OLD] with all of its dependencies

I selected the other answer, because I think it could be also useful in most situations. This one has more steps but has a very little downtime and is reversible at any step but the last.

Unable to alter column data type in Big Query

Unfortunately, as far as I know there is no way to convert data type from STRING to DATE using ALTER TABLE but to create it again with the schema you want.

CREATE OR REPLACE TABLE testset.tbl AS 
SELECT 'a' AS col1, '2022-05-16' AS create_date
UNION ALL
SELECT 'a' AS col1, '2022-05-14' AS create_date
;

-- ALTER TABLE testset.tbl ALTER COLUMN create_date SET DATA TYPE DATE;

-- Query error: ALTER TABLE ALTER COLUMN SET DATA TYPE requires that
-- the existing column type (STRING) is assignable to the new type (DATE) at [7:25]

-- Create it again.
CREATE OR REPLACE TABLE testset.tbl AS
SELECT * REPLACE(SAFE_CAST(create_date AS DATE) AS create_date)
FROM testset.tbl;


Related Topics



Leave a reply



Submit