How to Add a Column to Large SQL Server Table

How do I add a column to large sql server table

ALTER TABLE table1 ADD
newcolumn int NULL
GO

should not take that long... What takes a long time is to insert columns in the middle of other columns... b/c then the engine needs to create a new table and copy the data to the new table.

Adding column with default value to large table

Thanks Aaron for your detailed approach, but I did a quick test and the simple approach would be to do the following:

Some background. I'm adding a CompanyID to an existing large table. The ID refers to the company the record belongs to. Default value would be 0. But since this is going into an existing customers prod database, their company ID is 1. We have a generic upgrade script for all our clients, turns out a slight modification to this script for this specific customer yielding significant performance improvements.

INSTEAD OF:

ALTER TABLE myTable ADD CompanyID int NOT NULL CONSTRAINT DF_Constraint DEFAULT 0 (takes about 1min to complete)
UPDATE myTable SET CompanyID = 1 (will take over an hour)

I JUST DO THIS:

ALTER TABLE myTable ADD CompanyID int NOT NULL CONSTRAINT DF_Constraint DEFAULT 1 (takes about 1min to complete)

Then just set the default value back to 0. Now the table will have CompanyID = 1 for all records. BOOM!

Add a new column to big database table

Yes, it is eminently doable.

Adding a column where NULL is acceptable and has no default value does not require a long-running lock to add data to the table.

If you supply a default value, then SQL Server has to go and update each record in order to write that new column value into the row.

How it works in general:

+---------------------+------------------------+-----------------------+
| Column is Nullable? | Default Value Supplied | Result |
+---------------------+------------------------+-----------------------+
| Yes | No | Quick Add (caveat) |
| Yes | Yes | Long running lock |
| No | No | Error |
| No | Yes | Long running lock |
+---------------------+------------------------+-----------------------+

The caveat bit:

I can't remember off the top of my head what happens when you add a column that causes the size of the NULL bitmap to be expanded. I'd like to say that the NULL bitmap represents the nullability of all the the columns currently in the row, but I can't put my hand on my heart and say that's definitely true.

Edit -> @MartinSmith pointed out that the NULL bitmap will only expand when the row is changed, many thanks. However, as he also points out, if the size of the row expands past the 8060 byte limit in SQL Server 2012 then a long running lock may still be required. Many thanks * 2.

Second caveat:

Test it.

Third and final caveat:

No really, test it.

Add column to a huge table

My preference for tables of this size is to create a new table and then batch the records into it (BCP, Bulk Insert, SSIS, whatever you like). This may take longer but it keeps your log from blowing out. You can also do the most relevant data (say last 30 days) first, swap out the table, then batch in the remaining history so that you can take advantage of the new row immediately...if your application lines up with that strategy.

How to optimally add columns to a table of large size?

Don't worry too much about the computing power needed to add columns - it's mostly a logical problem, not physical.

I did this test with an 8.4M rows table:


create table clone1
clone my_big_table;

alter table clone1
add column justtest NUMBER(38,0);

-- adding a column was a quick operation, just metadata probably (210ms)

create table lone2
clone my_big_table;

alter table clone2
add column justtest NUMBER(38,0) default 7;

-- adding a column with a default value was a quick operation too, just metadata probably (256ms)

select justtest
from clone2
limit 10;

-- correct data returned

create table clone3
clone my_big_table;

alter table clone3
add column justtest NUMBER(38,0) default 7;

-- again, adding a column with a default value was quick

update clone3
set justtest=1;

-- this took a longer time - changing an existing value for a new one (1min 18s)

Adding a column to a table shouldn't be a problem - just test the operation with a table clone before.

Adding computed persisted column to large table

One thing you might consider is creating a new table with the computed persisted column in the definition. Then you could populate this new table in batches from the existing table. This would minimize the downtime and blocking. Similar to the batching process you already did but in the end you would have a second copy of the data. Once it completes you would then drop the original table and rename the new one. You might want to consider adding the index from the beginning.

Add column taking too long

The amount of time it takes to add a new column depends on the amount of data in your table. If you run sp_who2 active to see what the process id is you can kill the job. It will go into rollback, but will put the table back to the way that it was.

You should never try altering a table of a production system without knowing the how long the process will take.



Related Topics



Leave a reply



Submit