Fillfactor for a Sequential Index That Is Pk

Fillfactor for a sequential index that is PK

FILLFACTOR

With only INSERT and SELECT you should use a FILLFACTOR of 100 for tables (which is the default anyway). There is no point in leaving wiggle room per data page if you are not going to "wiggle" with UPDATEs.

The mechanism behind FILLFACTOR is simple. INSERTs only fill data pages (usually 8 kB blocks) up to the percentage declared by the FILLFACTOR setting. Also, whenever you run VACUUM FULL or CLUSTER on the table, the same wiggle room per block is re-established. Ideally, this allows UPDATE to store new row versions in the same data page, which can provide a substantial performance boost when dealing with lots of UPDATEs. Also beneficial in combination with H.O.T. updates. See:

  • Redundant data in update statements

Indexes need more wiggle room by design. They have to store new entries at the right position in leaf pages. Once a page is full, a relatively costly "page split" is needed. So indexes tend to bloat more than tables. The default FILLFACTOR for a (default) B-Tree index is 90 (varies per index type). And wiggle room makes sense for just INSERTs, too. The best strategy heavily depends on write patterns.

Example: If new inserts have steadily growing values (typical case for a serial or timestamp column), then there are basically no page-splits, and you might go with FILLFACTOR = 100 (or a bit lower to allows for some noise).

For a random distribution of new values, you might go below the default 90 ...

Basic source of information: the manual for CREATE TABLE and CREATE INDEX.

Other optimization

But you can do something else - since you seem to be a sucker for optimization ... :)

CREATE TABLE dev_transactions(
transaction_id serial PRIMARY KEY
, gateway integer NOT NULL
, moment timestamp NOT NULL
, device integer NOT NULL
, transaction_type smallint NOT NULL
, status smallint NOT NULL
, controller smallint NOT NULL
, token integer
, et_mode character(1)
);

This optimizes your table with regard to data alignment and avoids padding for a typical 64 bit server and saves a few bytes, probably just 8 byte on average - you typically can't squeeze out much with "column tetris":

  • Calculating and saving space in PostgreSQL

Keep NOT NULL columns at the start of the table for a very small performance bonus.

Your table has 9 columns. The initial ("cost-free") 1-byte NULL bitmap covers 8 columns. The 9th column triggers an additional 8 bytes for the extended NULL bitmap - if there are any NULL values in the row.

If you make et_mode and token NOT NULL, all columns are NOT NULL and there is no NULL bitmap, freeing up 8 bytes per row.

This even works per row if some columns can be NULL. If all fields of the same row have values, there is no NULL bitmap for the row. In your special case, this leads to the paradox that filling in values for et_mode and token can make your storage size smaller or at least stay the same:

  • Do nullable columns occupy additional space in PostgreSQL?

Basic source of information: the manual on Database Physical Storage.

Compare the size of rows (filled with values) with your original table to get definitive proof:

SELECT pg_column_size(t) FROM dev_transactions t;

(Plus maybe padding between rows, as the next row starts at a multiple of 8 bytes.)

Using GUIDs in Primary Keys / Clusted Indexes

If you are doing any kind of volume, GUIDs are extremely bad as a PK bad unless you use sequential GUIDs, for the exact reasons you describe. Page fragmentation is severe:

                 Average                    Average
Fragmentation Fragment Fragment Page Average
Type in Percent Count Size Count Space Used

id 4.35 7 16.43 115 99.89
newidguid 98.77 162 1 162 70.90
newsequentualid 4.35 7 16.43 115 99.89

And as this comparison between GUIDs and integers shows:

Test1 caused a tremendous amount of page splits, and had a scan density around 12% when I ran a DBCC SHOWCONTIG after the inserts had completed. The Test2 table had a scan density around 98%

If your volume is very low, however, it just doesn't matter that much.

If you do really need a globally unique ID but have high volume (and can't use sequential IDs), just put the GUIDs in an indexed column.

SQL Server Filtered Index on PK Index/Constraint?

You can't define filtered index on PK. Is the following index useful for you?

CREATE NONCLUSTERED INDEX MyIndex
ON biglog(logtext)
WHERE rowstate IS NOT NULL ;

Best way to change clustered index (PK) in SQL 2005

If your table is getting up to 1 TB in size and probably has LOTS of rows in it, I would strongly recommend NOT making the clustered index that much fatter!

First of all, dropping and recreating the clustered index will shuffle around ALL your data at least once - that alone will take ages.

Secondly, the big compound clustered index you're trying to create will significantly increase the size of all your non-clustered indices (since they contain the whole clustered index value on each leaf node, for the bookmark lookups).

The question is more: why are you trying to do this?? Couldn't you just add another non-clustered index with those columns, to potentially cover your queries? Why does this have to be the clustered index?? I don't see any advantage in that....

For more information on indexing and especially the clustered index debate, see Kimberly Tripp's blog on SQL Server indexes - very helpful!

Marc

CREATE TABLE with FK and index options declarations

The

  WITH ( PAD_INDEX = OFF, /*... */ ALLOW_PAGE_LOCKS = ON )

defines options for the index associated with the PK constraint not the foreign key. So it needs to go as part of the PK constraint definition. You are trying to include it as part of the FK definition. It should be

CREATE TABLE dbo.Calendar
(
ScenarioKey INT NOT NULL,
Bucket SMALLDATETIME NOT NULL,
BucketEnd SMALLDATETIME NOT NULL,
CONSTRAINT [PK-C_dbo.Calendar]
PRIMARY KEY CLUSTERED (ScenarioKey, Bucket)
WITH ( PAD_INDEX = OFF,
FILLFACTOR = 100,
IGNORE_DUP_KEY = OFF,
STATISTICS_NORECOMPUTE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON ),
CONSTRAINT [FK_dbo.Calendar_dbo.Scenario]
FOREIGN KEY (ScenarioKey)
REFERENCES dbo.Scenario (ScenarioKey) ON DELETE CASCADE
ON UPDATE CASCADE
)
ON [PRIMARY]


Related Topics



Leave a reply



Submit