Fillfactor for a sequential index that is PK
FILLFACTOR
With only INSERT
and SELECT
you should use a FILLFACTOR
of 100
for tables (which is the default anyway). There is no point in leaving wiggle room per data page if you are not going to "wiggle" with UPDATE
s.
The mechanism behind FILLFACTOR
is simple. INSERT
s only fill data pages (usually 8 kB blocks) up to the percentage declared by the FILLFACTOR
setting. Also, whenever you run VACUUM FULL
or CLUSTER
on the table, the same wiggle room per block is re-established. Ideally, this allows UPDATE
to store new row versions in the same data page, which can provide a substantial performance boost when dealing with lots of UPDATE
s. Also beneficial in combination with H.O.T. updates. See:
- Redundant data in update statements
Indexes need more wiggle room by design. They have to store new entries at the right position in leaf pages. Once a page is full, a relatively costly "page split" is needed. So indexes tend to bloat more than tables. The default FILLFACTOR
for a (default) B-Tree index is 90
(varies per index type). And wiggle room makes sense for just INSERTs, too. The best strategy heavily depends on write patterns.
Example: If new inserts have steadily growing values (typical case for a serial
or timestamp
column), then there are basically no page-splits, and you might go with FILLFACTOR = 100
(or a bit lower to allows for some noise).
For a random distribution of new values, you might go below the default 90 ...
Basic source of information: the manual for CREATE TABLE
and CREATE INDEX
.
Other optimization
But you can do something else - since you seem to be a sucker for optimization ... :)
CREATE TABLE dev_transactions(
transaction_id serial PRIMARY KEY
, gateway integer NOT NULL
, moment timestamp NOT NULL
, device integer NOT NULL
, transaction_type smallint NOT NULL
, status smallint NOT NULL
, controller smallint NOT NULL
, token integer
, et_mode character(1)
);
This optimizes your table with regard to data alignment and avoids padding for a typical 64 bit server and saves a few bytes, probably just 8 byte on average - you typically can't squeeze out much with "column tetris":
- Calculating and saving space in PostgreSQL
Keep NOT NULL
columns at the start of the table for a very small performance bonus.
Your table has 9 columns. The initial ("cost-free") 1-byte NULL bitmap covers 8 columns. The 9th column triggers an additional 8 bytes for the extended NULL bitmap - if there are any NULL values in the row.
If you make et_mode
and token
NOT NULL
, all columns are NOT NULL
and there is no NULL bitmap, freeing up 8 bytes per row.
This even works per row if some columns can be NULL. If all fields of the same row have values, there is no NULL bitmap for the row. In your special case, this leads to the paradox that filling in values for et_mode
and token
can make your storage size smaller or at least stay the same:
- Do nullable columns occupy additional space in PostgreSQL?
Basic source of information: the manual on Database Physical Storage.
Compare the size of rows (filled with values) with your original table to get definitive proof:
SELECT pg_column_size(t) FROM dev_transactions t;
(Plus maybe padding between rows, as the next row starts at a multiple of 8 bytes.)
Using GUIDs in Primary Keys / Clusted Indexes
If you are doing any kind of volume, GUIDs are extremely bad as a PK bad unless you use sequential GUIDs, for the exact reasons you describe. Page fragmentation is severe:
Average Average
Fragmentation Fragment Fragment Page Average
Type in Percent Count Size Count Space Used
id 4.35 7 16.43 115 99.89
newidguid 98.77 162 1 162 70.90
newsequentualid 4.35 7 16.43 115 99.89
And as this comparison between GUIDs and integers shows:
Test1 caused a tremendous amount of page splits, and had a scan density around 12% when I ran a DBCC SHOWCONTIG after the inserts had completed. The Test2 table had a scan density around 98%
If your volume is very low, however, it just doesn't matter that much.
If you do really need a globally unique ID but have high volume (and can't use sequential IDs), just put the GUIDs in an indexed column.
SQL Server Filtered Index on PK Index/Constraint?
You can't define filtered index on PK. Is the following index useful for you?
CREATE NONCLUSTERED INDEX MyIndex
ON biglog(logtext)
WHERE rowstate IS NOT NULL ;
Best way to change clustered index (PK) in SQL 2005
If your table is getting up to 1 TB in size and probably has LOTS of rows in it, I would strongly recommend NOT making the clustered index that much fatter!
First of all, dropping and recreating the clustered index will shuffle around ALL your data at least once - that alone will take ages.
Secondly, the big compound clustered index you're trying to create will significantly increase the size of all your non-clustered indices (since they contain the whole clustered index value on each leaf node, for the bookmark lookups).
The question is more: why are you trying to do this?? Couldn't you just add another non-clustered index with those columns, to potentially cover your queries? Why does this have to be the clustered index?? I don't see any advantage in that....
For more information on indexing and especially the clustered index debate, see Kimberly Tripp's blog on SQL Server indexes - very helpful!
Marc
CREATE TABLE with FK and index options declarations
The
WITH ( PAD_INDEX = OFF, /*... */ ALLOW_PAGE_LOCKS = ON )
defines options for the index associated with the PK constraint not the foreign key. So it needs to go as part of the PK constraint definition. You are trying to include it as part of the FK definition. It should be
CREATE TABLE dbo.Calendar
(
ScenarioKey INT NOT NULL,
Bucket SMALLDATETIME NOT NULL,
BucketEnd SMALLDATETIME NOT NULL,
CONSTRAINT [PK-C_dbo.Calendar]
PRIMARY KEY CLUSTERED (ScenarioKey, Bucket)
WITH ( PAD_INDEX = OFF,
FILLFACTOR = 100,
IGNORE_DUP_KEY = OFF,
STATISTICS_NORECOMPUTE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON ),
CONSTRAINT [FK_dbo.Calendar_dbo.Scenario]
FOREIGN KEY (ScenarioKey)
REFERENCES dbo.Scenario (ScenarioKey) ON DELETE CASCADE
ON UPDATE CASCADE
)
ON [PRIMARY]
Related Topics
Postgresql - Conditional Ordering
Oracle SQL Query for Records with Timestamp That Falls Between Two Timestamps
Presto Sql: Changing Time Zones Using Time Zone String Coming as a Result of a Query Is Not Working
How to Find The Documentation for The Particular Kind of SQL Used by The Jet 4.0 Engine
Use Soundex() Word by Word on SQL Server
Language Translation for Tables
Sql Server 2000 - Query a Table's Foreign Key Relationships
How to Select Avg of Multiple Columns on a Single Row
How to Fire a Trigger Before a Delete in T-Sql 2005
Sql Selecting "Window" Around Particular Row
How to Dynamically Create Columns in SQL Select Statement
How to Copy Schema and Some Data from SQL Server to Another Instance
Why Is There a Scan on My Clustered Index
"Pivoting" a Table in SQL (I.E. Cross Tabulation/Crosstabulation)
Attaching an Mdf File Without Ldf File
How to Use Wildcards in "In" MySQL Statement
How to Convert Cyrillic Stored as Latin1 ( SQL ) to True Utf8 Cyrillic with Iconv