How to add an auto-incrementing primary key to an existing table, in PostgreSQL?
(Updated - Thanks to the people who commented)
Modern Versions of PostgreSQL
Suppose you have a table named test1
, to which you want to add an auto-incrementing, primary-key id
(surrogate) column. The following command should be sufficient in recent versions of PostgreSQL:
ALTER TABLE test1 ADD COLUMN id SERIAL PRIMARY KEY;
Older Versions of PostgreSQL
In old versions of PostgreSQL (prior to 8.x?) you had to do all the dirty work. The following sequence of commands should do the trick:
ALTER TABLE test1 ADD COLUMN id INTEGER;
CREATE SEQUENCE test_id_seq OWNED BY test1.id;
ALTER TABLE test1 ALTER COLUMN id SET DEFAULT nextval('test_id_seq');
UPDATE test1 SET id = nextval('test_id_seq');
Again, in recent versions of Postgres this is roughly equivalent to the single command above.
How to write a table in PostgreSQL from R?
Ok, I'm not sure why dbWriteTable()
would be failing; there may be some kind of version/protocol mismatch. Perhaps you could try installing the latest versions of R, the RPostgreSQL package, and upgrading the PostgreSQL server on your system, if possible.
Regarding the insert into
workaround failing for large data, what is often done in the IT world when large amounts of data must be moved and a one-shot transfer is infeasible/impractical/flaky is what is sometimes referred to as batching or batch processing. Basically, you divide the data into smaller chunks and send each chunk one at a time.
As a random example, a few years ago I wrote some Java code to query for employee information from an HR LDAP server which was constrained to only provide 1000 records at a time. So basically I had to write a loop to keep sending the same request (with the query state tracked using some kind of weird cookie-based mechanism) and accumulating the records into a local database until the server reported the query complete.
Here's some code that manually constructs the SQL to create an empty table based on a given data.frame, and then insert the content of the data.frame into the table using a parameterized batch size. It's mostly built around calls to paste()
to build the SQL strings, and dbSendQuery()
to send the actual queries. I also use postgresqlDataType()
for the table creation.
## connect to the DB
library('RPostgreSQL'); ## loads DBI automatically
drv <- dbDriver('PostgreSQL');
con <- dbConnect(drv,host=...,port=...,dbname=...,user=...,password=...);
## define helper functions
createEmptyTable <- function(con,tn,df) {
sql <- paste0("create table \"",tn,"\" (",paste0(collapse=',','"',names(df),'" ',sapply(df[0,],postgresqlDataType)),");");
dbSendQuery(con,sql);
invisible();
};
insertBatch <- function(con,tn,df,size=100L) {
if (nrow(df)==0L) return(invisible());
cnt <- (nrow(df)-1L)%/%size+1L;
for (i in seq(0L,len=cnt)) {
sql <- paste0("insert into \"",tn,"\" values (",do.call(paste,c(sep=',',collapse='),(',lapply(df[seq(i*size+1L,min(nrow(df),(i+1L)*size)),],shQuote))),");");
dbSendQuery(con,sql);
};
invisible();
};
## generate test data
NC <- 1e2L; NR <- 1e3L; df <- as.data.frame(replicate(NC,runif(NR)));
## run it
tn <- 't1';
dbRemoveTable(con,tn);
createEmptyTable(con,tn,df);
insertBatch(con,tn,df);
res <- dbReadTable(con,tn);
all.equal(df,res);
## [1] TRUE
Note that I didn't bother prepending a row.names
column to the database table, unlike dbWriteTable()
, which always seems to include such a column (and doesn't seem to provide any means of preventing it).
Change primary key to auto increment
I figure it out: just add an auto-increment default value to the playerID:
create sequence player_id_seq;
alter table player alter playerid set default nextval('player_id_seq');
Select setval('player_id_seq', 2000051 ); --set to the highest current value of playerID
How to set auto increment primary key in PostgreSQL?
Try this command:
ALTER TABLE your_table ADD COLUMN key_column BIGSERIAL PRIMARY KEY;
Try it with the same DB-user as the one you have created the table.
postgres autoincrement not updated on explicit id inserts
That's how it's supposed to work - next_val('test_id_seq')
is only called when the system needs a value for this column and you have not provided one. If you provide value no such call is performed and consequently the sequence is not "updated".
You could work around this by manually setting the value of the sequence after your last insert with explicitly provided values:
SELECT setval('test_id_seq', (SELECT MAX(id) from "test"));
The name of the sequence is autogenerated and is always tablename_columnname_seq
.
PostgreSQL - create an auto-increment column for non-primary key
You may try making the item_id
column SERIAL
. I don't know whether or not it's possible to alter the current item_id
column to make it serial, so we might have to drop that column and then add it back, something like this:
ALTER TABLE yourTable DROP COLUMN item_id;
ALTER TABLE yourTable ADD COLUMN item_id SERIAL;
If there is data in the item_id
column already, it may not make sense from a serial point of view, so hopefully there is no harm in deleting it.
What's the PostgreSQL datatype equivalent to MySQL AUTO INCREMENT?
Yes, SERIAL is the equivalent function.
CREATE TABLE foo (
id SERIAL,
bar varchar
);
INSERT INTO foo (bar) VALUES ('blah');
INSERT INTO foo (bar) VALUES ('blah');
SELECT * FROM foo;
+----------+
| 1 | blah |
+----------+
| 2 | blah |
+----------+
SERIAL is just a create table time macro around sequences. You can not alter SERIAL onto an existing column.
loading dataframe into table postgres and pandas with auto-incrementing id
Try using if_exists="append"
in your to_sql
function.
If you use "replace"
instead it might recreate the table using only the columns in the excel file.
Related Topics
Scraping Tables on Multiple Web Pages with Rvest in R
Visualising and Rotating a Matrix
Let Each Plot in Facet_Grid Have Its Own Y-Axis Value
How to Change the Size of the Strip on Facets in a Ggplot
R-How to Generate Random Sample of a Discrete Random Variables
How to Save a Data Frame in a Txt or Excel File Separated by Columns
How to Know a Function or an Operation in R Is Vectorized
Use of .By and .Eachi in the Data.Table Package
R Reshape2 'Aggregation Function Missing: Defaulting to Length'
R: How to Aggregate Some Columns While Keeping Other Columns
How to Use Gsub() on Each Element of a Data Frame
Plot Table Objects with Ggplot
Prevent Automatic Conversion of Single Column to Vector
How to Write a Data-Frame with One Column a List to a File