Psycopg2: Insert Multiple Rows with One Query

Inserting multiple rows using psycopg2

To use the execute method place the data to be inserted in a list. A list will be adapted by psycopg2 to an array. Then you unnest the array and cast the values as necessary

import psycopg2

insert = """
insert into history ("timestamp")
select value
from unnest(%s) s(value timestamp)
returning *
;"""

data = [('2014-04-27 14:07:30.000000',), ('2014-04-27 14:07:35.000000',)]
conn = psycopg2.connect("host=localhost4 port=5432 dbname=cpn")
cursor = conn.cursor()
cursor.execute(insert, (data,))
print cursor.fetchall()
conn.commit()
conn.close()

Not sure if the performance difference from executemany will be significant. But I think the above is neater. The returning clause will, as the name suggests, return the inserted tuples.

BTW timestamp is a reserved word and should not be used as a column name.

psycopg2: update multiple rows with one query

By splitting the list into chunks of size equal to page_size, it worked well:

def update_items(rows_to_update):
sql_query = """UPDATE contact as t SET
name = data.name
FROM (VALUES %s) AS data (id, name)
WHERE t.id = data.id"""
conn = get_db_connection()
cur = conn.cursor()
n = 100
with tqdm(total=len(rows_to_update)) as pbar:
for i in range(0, len(rows_to_update), n):
psycopg2.extras.execute_values (
cur, sql_query, rows_to_update[i:i + n], template=None, page_size=n
)
conn.commit()
pbar.update(cur.rowcount)
cur.close()
conn.close()

Efficiently insert massive amount of rows in Psycopg2

Based on the answers given here, COPY is the fastest method. COPY reads from a file or file-like object.

Since memory I/O is many orders of magnitude faster than disk I/O, it is faster to write the data to a StringIO file-like object than to write to an actual file.
The psycopg docs show an example of calling copy_from with a StringIO as input.

Therefore, you could use something like:

try:
# Python2
from cStringIO import StringIO
except ImportError:
# Python3
from io import StringIO

def db_insert_spectrum(curs, visual_data, recording_id):
f = StringIO()
# visual_data is a 2D array (a nx63 matrix)
values_list = []
for rowIndex, rowData in enumerate(visual_data):
items = []
for colIndex, colData in enumerate(rowData):
value = (rowIndex, colIndex, colData, recording_id)
items.append('\t'.join(map(str, value))+'\n')
f.writelines(items)
f.seek(0)
cur.copy_from(f, 'spectrums', columns=('row', 'col', 'value', 'recording_id'))


Related Topics



Leave a reply



Submit