Django: Using custom raw SQL inserts with executemany and MySQL
Here's a solution that actually uses executemany() !
Basically the idea in the example here will work.
But note that in Django, you need to use the %s placeholder rather than the question mark.
Also, you will want to manage your transactions. I'll not get into that here as there is plenty of documentation available.
from django.db import connection,transaction
cursor = connection.cursor()
query = ''' INSERT INTO table_name
(var1,var2,var3)
VALUES (%s,%s,%s) '''
query_list = build_query_list()
# here build_query_list() represents some function to populate
# the list with multiple records
# in the tuple format (value1, value2, value3).
cursor.executemany(query, query_list)
transaction.commit()
Django: Bulk operations
Use cursor.executemany(query, param_list)
if you need a raw SQL.
param_list = [("something_1", 1), ("something_2", 2),...]
# or everything like [(some_number_1, 1)...]. Apostrophes around the substituted
# "%s" and the complete necessary escaping is added automatically for string
# parameters.
cursor.executemany("""UPDATE database.table
SET field_to_modify=%s
WHERE primary_key_field=%s""",
param_list)
It has many advantages:
- The query string is much shorter than a big query and can be fast analyzed/optimized without consuming unnecessary resources by the database planner. (If you parse parameters into SQL manually, you get many different SQL command that must be analyzed individually.)
- Parsing strings into SQL manually is a bad practise because it can be a security issue (SQL injection attack) if you don't escape unexpected apostrophes and backslashes from the user input correctly.
It is an undocumented method though both methods execute(self, sql, params=())
and executemany(self, sql, param_list)
are supported for cursor objects by all native db backends (mysql, postgesql_psycopg2, sqlite3, oracle) for long time since Django-0.96 to the current 1.5-beta. A useful similar answer is https://stackoverflow.com/a/6101536/448474 .
The method executemany had two fixed issues related to exception handling in previous years. So, verify for your Django version that you get helpful error messages if you intentionally cause a database exception, too much %s
or too little etc. Yet, a few minutes of initial tinkering/testing are faster than many hours of waiting for slow methods.
Accelerate bulk insert using Django's ORM?
Django 1.4 provides a bulk_create()
method on the QuerySet object, see:
- https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.bulk_create
- https://docs.djangoproject.com/en/dev/releases/1.4/
- https://code.djangoproject.com/ticket/7596
MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4 for bulk insert
So I created a custom manager. Here is the manager:
class BulkInsertManager(models.Manager):
def _bulk_insert_or_update(self, create_fields, update_fields, values):
from django.db import connection, transaction
cursor = connection.cursor()
db_table = self.model._meta.db_table
values_sql = []
values_data =[]
for value_lists in values:
values_sql.append( "(%s)" % (','.join([ "%s" for i in range(len(value_lists))]),) )
values_data.extend(value_lists)
base_sql = "INSERT INTO %s (%s) VALUES " % (db_table, ",".join(create_fields))
on_duplicates = []
for field in update_fields:
on_duplicates.append(field + "=VALUES(" + field +")")
sql = "%s %s ON DUPLICATE KEY UPDATE %s" % (base_sql, ", ".join(values_sql), ",".join(on_duplicates))
cursor.executemany(sql, [values_data])
transaction.commit_unless_managed()
And a sample model:
class User_Friend(models.Model):
objects = BulkInsertManager() # assign a custom manager to handle bulk insert
id = models.CharField(max_length=255)
user = models.ForeignKey(User, null=False, blank=False)
first_name = models.CharField(max_length=30)
last_name = models.CharField(max_length=30)
city = models.CharField(max_length=50, null=True, blank=True)
province = models.CharField(max_length=50, null=True, blank=True)
country = models.CharField(max_length=30, null=True, blank=True)
And sample implementation:
def save_user_friends(user, friends):
user_friends = []
for friend in friends:
create_fields = ['id', 'user_id', 'first_name', 'last_name', 'city', 'province', 'country']
update_fields = ['first_name', 'last_name', 'city', 'province', 'country']
user_friends.append(
[
str(user.id),
str(friend['id']),
friend['first_name'],
friend['last_name'],
friend['city'],
friend['province'],
friend['country'],
]
)
User_Friend.objects._bulk_insert_or_update(create_fields, update_fields, user_friends)
Here is the gist.
Django cursor.executemany what's the preferred batch size for each executemany
MySQL documentation helps elaborate on that:
In most cases, the executemany() method iterates through the sequence of parameters, each time passing the current parameters to the the execute() method.
An optimization is applied for inserts: The data values given by the parameter sequences are batched using multiple-row syntax.
Why is executemany slow in Python MySQLdb?
Try lowercasing the word 'values' in your query - this appears to be a bug/regression in MySQL-python 1.2.3.
MySQL-python's implementation of executemany() matches the VALUES clause with a regular expression and then just clones the list of values for each row of data, so you end up executing exactly the same query as with your first approach.
Unfortunately the regular expression lost its case-insensitive flag in that release (subsequently fixed in trunk r622 but never backported to the 1.2 branch) so it degrades to iterating over the data and firing off a query per row.
Related Topics
Selecting Nth Record in an SQL Query
Exec Failed Because the Name Not a Valid Identifier
How to Add Results of Two Select Commands in Same Query
How to Remove Leading and Trailing Quotes in SQL Server
Create Table Permission Denied in Database 'Master'
Doing "Points of Interest Along a Route" in Google Maps
Confusion with Oracle Connect By
What's the Simplest Way to Import an SQLite SQL File into a Web SQL Database
Postgres Date Overlapping Constraint
Fastest Way to Find String by Substring in SQL
Easiest Way to Copy a MySQL Database
Postgresql Copy/Transfer Data from One Database to Another
Ms SQL Server - How to Create a View from a Cte
Multiple Aggregate Functions in One SQL Query from the Same Table Using Different Conditions