How to Escape Strings for SQLite Table/Column Names in Python

How do you escape strings for SQLite table/column names in Python?

To convert any string into a SQLite identifier:

  • Ensure the string can be encoded as UTF-8.
  • Ensure the string does not include any NUL characters.
  • Replace all " with "".
  • Wrap the entire thing in double quotes.

Implementation

import codecs

def quote_identifier(s, errors="strict"):
encodable = s.encode("utf-8", errors).decode("utf-8")

nul_index = encodable.find("\x00")

if nul_index >= 0:
error = UnicodeEncodeError("NUL-terminated utf-8", encodable,
nul_index, nul_index + 1, "NUL not allowed")
error_handler = codecs.lookup_error(errors)
replacement, _ = error_handler(error)
encodable = encodable.replace("\x00", replacement)

return "\"" + encodable.replace("\"", "\"\"") + "\""

Given a string single argument, it will escape and quote it correctly or raise an exception. The second argument can be used to specify any error handler registered in the codecs module. The built-in ones are:

  • 'strict': raise an exception in case of an encoding error
  • 'replace': replace malformed data with a suitable replacement marker, such as '?' or '\ufffd'
  • 'ignore': ignore malformed data and continue without further notice
  • 'xmlcharrefreplace': replace with the appropriate XML character reference (for encoding only)
  • 'backslashreplace': replace with backslashed escape sequences (for encoding only)

This doesn't check for reserved identifiers, so if you try to create a new SQLITE_MASTER table it won't stop you.

Example Usage

import sqlite3

def test_identifier(identifier):
"Tests an identifier to ensure it's handled properly."

with sqlite3.connect(":memory:") as c:
c.execute("CREATE TABLE " + quote_identifier(identifier) + " (foo)")
assert identifier == c.execute("SELECT name FROM SQLITE_MASTER").fetchone()[0]

test_identifier("'Héllo?'\\\n\r\t\"Hello!\" -☃") # works
test_identifier("北方话") # works
test_identifier(chr(0x20000)) # works

print(quote_identifier("Fo\x00o!", "replace")) # prints "Fo?o!"
print(quote_identifier("Fo\x00o!", "ignore")) # prints "Foo!"
print(quote_identifier("Fo\x00o!")) # raises UnicodeEncodeError
print(quote_identifier(chr(0xD800))) # raises UnicodeEncodeError

Observations and References

  • SQLite identifiers are TEXT, not binary.

    • SQLITE_MASTER schema in the FAQ
    • Python 2 SQLite API yelled at me when I gave it bytes it couldn't decode as text.
    • Python 3 SQLite API requires queries be strs, not bytes.
  • SQLite identifiers are quoted using double-quotes.

    • SQL as Understood by SQLite
  • Double-quotes in SQLite identifiers are escaped as two double quotes.
  • SQLite identifiers preserve case, but they are case-insensitive towards ASCII letters. It is possible to enable unicode-aware case-insensitivity.

    • SQLite FAQ Question #18
  • SQLite does not support the NUL character in strings or identifiers.

    • SQLite Ticket 57c971fc74
  • sqlite3 can handle any other unicode string as long as it can be properly encoded to UTF-8. Invalid strings could cause crashes between Python 3.0 and Python 3.1.2 or thereabouts. Python 2 accepted these invalid strings, but this is considered a bug.

    • Python Issue #12569
    • Modules/_sqlite/cursor.c
    • I tested it a bunch.

Can I use python to simplify an SQLite query involving all columns?

There are a couple of stack overflow issues that address this already. See here and here. In short, column names are not a security issue, so if you want to do anything dynamic with them, you need to use regular python string operations for that part and use parameter binding for just the values.
Something like this should work.

for column in data.description:
query = f'''SELECT {column[0]} FROM table1 WHERE {column[0]} = ?'''
cur.execute(query,(str(word),))
output = cur.fetchone()
print(output)

Python Sqlite Escape String Select Column

You cannot parameterize the column or table names. You have to use string formatting:

def ChangeItemQuantity(self, column_name, item_name, incrament_quantity):
try:
self.c.execute('''
SELECT {column_name}
FROM items
WHERE itemName=?
'''.format(column_name=column_name), (item_name,))

Make sure though that you either trust your source, or validate the column name before inserting into the query.

SQL String throws error with special character in SQLIte column name

Here is the solution. The column name just needs double-quotes around it i.e. on the inside of the single quotes such that c2='"Prob-Column"'. Anyway hope this helps someone else.

#imports
import numpy as np
import pandas as pd
from sqlalchemy import create_engine

#create df
df = pd.DataFrame(np.random.rand(10,2),columns=['Column1','Prob-Column'])
# create engine to connect to db
engine = create_engine('sqlite://')
#create table in db
df.to_sql('my_table',engine,if_exists='replace')

# variables
vals = '(?)'
fil = ('key',)

# create sql string
sq = 'SELECT * FROM {t} WHERE {c1} IN {vals} GROUP BY {c2}'\
.format(t='my_table',c1='Column1',c2='"Prob-Column"',vals = vals)

#write query to pandas df
df = pd.read_sql_query(sq,engine,params=(fil))

Substituting column names in Python sqlite3 query

As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:

It comes down to what "safe" means. The conventional wisdom is that
using normal python string manipulation to put values into your
queries is not "safe". This is because there are all sorts of things
that can go wrong if you do that, and such data very often comes from
the user and is not in your control. You need a 100% reliable way of
escaping these values properly so that a user cannot inject SQL in a
data value and have the database execute it. So the library writers do
this job; you never should.

If, however, you're writing generic helper code to operate on things
in databases, then these considerations don't apply as much. You are
implicitly giving anyone who can call such code access to everything
in the database; that's the point of the helper code. So now the
safety concern is making sure that user-generated data can never be
used in such code. This is a general security issue in coding, and is
just the same problem as blindly execing a user-input string. It's a
distinct issue from inserting values into your queries, because there
you want to be able to safely handle user-input data.

So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.

Can't escape table or column names when importing data model from XML using SQLite

I posted a solution that seems to work in the other question. Here's the function:

def quote_identifier(s, errors="strict"):
encodable = s.encode("utf-8", errors).decode("utf-8")

nul_index = encodable.find("\x00")

if nul_index >= 0:
error = UnicodeEncodeError("NUL-terminated utf-8", encodable,
nul_index, nul_index + 1, "NUL not allowed")
error_handler = codecs.lookup_error(errors)
replacement, _ = error_handler(error)
encodable = encodable.replace("\x00", replacement)

return "\"" + encodable.replace("\"", "\"\"") + "\""

It doesn't warn you about reserved identifiers, so you have to worry about that yourself.

Escaping chars in Python and sqlite

You're doing it wrong. Literally. You should be using parameters, like this:

c.execute("UPDATE movies SET rating = ? WHERE name = ?", (8.7, "'Allo 'Allo! (1982)"))

Like that, you won't need to do any quoting at all and (if those values are coming from anyone untrusted) you'll be 100% safe (here) from SQL injection attacks too.



Related Topics



Leave a reply



Submit