Pandas Read_SQL With Parameters

Pandas read_sql with parameters

The read_sql docs say this params argument can be a list, tuple or dict (see docs).

To pass the values in the sql query, there are different syntaxes possible: ?, :1, :name, %s, %(name)s (see PEP249).

But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2 in your case I suppose).

In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2 documentation, they support the %(name)s style (and so not the :name I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.

So using that style should work:

df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])

How to use params from pandas.read_sql to import data with Python pandas from SQLite table between dates

You have the syntax for named parameter passing wrong. See the example in the sqlite3 docs:

# And this is the named style:
cur.execute("select * from people where name_last=:who and age=:age", {"who": who, "age": age})

So for your case it should be:

query = '''SELECT *
FROM "Historique"
WHERE "index" BETWEEN :dstart AND :dfinish'''

pd.read_sql(query, con=conn, params={"dstart":start, "dfinish":end})

How to automate parameters passed into pandas.read_sql?

You can use parametrized queries by wrapping the query in sqlalchemy.text and converting lists to tuples. For example:

def my_func(conn, min_number, letters):
# convert lists to tuples
letters = tuple(letters)

# wrap sql in sqlalchemy.text
sql = sqlalchemy.text("""
SELECT *
FROM letters
WHERE
number >= :min_number AND
letter in :letters""")

# read and return the resulting dataframe
df = pd.read_sql(sql, conn, params=locals())
return df

my_func(conn, 10, ['a', 'b', 'c', 'x', 'y', 'z'])

Output:

  letter  number
0 x 23
1 y 24
2 z 25

For completeness of the example, the following was used as a test table:

df = pd.DataFrame({
'letter': list(string.ascii_lowercase),
'number': range(len(string.ascii_lowercase))})
df.to_sql('letters', conn, index=False)

Update: Here's a possible workaround for Oracle to make it work with lists:

def get_query(sql, **kwargs):
for k, v in kwargs.items():
vs = "','".join(v)
sql = sql.replace(f':{k}', f"('{vs}')")
return sql

def my_func(conn, min_number, letters):
sql_template = """
SELECT *
FROM letters
WHERE
number >= :min_number AND
letter in :letters
"""
# pass list variables to `get_query` function as named parameters
# to get parameters replaced with ('value1', 'value2', ..., 'valueN')
sql = sqlalchemy.text(
get_query(sql_template, letters=letters))

df = pd.read_sql(sql, conn, params=locals())
return df

my_func(conn, 10, ['a', 'b', 'c', 'x', 'y', 'z'])

Update 2: Here's the get_query function that works with both strings and numbers (enclosing in quotes strings, but not numbers):

def get_query(sql, **kwargs):
# enclose in quotes strings, but not numbers
def q(x):
q = '' if isinstance(x, (int, float)) else "'"
return f'{q}{x}{q}'

# replace with values
for k, v in kwargs.items():
sql = sql.replace(f':{k}', f"({','.join([q(x) for x in v])})")

return sql

For example:

sql = """    
SELECT *
FROM letters
WHERE
number in :numbers AND
letters in :letters
"""

get_query(sql,
numbers=[1, 2, 3],
letters=['A', 'B', 'C'])

Output:

SELECT *
FROM letters
WHERE
number in (1,2,3) AND
letters in ('A','B','C')

Python pandas pd.read_sql with parameter from dropdown text value

When dropdown1_value is a str value, then use:

sql = f'''
SELECT CONVERT(int,month) as month
,CONVERT(int, revenue) as revenue
FROM dbo.Sales_data where region = '{dropdown1_value}'
'''
df = pd.read_sql(sql, cnxn)

while dropdown1_value is a list value, then use:

sql = f'''
SELECT CONVERT(int,month) as month
,CONVERT(int, revenue) as revenue
FROM dbo.Sales_data where region in {tuple(dropdown1_value)}
'''
df = pd.read_sql(sql, cnxn)

How can I use multiple parameters using pandas pd.read_sql_query?

try a string in a tuple, also you can take out the () in the query:

so you could do something like

query = "SELECT LicenseNo FROM License_Mgmt_Reporting.dbo.MATLAB_NNU_OPTIONS WHERE Region = ? and FeatureName = ? and NewUser =?"
region = 'US'
feature = 'tall'
newUser = 'john'
data_df = pd.read_sql_query(query, engine, params=(region, feature , newUser))

Binding list to params in Pandas read_sql_query with other params

Break this up into three parts to help isolate the problem and improve readability:

  1. Build the SQL string
  2. Set parameter values
  3. Execute pandas.read_sql_query


Build SQL

First ensure ? placeholders are being set correctly. Use str.format with str.join and len to dynamically fill in ?s based on member_list length. Below examples assume 3 member_list elements.

Example

member_list = (1,2,3)
sql = """select member_id, yearmonth
from queried_table
where yearmonth between {0} and {0}
and member_id in ({1})"""
sql = sql.format('?', ','.join('?' * len(member_list)))
print(sql)

Returns

select member_id, yearmonth
from queried_table
where yearmonth between ? and ?
and member_id in (?,?,?)


Set Parameter Values

Now ensure parameter values are organized into a flat tuple

Example

# generator to flatten values of irregular nested sequences,
# modified from answers http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
def flatten(l):
for el in l:
try:
yield from flatten(el)
except TypeError:
yield el

params = tuple(flatten((201601, 201603, member_list)))
print(params)

Returns

(201601, 201603, 1, 2, 3)


Execute

Finally bring the sql and params values together in the read_sql_query call

query = pd.read_sql_query(sql, db2conn, params)


Related Topics



Leave a reply



Submit