SQL-Injection in duckdb-queries on pandas dataframes

In a project I am working with duckdb to perform some queries on dataframes. For one of the queries, I have some user-input that I need to add to the query. That is why I am wondering if SQL-Injection is possible in this case. Is there a way a user could harm the application or the system through the input? And if so, how could I prevent this case? It seems that duckdb has no PreparedStatement for queries on dataframes.

I already looked up in the documentation (https://duckdb.org/docs/api/python/overview.html) but couldn’t find anything useful. The method duckdb.execute(query, parameters) only seems to work on databases with a real sql-connection and not on dataframes.

There is another question on stackoverflow (Syntax for Duckdb > Python SQL with Parameter\Variable) about this topic but the answer only works on real sql-connections and the version with f-strings seems insecure to me.

Here is a small code sample to show what I mean:

import duckdb
import pandas as pd

df_data = pd.DataFrame({'id': [1, 2, 3, 4], 'student': ['student_a', 'student_a', 'student_b', 'student_c']})
    
user_input = 3  # fetch some user_input here
    
# How to prevent sql-injection, if its even possible in this case?
result = duckdb.query("SELECT * FROM df_data WHERE id={}".format(user_input))

So how would you approach this problem? Is sql-injection even possible? Thanks for your help and feel free to ask for more details, if you need some more information!

EDIT: Fixed a syntax error in the code

The method duckdb.execute(query, parameters) only seems to work on databases with a real sql-connection and not on dataframes.

It seems it’s possible:

>>> duckdb.execute("""SELECT * FROM df_data WHERE id=?""", (user_input,)).df()

   id    student
0   3  student_b

Leave a Comment