In a project I am working with duckdb to perform some queries on dataframes. For one of the queries, I have some user-input that I need to add to the query. That is why I am wondering if SQL-Injection is possible in this case. Is there a way a user could harm the application or the system through the input? And if so, how could I prevent this case? It seems that duckdb has no PreparedStatement for queries on dataframes.
I already looked up in the documentation (https://duckdb.org/docs/api/python/overview.html) but couldn’t find anything useful. The method duckdb.execute(query, parameters)
only seems to work on databases with a real sql-connection and not on dataframes.
There is another question on stackoverflow (Syntax for Duckdb > Python SQL with Parameter\Variable) about this topic but the answer only works on real sql-connections and the version with f-strings seems insecure to me.
Here is a small code sample to show what I mean:
import duckdb
import pandas as pd
df_data = pd.DataFrame({'id': [1, 2, 3, 4], 'student': ['student_a', 'student_a', 'student_b', 'student_c']})
user_input = 3 # fetch some user_input here
# How to prevent sql-injection, if its even possible in this case?
result = duckdb.query("SELECT * FROM df_data WHERE id={}".format(user_input))
So how would you approach this problem? Is sql-injection even possible? Thanks for your help and feel free to ask for more details, if you need some more information!
EDIT: Fixed a syntax error in the code
The method
duckdb.execute(query, parameters)
only seems to work on databases with a real sql-connection and not on dataframes.
It seems it’s possible:
>>> duckdb.execute("""SELECT * FROM df_data WHERE id=?""", (user_input,)).df()
id student
0 3 student_b