How Does SQL Query Parameterisation Work

How does SQL query parameterisation work?

A parameterized query doesn't actually do string replacement. If you use string substitution, then the SQL engine actually sees a query that looks like

SELECT * FROM mytable WHERE user='wayne'

If you use a ? parameter, then the SQL engine sees a query that looks like

SELECT * FROM mytable WHERE user=<some value>

Which means that before it even sees the string "wayne", it can fully parse the query and understand, generally, what the query does. It sticks "wayne" into its own representation of the query, not the SQL string that describes the query. Thus, SQL injection is impossible, since we've already passed the SQL stage of the process.

(The above is generalized, but it more or less conveys the idea.)

How do parameterized queries help against SQL injection?

Parameterized queries do proper substitution of arguments prior to running the SQL query. It completely removes the possibility of "dirty" input changing the meaning of your query. That is, if the input contains SQL, it can't become part of what is executed because the SQL is never injected into the resulting statement.

What is parameterized query?

A parameterized query (also known as a prepared statement) is a means of pre-compiling a SQL statement so that all you need to supply are the "parameters" (think "variables") that need to be inserted into the statement for it to be executed. It's commonly used as a means of preventing SQL injection attacks.

You can read more about these on PHP's PDO page (PDO being a database abstraction layer), although you can also make use of them if you're using the mysqli database interface (see the prepare documentation).

SQL parameterization: How does this work behind the scenes?

I'm sure that the way that your command and parameters are handled will vary depending on the particular database engine and client library.

However, speaking from experience with SQL Server, I can tell you that parameters are preserved when sending commands using ADO.NET. They are not folded into the statement. For example, if you use SQL Profiler, you'll see a remote procedure call like:

exec sp_executesql N'INSERT INTO Test (Col1) VALUES (@p0)',N'@p0 nvarchar(4000)',@p0=N'p1'

Keep in mind that there are other benefits to parameterization besides preventing SQL injection. For example, the query engine has a better chance of reusing query plans for parameterized queries because the statement is always the same (just the parameter values change).

In response to update:
Query parameterization is so common I would expect MySQL (and really any database engine) to handle it similarly.

Based on the MySQL protocol documentation, it looks like prepared statements are handled using COM_PREPARE and COM_EXECUTE packets, which do support separate parameters in binary format. It's not clear if all parameterized statements will be prepared, but it does look like unprepared statements are handled by COM_QUERY which has no mention of parameter support.

When in doubt: test. If you really want to know what's sent over the wire, use a network protocol analyzer like Wireshark and look at the packets.

Regardless of how it's handled internally and any optimizations it may or may not currently provide for a given engine, there's very little (nothing?) to gain from not using parameters.

How does using parameters prevent SQL injection?

Because simply entering "drop database" into the field on its own is not sufficient. You'd have to add some other characters to convince the SQL interpreter to terminate the previous statement and start a new one (to execute the DROP). Parameterisation will, among other things, prevent those kind of sequences from successfully being injected and interpreted in the intended manner by SQL. Characters such as ' for instance, which might close a string, will be escaped so that they are considered part of a variable, not part of the SQL.

Specifically, http://bobby-tables.com/about contains a worked example. In the example, there is some PHP code to insert a row into a table:

$sql = "INSERT INTO Students (Name) VALUES ('" . $studentName . "');";
execute_sql($sql);

If $studentName is set to something normal like "John", then the final SQL string produced by the code is benign:

INSERT INTO Students (Name) VALUES ('John');

Equally, similar to your example, if $studentName was set to "DROP TABLE Students", it still wouldn't have any effect. The final SQL would be:

INSERT INTO Students (Name) VALUES ('DROP TABLE Students');

No harm done.

But...if $studentName is set to something a bit more subtle, like this:

Robert'); DROP TABLE Students;--

The final string looks like this:

INSERT INTO Students (Name) VALUES ('Robert'); DROP TABLE Students;--');

This is then passed to the SQL engine, which interprets it as two statements (with, incidentally, a comment at the end) and executes them both, with unpleasant consequences.

However, if the value in $studentName had been passed as a parameter instead of just joined to a standard string, the final query would have ended up as

INSERT INTO Students (Name) VALUES ('Robert\'); DROP TABLE Students;--');

Notice the escaped ' in the middle, so it's now considered part of the string. It no longer causes the string to stop after "t".

Therefore what gets entered into the Name field in the table will be

Robert'); DROP TABLE Students;--

The DROP TABLE statement won't get executed because the SQL interpreter never sees it - it just treats it as part of the value to be added to the table.

Parameterisation means that anything within the parameter value is treated as a string (or possibly number) only - it can never escape outside that and be considered as part of the SQL statement itself.

How do SQL parameters work internally?

The parameters make it to the SQL engine separately from the query. Execution plan calculated or reused for the parametrized query, and then query is executed by sql engine with parameters.

Why do we always prefer using parameters in SQL statements?

Using parameters helps prevent SQL Injection attacks when the database is used in conjunction with a program interface such as a desktop program or web site.

In your example, a user can directly run SQL code on your database by crafting statements in txtSalary.

For example, if they were to write 0 OR 1=1, the executed SQL would be

 SELECT empSalary from employee where salary = 0 or 1=1

whereby all empSalaries would be returned.

Further, a user could perform far worse commands against your database, including deleting it If they wrote 0; Drop Table employee:

SELECT empSalary from employee where salary = 0; Drop Table employee

The table employee would then be deleted.

In your case, it looks like you're using .NET. Using parameters is as easy as:

string sql = "SELECT empSalary from employee where salary = @salary";

using (SqlConnection connection = new SqlConnection(/* connection info */))
using (SqlCommand command = new SqlCommand(sql, connection))
{
    var salaryParam = new SqlParameter("salary", SqlDbType.Money);
    salaryParam.Value = txtMoney.Text;

    command.Parameters.Add(salaryParam);
    var results = command.ExecuteReader();
}

Dim sql As String = "SELECT empSalary from employee where salary = @salary"
Using connection As New SqlConnection("connectionString")
    Using command As New SqlCommand(sql, connection)
        Dim salaryParam = New SqlParameter("salary", SqlDbType.Money)
        salaryParam.Value = txtMoney.Text

        command.Parameters.Add(salaryParam)

        Dim results = command.ExecuteReader()
    End Using
End Using

Edit 2016-4-25:

As per George Stocker's comment, I changed the sample code to not use AddWithValue. Also, it is generally recommended that you wrap IDisposables in using statements.

How does SQLParameter prevent SQL Injection?

Basically, when you perform a SQLCommand using SQLParameters, the parameters are never inserted directly into the statement. Instead, a system stored procedure called sp_executesql is called and given the SQL string and the array of parameters.

When used as such, the parameters are isolated and treated as data, instead of having to be parsed out of the statement (and thus possibly changing it), so what the parameters contain can never be "executed". You'll just get a big fat error that the parameter value is invalid in some way.

Can parameterized statement stop all SQL injection?

The links that I have posted in my comments to the question explain the problem very well. I've summarised my feelings on why the problem persists, below:

Those just starting out may have no awareness of SQL injection.
Some are aware of SQL injection, but think that escaping is the (only?) solution. If you do a quick Google search for php mysql query, the first page that appears is the mysql_query page, on which there is an example that shows interpolating escaped user input into a query. There's no mention (at least not that I can see) of using prepared statements instead. As others have said, there are so many tutorials out there that use parameter interpolation, that it's not really surprising how often it is still used.
A lack of understanding of how parameterized statements work. Some think that it is just a fancy means of escaping values.
Others are aware of parameterized statements, but don't use them because they have heard that they are too slow. I suspect that many people have heard how incredibly slow paramterized statements are, but have not actually done any testing of their own. As Bill Karwin pointed out in his talk, the difference in performance should rarely be used as a factor when considering the use of prepared statements. The benefits of prepare once, execute many, often appear to be forgotten, as do the improvements in security and code maintainability.
Some use parameterized statements everywhere, but with interpolation of unchecked values such as table and columns names, keywords and conditional operators. Dynamic searches, such as those that allow users to specify a number of different search fields, comparison conditions and sort order, are prime examples of this.
False sense of security when using an ORM. ORMs still allow interpolation of SQL statement parts - see 5.
Programming is a big and complex subject, database management is a big and complex subject, security is a big and complex subject. Developing a secure database application is not easy - even experienced developers can get caught out.
Many of the answers on stackoverflow don't help. When people write questions that use dynamic SQL and parameter interpolation, there is often a lack of responses that suggest using parameterized statements instead. On a few occasions, I've had people rebut my suggestion to use prepared statements - usually because of the perceived unacceptable performance overhead. I seriously doubt that those asking most of these questions are in a position where the extra few milliseconds taken to prepare a parameterized statement will have a catastrophic effect on their application.

How Does SQL Query Parameterisation Work