PHP & SQL Injection - Utf8 Poc

Is it safe using addslashes in php when parameter charset is utf-8?

In fact, there are two questions in one. And so it's better to voice them separately.

For the question

Is it safe using addslashes() if charset is utf8?

The answer is YES, it is safe.

Taken by itself, with isolated example, addslashes can produce a safe sequence to be used in the SQL string literal if your charset is utf8.

However, taken as a protection measure, intended, as it is commonly used, to "process all the input data to make it safe" it is proven to be fatally insecure. Which for the question

Is it safe using addslashes() to prevent SQL injection

makes it the only answer:

NO WAY!

Simply because that this honest function has nothing to do with protection from any injections. And never has been.

What you have to understand, is that the main threat is coming not from the semi-mythical GBK vulnerability, but entirely from the misuse of this function. As it's just not intended to protect you from injections. The topic of protection is much more complex than simple string escaping.

The problem is that there are a lot of rules to keep in mind. And there are a lot of points of possible failure.

For these reasons, a simple string escaping just cannot be considered as an all-embracing protection rule.

From this point of view, parametrized queries, although not offering the 100% protection, can be considered a WAY better measure anyway, eliminating three most dangerous threats:

because numbers also covered, there is no way to inject via numeric literal
because of complete formatting, a wrongly escaped identifier becomes not a breach but a development stage error.
because of automated formatting, a human error is eliminated

The above these three reasons I consider enough for changing your approach.

Besides, properly implemented parametrized queries make your code DRAMATICALLY cleaner. Give me your addslashes-based code snippet, and I'll show you how to make it 3-5 times shorter and cleaner.

Are mysql_real_escape_string() and mysql_escape_string() sufficient for app security?

@Charles is extremely correct!

You put yourself at risk for multiple types of known SQL attacks, including, as you mentioned

SQL injection: Yes! Mysql_Escape_String probably STILL keeps you susceptible to SQL injections, depending on where you use PHP variables in your queries.

Consider this:

$sql = "SELECT number FROM PhoneNumbers " .
       "WHERE " . mysql_real_escape_string($field) . " = " . mysql_real_escape_string($value);

Can that be securely and accurately escaped that way? NO! Why? because a hacker could very well still do this:

Repeat after me:

mysql_real_escape_string() is only meant to escape variable data, NOT table names, column names, and especially not LIMIT fields.

LIKE exploits: LIKE "$data%" where $data could be "%" which would return ALL records ... which can very well be a security exploit... just imagine a Lookup by last four digits of a credit card... OOPs! Now the hackers can potentially receive every credit card number in your system! (BTW: Storing full credit cards is hardly ever recommended!)
Charset Exploits: No matter what the haters say, Internet Explorer is still, in 2011, vulnerable to Character Set Exploits, and that's if you have designed your HTML page correctly, with the equivalent of <meta name="charset" value="UTF-8"/>! These attacks are VERY nasty as they give the hacker as much control as straight SQL injections: e.g. full.

Here's some example code to demonstrate all of this:

// Contains class DBConfig; database information.
require_once('../.dbcreds');                       

$dblink = mysql_connect(DBConfig::$host, DBConfig::$user, DBConfig::$pass);
mysql_select_db(DBConfig::$db);
//print_r($argv);

$sql = sprintf("SELECT url FROM GrabbedURLs WHERE %s LIKE '%s%%' LIMIT %s",
               mysql_real_escape_string($argv[1]),
               mysql_real_escape_string($argv[2]),
               mysql_real_escape_string($argv[3]));
echo "SQL: $sql\n";
$qq = mysql_query($sql);
while (($data = mysql_fetch_array($qq)))
{
        print_r($data);
}

Here's the results of this code when various inputs are passed:

$ php sql_exploits.php url http://www.reddit.com id
SQL generated: SELECT url FROM GrabbedURLs 
               WHERE url LIKE 'http://www.reddit.com%'
               ORDER BY id;
Returns: Just URLs beginning w/ "http://www.reddit.com"

$ php sql_exploits.php url % id
SQL generated: SELECT url FROM GrabbedURLs 
               WHERE url LIKE '%%' 
               ORDER BY id;
Results: Returns every result Not what you programmed, ergo an exploit --

$ php sql_exploits.php 1=1
'http://www.reddit.com' id Results:
Returns every column and every result.

Then there are the REALLLY nasty LIMIT exploits:

$ php sql_exploits.php url 
> 'http://www.reddit.com'
> "UNION SELECT name FROM CachedDomains"
Generated SQL: SELECT url FROM GrabbedURLs 
               WHERE url LIKE 'http://reddit.com%' 
               LIMIT 1 
               UNION
               SELECT name FROM CachedDomains;
Returns:  An entirely unexpected, potentially (probably) unauthorized query
          from another, completely different table.

Whether you understand the SQL in the attacks or not is irrevelant. What this has demonstrated is that mysql_real_escape_string() is easily circumvented by even the most immature of hackers. That is because it is a REACTIVE defense mechism. It only fixes very limited and KNOWN exploits in the Database.

All escaping will NEVER be sufficient to secure databases. In fact, you can explicitly REACT to every KNOWN exploit and in the future, your code will most likely become vulnerable to attacks discovered in the future.

The proper, and only (really) , defense is a PROACTIVE one: Use Prepared Statements. Prepared statements are designed with special care so that ONLY valid and PROGRAMMED SQL is executed. This means that, when done correctly, the odds of unexpected SQL being able to be executed are drammatically reduced.

Theoretically, prepared statements that are implemented perfectly would be impervious to ALL attacks, known and unknown, as they are a SERVER SIDE technique, handled by the DATABASE SERVERS THEMSELVES and the libraries that interface with the programming language. Therefore, you're ALWAYS guaranteed to be protected against EVERY KNOWN HACK, at the bare minimum.

And it's less code:

$pdo = new PDO($dsn);

$column = 'url';
$value = 'http://www.stackoverflow.com/';
$limit = 1;

$validColumns = array('url', 'last_fetched');

// Make sure to validate whether $column is a valid search parameter.
// Default to 'id' if it's an invalid column.
if (!in_array($column, $validColumns) { $column = 'id'; }

$statement = $pdo->prepare('SELECT url FROM GrabbedURLs ' .
                           'WHERE ' . $column . '=? ' .
                           'LIMIT ' . intval($limit));
$statement->execute(array($value));
while (($data = $statement->fetch())) { }

Now that wasn't so hard was it? And it's forty-seven percent less code (195 chars (PDO) vs 375 chars (mysql_). That's what I call, "full of win".

EDIT: To address all the controversy this answer stirred up, allow me to reiterate what I have already said:

Using prepared statements allows one to harness the protective measures of
the SQL server itself, and therefore
you are protected from things that the
SQL server people know about. Because
of this extra level of protection, you
are far safer than by just using
escaping, no matter how thorough.

export sql in php UTF-8

Start your query after this:

$db = new MySQLi("host", "username", "password", "db");

if(!$db)
{
    die("your_error_msg" . mysqli_error($db));
}

$db->set_charset("utf8");'

EDIT

<?php
    //ENTER THE RELEVANT INFO BELOW
    $mysqlDatabaseName ='xxxxxx';
    $mysqlUserName ='xxxxxxx';
    $mysqlPassword ='myPassword';
    $mysqlHostName ='xxxxxxxx.net';
    $mysqlExportPath ='chooseFilenameForBackup.sql';

    //DO NOT EDIT BELOW THIS LINE
    //Export the database and output the status to the page
    $command='mysqldump --opt -h' .$mysqlHostName .' -u' .$mysqlUserName .' -p' .$mysqlPassword .' --default-character-set=utf8 ' .$mysqlDatabaseName .' > ~/' .$mysqlExportPath;
    exec($command);
 ?>

From: export database using php code in .sql Extension?, but I added the --default-character-set=utf8 part

Having hard time solving sql injection php

Updated because off an comment from Jason McCreary
prepared statements are more safe but always force an type when you bind the values.
But you still need to watch out for second order SQL injections they are still possible then even if you make use off prepared statements

Id should be an integer just cast it to an int and filter out NULL bytes, NULL bytes are also evil things

$id = (int)(str_replace("\0", "", $_GET['id']));

Are PDO prepared statements sufficient to prevent SQL injection?

The short answer is YES, PDO prepares are secure enough if used properly.

I'm adapting this answer to talk about PDO...

The long answer isn't so easy. It's based off an attack demonstrated here.

The Attack

So, let's start off by showing the attack...

$pdo->query('SET NAMES gbk');
$var = "\xbf\x27 OR 1=1 /*";
$query = 'SELECT * FROM test WHERE name = ? LIMIT 1';
$stmt = $pdo->prepare($query);
$stmt->execute(array($var));

In certain circumstances, that will return more than 1 row. Let's dissect what's going on here:

Selecting a Character Set
```
$pdo->query('SET NAMES gbk');
```
For this attack to work, we need the encoding that the server's expecting on the connection both to encode ' as in ASCII i.e. 0x27 and to have some character whose final byte is an ASCII \ i.e. 0x5c. As it turns out, there are 5 such encodings supported in MySQL 5.6 by default: big5, cp932, gb2312, gbk and sjis. We'll select gbk here.
Now, it's very important to note the use of SET NAMES here. This sets the character set ON THE SERVER. There is another way of doing it, but we'll get there soon enough.
The Payload
The payload we're going to use for this injection starts with the byte sequence 0xbf27. In gbk, that's an invalid multibyte character; in latin1, it's the string ¿'. Note that in latin1 and gbk, 0x27 on its own is a literal ' character.
We have chosen this payload because, if we called addslashes() on it, we'd insert an ASCII \ i.e. 0x5c, before the ' character. So we'd wind up with 0xbf5c27, which in gbk is a two character sequence: 0xbf5c followed by 0x27. Or in other words, a valid character followed by an unescaped '. But we're not using addslashes(). So on to the next step...
$stmt->execute()
The important thing to realize here is that PDO by default does NOT do true prepared statements. It emulates them (for MySQL). Therefore, PDO internally builds the query string, calling mysql_real_escape_string() (the MySQL C API function) on each bound string value.
The C API call to mysql_real_escape_string() differs from addslashes() in that it knows the connection character set. So it can perform the escaping properly for the character set that the server is expecting. However, up to this point, the client thinks that we're still using latin1 for the connection, because we never told it otherwise. We did tell the server we're using gbk, but the client still thinks it's latin1.
Therefore the call to mysql_real_escape_string() inserts the backslash, and we have a free hanging ' character in our "escaped" content! In fact, if we were to look at $var in the gbk character set, we'd see:
```
縗' OR 1=1 /*
```
Which is exactly what the attack requires.
The Query
This part is just a formality, but here's the rendered query:
```
SELECT * FROM test WHERE name = '縗' OR 1=1 /*' LIMIT 1
```

Congratulations, you just successfully attacked a program using PDO Prepared Statements...

The Simple Fix

Now, it's worth noting that you can prevent this by disabling emulated prepared statements:

$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);

This will usually result in a true prepared statement (i.e. the data being sent over in a separate packet from the query). However, be aware that PDO will silently fallback to emulating statements that MySQL can't prepare natively: those that it can are listed in the manual, but beware to select the appropriate server version).

The Correct Fix

The problem here is that we used SET NAMES instead of C API's mysql_set_charset(). Otherwise, the attack would not succeed. But the worst part is that PDO didn't expose the C API for mysql_set_charset() until 5.3.6, so in prior versions it cannot prevent this attack for every possible command!
It's now exposed as a DSN parameter, which should be used instead of SET NAMES...

This is provided we are using a MySQL release since 2006. If you're using an earlier MySQL release, then a bug in mysql_real_escape_string() meant that invalid multibyte characters such as those in our payload were treated as single bytes for escaping purposes even if the client had been correctly informed of the connection encoding and so this attack would still succeed. The bug was fixed in MySQL 4.1.20, 5.0.22 and 5.1.11.

The Saving Grace

As we said at the outset, for this attack to work the database connection must be encoded using a vulnerable character set. utf8mb4 is not vulnerable and yet can support every Unicode character: so you could elect to use that instead—but it has only been available since MySQL 5.5.3. An alternative is utf8, which is also not vulnerable and can support the whole of the Unicode Basic Multilingual Plane.

Alternatively, you can enable the NO_BACKSLASH_ESCAPES SQL mode, which (amongst other things) alters the operation of mysql_real_escape_string(). With this mode enabled, 0x27 will be replaced with 0x2727 rather than 0x5c27 and thus the escaping process cannot create valid characters in any of the vulnerable encodings where they did not exist previously (i.e. 0xbf27 is still 0xbf27 etc.)—so the server will still reject the string as invalid. However, see @eggyal's answer for a different vulnerability that can arise from using this SQL mode (albeit not with PDO).

Safe Examples

The following examples are safe:

mysql_query('SET NAMES utf8');
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var' LIMIT 1");

Because the server's expecting utf8...

mysql_set_charset('gbk');
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var' LIMIT 1");

Because we've properly set the character set so the client and the server match.

$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$pdo->query('SET NAMES gbk');
$stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1');
$stmt->execute(array("\xbf\x27 OR 1=1 /*"));

Because we've turned off emulated prepared statements.

$pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=gbk', $user, $password);
$stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1');
$stmt->execute(array("\xbf\x27 OR 1=1 /*"));

Because we've set the character set properly.

$mysqli->query('SET NAMES gbk');
$stmt = $mysqli->prepare('SELECT * FROM test WHERE name = ? LIMIT 1');
$param = "\xbf\x27 OR 1=1 /*";
$stmt->bind_param('s', $param);
$stmt->execute();

Because MySQLi does true prepared statements all the time.

Wrapping Up

If you:

Use Modern Versions of MySQL (late 5.1, all 5.5, 5.6, etc) AND PDO's DSN charset parameter (in PHP ≥ 5.3.6)

Don't use a vulnerable character set for connection encoding (you only use utf8 / latin1 / ascii / etc)

Enable NO_BACKSLASH_ESCAPES SQL mode

You're 100% safe.

Otherwise, you're vulnerable even though you're using PDO Prepared Statements...

Addendum

I've been slowly working on a patch to change the default to not emulate prepares for a future version of PHP. The problem that I'm running into is that a LOT of tests break when I do that. One problem is that emulated prepares will only throw syntax errors on execute, but true prepares will throw errors on prepare. So that can cause issues (and is part of the reason tests are borking).

Questions switching to PDO from mysql_query

This web app will be installed on systems of different configurations and some will (unfortunately) have [magic quotes] on while others will be off.

As I noted in a comment on another answer, the PHP manual has an entire section on dealing with dealing with the horror that is magic quotes. You can usually either disable it locally in an .htaccess or correct the data as it comes in. I personally would not do business with a hosting provider that has it enabled by default.

Previously I was doing the whole if statement to addslashes() when it's off to input data

Yikes! addslashes is not adequate protection. When using the old MySQL extension, you need to use mysql_real_escape_string after setting the character set. Failing to do this can leave you open to a huge vulnerability.

... what needs to be done with PDO queries like these:

Other than setting the connection character set, nothing! PDO will use real prepared statements when it can. This means that it will first send the query with placeholders to the server so it can process it, then it will later separately send the arguments over. This makes the query immune from SQL injection.

(Some databases don't support prepared statements with PDO. For these, PDO will process the query, replacing the placeholders with quoted, escaped values. The result is the same -- immunity from SQL injection.)

Also, how necessary is it to close the database handler in the end? What is the detriment of not doing so?

Just like the other PHP database handlers, there is no need to close the connection -- when the script ends, the connection will close. (Don't even think about persistent connections.)

Is PDO a good option for a CMS that will be installed on many different server setups? Is PDO ubiquitous enough where it will be enabled on most servers?

PDO became standard in PHP 5.1, but that doesn't mean it's always available. Most Linux distributions split out all of PHP's database access options so that installing them also installs the mandatory libraries. Sometimes, incompetent or inexperienced hosting providers won't install the PDO packages to go along with the other database access options, not realizing what they are doing. This is usually corrected by simply asking them to enable PDO, and switching to another host if they are unwilling or unable.

Many modern frameworks prefer building on top of PDO, but make other options available. For example, Zend Framework's Zend_Db has adapters for PDO, mysqli, and a few others. If you fear PDO won't always be available, using an adapter layer might work well for you. (Like most ZF components, Zend_Db doesn't rely heavily on other ZF components, meaning you can easily bundle it with your code.)