What Are the Best PHP Input Sanitizing Functions

How can I sanitize user input with PHP?

It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (or cleaning, or whatever people call it).

What you should do, to avoid problems, is quite simple: whenever you embed a a piece of data within a foreign code, you must treat it according to the formatting rules of that code. But you must understand that such rules could be too complicated to try to follow them all manually. For example, in SQL, rules for strings, numbers and identifiers are all different. For your convenience, in most cases there is a dedicated tool for such an embedding. For example, when you need to use a PHP variable in the SQL query, you have to use a prepared statement, that will take care of all the proper formatting/treatment.

Another example is HTML: If you embed strings within HTML markup, you must escape it with htmlspecialchars. This means that every single echo or print statement should use htmlspecialchars.

A third example could be shell commands: If you are going to embed strings (such as arguments) to external commands, and call them with exec, then you must use escapeshellcmd and escapeshellarg.

Also, a very compelling example is JSON. The rules are so numerous and complicated that you would never be able to follow them all manually. That's why you should never ever create a JSON string manually, but always use a dedicated function, json_encode() that will correctly format every bit of data.

And so on and so forth ...

The only case where you need to actively filter data, is if you're accepting preformatted input. For example, if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.

Is this a good Sanitization function?

My question is, is this the best way of using this sanitization function?

This is a good way to perform sanitization. All sanitization methods will improve over time.

Is there a better way of going about this?

If it is a web application that is providing you user input, you may want to guide the user on the UI with your expectations (e.g. (Enter apartment number, if applicable. Otherwise, leave it blank)).

JavaScript can be used to enforce certain behavior.

When the data arrives to your PHP script, beyond the sanitation process also analyze whether the data is between X and Y length. For example, if customer is entering age, check that the age is in a valid limit.

If it is a string, and you prefer not to have any offensive tags, use strip_tags to remove them. Perform encoding/decoding/escaping of certain characters - think of using mysqli_real_escape_string, htmlspecialchars.

Additionally, if there are multiple insert statements running in one shot, use stored routines. Use transactions, regardless of routines. Rollback unless all the transactions complete as you desire.

Instead of using select * from ..., select required columns. It is possible that certain results, where only 2-3 columns are needed, may speed up dramatically if there is a covering index.

If your app uses system, exec etc., utilizing escapeshellargs/escapeshellcmd will be useful.

When displaying information on UI, ensure that relevant fields are displayed with htmlspecialchars in order to reduce/eliminate the chances of XSS. Also think of using urlencode/json_encode as necessary.

Take a look at http://www.wikihow.com/Prevent-Cross-Site-Request-Forgery-%28CSRF%29-Attacks-in-PHP that showcases, with examples, ways to prevent CSRF attacks.

Comments to your questions are all nice thoughts you should consider, and importantly - you have taken a step in the right direction already - so good for you!

PHP Sanitize Data

Your example script isn't great - the so called sanitisation of a string just trims whitespace off each end. Relying on that would get you in a lot of trouble fast.

There isn't a one size fits all solution. You need to apply the right sanitisation for your application, which will completely depend on what input you need and where it's being used. And you should sanitise at multiple levels in any case - most likely when you receive data, when you store it and possibly when you render it.

Worth reading, possible dupes:

What's the best method for sanitizing user input with PHP?

Clean & Safe string in PHP

php functions for sanitizing web form data

1.mysqli_real_escape_string() or mysql_real_escape_string() to escape quotes

2.use php filter_input for other form data

$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
$search_url = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_ENCODED);

There are other options into php manual here is link

http://php.net/manual/en/function.filter-input.php

http://www.php.net/manual/en/filter.filters.sanitize.php

Can you hack this input sanitize function?

Quite easily:

$testinput = "<script>alert('p0wned');</script >\n
<a href='http://example.org' onclick=\"alert('p0Wned again!)\">Click me!</a>";

var_export(cleanInput($testinput));

Also, htmlescape is almost always the wrong thing to use--it will mangle utf8 input. Also, you should not be storing html-escaped data in your DB. I'm not even sure why you use it here at all--won't you have to unescape the html to display it?

However you are going about this the wrong way.

  1. Do not parse/sanitize html with regexes. Use a real html parser such as DOMDocument or html5lib or even tidylib. Unfortunately PHP doesn't seem to have anything as wonderful as Bleach on Python, so you will have to roll your own. An XSLT stylesheet with a whitelist seems like it might be a good way to handle this particular sanitization condition. Update: another user pointed out HTML Purifier, which is also a whitelist-based html sanitizer. I've never used it but it looks like "Bleach in PHP". You should definitely investigate.
  2. Prefer escaping to sanitization. PHP culture has an obsession with sanitization which is really just plain wrong. Escape data at the boundaries of your application (output and database). In the core of your application your data should be in its native form without any escaping.

A general outline of processing is like so:

  1. Input

    1. Turn off magic quotes in your php settings. Include code at the top of your app to fail hard if it's on: if (get_magic_quotes_gpc()) die ('TURN OFF MAGIC QUOTES!!!!');
    2. Validate and normalize/sanitize specific fields of your input according to the expected type of each field. For example, a "dollar amount" has different validation criteria than a whitelisted html fragment field. (Probably you should find and use a validation library.)
    3. If there are errors, send them back to the user with an appropriate HTTP response code.
    4. Save your data to the database using a database library that supports parameter binding, such as PDO library with prepared statements. This way you do not need to remember to escape data by hand.
    5. On success, redirect (code 303) to a page displaying the created or modified record.
  2. Output

    1. Retrieve data from the database.
    2. Feed the data to a template which is PHP code that only deals with html display of data structures. It should not know details of how that data is retrieved or contain any "application-driving" behavior. Treat a template like a function that accepts a data structure and returns a string.
    3. Escape your data inside your template. Individual fields of your data will need to be escaped differently. You almost always need to run it through htmlspecialchars before output; the only case you would not do that is when the data you need to display is already html (i.e. your whitelist-sanitized html fields). Define a helper function like this and use it in your templates:

      function h($str) {
      return htmlspecialchars($str, ENT_QUOTES, 'utf-8');
      }

      Even better, try to use a template library that automatically escapes strings for you and that requires you to turn off escaping explicitly. (The common case should be simple to avoid errors, and having to escape is the common case!)

    4. Your html page is the string returned from your template. You may now display it to the user.

Sanitize html inputs with php

The validation depending mainly on the context of your website, what's should be confirmed to keep database consistent as possible.
Also there are some validations which are like a global or public, such as trim() and stripslashes()

The main function of validation is to check user inputs that will stored in database and used in future, such as email or phone number and password of user when login or sign-up.
You should validate that phone number is numeric and only 12 length. Or validate that email is in correct format.

About what to use for validation, you can search about:
FILTERs here https://www.w3schools.com/php/php_filter.asp ,

REGULAR EXPERSSIONS here https://www.w3schools.com/php/php_regex.asp

Other way is by using string functions: here https://www.w3schools.com/php/php_ref_string.asp



Related Topics



Leave a reply



Submit