How to prevent XSS (Cross Site Scripting) whilst allowing HTML input
Microsoft have produced their own anti-XSS library, Microsoft Anti-Cross Site Scripting Library V4.0:
The Microsoft Anti-Cross Site Scripting Library V4.0 (AntiXSS V4.0) is an encoding library designed to help developers protect their ASP.NET web-based applications from XSS attacks. It differs from most encoding libraries in that it uses the white-listing technique -- sometimes referred to as the principle of inclusions -- to provide protection against XSS attacks. This approach works by first defining a valid or allowable set of characters, and encodes anything outside this set (invalid characters or potential attacks). The white-listing approach provides several advantages over other encoding schemes. New features in this version of the Microsoft Anti-Cross Site Scripting Library include:- A customizable safe list for HTML and XML encoding- Performance improvements- Support for Medium Trust ASP.NET applications- HTML Named Entity Support- Invalid Unicode detection- Improved Surrogate Character Support for HTML and XML encoding- LDAP Encoding Improvements- application/x-www-form-urlencoded encoding support
It uses a whitelist approach to strip out potential XSS content.
Here are some relevant links related to AntiXSS:
- Anti-Cross Site Scripting Library
- Microsoft Anti-Cross Site Scripting Library V4.2 (AntiXSS V4.2)
- Microsoft Web Protection Library
.NET HTML whitelisting (anti-xss/Cross Site Scripting)
Well if you want to parse, and you're worried about invalid (x)HTML coming in then the HTML Agility Pack is probably the best thing to use for parsing. Remember though it's not just elements, but also attributes on allowed elements you need to allow (of course you should work to an allowed whitelist of elements and their attributes, rather than try to strip things that might be dodgy via a blacklist)
There's also the OWASP AntiSamy Project which is an ongoing work in progress - they also have a test site you can try to XSS
Regex for this is probably too risky IMO.
In C#, how to prevent XSS while allowing HTML input, including br's?
To resolve this, I decided to store the raw HTML as-is, performing a replace on Environment.Newlines
to <br />
before storing it.
Then on the flip side, when showing it to visitors I use the MS AntiXSS code to clean it up. Not 100% the ideal way I'd like to do it, but gets the job done.
I do a bit of caching here to make sure it's not running through AntiXSS on every request too.
What are the best practices for avoiding xss attacks in a PHP site
Escaping input is not the best you can do for successful XSS prevention. Also output must be escaped. If you use Smarty template engine, you may use |escape:'htmlall'
modifier to convert all sensitive characters to HTML entities (I use own |e
modifier which is alias to the above).
My approach to input/output security is:
- store user input not modified (no HTML escaping on input, only DB-aware escaping done via PDO prepared statements)
- escape on output, depending on what output format you use (e.g. HTML and JSON need different escaping rules)
Related Topics
How to Get Error Information When Httpwebrequest.Getresponse() Fails
Method Overloading. Can You Overuse It
Instantiating a Python Class in C#
SQL Connection String for Microsoft Access 2010 .Accdb
Attach Debugger to Iis Instance
How to Target Mono Framework from Vs2015
Finding Out If a Type Implements a Generic Interface
Use Decimal Values as Attribute Params in C#
How Does Wpf Inotifypropertychanged Work
C# Generics Compared to C++ Templates
Duplicate Key Exception from Entity Framework
Error 0X80005000 and Directoryservices
Maxlength Attribute Not Generating Client-Side Validation Attributes
Find If Lista Contains Any Elements Not in Listb
Difference Between SQLdatareader.Read and SQLdatareader.Nextresult
Configure Multiple Database Entity Framework 6
How Set Value a Property Selector Expression<Func<T,Tresult>>