HTML Encode in PHP

Html encode in PHP

By encode, do you mean: Convert all applicable characters to HTML entities?

htmlspecialchars or
htmlentities

You can also use strip_tags if you want to remove all HTML tags :

strip_tags

Note: this will NOT stop all XSS attacks

Confused with html encoding

You have this confused. Character encoding is an attribute of YOUR systems. Your websites and your database are responsible for character encoding.

You have to decide what you will accept. I would say in general, the web has moved towards standardization on UTF-8. So if your websites that accept user input AND your database, and all connections involved are UTF-8, then you are in a position to accept input as UTF-8, and your character set and collation in the database should be configured appropriately.

At this point all your web pages should be HTML5, so the recommended HEAD section of your pages should at a minimum be this:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>

Next you have SQL injection. You specified PHP. If you are using mysqli or PDO (which is in my experience the better choice) AND you are using bindParameter for all your variables, there is NO ISSUE with SQL injection. That issue goes away, and the need for escaping input goes away, because you no longer have to be concerned that a SQL statement could get confused. It's not possible anymore.

Finally, you mentioned htmlpurifier. That exists so that people can try and avoid XSS and other exploits of that nature, that occur when you accept user input, and those people inject html & js.

That is always going to be a concern, depending on the nature of the system and what you do with that output, but as others suggested in comments, you can run sanitizers and filters on the output after you've retrieved it from the database. Sitting inside a php string variable there is no intrinsic danger, until you weaponize it by injecting it into a live html page you are serving.

In terms of finding bad actors and people trying to mess with your system, you are obviously much better off having stored the original input as submitted. Then as you come to understand the nature of these exploits, you can search through your database looking for specific things, which you won't be able to do if you sanitize first and store the result.

PHP Encode string as HTML

Where did you use html_entity_decode? I've tried this:

printf(
'<a class="row-title" href="%s" aria-label="%s">%s%s</a>',
get_edit_post_link( $post->ID ),
/* translators: %s: post title */
esc_attr( sprintf( __( '“%s” (Edit)' ), $title )),
$pad,
html_entity_decode($title)
);

and seems it works.

But this is BAD idea to change core file. You can try to write (or find) some plugin that allows to add icons to particular post title but not to all post and without changing original file.

How do I JSON Encode HTML in PHP?

Encoding the html like this seems to add a \ before every / causing
the html to break

PHP is escaping the slashes while encoding. This can be preventing by adding a JSON_UNESCAPED_SLASHES flag when calling json_encode():

$data = "<html></html>";

$escaped = json_encode($data);
// string(16) ""<html><\/html>""
var_dump($escaped);

$unescaped = json_encode($data, JSON_UNESCAPED_SLASHES);
// string(15) ""<html></html>""
var_dump($unescaped);

When and how to encode/decode HTML when interacting with a database?

Unless I've misunderstood what you're asking, you seem to have the wrong impression about the effect of outputting HTML encoded strings into text inputs. Here's a basic example of what will happen. Let's say you have a user who wants to be named PB&J. Sure, it's weird, but not everyone can pick a nice non-weird username like "Bonvi" or "Don't Panic".

So you save that in your database as is.

Later, when you're using it in another form, you escape it for output.

<input type="text" name="username" value="<?= htmlspecialchars($username) ?>">

In your page source, you'll see

<input type="text" name="username" value="PB&amp;J">

with the ampersand converted to an HTML entity. (Which is what you want, in case they really wanted to be named bob"><script>alert("però!")</script><p class="ha or something worse.)

But the value displayed in the text box will be PB&J, and when the user submits the form, the value in $_POST['username'] will be PB&J, not PB&amp;J. It will not be changed to the encoded value.

(I used htmlspecialchars in this example, but the same would apply with your example using però with htmlentities.)

I'm trying to explain it basically, so I apologize if I did misunderstand you - I don't intend to sound condescending.

force html entity to display although encoding is enabled

There is actually no need of bypass the sanitizing of a html entity. It's there for a purpose.

When you have to use values on server side/other functions you need to decode values again to original values

In Js:

decodeHtml('string1 string2')

Live Example:http://jsfiddle.net/pranavq212/xasjyjtk/1/

function decodeHtml(html) {    var txt = document.createElement("textarea");    txt.innerHTML = html;    return txt.value;}document.getElementById('form').onsubmit = function(e) {    e.preventDefault();    var input = document.getElementById('input').value;    var output = decodeHtml(input);    alert(output);}
input {    width: 100%;    display: block;}
<form id="form">    <input type="text" id="input" placeholder="input" value="string1  string2"><input type="submit" value="alert(input)"></form>


Related Topics



Leave a reply



Submit