Does html_entity_decode replaces also? If not how to replace it?
Quote from html_entity_decode()
manual:
You can useYou might wonder why
trim(html_entity_decode(' '));
doesn't reduce the string to an empty
string, that's because the' '
entity is not ASCII code 32 (which is
stripped by trim()) but ASCII code 160
(0xa0) in the default ISO 8859-1
characterset.
str_replace()
to replace the ascii character #160 to a space:<?php
$a = html_entity_decode('> <');
echo 'before ' . $a . PHP_EOL;
$a = str_replace("\xA0", ' ', $a);
echo ' after ' . $a . PHP_EOL;
PHP preg_replace
I think the problem is quite simply that highlight_string()
is outputting its result immediately, rather than saving it to $note
.
Instead, please try the following:
$note = html_entity_decode($note);
$note = highlight_string($note, true);
$note = str_replace(' ', ' ', $note);
The difference in my code is that I use highlight_string($note, true)
with the second parameter set to true. The docs shed some light about the function's behavior:The regex function you have in your code block might work, but since this is a simple replacement, it will suffice to usemixed highlight_string ( string $str [, bool $return = false ] )
Return
Set this parameter to TRUE to make this function return the highlighted code.
str_replace
in this case, as you have tried. PHP Parsing Problem - and Â
The non-breaking space exist in UTF-8 of two bytes: 0xC2
and 0xA0
.
When those bytes are represented in ISO-8859-1 (a single-byte encoding) instead of UTF-8 (a multi-byte encoding) then those bytes becomes respectively the characters Â
and another non-breaking space .
Apparently you're parsing the HTML using UTF-8 and echoing the results using ISO-8859-1. To fix this problem, you need to either parse HTML using ISO-8859-1 or echo the results using UTF-8. I'd recommend to use UTF-8 all the way. Go through the PHP UTF-8 cheatsheet to align it all out.
replace characters that are hidden in text
This solution will work, I tested it:
$string = htmlentities($content, null, 'utf-8');
$content = str_replace(" ", "", $string);
$content = html_entity_decode($content);
How to check if exist?
First of all, People get tripped up on this move all the time...
strpos($string, " ")
If
is at the start of your string, then the evaluated result is 0
("offset position") AND 0
is loosely compared to false
in the way that you have crafted your conditional expression.You need to explicitly check for false
(strict check) from strpos()
like this:
if (empty($string) || strpos($string, " ") !== false || $string == " ") {
//Do Something.
}
However, that is NOT your actual issue because...
You have a multibyte space evidenced by when you "highlight" the character with your cursor -- it only has a character length of one, but when you call var_dump()
there is a byte count of 2
.
trim()
can't help you. ctype_space()
can't help you. You need something that is multibyte aware.
To allow the most inclusive match, I'll employ a regular expression that will search for all whitespace characters, invisible control characters, and unused code points.
if (empty($string) || preg_match("/^[\pZ\pC]+$/u", $string)) {
This will check if the string is truly empty or is entirely composed of one or more of the aforementioned characters.Here's a little demo: https://3v4l.org/u7eoK
(I don't really think this is a
issue, so I am leaving that out of my solution.)
Scroll down this resource: https://www.regular-expressions.info/unicode.html
How to remove html special chars?
Either decode them using html_entity_decode
or remove them using preg_replace
:
$Content = preg_replace("/?[a-z0-9]+;/i","",$Content);
(From here)EDIT: Alternative according to Jacco's comment
might be nice to replace the '+' with
{2,8} or something. This will limit
the chance of replacing entire
sentences when an unencoded '&' is
present.
$Content = preg_replace("/?[a-z0-9]{2,8};/i","",$Content);
PHP convert html to space, to etc
Use htmlspecialchars_decode
is the opposite of htmlspecialchars
.
Example from the PHP documentation page:
$str = '<p>this -> "</p>';
echo htmlspecialchars_decode($str);
//Output: <p>this -> "</p>
How to remove from a UTF-8 string?
This gets tricky, its not as straight forward as replacing normal string.
Try this.
str_replace("\xc2\xa0",' ',$str);
or this, the above should work: $nbsp = html_entity_decode(" ");
$s = html_entity_decode("[ ]");
$s = str_replace($nbsp, " ", $s);
echo $s;
@ref: https://moovwebconfluence.atlassian.net/wiki/pages/viewpage.action?pageId=1081435 Replace with a blank or empty string PHP
$text_description=" Hello world! lorel ipsum";
$text_description = str_replace(' ', ' ', $text_description);
echo $text_description;
Output:Hello world! lorel ipsum
Related Topics
How to Refresh Select2 Dropdown Menu After Ajax Loading Different Content
Print a Webpage to PDF Document Using PHP
Measure the Pronounceability of a Word
How Does PHP Max_Execution_Time Work
Php:Capturing the Command Output
Add a Custom Checkbox in Woocommerce Checkout Which Value Shows in Admin Edit Order
Different Timezone_Types on Datetime Object
Get Service Container from Entity in Symfony 2.1 (Doctrine)
.Htaccess Rewrite: Subdomain as Get Var and Path as Get Var
Resource Interpreted as Image But Transferred with Mime Type Text/HTML - Magento
.Htaccess Deny Access to Specific Files? More Than One
Upload File Using Guzzle 6 to API Endpoint
Magento Products by Categories