How to Replace Different Newline Styles in PHP the Smartest Way

How to replace different newline styles in PHP the smartest way?

$string = preg_replace('~\R~u', "\r\n", $string);

If you don't want to replace all Unicode newlines but only CRLF style ones, use:

$string = preg_replace('~(*BSR_ANYCRLF)\R~', "\r\n", $string);

\R matches these newlines, u is a modifier to treat the input string as UTF-8.


From the PCRE docs:

What \R matches

By default, the sequence \R in a pattern matches any Unicode newline
sequence, whatever has been selected as the line ending sequence. If
you specify

     --enable-bsr-anycrlf

the default is changed so that \R matches only CR, LF, or CRLF. Whatever is selected when PCRE is built can be overridden when the library
functions are called.

and

Newline sequences

Outside a character class, by default, the escape sequence \R matches
any Unicode newline sequence. In non-UTF-8 mode \R is equivalent to the
following:

    (?>\r\n|\n|\x0b|\f|\r|\x85)

This is an example of an "atomic group", details of which are given
below. This particular group matches either the two-character sequence
CR followed by LF, or one of the single characters LF (linefeed,
U+000A), VT (vertical tab, U+000B), FF (formfeed, U+000C), CR (carriage
return, U+000D), or NEL (next line, U+0085). The two-character sequence
is treated as a single unit that cannot be split.

In UTF-8 mode, two additional characters whose codepoints are greater
than 255 are added: LS (line separator, U+2028) and PS (paragraph separator, U+2029). Unicode character property support is not needed for
these characters to be recognized.

It is possible to restrict \R to match only CR, LF, or CRLF (instead of
the complete set of Unicode line endings) by setting the option
PCRE_BSR_ANYCRLF either at compile time or when the pattern is matched.
(BSR is an abbrevation for "backslash R".) This can be made the default
when PCRE is built; if this is the case, the other behaviour can be
requested via the PCRE_BSR_UNICODE option. It is also possible to
specify these settings by starting a pattern string with one of the
following sequences:

    (*BSR_ANYCRLF)   CR, LF, or CRLF only
(*BSR_UNICODE) any Unicode newline sequence

These override the default and the options given to pcre_compile() or
pcre_compile2(), but they can be overridden by options given to
pcre_exec() or pcre_dfa_exec(). Note that these special settings, which
are not Perl-compatible, are recognized only at the very start of a
pattern, and that they must be in upper case. If more than one of them
is present, the last one is used. They can be combined with a change of
newline convention; for example, a pattern can start with:

    (*ANY)(*BSR_ANYCRLF)

They can also be combined with the (*UTF8) or (*UCP) special sequences.
Inside a character class, \R is treated as an unrecognized escape
sequence, and so matches the letter "R" by default, but causes an error
if PCRE_EXTRA is set.

How to change what PCRE regexp thinks are newlines in multi-line mode?

Did you try the (*CRLF) and related modifiers? They are detailed on Wikipedia here (under Newline/linebreak options) and seem to do the right thing in my testing. i.e. '/(*CRLF)^two$/m' should match the windows \r\n newlines. Also (*ANYCRLF) should match both linux and windows but I haven't tested this.

Remove new lines from string and replace with one empty space

You have to be cautious of double line breaks, which would cause double spaces. Use this really efficient regular expression:

$string = trim(preg_replace('/\s\s+/', ' ', $string));

Multiple spaces and newlines are replaced with a single space.

Edit: As others have pointed out, this solution has issues matching single newlines in between words. This is not present in the example, but one can easily see how that situation could occur. An alternative is to do the following:

$string = trim(preg_replace('/\s+/', ' ', $string));

character separator for newline textarea

All newlines should be converted in \r\n by the spec.

So you could indeed do a simple explode("\r\n", $theContent) no matter the platform used.

P.S.

\r is only used on old(er) Macs. Nowadays Macs also use the *nix style line breaks (\n).

Split string by new line characters

You can use the explode function, using "\n" as separator:

$your_array = explode("\n", $your_string_from_db);

For instance, if you have this piece of code:

$str = "My text1\nMy text2\nMy text3";
$arr = explode("\n", $str);
var_dump($arr);

You'd get this output:

array
0 => string 'My text1' (length=8)
1 => string 'My text2' (length=8)
2 => string 'My text3' (length=8)



Note that you have to use a double-quoted string, so \n is actually interpreted as a line-break.

(See that manual page for more details.)

php strlen() getting different value on different environments for same string

probable newline characters in windows being "\n\r" and only "\n" on linux?

can you give us the string ?

EDIT:
To remove the EndOfLines

$xml = str_replace(PHP_EOL, '', $xml)

How to use str_replace with a condition

Finally I got the answer. Since Im using textarea for the input, all other answers are wrong, I still get the space at the EOL.
I tried replacing my str_replace functions parameters and got the desired output.

SOLUTION

$refresheddata = str_replace(array(" \r\n","\r\n"), '<br>', $data);

Now the new line is gone with the space before it.



Related Topics



Leave a reply



Submit