Replace Multiple Newlines, Tabs, and Spaces

Replace multiple newlines, tabs, and spaces

In theory, you regular expression does work, but the problem is that not all operating system and browsers send only \n at the end of string. Many will also send a \r.

Try:

I've simplified this one:

preg_replace("/(\r?\n){2,}/", "\n\n", $text);

And to address the problem of some sending \r only:

preg_replace("/[\r\n]{2,}/", "\n\n", $text);

Based on your update:

// Replace multiple (one ore more) line breaks with a single one.
$text = preg_replace("/[\r\n]+/", "\n", $text);

$text = wordwrap($text,120, '<br/>', true);
$text = nl2br($text);

Regex Python - Replace any combination of line breaks, tabs, spaces, by single space

Try using \s, which matches all whitespace characters.

>>> import re
>>> s = 'Copyright ©\n\t\t\t\n\t\t\t2019\n\t\t\tApple Inc. All rights reserved.'
>>> s = re.sub("\s+", " ", s)
>>> s
'Copyright © 2019 Apple Inc. All rights reserved.'

Replace Multiple New Lines in One New Line

Try using the following pattern:

/[\n\r]+/

as follows:

preg_replace( "/[\r\n]+/", "\n", $text );

replace multiple whitespace with single space but keep new lines in regex match (python)

You can use code below:

import re
test = "This is a test. \n Blah blah."
print(re.sub(r"(?:(?!\n)\s)+", " ",test))

Output


This is a test. 
Blah blah.

replace multiple spaces, tabs and newlines into one space except commented text

The new solution

After thinking a bit, I came up with the following solution with pure regex. Note that this solution will delete the newlines/tabs/multi-spaces instead of replacing them:

$new_string = preg_replace('#(?(?!<!--.*?-->)(?: {2,}|[\r\n\t]+)|(<!--.*?-->))#s', '$1', $string);
echo $new_string;

Explanation

(?                              # If
(?!<!--.*?-->) # There is no comment
(?: {2,}|[\r\n\t]+) # Then match 2 spaces or more, or newlines or tabs
| # Else
(<!--.*?-->) # Match and group it (group #1)
) # End if

So basically when there is no comment it will try to match spaces/tabs/newlines. If it does find it then group 1 wouldn't exist and there will be no replacements (which will result into the deletion of spaces...). If there is a comment then the comment is replaced by the comment (lol).

Online demo


The old solution

I came up with a new strategy, this code require PHP 5.3+:

$new_string = preg_replace_callback('#(?(?!<!--).*?(?=<!--|$)|(<!--.*?-->))#s', function($m){
if(!isset($m[1])){ // If group 1 does not exist (the comment)
return preg_replace('#\s+#s', ' ', $m[0]); // Then replace with 1 space
}
return $m[0]; // Else return the matched string
}, $string);

echo $new_string; // Output

Explaining the regex:

(?                      # If
(?!<!--) # Lookahead if there is no <!--
.*? # Then match anything (ungreedy) until ...
(?=<!--|$) # Lookahead, check for <!-- or end of line
| # Or
(<!--.*?-->) # Match and group a comment, this will make for us a group #1
)
# The s modifier is to match newlines with . (dot)

Online demo

Note: What you are asking and what you have provided as expected output are a bit contradicting. Anyways if you want to remove instead of replacing by 1 space, then just edit the code from '#\s+#s', ' ', $m[0] to '#\s+#s', '', $m[0].

how can I remove multiple new line and space(white char) from a string?

Update #2

Catching all initial white-spaces (caret ^ asserts that we are at beginning of line) and other consecutive spaces:

^\\s+|[\\t\\f ](?=[\\t\\f ])|[\\t\\f ]$|\\s+\\z

Replace it with nothing (multi-line modifier is important to be on):

String str = "   I am   seal \n\n  \t   where are we? ";
String result = str.replaceAll("(?m)(^\\s+|[\\t\\f ](?=[\\t\\f ])|[\\t\\f ]$|\\s+\\z)", "");
System.out.println(result);

Live demo

Also by the help of class intersection we can use a shorter regex:

^\\s+|[\\s&&[^\\r\\n]](?=\\s|$)|\\s+\\z

Replace tabs and spaces with a single space as well as carriage returns and newlines with a single newline

First, I'd like to point out that new lines can be either \r, \n, or \r\n depending on the operating system.

My solution:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/[\r\n]+/', "\n", $string));

Which could be separated into 2 lines if necessary:

$string = preg_replace('/[\r\n]+/', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

Update:

An even better solutions would be this one:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/\s*$^\s*/m', "\n", $string));

Or:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

I've changed the regular expression that makes multiple lines breaks into a single better. It uses the "m" modifier (which makes ^ and $ match the start and end of new lines) and removes any \s (space, tab, new line, line break) characters that are a the end of a string and the beginning of the next. This solve the problem of empty lines that have nothing but spaces. With my previous example, if a line was filled with spaces, it would have skipped an extra line.



Related Topics



Leave a reply



Submit