Replace Tabs and Spaces with a Single Space as Well as Carriage Returns and Newlines with a Single Newline

Replace tabs and spaces with a single space as well as carriage returns and newlines with a single newline

First, I'd like to point out that new lines can be either \r, \n, or \r\n depending on the operating system.

My solution:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/[\r\n]+/', "\n", $string));

Which could be separated into 2 lines if necessary:

$string = preg_replace('/[\r\n]+/', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

Update:

An even better solutions would be this one:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/\s*$^\s*/m', "\n", $string));

Or:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

I've changed the regular expression that makes multiple lines breaks into a single better. It uses the "m" modifier (which makes ^ and $ match the start and end of new lines) and removes any \s (space, tab, new line, line break) characters that are a the end of a string and the beginning of the next. This solve the problem of empty lines that have nothing but spaces. With my previous example, if a line was filled with spaces, it would have skipped an extra line.

Replace multiple newlines, tabs, and spaces

In theory, you regular expression does work, but the problem is that not all operating system and browsers send only \n at the end of string. Many will also send a \r.

Try:

I've simplified this one:

preg_replace("/(\r?\n){2,}/", "\n\n", $text);

And to address the problem of some sending \r only:

preg_replace("/[\r\n]{2,}/", "\n\n", $text);

Based on your update:

// Replace multiple (one ore more) line breaks with a single one.
$text = preg_replace("/[\r\n]+/", "\n", $text);

$text = wordwrap($text,120, '<br/>', true);
$text = nl2br($text);

Regex Python - Replace any combination of line breaks, tabs, spaces, by single space

Try using \s, which matches all whitespace characters.

>>> import re
>>> s = 'Copyright ©\n\t\t\t\n\t\t\t2019\n\t\t\tApple Inc. All rights reserved.'
>>> s = re.sub("\s+", " ", s)
>>> s
'Copyright © 2019 Apple Inc. All rights reserved.'

How can I replace newlines/line breaks with spaces in javascript?

You can use the .replace() function:

words = words.replace(/\n/g, " ");

Note that you need the g flag on the regular expression to get replace to replace all the newlines with a space rather than just the first one.

Also, note that you have to assign the result of the .replace() to a variable because it returns a new string. It does not modify the existing string. Strings in Javascript are immutable (they aren't directly modified) so any modification operation on a string like .slice(), .concat(), .replace(), etc... returns a new string.

let words = "a\nb\nc\nd\ne";
console.log("Before:");
console.log(words);
words = words.replace(/\n/g, " ");

console.log("After:");
console.log(words);

stripping tabs, newlines, and spaces from string output, but leave one space so that words are not connected

You can navigate through each string in the list and the use re.sub to replace each occurrence of more than 2 white space by a :

>>> import re
>>> lst = ['\n\n\n Headquarters or Regional Office\n\n\n\n\n\t\t\t\t\t\t\t\t\tMain Headquarters\t\t\t\t\t\t\t\n\n', '\n\n\n Founders\n\n\n\n\n\t\t\t\t\t\t\t\t\tThomas Lon Van\t\t\t\t\t\t\t\n\n', '\n\n\n Founder Diversity\n\n\n\n\n\t\t\t\t\t\t\t\t\tN/A\t\t\t\t\t\t\t\n\n', '\n\n\n Year Founded\n\n\n\n\n\t\t\t\t\t\t\t\t\t2016\t\t\t\t\t\t\t\n\n', '\n\n\n # of Employees\n\n\n\n\n\t\t\t\t\t\t\t\t\t1-10\t\t\t\t\t\t\t\n\n', '\n\n\n Seeking Funding?\n\n\n\n\n\t\t\t\t\t\t\t\t\tNo \t\t\t\t\t\t\t\n\n', '\n\n\n Funding Phase\n\n\n\n\n\t\t\t\t\t\t\t\t\tN/A\t\t\t\t\t\t\t\n\n']
>>> [re.sub(r'\s\s+', ': ', word).strip(': ') for word in lst]
['Headquarters or Regional Office: Main Headquarters', 'Founders: Thomas Lon Van', 'Founder Diversity: N/A', 'Year Founded: 2016', '# of Employees: 1-10', 'Seeking Funding?: No', 'Funding Phase: N/A']

Remove new lines from string and replace with one empty space

You have to be cautious of double line breaks, which would cause double spaces. Use this really efficient regular expression:

$string = trim(preg_replace('/\s\s+/', ' ', $string));

Multiple spaces and newlines are replaced with a single space.

Edit: As others have pointed out, this solution has issues matching single newlines in between words. This is not present in the example, but one can easily see how that situation could occur. An alternative is to do the following:

$string = trim(preg_replace('/\s+/', ' ', $string));


Related Topics



Leave a reply



Submit