How to replace all XHTML/HTML line breaks (br) with new lines?
I would generally say "don't use regex to work with HTML", but, on this one, I would probably go with a regex, considering that <br>
tags generally look like either :
<br>
- or
<br/>
, with any number of spaces before the/
I suppose something like this would do the trick :
$html = 'this <br>is<br/>some<br />text <br />!';
$nl = preg_replace('#<br\s*/?>#i', "\n", $html);
echo $nl;
Couple of notes :
- starts with
<br
- followed by any number of white characters :
\s*
- optionnaly, a
/
:/?
- and, finally, a
>
- and this using a case-insensitive match (
#i
), as<BR>
would be valid in HTML
Convert (render) HTML to Text with correct line-breaks
The code below works correctly with the example provided, even deals with some weird stuff like <div><br></div>
, there're still some things to improve, but the basic idea is there. See the comments.
public static string FormatLineBreaks(string html)
{
//first - remove all the existing '\n' from HTML
//they mean nothing in HTML, but break our logic
html = html.Replace("\r", "").Replace("\n", " ");
//now create an Html Agile Doc object
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
//remove comments, head, style and script tags
foreach (HtmlNode node in doc.DocumentNode.SafeSelectNodes("//comment() | //script | //style | //head"))
{
node.ParentNode.RemoveChild(node);
}
//now remove all "meaningless" inline elements like "span"
foreach (HtmlNode node in doc.DocumentNode.SafeSelectNodes("//span | //label")) //add "b", "i" if required
{
node.ParentNode.ReplaceChild(HtmlNode.CreateNode(node.InnerHtml), node);
}
//block-elements - convert to line-breaks
foreach (HtmlNode node in doc.DocumentNode.SafeSelectNodes("//p | //div")) //you could add more tags here
{
//we add a "\n" ONLY if the node contains some plain text as "direct" child
//meaning - text is not nested inside children, but only one-level deep
//use XPath to find direct "text" in element
var txtNode = node.SelectSingleNode("text()");
//no "direct" text - NOT ADDDING the \n !!!!
if (txtNode == null || txtNode.InnerHtml.Trim() == "") continue;
//"surround" the node with line breaks
node.ParentNode.InsertBefore(doc.CreateTextNode("\r\n"), node);
node.ParentNode.InsertAfter(doc.CreateTextNode("\r\n"), node);
}
//todo: might need to replace multiple "\n\n" into one here, I'm still testing...
//now BR tags - simply replace with "\n" and forget
foreach (HtmlNode node in doc.DocumentNode.SafeSelectNodes("//br"))
node.ParentNode.ReplaceChild(doc.CreateTextNode("\r\n"), node);
//finally - return the text which will have our inserted line-breaks in it
return doc.DocumentNode.InnerText.Trim();
//todo - you should probably add "&code;" processing, to decode all the and such
}
//here's the extension method I use
private static HtmlNodeCollection SafeSelectNodes(this HtmlNode node, string selector)
{
return (node.SelectNodes(selector) ?? new HtmlNodeCollection(node));
}
How can i convert/replace every newline to 'br/'?
You need to use html_safe
if you want to render embedded HTML:
<%= @the_string.html_safe %>
If it might be nil, raw(@the_string)
won't throw an exception. I'm a bit ambivalent about raw
; I almost never try to display a string that might be nil
.
Removing newline after h1 tags?
Sounds like you want to format them as inline. By default, h1
and h2
are block-level elements which span the entire width of the line. You can change them to inline with css like this:
h1, h2 {
display: inline;
}
Here's an article that explains the difference between block
and inline
in more detail: http://www.webdesignfromscratch.com/html-css/css-block-and-inline/
To maintain vertical padding, use inline-block
, like this:
h1, h2 {
display: inline-block;
}
How do I create a new line in Javascript?
Use the \n
for a newline character.
document.write("\n");
You can also have more than one:
document.write("\n\n\n"); // 3 new lines! My oh my!
However, if this is rendering to HTML, you will want to use the HTML tag for a newline:
document.write("<br>");
The string Hello\n\nTest
in your source will look like this:
Hello!
Test
The string Hello<br><br>Test
will look like this in HTML source:
Hello<br><br>Test
The HTML one will render as line breaks for the person viewing the page, the \n
just drops the text to the next line in the source (if it's on an HTML page).
HTML 5: Is it br, br/, or br /?
Simply <br>
is sufficient.
The other forms are there for compatibility with XHTML; to make it possible to write the same code as XHTML, and have it also work as HTML. Some systems that generate HTML may be based on XML generators, and thus do not have the ability to output just a bare <br>
tag; if you're using such a system, it's fine to use <br/>
, it's just not necessary if you don't need to do it.
Very few people actually use XHTML, however. You need to serve your content as application/xhtml+xml
for it to be interpreted as XHTML, and that will not work in old versions of IE - it will also mean that any small error you make will prevent your page from being displayed in browsers that do support XHTML. So, most of what looks like XHTML on the web is actually being served, and interpreted, as HTML. See Serving XHTML as text/html Considered Harmful for some more information.
Replacing line breaks with br tags in multi-line text nodes not enclosed in tags
So it's a little more complicated than what I said in my comment, but I think something like this might work:
public static void main (String[] args)
{
String text = "text11\n"
+ "text 21<p>tagged text1\n"
+ "tagged text2</p>\n"
+ "text 2";
StringBuilder sb = new StringBuilder("<body>");
sb.append(text);
sb.append("</body>");
Document doc = Jsoup.parseBodyFragment(sb.toString());
Element body = doc.select("body");
List<Node> children = body.childNodes();
StringBuilder sb2 = new StringBuilder();
for(Node n : children) {
if(n instanceof TextNode) {
n.text(n.getWholeText().replace("\n", "<br/>"));
}
sb2.append(n.toString());
}
System.out.println(sb2.toString());
}
Basically get all the Nodes
, do a replace on the TextNodes
, and put them back together. I'm not 100% sure this will work as-is, since I am not able to test it at the moment. But hopefully it gets the idea across.
What I said in my comment doesn't work because you have to be able to put the child elements back in place between the text. You can't do that if you just use getOwnText()
.
I haven't used Jsoup much myself, so improvements are welcome if anyone has any.
Keep line breaks in HTML string
HTML, in general, uses br
tags to denote a new line. A plain textarea
tag does not use this, it uses whatever the user's system uses to denote a new line. This can vary by operating system.
Your simplest solution is to use CSS
<main role="main" class="container">
<p style="margin-bottom: 2rem;white-space:pre-wrap;">{{review.body}}</p>
</main>
This will maintain any "white space" formatting, including additional spaces.
If you want to actually replace the newline characters with br
tags you can use the following regex
<main role="main" class="container">
<p style="margin-bottom: 2rem;" [innerHTML]="review.body.replace(/(?:\r\n|\r|\n)/g, '<br>')"></p>
</main>
Edit Thanks to ConnorsFan for the heads up on replace not working with interpolation.
replace br tag from a string in php
preg_replace("/<br\W*?\/>/", "\n", $your_string);
Related Topics
Iterable Objects and Array Type Hinting
In PHP How to Inspect Content of a Zip File Without Extracting Its Content First
Symfony 2 - How to Pass Data to Formbuilder
Sending Xml Data Using Http Post with PHP
How to Rotate Image and Save the Image
What Is the Pdo Equivalent of Function MySQL_Real_Escape_String
What Is the Syntax for Sorting an Eloquent Collection by Multiple Columns
MySQL Get a Random Value Between Two Values
Explode() into $Key=>$Value Pair
Add Http:// Prefix to Url When Missing
Generating a Screenshot of a Website Using Jquery
How to Change PHP Version Used by Composer
Bind Param with Array of Parameters
Php, Curl Post to Login to Wordpress
Laravel 4 Custom Named Password Column
What Are the Valid Characters in PHP Variable, Method, Class, etc Names
Sent Mails with PHPmailer Don't Go to "Sent" Imap Folder
Detecting Ajax in PHP and Making Sure Request Was from My Own Website