Remove HTML Tags in JavaScript with Regex

Remove HTML Tags in Javascript with Regex

Try this, noting that the grammar of HTML is too complex for regular expressions to be correct 100% of the time:

var regex = /(<([^>]+)>)/ig
, body = "<p>test</p>"
, result = body.replace(regex, "");

console.log(result);

If you're willing to use a library such as jQuery, you could simply do this:

console.log($('<p>test</p>').text());

How to remove html tags from an Html string using RegEx?

You can use

.replace(/<br>(?=(?:\s*<[^>]*>)*$)|(<br>)|<[^>]*>/gi, (x,y) => y ? ' & ' : '')

See the JavaScript demo:

const text = '<div class="ExternalClassBE95E28C1751447DB985774141C7FE9C"><p>Tina Schmelz<br></p><p>Sascha Balke<br></p></div>';
const regex = /<br>(?=(?:\s*<[^>]*>)*$)|(<br>)|<[^>]*>/gi;
console.log(
text.replace(regex, (x,y) => y ? ' & ' : '')
);

Javascript replace regex all html tags except p,a and img

You may match the tags to keep in a capture group and then, using alternation, all other tags. Then replace with $1:

(<\/?(?:a|p|img)[^>]*>)|<[^>]+>

Demo: https://regex101.com/r/Sm4Azv/2

And the JavaScript demo:

var input = 'b<body>b a<a>a h1<h1>h1 p<p>p p</p>p img<img />img';var output = input.replace(/(<\/?(?:a|p|img)[^>]*>)|<[^>]+>/ig, '$1');console.log(output);

JS Regex remove HTML Tags and Content

Create a dummy element to remove Elements and keep only text nodes

function stripElements( str )
{
var dummyDiv = document.createElement( "div" );
dummyDiv.innerHTML = str;
return Array.from( dummyDiv.childNodes ).filter( s => s.nodeType == 3 ).map( s => s.nodeValue ).join("");
}

Demo

var str = `Color<select><option value="">Show All</option><option value="Black">Black</option><option value="Blue">Blue</option></select>`;
function stripElements( str ){ var dummyDiv = document.createElement( "div" ); dummyDiv.innerHTML = str; return Array.from( dummyDiv.childNodes ).filter( s => s.nodeType == 3 ).map( s => s.nodeValue ).join("");}
console.log( stripElements(str) );

How to remove all html tags including ' ' from string?

The text looks to be double-escaped, kinda - first turn all the &s into &s, so that the HTML entities can be properly recognized. Then .text() will give you the plain text version of the HTML markup.

const input = `<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and typesetting industry.Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>\n\n<p> </p>\n\n<p>TItle </p>\n`;
const inputWithProperEntities = input.replaceAll('&', '&');
console.log($(inputWithProperEntities).text());
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>

Remove HTML tags in script

This regex /<{1}[^<>]{1,}>{1}/g should replace any text in a string that is between two of these <> and the brackets themselves with a white space. This

  var str = "<hi>How are you<hi><table><tr>I<tr><table>love cake<g>"  str = str.replace(/<{1}[^<>]{1,}>{1}/g," ")  document.writeln(str);

Remove HTML tags and newline characters with Regex

This works for me. Are your CRLFs '\r' one escaped character or two characters, being '\' and 'r'.

If you have HTML elements with characters \n and \r, they are literal, and that would be really odd inside a div unless you are displaying source code. Plain ol' line breaks will end up as expected with a single escape character.

Also ,it's not clear if your source is getting pulled from an element or is static text.

You might have to escape the literal case in your regex.

replace(/(?:\\r\\n|\\r|\\n)/g, '<br>')

const text = `<div class="text-danger ng-binding" ng-bind-html="message.causedBy ">javax.xml.ws.soap.SOAPFaultException: Response was of unexpected text/html ContentType.  Incoming portion of HTML stream: \r\n\r\n\r\n\r\n500 - Internal server error.\r\n\r\n\r\n\r\n<div><h1>Server Error</h1></div>\r\n<div>\r\n <div class="\"content-container\"">\r\n  <h2>500 - Internal server error.</h2>\r\n  <h3>There is a problem with the resource you are looking for, and it cannot be displayed.</h3>\r\n </div>\r\n</div>\r\n\r\n\r\n\n\t</div>`
const newText = text .replace(/<script.*?<\/script>/g, '<br>') .replace(/<style.*?<\/style>/g, '<br>') .replace(/(<([^>]+)>)/ig, "<br>") .replace(/(?:\r\n|\r|\n)/g, '<br>') //.replace(/(?:\\r\\n|\\r|\\n)/g, '<br>')console.log(newText)
const text2 = document.getElementById('text').innerHTMLconst newText2 = text2 .replace(/<script.*?<\/script>/g, '<br>') .replace(/<style.*?<\/style>/g, '<br>') .replace(/(<([^>]+)>)/ig, "<br>") .replace(/(?:\r\n|\r|\n)/g, '<br>') //.replace(/(?:\\r\\n|\\r|\\n)/g, '<br>')console.log(newText2)
<div id='text'>This
is
<script>// nothing here </script>
a
div
These are literal \r\n\r\n and will not get escaped unless you uncomment the special case.
</div>


Related Topics



Leave a reply



Submit