Regex Replace Text Outside HTML Tags

Regex replace text outside html tags

Okay, try using this regex:

(text|simple)(?![^<]*>|[^<>]*</)

Example worked on regex101.

Breakdown:

(         # Open capture group
text # Match 'text'
| # Or
simple # Match 'simple'
) # End capture group
(?! # Negative lookahead start (will cause match to fail if contents match)
[^<]* # Any number of non-'<' characters
> # A > character
| # Or
[^<>]* # Any number of non-'<' and non-'>' characters
</ # The characters < and /
) # End negative lookahead.

The negative lookahead will prevent a match if text or simple is between html tags.

Regex replace text outside script tag

My pattern will use (*SKIP)(*FAIL) to disqualify matched script tags and their contents.

text and simple will be match on every qualifying occurrence.

Regex Pattern: ~<script.*?/script>(*SKIP)(*FAIL)|text|simple~

Pattern / Replacement Demo Link

Code: (Demo)

$strings=['This has no replacements',
'This simple text has no script tag',
'This simple text ends with a script tag <script language="javascript">simple simple text text</script>',
'This is simple html text is split by a script tag <script language="javascript">simple simple text text</script> text',
'<script language="javascript">simple simple text text</script> this text starts with a script tag'
];

$strings=preg_replace('~<script.*?/script>(*SKIP)(*FAIL)|text|simple~','***replaced***',$strings);

var_export($strings);

Output:

array (
0 => 'This has no replacements',
1 => 'This ***replaced*** ***replaced*** has no script tag',
2 => 'This ***replaced*** ***replaced*** ends with a script tag <script language="javascript">simple simple text text</script>',
3 => 'This is ***replaced*** html ***replaced*** is split by a script tag <script language="javascript">simple simple text text</script> ***replaced***',
4 => '<script language="javascript">simple simple text text</script> this ***replaced*** starts with a script tag',
)

RegEx replace only occurrences outside of h html tags

You can use

\bPlus\b(?![^>]*<\/h\d+>)

See the regex demo. To use the match inside the replacement pattern, use the $& backreference in your VBA code.

Details:

  • \bPlus\b - a whole word Plus
  • (?![^>]*<\/h\d+>) - a negative lookahead that fails the match if, immediately to the right of the current location, there are
    • [^>]* - zero or more chars other than >
    • <\/h - </h string
    • \d+ - one or more digits
    • > - a > char.

Match text outside of html tags

The problem is that you are using . that matches any character. Replace it with a negated character class, like [^<>] that matches any char but < and > and use a greedy quantifier * (to match 0 or more occurrences) or + (to match 1 or more occurrences):

(?<!<[^>]*)(?<Text>[^<>]*)

See the regex demo

BTW, using (?<Text>.+?) at the end of the pattern only makes the regex engine match 1 char since the +? is a lazy quantifier matching 1 or more occurrences but as few as possible (and since 1 is enough, it will always match just 1 char). Usually, there must be some other pattern after such a lazily quantified one, else, it usually does not fetch the right texts.

Need regex to find text outside the tags ONLY javascript

Just remove all the tags.

var s = '<tag>Some</tag>Text, you have <tag url="something">Here</tag>';alert(s.replace(/<(\w+)\b[^<>]*>[\s\S]*?<\/\1>/g, ''))

Replace with RegExp only outside tags in the string

The regex itself to replace all bs with :blablabla: is not that hard:

.replace(/b/g, ":blablabla:")

It is a bit tricky to get the text nodes where we need to perform search and replace.

Here is a DOM-based example:

function replaceTextOutsideTags(input) {  var doc = document.createDocumentFragment();  var wrapper = document.createElement('myelt');  wrapper.innerHTML = input;  doc.appendChild( wrapper );  return textNodesUnder(doc);}function textNodesUnder(el){  var n, walk=document.createTreeWalker(el,NodeFilter.SHOW_TEXT,null,false);  while(n=walk.nextNode())  {       if (n.parentNode.nodeName.toLowerCase() === 'myelt')        n.nodeValue =  n.nodeValue.replace(/:\/(?!\/)/g, "smiley_here");   }  return el.firstChild.innerHTML;} 
var s = 'not feeling well today :/ check out this link <a href="http://example.com">http://example.com</a>';console.log(replaceTextOutsideTags(s));


Related Topics



Leave a reply



Submit