Removing all script tags from html with JS Regular Expression
Attempting to remove HTML markup using a regular expression is problematic. You don't know what's in there as script or attribute values. One way is to insert it as the innerHTML of a div, remove any script elements and return the innerHTML, e.g.
function stripScripts(s) {
var div = document.createElement('div');
div.innerHTML = s;
var scripts = div.getElementsByTagName('script');
var i = scripts.length;
while (i--) {
scripts[i].parentNode.removeChild(scripts[i]);
}
return div.innerHTML;
}
alert(
stripScripts('<span><script type="text/javascript">alert(\'foo\');<\/script><\/span>')
);
Note that at present, browsers will not execute the script if inserted using the innerHTML property, and likely never will especially as the element is not added to the document.
Remove all scripts with javascript regex
Try this:
(/<.*?script.*?>.*?<\/.*?script.*?>/igm, '')
or
(/<script.*?>.*?<\/script>/igm, '')
(you need 'm' switch to search multi-line)
Regex To Remove Script And Style Tags + Content Javascript
I want to have the equivalent of
<script(.*?)>(.*?)</script> //in javascript
/<script([\S\s]*?)>([\S\s]*?)<\/script>/ig
Use [\S\s]*?
instead of .*?
in your regex because javascript won't support s
modifier (DOTALL modifier). [\S\s]*?
would match any space or non-space character zero or more times non-greedily.
How to remove all script tags from html file
If I understood correctly your question, and you want to delete everything inside <script></script>
, I think you have to split the sed in parts (You can do it one-liner with ;):
Using:
sed 's/<script>.*<\/script>//g;/<script>/,/<\/script>/{/<script>/!{/<\/script>/!d}};s/<script>.*//g;s/.*<\/script>//g'
The first piece (s/<script>.*<\/script>//g
) will work for them when in one line;
The second section (/<script>/,/<\/script>/{/<script>/!{/<\/script>/!d}}
) is almost a quote to @akingokay answer, only that I excluded the lines of occurrence (Just in case they have something before or after). Great explanation of that in here Using sed to delete all lines between two matching patterns;
The last two (s/<script>.*//g
and s/.*<\/script>//g
) finally take care of the lines that start and don't finish or don't start and finish.
Now if you have an index.html that has:
<html>
<body>
foo
<script> console.log("bar) </script>
<div id="something"></div>
<script>
// Multiple Lines script
// Blah blah
</script>
foo <script> //Some
console.log("script")</script> bar
</body>
</html>
and you run this sed command, you will get:
cat index.html | sed 's/<script>.*<\/script>//g;/<script>/,/<\/script>/{/<script>/!{/<\/script>/!d}};s/<script>.*//g;s/.*<\/script>//g'
<html>
<body>
foo
<div id="something"></div>
foo
bar
</body>
</html>
Finally you will have a lot of blank spaces, but the code should work as expected. Of course you could easily remove them with sed as well.
Hope it helps.
PS: I think that @l0b0 is right, and this is not the correct tool.
Related Topics
Jquery Click Not Working For Dynamically Created Items
Simulation Background-Size: Cover in Canvas
Changing CSS Values With JavaScript
How to Loop Through Selected Elements With Document.Queryselectorall
How to Change Div Content With JavaScript
How to Check Whether a Storage Item Is Set
How to Get the Pure Text Without HTML Element Using JavaScript
Get Next/Previous Element Using JavaScript
Resize Image With JavaScript Canvas (Smoothly)
How to Use HTML Tags in the Options For Select Elements
Difference Between Relative Path and Absolute Path in JavaScript
How to Capture the Right-Click Event in JavaScript
Launch Bootstrap Modal on Page Load
Get Selected Option Text With JavaScript
Why Does This Simple Jsfiddle Not Work
Encrypt With PHP, Decrypt With JavaScript (Cryptojs)