Convert HTML to Data:Text/Html Link Using JavaScript

Convert HTML to data:text/html link using JavaScript

Characteristics of a data-URI

A data-URI with MIME-type text/html has to be in one of these formats:

data:text/html,<HTML HERE>
data:text/html;charset=UTF-8,<HTML HERE>

Base-64 encoding is not necessary. If your code contains non-ASCII characters, such as éé, charset=UTF-8 has to be added.

The following characters have to be escaped:

  • # - Firefox and Opera interpret this character as the marker of a hash (as in location.hash).
  • % - This character is used to escape characters. Escape this character to make sure that no side effects occur.

Additionally, if you want to embed the code in an anchor tag, the following characters should also be escaped:

  • " and/or ' - Quotes mark the value of the attribute.
  • & - The ampersand is used to mark HTML entities.
  • < and > do not have to be escaped inside a HTML attribute. However, if you're going to embed the link in the HTML, these should also be escaped (%3C and %3E)

JavaScript implementation

If you don't mind the size of the data-URI, the easiest method to do so is using encodeURIComponent:

var html = document.getElementById("html").innerHTML;
var dataURI = 'data:text/html,' + encodeURIComponent(html);

If size matters, you'd better strip out all consecutive white-space (this can safely be done, unless the HTML contains a <pre> element/style). Then, only replace the significant characters:

var html = document.getElementById("html").innerHTML;
html = html.replace(/\s{2,}/g, '') // <-- Replace all consecutive spaces, 2+
.replace(/%/g, '%25') // <-- Escape %
.replace(/&/g, '%26') // <-- Escape &
.replace(/#/g, '%23') // <-- Escape #
.replace(/"/g, '%22') // <-- Escape "
.replace(/'/g, '%27'); // <-- Escape ' (to be 100% safe)
var dataURI = 'data:text/html;charset=UTF-8,' + html;

Convert Text link to Multiple HTML Format in Javascript with XSS Filter

I believe this is the change you are looking for, What I did here (the snippet below) over what is your code doing. I add two replace statements,

  • searched for all URLs and replaced them with links, this is to make sure that I don't override the other patterns we are replacing.
  • searched for all href links with images extensions (I didn't add all image ext, you can add more as fits for your app) and replaced the whole tag with tag, I hope this help.

Update: Add XSS, in case your input is simple you can use a custom function to match patterns from XXS OWASP XSS prevention sheet the only thing that didn't work for me is the '/' which I had to whitelist it, otherwise I really suggest to a library to js-xss or DOMPurify to filter XSS potential text in your data input.

function sanitizeString(str) {  // "/": '/',  const patterns = {    '&': '&',    '<': '<',    '>': '>',    '"': '"',    "'": ''',    "`": '`'  };  const reg = /[&<>"']/ig;  return str.replace(reg, (match)=>(patterns[match]));}


function convertLink(article) { let cArticle = ""; cArticle = article .replace(/(\bhttps?:\/\/\S+)/g, '<a href="$1">linked tag</a>') .replace(/<a href="((?:(https?):\/\/)([^\s]+)(\.(jpg|jpeg|gif|png))).">([\s\S]*?)<\/a>/g, '<img width="200" height="100" src="$1" />') .replace(/(?:(https?):\/\/)?(?:www\.)?(?:youtube\.com|youtu\.be)\/(?:watch\?v=)?([^\ \r\n]+)/g, '<div id="$2" class="anotherClass" onclick="someFunction(\'$2\', \'$2\');"><img class="some-class" src="https://i.ytimg.com/vi/$2/0.jpg"></div>') .replace(/<a href="(?:(https?):\/\/)?(?:www\.)?(?:vimeo\.com)\/([^\ \r\n]+)">([\s\S]*?)<\/a>/g, '<div class="video-container"><iframe src="//player.vimeo.com/video/$2" width="100%" height="480" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe></div>') .replace(/([\s+])@([^\s]+)/g, " <a href='https:\/\/example.com/$2'>@$2</a>") .replace(/([\r\n])/ig, "<br>") .replace(/([\s+])#([^\s]+)/g, " <a href='https:\/\/example.com/#2'>#$2</a>") .replace(/([\r\n])/ig, "<br>") .replace(/(\<br\>\<br\>)/, "<br>"); return cArticle;} let stringIs = "In this article, we will be talking about Some of the interesting facts like: Youtube Flutterhttps://www.youtube.com/watch?v=i-Qy1VQUMuI as well as https://images.pexels.com/photos/414612/pexels-photo-414612.jpeg, https://vimeo.com/259411563 and #rain @stackoverflow more coming soon at https://example.com/link/me Let's talk. here is XSS sample <div>Testing xss</div>";
stringIs = sanitizeString(stringIs);
document.getElementById("demo").innerHTML = convertLink(stringIs);
<!DOCTYPE html><html><head>  <meta charset="utf-8">  <meta name="viewport" content="width=device-width">  <title>Convert Link</title></head><body>    <span id="demo"></span></body></html>

Convert HTML to plain text keeping links, bold and italic in Javascript

I had some time on my hands and played around. This is what I came up with:

const copy=document.createElement("div");
copy.innerHTML=container.innerHTML.replace(/\n/g," ").replace(/[\t\n]+/g,"");
const tags={B:["**","**",1], // [<prefix>, <postfix>, <sequence-number> ]
I:["*","*",2],
H2:["##","\n",3],
P:["\n","\n",4],
DIV:["","\n",5],
TD:["","\t",6]};
[...copy.querySelectorAll(Object.keys(tags).join(","))]
.sort((a,b)=>tags[a.tagName][2]-tags[b.tagName][2])
.forEach(e=>{
const [a,b]=tags[e.tagName];
e.innerHTML=(e.matches("TD:first-child") ? "\n": a) + e.innerHTML + b;
});
console.log(copy.textContent.replace(/^ */mg,""));
<div id="container">
<H2>Second level heading</H2>
<div><div>
A <b>first div</b> with a
<a href="abc.html">link (abc)</a> and a
<p>paragraph having itself another <a href="def.html">link (def)</a> in it.</p>
</div>
</div>
And here is some more <i>"lost" text</i> ...
<table>
<tr><td>one</td><td><b>two</b></td><td>three</td></tr>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>d</td><td>e</td><td>f</td></tr>
</table>
</div>

Javascript convert text to a link

Just change code to this. You have to form a "a" attribute for each link.

var urlList = [];
function getUrlList() {
var url = { urlhtml }; var i = 0; var thisList = "";
url.urlhtml = document.getElementById("urlhtml").value;
urlList.push(url); for (i = 0; i < urlList.length; i++) { thisList += "<a target='blank' href='http://" + urlList[i].urlhtml + "'>" + urlList[i].urlhtml + "</a><br>"; } document.getElementById("showurls").innerHTML = thisList;}
<form>  <input type="text" id="urlhtml" size="30" placeholder="http://www.sait.ca" value="www.google.com">  <br>  <br>  <input type="submit" value="Add Url" id="submit" onclick="getUrlList(); return false"></form><br><h2> Your favorite urls are: </h2><a href target="_blank"><h3><span id="showurls"></span></h3></a>

Converting a series of URL strings to HTML links

I suppose what you need is an .each() loop:

jQuery(document).ready(function( $ ) {

var elements = $('.post-excerpt')
if (elements.length > 0) {
elements.each(function(index,element){
$(element).html($(element).html().replace(/((http:|https:)[^\s]+[\w])/g,'<a href="$1" target="_blank">$1</a>'));
})
}
});

Convert HTML to plain text in JS without browser environment

Converter HTML to plain text like Gmail:

html = html.replace(/<style([\s\S]*?)<\/style>/gi, '');
html = html.replace(/<script([\s\S]*?)<\/script>/gi, '');
html = html.replace(/<\/div>/ig, '\n');
html = html.replace(/<\/li>/ig, '\n');
html = html.replace(/<li>/ig, ' * ');
html = html.replace(/<\/ul>/ig, '\n');
html = html.replace(/<\/p>/ig, '\n');
html = html.replace(/<br\s*[\/]?>/gi, "\n");
html = html.replace(/<[^>]+>/ig, '');

If you can use jQuery :

var html = jQuery('<div>').html(html).text();

Parse an HTML string with JS

Create a dummy DOM element and add the string to it. Then, you can manipulate it like any DOM element.

var el = document.createElement( 'html' );
el.innerHTML = "<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>";

el.getElementsByTagName( 'a' ); // Live NodeList of your anchor elements

Edit: adding a jQuery answer to please the fans!

var el = $( '<div></div>' );
el.html("<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>");

$('a', el) // All the anchor elements


Related Topics



Leave a reply



Submit