Trying to Use the Domparser with Node Js

Trying to use the DOMParser with node js

A lot of browser functionalities, like DOM manipulations or XHR, are not available natively NodeJS because that is not a typical server task to access the DOM - you'll have to use an external library to do that.

DOM capacities depends a lot on the library, here's a quick comparisons of the main tools you can use:

  • jsdom: implements DOM level 4 which is the latest DOM standard, so everything that you can do on a modern browser, you can do it in jsdom. It is the de-facto industry standard for doing browser stuff on Node, used by Mocha, Vue Test Utils, Webpack Prerender SPA Plugin, and many other:

    const jsdom = require("jsdom");
    const dom = new jsdom.JSDOM(`<!DOCTYPE html><p>Hello world</p>`);
    dom.window.document.querySelector("p").textContent; // 'Hello world'
  • deno_dom: if using Deno instead of Node is an option, this library provides DOM parsing capabilities:

    import { DOMParser } from "https://deno.land/x/deno_dom/deno-dom-wasm.ts";
    const parser = new DOMParser();
    const document = parser.parseFromString('<p>Hello world</p>', 'text/html');
    document.querySelector('p').textContent; // 'Hello world';
  • htmlparser2: same as jsdom, but with enhanced performances and flexibility at the price of a more complex API:

    const htmlparser = require("htmlparser2");
    const parser = new htmlparser.Parser({
    onopentag: (name, attrib) => {
    if (name=='p') console.log('a paragraph element is opening');
    }
    }, {decodeEntities: true});
    parser.write(`<!DOCTYPE html><p>Hello world</p>`);
    parser.end();
    // console output: 'a paragraph element is opening'
  • cheerio: implementation of jQuery based on HTML DOM parsing by htmlparser2:

    const cheerio = require('cheerio');
    const $ = cheerio.load(`<!DOCTYPE html><p>Hello world</p>`);
    $('p').text('Bye moon');
    $.html(); // '<!DOCTYPE html><p>Bye moon</p>'
  • xmldom: fully implements the DOM level 2 and partially implements the DOM level 3. Works with HTML, and with XML also

  • dom-parser: regex-based DOM parser that implements a few DOM methods like getElementById. Since parsing HTML with regular expressions is a very bad idea I wouldn't recommend this one for production.

Is there a polyfill for window.DOMParser() for Node

The exact polyfill for Domparser in node is xmldom package at https://www.npmjs.com/package/xmldom

How overwrite in node/xmldom errorHandler on DOMParser?

This is the runable code:

var DOMParser = require('xmldom').DOMParser;

let mylocator = {};

let parseLog = {errorLevel: 0};

let parser = new DOMParser({
locator: mylocator,
errorHandler: {
warning: (msg) => {manageXmlParseError(msg,1,parseLog)},
error: (msg) => {manageXmlParseError(msg,2,parseLog)},
fatalError: (msg) => {manageXmlParseError(msg,3,parseLog)},
},
});

function manageXmlParseError(msg,errorLevel,errorLog){
if( (errorLog.errorLevel == null) || (errorLog.errorLevel < errorLevel)){
errorLog.errorLevel = errorLevel;
}

if(errorLog[errorLevel.toString()] == null){
errorLog[errorLevel.toString()] = [];
}

errorLog[errorLevel.toString()].push(msg);
}

var doc = parser.parseFromString(
'<xml xmlns="a" xmlns:c="./lite">\n'+
'\t<child>test</child>\n'+
'\t<child22><<</child>\n'+
'\t<child/>\n'+
'</xml>'
,'text/xml');

console.info("parsestatus ==> " + parseLog.errorLevel + "\nlocator:" + mylocator.columnNumber + "/" + mylocator.lineNumber );


Related Topics



Leave a reply



Submit