How to Convert HTML with Mathjax into Latex Using Pandoc

How to convert HTML with mathjax into Latex using Pandoc?

With the latest version of pandoc (1.12.2), you can do this:

pandoc -f html+tex_math_dollars+tex_math_single_backslash -t latex

Much nicer! If you don't want to convert math delimited by \( and \), just do

pandoc -f html+tex_math_dollars -t latex

Convert HTML and inline Mathjax math to LaTeX with pandoc ruby

Get the very latest version of pandoc (1.12.2). Then you can do

pandoc -f html+tex_math_dollars+tex_math_single_backslash -t latex

Pandoc : generate a html embedding Latex equation from markdown input

The test.html file generated was not "complete" in the sense that only the body of the html was generated, and not the header. However, mathjax must be linked in the header for the equation to be displayed nicely.

To generate a "complete" html file with the <html> <head> and <body> tags, pandoc's --standalone (aka -s) option must be used

pandoc --standalone test.md -o test.html --mathjax

More details

Using the invocation

pandoc --standalone test.md -o test.html --mathjax

generates the following test.html file

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">code{white-space: pre;}</style>
<script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
</head>
<body>
<p><span class="math">\[ \frac{1}{2} = 0.5 \neq \sqrt{2} \]</span></p>
</body>
</html>

(note the <script> tag linking to mathjax in the <head> section)
whereas the invocation

pandoc test.md -o test.html --mathjax

generates a file containing merely a single-line

<p><span class="math">\[ \frac{1}{2} = 0.5 \neq \sqrt{2} \]</span></p>

Convert html mathjax to markdown with pandoc

You can write a short Haskell program unescape.hs:

-- Disable backslash escaping of special characters when writing strings to markdown.
import Text.Pandoc

main = toJsonFilter unescape
where unescape (Str xs) = RawInline "markdown" xs
unescape x = x

Now compile with ghc --make unescape.hs. And use with

pandoc -f html -t json | ./unescape | pandoc -f json -t markdown

This will disable escaping of special characters (like $) in markdown output.

A simpler approach might be to pipe pandoc's normal markdown output through sed:

pandoc -f html -t markdown | sed -e 's/\\\([$^_*]\)/\1/g'

Using pandoc converting a markdown document with LaTeX to HTML and LaTeX converted to SVG within HTML

Use the --webtex option with a custom URL to get equations converted to SVG:

pandoc --webtex 'https://latex.codecogs.com/svg.latex?' ...

Images will be created on the fly by the codecogs web service whenever the resulting HTML document is opened in a browser. To download the images ahead of time and including the images directly, combine it with the --self-contained option.



Related Topics



Leave a reply



Submit