How to Convert HTML to Bbcode

How to convert HTML to BBCode

It should be doable with XSLT in text output mode:

<xsl:output method="text">
…
<xsl:template match="b|strong">[b]<xsl:apply-templates/>[/b]</xsl:template>
<xsl:template match="br">
</xsl:template>
<xsl:template match="p">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="a">[url="<xls:value-of select="@href"/>"]<xsl:apply-templates/>[/url]</xsl:template>
<xsl:template match="text()"><x:value-of select="normalize-space(.)"/></xsl:template>

To get there parse HTML and use built-in XSLT processor.

Javascript to convert html to bbcode, for images with links

Hi there using jquery you can use an on click function

$('body').on('click', 'img.bbcode', function() {
var imgsrc = $(this).attr('src');
$('.message').append('[url=' + imgsrc + '][img]' + imgsrc + '[/img][/url]');
$(this).css('border', '1px solid #000');
});

https://jsfiddle.net/x5sgycuk/1/

hope that helps

How convert html to BBcode in C#

For some HTML tags, you can just do a simple string.Replace. BBCode is in many ways just a 1:1, tag-for-tag mapping, for example <b> and </b> mapping to [B] and [/B] respectively. So that's easily accomplished with just:

html.Replace("<b>", "[b]").Replace("</b>", "[/b]")

If it's really dead-simple HTML, and you don't mind the performance impact and code ugliness of doing this tag-by-tag, go for it. But beware of cross-site scripting vulnerabilities, if you plan to display the resulting BBCode on a web page somewhere; this is nowhere near good enough for sanitization.

But don't even bother trying to use regular expressions to sanitize the HTML and do automatic replacement of all tags. The <img> tag, for instance, looks completely different in HTML vs. BBCode. In HTML it's <img src="..."/> (trailing slash is optional) and in BBCode it's [IMG]...[/IMG]. Doing this with regex is... well, let's just say sub-optimal.

Regular expressions are designed for regular languages, and HTML is not a regular language, it's a context-free language. Consider using an actual HTML parser instead like the HTML Agility Pack. Then you can descend the DOM tree, whitelist the elements you want, and map them to BBCode or anything else however you like.

How can I convert HTML to bbcode using a regex?

$string = '<div class="postQuote"> <div class="postQuoteAuthor"><a href="http://www.siteurl.com/profile.php?user=Username">Username</a> wrote...</div> quoted text</div> comment ';

$string = preg_replace('|^<div class="postQuote".*user=([^"]+)".+</div>([^<]+)</div>(.+)$|', '[QUOTE=$1] $2 [/QUOTE] $3', $string);

echo $string; // [QUOTE=Username] quoted text [/QUOTE] comment

Converting nested html to bbCode tags

function Acco_Tag($Text)
{
    $Text = preg_replace_callback('#\[header\](.*?)\[/header\]#muis','iNext4header', $Text);
    return preg_replace_callback('#\[content\](.*?)\[/content\]#muis', 'iNext4content', $Text);
}
private function iNext4header($match)
{
    static $i=0;
    return '<div class="accordion-heading"><a class="accordion-toggle" data-toggle="collapse" data-parent="#accordion2" href="#collapse'.(++$i).'">'.$match[1].'</a></div>';
}
private function iNext4content($match)
{
    static $i=0;
    return '<div id="collapse'.(++$i).'" class="accordion-body collapse in"><div class="accordion-inner">'.$match[1].'</div></div>';
}

Thank

I've found the answer

How to convert bbcode url tag to an html hyperlink using the bbcode tag's values?

UPDATE: Casimir's commented solution is more direct/clean.

Code: (Demo) (Pattern Demo)

echo preg_replace('~\[url(?|]((https?://[^[]+))|(?:="(https?://[^"]+)")](.+?))\[/url]~i', '<a href=\"$1\">$2</a>', $bbcode);

By doubling the capture of the first alternative in the pattern, you can ensure that you always have a $1 and $2 to apply to the replacement string.

Here is a slightly extended variation of the pattern that considers single quoting and no quoting.

(Start of previous solution)

By using preg_match_callback() you can determine if there was a url provided inside of the opening [url] tag -- in which case, you will want to preserve the text that is located between the opening and closing tags.

If the text between the tags IS the url, you use it in both places in the <a> tag string.

Invalid strings will not be converted.

Code: (Demo) (Pattern Demo)

$bbcodes = [
    '[URL]www.no.http.example.com[/URL]',
    '[url]https://any.com/any[/url]',
    '[url="nourl"]nourl[/url]',
    '[URL="https://any.com/any?any=333"]text text[/URL]',
    '[url="http://www.emptyTEXT.com"][/url]',
    '[url]http://www.any.com/any?any=44#sss[/url]',
    '[url="https://conflictinglink"]http://differenturl[/url]'
];

foreach ($bbcodes as $bbcode) {
    echo preg_replace_callback('~\[url(?:](https?://[^[]+)|(?:="(https?://[^"]+)")](.+?))\[/url]~i',
                          function($m) {
                              if (isset($m[2])) {
                                  return "<a href=\"{$m[2]}\">{$m[3]}</a>";
                              }
                              return "<a href=\"{$m[1]}\">{$m[1]}</a>";
                          },
                          $bbcode);
    echo "\n---\n";
}

Output:

[URL]www.no.http.example.com[/URL]
---
<a href="https://any.com/any">https://any.com/any</a>
---
[url="nourl"]nourl[/url]
---
<a href="https://any.com/any?any=333">text text</a>
---
[url="http://www.emptyTEXT.com"][/url]
---
<a href="http://www.any.com/any?any=44#sss">http://www.any.com/any?any=44#sss</a>
---
<a href="https://conflictinglink">http://differenturl</a>
---

Pattern Breakdown:

~                    #start of pattern delimiter
\[url                #match literally [url
(?:                  #start non-capturing group #1
  ]                  #match literally ]
  (https?://[^[]+)   #match and store as Capture Group #1 http , an optional s , colon , two forward slashes, then one or more non-opening square brackets (since valid href values cannot have square brackets)
  |                  #or
  (?:                #start non-capturing group #2
    ="               #match literally ="
    (https?://[^"]+) #match and store as Capture Group #2 (same logic as Capture Group #1)
    "                #match literally "
  )                  #end non-capturing group #2
  ]                  #match literally ]
  (.+?)              #match (lazily) and store as Capture Group #3 one or more characters (this is the innerHTML component)
)                    #end non-capturing group #1
\[/url]              #match literally [/url]
~                    #end of pattern delimiter

The callback function assesses the elements in the matches array ($m) and conditionally generates and returns the desired output. If there are any matches, the output will either contain:

array(
    0 => [the fullstring match]
    1 => [the url of a bbcode tag that does not have a quoted url]
)

array(
    0 => [the fullstring match]
    1 => ''  // <-- empty string
    2 => [the quoted url of the bbcode tag]
    3 => [the text between the opening an closing bbcode tags]
)

Converting nested html to bbCode quote tags

Here's a very basic example:

var html = $('#commentContent').html(),
    beingParsed = $('<div>' + html.replace(/<br>/g, '\n\r') + '</div>'),
    $quote;
while (($quote = beingParsed.find('.bbQuote:first')).length) {
    var $author = $quote.find('.quoteAuthor:first'),
        $content = $quote.find('.quoteContent:first'),
        toIndent = $author[0].previousSibling;

    toIndent.textContent = toIndent.textContent.substring(0, toIndent.textContent.length-4);
    $author.replaceWith('[quote=' + $author.text() + ']');
    $content.replaceWith($content.html());
    $quote.replaceWith($quote.html() + '[/quote]');
}

var parsedData = beingParsed.html();

Fiddle

Limitations:

It won't convert other HTML to BBCode (<b>, <i>, anchor tags etc);
Indentation/white space is not 100% accurate.

I'd use Ajax to fetch the actual post content from the DB or use a proper jQuery bbCode parsing library.

How to Convert HTML to Bbcode