Best Way to Parse Bbcode

Best way to parse bbcode

There's both a pecl and PEAR BBCode parsing library. Software's hard enough without reinventing years of work on your own.

If neither of those are an option, I'd concentrate on turning the BBCode into a valid XML string, and then using your favorite XML parsing routine on that. Very very rough idea here, but

  1. Run the code through htmlspecialchars to escape any entities that need escaping

  2. Transform all [ and ] characters into < and > respectively

  3. Don't forget to account for the colon in cases like [tagname:

If the BBCode was nested properly, you should be all set to pass this string into an XML parsing object (SimpleXML, DOMDocument, etc.)

What is the best way to parse text and code in my PHP blog?

You can use nl2br()

Example

$message =  nl2br(preg_replace('#(\\]{1})(\\s?)\\n#Usi', ']', stripslashes($message))); 

Parsing BBCode in Javascript

This does the trick for me: (updated this one too to avoid confusion)

\[code\]([\s\S]*?)\[\/code\]

See regexpal and enter the following:

[code]
code....
[/code]

[code]code.... [/code]

Update:
Fixed the regex to the following and this works in the Chrome Console for me:

/\[code\]([\s\S]*?)\[\/code\]/g.exec("[code]hello world \n[/code]")

php bbcode parser [tag] in [tag]

You should look into already existing solutions if you're willing to parse complex BBCode (see the post mario linked in a comment for reference).

However, if you're willing to stick with your own implementation, you can use recursive regexes, for example this way:

<?php
function bbcodeColor($input)
{
$regex = '#\[color=(.*?)\](((?R)|.)*?)\[\/color\]#is';
if (is_array($input)) {
$input = '<span style="color:'.$input[1].';">'.$input[2].'</span>';
}
return preg_replace_callback($regex, 'bbcodeColor', $input);
}

echo bbcodeColor('[color=#f00]red[color=#0f0]green[/color][/color]');
// <span style="color:#f00;">red<span style="color:#0f0;">green</span></span>

Parse BBCode in array

You can try this using regex

$code = '[date format="j M, Y" type="jalali"]';

preg_match_all("/\[([^\]]*)\]/", $code, $matches);

$codes = [];

foreach($matches[1] as $match) {
// Normalize quotes into double quotes
$match = str_replace("'",'"',$match);
// Split by space but ignore inside of double quotes
preg_match_all('/(?:[^\s+"]+|"[^"]*")+/',$match,$tokens);
$parsed = [];
$prevToken = '';
foreach($tokens[0] as $token) {
if(strpos($token,'=') !== false) {
if($prevToken !== '') {
$parts = explode('=',$token);
$parsed[$prevToken][$parts[0]] = trim($parts[1],'"\'');
}
} else {
$parsed[$token] = [];
$prevToken = $token;
}
}

$codes[] = $parsed;
}

var_dump($codes);

Result:

array(1) {
[0]=>
array(1) {
["date"]=>
array(2) {
["format"]=>
string(6) "j M, Y"
["type"]=>
string(6) "jalali"
}
}
}

Any good javascript BBCode parser?

I haven't personally used any Javascript BBcode parsers, but the top two Google results (bbcodejs and this blog post) seem pretty weak. The former only seems to support simple find-and-replace, and the latter seems to have pre-set BBcode built in, so you'd probably have to hack it a bit if you chose that solution.

Your best options are probably to roll your own solution (possibly basing your work off one of the two links here), or just use AJAX and move on. That's probably the best way to ensure that previews are accurate, and previewing doesn't have to be real-time on every keypress, anyway; a delay before even sending the request is acceptable.

Parse bbcode with arrays

There are several bbcode flavours with different syntaxes. Obviously the best is to have a clear rule and to handle only one syntax, but for your specific problem, you can change your pattern to something like this:

#\[\*]([^[]*(?:\[(?!/?\*]|/list])[^[]*)*)(?:\[/\*])?#i

demo

Note that you also need to put the [*] replacement before the [list]s replacements.

The idea is to describe all that isn't a [*], [/*] or [/list], and to add an optional closing tag at the end.

details:

\[\*]   # opening tag
( # capture group 1
[^[]* # all that isn't an opening square bracket
(?:
\[ (?!/?\*]|/list]) # opening bracket not followed by *] or /*] or /list]
[^[]*
)*
)
(?:\[/\*])? # the optional closing tag


Related Topics



Leave a reply



Submit