How to Make a Bbcode to Parse Url Tags into Links

how do I make a bbcode to parse url tags into links?

$post = preg_replace('/\[url=(.+?)\](.+?)\[\/url\]/', '<a href="\1">\2</a>', $post);

That will turn:
[url=http://google.com]Google[/url]

Into parsed bbcode text:
Google

You'll probably want to use more specific regex than just .+ to filter out potentially bad/dangerous input.

How to convert bbcode url tag to an html hyperlink using the bbcode tag's values?

UPDATE: Casimir's commented solution is more direct/clean.

Code: (Demo) (Pattern Demo)

echo preg_replace('~\[url(?|]((https?://[^[]+))|(?:="(https?://[^"]+)")](.+?))\[/url]~i', '<a href=\"$1\">$2</a>', $bbcode);

By doubling the capture of the first alternative in the pattern, you can ensure that you always have a $1 and $2 to apply to the replacement string.

Here is a slightly extended variation of the pattern that considers single quoting and no quoting.


(Start of previous solution)

By using preg_match_callback() you can determine if there was a url provided inside of the opening [url] tag -- in which case, you will want to preserve the text that is located between the opening and closing tags.

If the text between the tags IS the url, you use it in both places in the <a> tag string.

Invalid strings will not be converted.

Code: (Demo) (Pattern Demo)

$bbcodes = [
'[URL]www.no.http.example.com[/URL]',
'[url]https://any.com/any[/url]',
'[url="nourl"]nourl[/url]',
'[URL="https://any.com/any?any=333"]text text[/URL]',
'[url="http://www.emptyTEXT.com"][/url]',
'[url]http://www.any.com/any?any=44#sss[/url]',
'[url="https://conflictinglink"]http://differenturl[/url]'
];

foreach ($bbcodes as $bbcode) {
echo preg_replace_callback('~\[url(?:](https?://[^[]+)|(?:="(https?://[^"]+)")](.+?))\[/url]~i',
function($m) {
if (isset($m[2])) {
return "<a href=\"{$m[2]}\">{$m[3]}</a>";
}
return "<a href=\"{$m[1]}\">{$m[1]}</a>";
},
$bbcode);
echo "\n---\n";
}

Output:

[URL]www.no.http.example.com[/URL]
---
<a href="https://any.com/any">https://any.com/any</a>
---
[url="nourl"]nourl[/url]
---
<a href="https://any.com/any?any=333">text text</a>
---
[url="http://www.emptyTEXT.com"][/url]
---
<a href="http://www.any.com/any?any=44#sss">http://www.any.com/any?any=44#sss</a>
---
<a href="https://conflictinglink">http://differenturl</a>
---

Pattern Breakdown:

~                    #start of pattern delimiter
\[url #match literally [url
(?: #start non-capturing group #1
] #match literally ]
(https?://[^[]+) #match and store as Capture Group #1 http , an optional s , colon , two forward slashes, then one or more non-opening square brackets (since valid href values cannot have square brackets)
| #or
(?: #start non-capturing group #2
=" #match literally ="
(https?://[^"]+) #match and store as Capture Group #2 (same logic as Capture Group #1)
" #match literally "
) #end non-capturing group #2
] #match literally ]
(.+?) #match (lazily) and store as Capture Group #3 one or more characters (this is the innerHTML component)
) #end non-capturing group #1
\[/url] #match literally [/url]
~ #end of pattern delimiter

The callback function assesses the elements in the matches array ($m) and conditionally generates and returns the desired output. If there are any matches, the output will either contain:

array(
0 => [the fullstring match]
1 => [the url of a bbcode tag that does not have a quoted url]
)

or

array(
0 => [the fullstring match]
1 => '' // <-- empty string
2 => [the quoted url of the bbcode tag]
3 => [the text between the opening an closing bbcode tags]
)

Convert urls to links unless they are in BBCode with PHP

This may not be the "best" solution, but you can use a negative lookbehind ((?<!...)) to make sure that the URL isn't prefixed by ', ", or =. The obvious limitations are if someone writes something like:

Let's visit "https://google.com" on our computers. Or the link=https://google.com.

Anyways, the negative lookbehind would go at the very beginning of your expression and contain a character class: ["'=].

(?<!["'=])(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)

Demo

PHP Replacing HREF link with BBCODE URL

One more wildcard does the trick for that regex - you need to allow for the extra characters between the closing " and the >:

$str = preg_replace('#(<a href=[\'"])(.*?)([\'"].*>)(.*?)(</a>)#', '[URL=$2]$4[/URL]', $str);

Depending on your data, you might also want to consider other possible combinations, for example of the link looks like <a rel="nofollow" href="http://www.foo.com">http://www.foo.com</a>:

$str = preg_replace('#(<a\s.*href=[\'"])(.*?)([\'"].*>)(.*?)(</a>)#', '[URL=$2]$4[/URL]', $str);

JavaScript parse bbcode url

You can try this regexp:

\[url=([^\s\]]+)\s*\](.*(?=\[\/url\]))\[\/url\]

Regular expression visualization

Debuggex Demo

So, in JavaScript you can use something like this:

text = text.replace(/\[url=([^\s\]]+)\s*\](.*(?=\[\/url\]))\[\/url\]/g, '<a href="$1">$2</a>')

jsFiddle demo

If you'd like to parse the short format

[url]http://ya.ru[/url]

which must transform to

<a href="http://ya.ru">http://ya.ru</a>

You'll need the following regexp:

\[url\](.*(?=\[\/url\]))\[\/url\]

Regular expression visualization

Debuggex Demo

And the corresponding JavaScript:

 text = text.replace(/\[url\](.*(?=\[\/url\]))\[\/url\]/g, '<a href="$1">$1</a>')     

PHP - BBCode parser - Parse both bbcode link tag and not tagged link

It's easy to workaround with a lookbehind assertion.

preg_replace('#(?<![>/"])((http://)?www.........)#im', '<a href="$1">$1</a>'

Thus the regex will skip any URL enclosed in " or > or preceeded by /

It's a workaround, not a solution.

PS: target="_blank" is user pestering. Cut it out.

How to extract the url+parameters out of a bbcode url tag?

Simplest or most efficient? You're asking two different questions it seems. Simplest would be something like this:

Change:

List<string> sides = parts[0].BreakIntoParts('=');
if (sides.Count > 1)
return sides[1];

To:

List<string> sides = parts[0].BreakIntoParts('=');
if (sides.Count > 1)
return parts[0].Replace(sides[0], "");

Edit: Looks like you changed title to remove "most efficient". Here's the simplest change (fewest lines of code changed) that I see.

Best way to parse bbcode

There's both a pecl and PEAR BBCode parsing library. Software's hard enough without reinventing years of work on your own.

If neither of those are an option, I'd concentrate on turning the BBCode into a valid XML string, and then using your favorite XML parsing routine on that. Very very rough idea here, but

  1. Run the code through htmlspecialchars to escape any entities that need escaping

  2. Transform all [ and ] characters into < and > respectively

  3. Don't forget to account for the colon in cases like [tagname:

If the BBCode was nested properly, you should be all set to pass this string into an XML parsing object (SimpleXML, DOMDocument, etc.)

How do I replace custom BBCode style tag with a hyperlink

If the attributes always in the order you showed, you can use

var text = "[MYLINK ID=\"1234\" URL=\"http://mywebsite.com\" TEXT=\"Website link\"]";
var pattern = "\\[MYLINK\\s+ID=\"([^\"]*)\"\\s+URL=\"([^\"]*)\"\\s+TEXT=\"([^\"]*)\"]";
var replacement = "<a href=\"$2?id=$1\">$3</a>";
var result = Regex.Replace(text, pattern, replacement, RegexOptions.IgnoreCase);
// => <a href="http://mywebsite.com?id=1234">Website link</a>

See the .NET regex demo and the C# demo.

Details:

  • \[MYLINK - [MYLINK text
  • \s+ - any one or more whitespaces
  • ID=\" - ID=" text
  • ([^\"]*) - Group 1 ($1): zero or more chars other than "
  • \"\s+URL=\" - ", one or more whitespaces, URL=" text
  • ([^\"]*) - Group 2 ($2): zero or more chars other than "
  • \"\s+TEXT=\" - ", one or more whitespaces, TEXT=" text
  • ([^\"]*) - Group 3 ($3): zero or more chars other than "
  • \"] - "] text.


Related Topics



Leave a reply



Submit