replace any url's within a string of text, to clickable links with php
You can use regexp to do this:
$html_links = preg_replace('"\b(https?://\S+)"', '<a href="$1">$1</a>', $text);
Replace all URLs in text to clickable links in PHP
function convert($input) {
$pattern = '@(http(s)?://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])@';
return $output = preg_replace($pattern, '<a href="http$2://$3">$0</a>', $input);
}
demo replace string based on array items loop in php
If you don't need to extract the usernames, just replace them all at once with a single regexp:
preg_replace( '#(/u/[a-z0-9]+)#i', '<a href="$1">$1</a>', $comment );
No need to worry about similar usernames since each will be matched and replaced without affecting the others. Find and replace all URL'S that is between [link] and [/link] with hyperlinks
<?php
function PregGet( $text, $regex ) {
preg_match_all( $regex, $text, $matches );
return $matches[2];
}
function PregReplace( $text, $regex, $replace ) {
return preg_replace( $regex, $replace, $text );
}
$text = '[link]https://google.com[/link] <br>[link]https://yahoo.com[/link]';
$matches = PregGet( $text, '(\[(link)\](.*?)\[\/(link)\])' );
foreach ( $matches as $match ) {
$a = str_replace('[link]', '', str_replace('[/link]', '', $match))
$text = PregReplace( $text, '(\[(link)\](' . $a . ')\[/(link)\])', '<a href="' . $a . '">' . $a . '</a>' );
}
Replace all urls with minified urls within a string containing mixed content
Since you know the key of the url to be replaced, you can simply loop over then and use str_replace
to replace each shorturl with the original;
<?php
$string = "The text you want to filter goes here. http://google.com, https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";
preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', $string, $match);
// Shorten array
$short = [ 'http://t.com/1xx', 'http://t.com/z112', 'http://t.com/3431' ];
// For each url
foreach ($match[0] as $key => $value) {
// Replace in original text
$string = str_replace($value, $short[$key], $string);
}
echo $string;
The text you want to filter goes here. http://t.com/1xx, http://t.com/z112,http://t.com/3431Try it online!
Replace URLs in text with HTML links
Let's look at the requirements. You have some user-supplied plain text, which you want to display with hyperlinked URLs.
- The "http://" protocol prefix should be optional.
- Both domains and IP addresses should be accepted.
- Any valid top-level domain should be accepted, e.g. .aero and .xn--jxalpdlp.
- Port numbers should be allowed.
- URLs must be allowed in normal sentence contexts. For instance, in "Visit stackoverflow.com.", the final period is not part of the URL.
- You probably want to allow "https://" URLs as well, and perhaps others as well.
- As always when displaying user supplied text in HTML, you want to prevent cross-site scripting (XSS). Also, you'll want ampersands in URLs to be correctly escaped as &.
- You probably don't need support for IPv6 addresses.
- Edit: As noted in the comments, support for email-adresses is definitely a plus.
- Edit: Only plain text input is to be supported – HTML tags in the input should not be honoured. (The Bitbucket version supports HTML input.)
Here's my take:
<?php
$text = <<<EOD
Here are some URLs:
stackoverflow.com/questions/1188129/pregreplace-to-detect-html-php
Here's the answer: http://www.google.com/search?rls=en&q=42&ie=utf-8&oe=utf-8&hl=en. What was the question?
A quick look at http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax is helpful.
There is no place like 127.0.0.1! Except maybe http://news.bbc.co.uk/1/hi/england/surrey/8168892.stm?
Ports: 192.168.0.1:8080, https://example.net:1234/.
Beware of Greeks bringing internationalized top-level domains: xn--hxajbheg2az3al.xn--jxalpdlp.
And remember.Nobody is perfect.
<script>alert('Remember kids: Say no to XSS-attacks! Always HTML escape untrusted input!');</script>
EOD;
$rexProtocol = '(https?://)?';
$rexDomain = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';
$rexPort = '(:[0-9]{1,5})?';
$rexPath = '(/[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]*?)?';
$rexQuery = '(\?[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
$rexFragment = '(#[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
// Solution 1:
function callback($match)
{
// Prepend http:// if no protocol specified
$completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";
return '<a href="' . $completeUrl . '">'
. $match[2] . $match[3] . $match[4] . '</a>';
}
print "<pre>";
print preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",
'callback', htmlspecialchars($text));
print "</pre>";
- To properly escape < and & characters, I throw the whole text through htmlspecialchars before processing. This is not ideal, as the html escaping can cause misdetection of URL boundaries.
- As demonstrated by the "And remember.Nobody is perfect." line (in which remember.Nobody is treated as an URL, because of the missing space), further checking on valid top-level domains might be in order.
preg_replace_callback
using preg_match
.// Solution 2:
$validTlds = array_fill_keys(explode(" ", ".aero .asia .biz .cat .com .coop .edu .gov .info .int .jobs .mil .mobi .museum .name .net .org .pro .tel .travel .ac .ad .ae .af .ag .ai .al .am .an .ao .aq .ar .as .at .au .aw .ax .az .ba .bb .bd .be .bf .bg .bh .bi .bj .bm .bn .bo .br .bs .bt .bv .bw .by .bz .ca .cc .cd .cf .cg .ch .ci .ck .cl .cm .cn .co .cr .cu .cv .cx .cy .cz .de .dj .dk .dm .do .dz .ec .ee .eg .er .es .et .eu .fi .fj .fk .fm .fo .fr .ga .gb .gd .ge .gf .gg .gh .gi .gl .gm .gn .gp .gq .gr .gs .gt .gu .gw .gy .hk .hm .hn .hr .ht .hu .id .ie .il .im .in .io .iq .ir .is .it .je .jm .jo .jp .ke .kg .kh .ki .km .kn .kp .kr .kw .ky .kz .la .lb .lc .li .lk .lr .ls .lt .lu .lv .ly .ma .mc .md .me .mg .mh .mk .ml .mm .mn .mo .mp .mq .mr .ms .mt .mu .mv .mw .mx .my .mz .na .nc .ne .nf .ng .ni .nl .no .np .nr .nu .nz .om .pa .pe .pf .pg .ph .pk .pl .pm .pn .pr .ps .pt .pw .py .qa .re .ro .rs .ru .rw .sa .sb .sc .sd .se .sg .sh .si .sj .sk .sl .sm .sn .so .sr .st .su .sv .sy .sz .tc .td .tf .tg .th .tj .tk .tl .tm .tn .to .tp .tr .tt .tv .tw .tz .ua .ug .uk .us .uy .uz .va .vc .ve .vg .vi .vn .vu .wf .ws .ye .yt .yu .za .zm .zw .xn--0zwm56d .xn--11b5bs3a9aj6g .xn--80akhbyknj4f .xn--9t4b11yi5a .xn--deba0ad .xn--g6w251d .xn--hgbk6aj7f53bba .xn--hlcj6aya9esc7a .xn--jxalpdlp .xn--kgbechtv .xn--zckzah .arpa"), true);
$position = 0;
while (preg_match("{\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))}", $text, &$match, PREG_OFFSET_CAPTURE, $position))
{
list($url, $urlPosition) = $match[0];
// Print the text leading up to the URL.
print(htmlspecialchars(substr($text, $position, $urlPosition - $position)));
$domain = $match[2][0];
$port = $match[3][0];
$path = $match[4][0];
// Check if the TLD is valid - or that $domain is an IP address.
$tld = strtolower(strrchr($domain, '.'));
if (preg_match('{\.[0-9]{1,3}}', $tld) || isset($validTlds[$tld]))
{
// Prepend http:// if no protocol specified
$completeUrl = $match[1][0] ? $url : "http://$url";
// Print the hyperlink.
printf('<a href="%s">%s</a>', htmlspecialchars($completeUrl), htmlspecialchars("$domain$port$path"));
}
else
{
// Not a valid URL.
print(htmlspecialchars($url));
}
// Continue text parsing from after the URL.
$position = $urlPosition + strlen($url);
}
// Print the remainder of the text.
print(htmlspecialchars(substr($text, $position)));
Replace all urls as links in PHP
It can be done with preg_replace function:
$string = "Hallo Studenten http://google.com/your/subpage and https://www.yahoo.com/my/subpage";
$regexp = "/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i";
$anchorMarkup = "<a href=\"$0\" target=\"_blank\" >$0</a>";
echo preg_replace($regexp, $anchorMarkup, $string);
Turn Plain Text URLs into Active Links using PHP
You may wonder how it works. I'll try to explain how it should be done by various methods. We'll start first with how regex works and how it is used.
Regex - Regular expression
Basic SyntaxIn computing, a regular expression (abbreviated regex or regexp) is a
sequence of characters that forms a search pattern, mainly for use in
pattern matching with strings, or string matching, i.e. "find and
replace"-like operations.
To use regular expressions first you need to learn the syntax. This syntax consists of a series of letters, numbers, dots, hyphens and special signs, which we can group together using different parentheses.
^ The circumflex symbol matches the beginning of the input string or line, although in some cases it can be omitted
$ Same as with the circumflex symbol, the dollar sign matches the end of the input string or line
. The period matches any single character
? It will match the preceding pattern zero or one times
+ It will match the preceding pattern one or more times
* It will match the preceding pattern zero or more times
| Boolean OR
- Used when describing a range of elements
() Groups pattern elements together
[] Matches any single character between the square brackets
{min, max} Used to match exact character counts, where min and max are integers
\d Matches any single digit
\D Matches any single non digit caharcter
\w Matches any alpha numeric character including underscore (_)
\W Matches any non alpha numeric character excluding the underscore character
\s Matches any single whitespace character
BracketsBrackets []
have a special meaning when used in the context of regular expressions. They are used to find a range of characters.
[0-9] Matches any decimal digit from 0 through 9.
[a-z] Matches any character from lowercase a through lowercase z.
[A-Z] Matches any character from uppercase A through uppercase Z.
[a-Z] Matches any character from lowercase a through uppercase Z.
ExamplesLet's look at how to use properly the operators. We will do this with an example of the word hello
.
/hello/ Matches the word hello
/^hello/ Matches hello at the start of a string. Possible matches are hello or helloworld, but not worldhello
/hello$/ Matches hello at the end of a string or line.
/he.o/ Matches any character between he and o. Possible matches are helo or heyo, but not hello
/he?llo/ Matches either hllo or hello
/hello+/ Matches hello one or more times. E.g. matches hello or hellohello
/he*llo/ Matches llo, hello or hehello, but not hellooo
/hello|world/ Matches either hello or world
/(A-Z)/ Using the hyphen character to denote a range, matches every uppercase character from A to Z. E.g. A, B, C…
/[abc]/ Matches any single character a, b or c
/abc{1}/ Matches precisely one c character after the characters ab. E.g. matches abc, but not abcc
/abc{1,}/ Matches one or more c character after the characters ab. E.g. matches abc or abcc
/abc{2,4}/ Matches between two and four c character after the characters ab. E.g. matches abcc, abccc or abcccc, but not abc
The most common[^a-zA-Z] Matches any string not containing any of the characters ranging from a through z and A through Z.
p.p Matches any string containing p, followed by any character, in turn followed by another p.
^.{2}$ Matches any string containing exactly two characters.
<b>(.*)</b> Matches any string enclosed within <b> and </b>.
p(hp)* Matches any string containing a p followed by zero or more instances of the sequence hp.
Regex to match a URL
At first let's look how a URL is built. We only have a couple of options:http://example.com/
https://example.com/
ftp://example.com/
www.example.com
user@example.com
127.0.0.1
http://example.com:8080/
http://
, https://
, ftp
, www
, mail
, ip
and port
.Method 1 (1/10 points)
// Only mails
$match = preg_match('/[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+(?:\.[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+)*\@[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+(?:\.[^\x00-\x20()<>@,;:\\".[\]\x7f-\xff]+)+/', $string, $array);
Method 2 (5/10 points)// Without ports, www-s, ip-s and mails
$text = ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]","<a href=\"\\0\">\\0</a>", $text);
Method 3 (10/10 points)/* Proposed by:
* Søren Løvborg
* http://stackoverflow.com/users/136796/soren-lovborg
*/
$rexProtocol = '(https?://)?';
$rexDomain = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';
$rexPort = '(:[0-9]{1,5})?';
$rexPath = '(/[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]*?)?';
$rexQuery = '(\?[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
$rexFragment = '(#[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
function callback($match)
{
// Prepend http:// if no protocol specified
$completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";
return '<a href="' . $completeUrl . '">'
. $match[2] . $match[3] . $match[4] . '</a>';
}
$text = preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",
'callback', htmlspecialchars($text));
You can write your own ideas to my answer.
I am writing...
Related Topics
How to Use Laravel Passport with a Custom Username Column
Upgrading Laravel 5.5 to 5.6 Error
How to Select a MySQL Database to Use with Pdo in PHP
File_Get_Contents('Php://Input') Always Returns an Empty String
PHP - Simplexml - Addchild with Another Simplexmlelement
Customizing My-Account Addresses Fields in Woocommerce 3
Callback Function Return Return($Var & 1)
Randomize a PHP Array with a Seed
Adding an Additional Custom Field in Woocommerce Edit Account Page
How to Call a Static Method on a Variable Class
Converting PHP Array of Arrays into Single Array
How to Get the Root Url of the Site
0' as a String with Empty() in PHP
Determining If a File Exists in Laravel 5
Php/MySQL with Encoding Problems