Extract URL from string
John Gruber has spent a fair amount of time perfecting the "one regex to rule them all" for link detection. Using preg_replace()
as mentioned in the other answers, using the following regex should be one of the most accurate, if not the most accurate, method for detecting a link:
(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))
If you only wanted to match HTTP/HTTPS:
(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))
How do you extract a url from a string using python?
There may be few ways to do this but the cleanest would be to use regex
>>> myString = "This is a link http://www.google.com"
>>> print re.search("(?P<url>https?://[^\s]+)", myString).group("url")
http://www.google.com
If there can be multiple links you can use something similar to below
>>> myString = "These are the links http://www.google.com and http://stackoverflow.com/questions/839994/extracting-a-url-in-python"
>>> print re.findall(r'(https?://[^\s]+)', myString)
['http://www.google.com', 'http://stackoverflow.com/questions/839994/extracting-a-url-in-python']
>>>
Detect and extract url from a string?
m.group(1) gives you the first matching group, that is to say the first capturing parenthesis. Here it's (https?|ftp|file)
You should try to see if there is something in m.group(0), or surround all your pattern with parenthesis and use m.group(1) again.
You need to repeat your find function to match the next one and use the new group array.
Extracting for URL from string using regex
Your regex
is incorrect.
Correct regex for extracting URl : /(https?:\/\/[^ ]*)/
Check out this fiddle.
Here is the snippet.
var urlRegex = /(https?:\/\/[^ ]*)/;
var input = "https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd";var url = input.match(urlRegex)[1];alert(url);
Extracting a URL in Python
In response to the OP's edit I hijacked Find Hyperlinks in Text using Python (twitter related) and came up with this:
import re
myString = "This is my tweet check it out http://example.com/blah"
print(re.search("(?P<url>https?://[^\s]+)", myString).group("url"))
Extract URL's from a string using PHP
REGEX is the answer for your problem. Taking the Answer of Object Manipulator.. all it's missing is to exclude "commas", so you can try this code that excludes them and gives 3 separated URL's as output:
$string = "The text you want to filter goes here. http://google.com, https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";
preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', $string, $match);
echo "<pre>";
print_r($match[0]);
echo "</pre>";
and the output is
Array
(
[0] => http://google.com
[1] => https://www.youtube.com/watch?v=K_m7NEDMrV0
[2] => https://instagram.com/hellow/
)
PHP regex extract url with pattern from string
You can repeat all the allowed characters before and after matching /products/
using the same optional character class. As the character class is quite long, you could shorten the notation by wrapping it in a capture group and recurse the first subpattern as (?1)
Note that you don't have to escape the forward slash using a different separator.
$re = '`\b(?:(?:https?|ftp)://|www\.)([-a-z0-9+&@#/%?=~_|!:,.;]*)/products/(?1)[-a-z0-9+&@#/%=~_|]`';
$str = <<<EOF
http://example.com/products/1/abc
This string is valid - http://example.com/products/1
This string is not valid - http://example.com/order/1
EOF;
preg_match_all($re, $str, $matches);
print_r($matches[0]);
Output
Array
(
[0] => http://example.com/products/1/abc
[1] => http://example.com/products/1
)
regex for extracting all urls from string
This should get you started:
\b(?:https?://)?(?:(?i:[a-z]+\.)+)[^\s,]+\b
Broken down, this says:
\b # a word boundary
(?:https?://)? # http:// or https://, optional
(?:(?i:[a-z]+\.)+) # any subdomain before
[^\s,]+ # neither whitespace nor comma
\b # another word boundary
See a demo on regex101.com.
Related Topics
Jekyll Custom Theme- Gemspec Bundle Install Error: Unexpected Unary-, Expecting Keyword_Do
How to Use Local or Instance Variable in Ruby Code in Coffeescript in Haml Template
Rails 3 - Devise/Actionmailer/Ruby-Smtp Causing a Segmentation Fault
Ruby: "Unexpected Keyword_End"... But All Openers and Closers Match
How to Find the Average of 3 Date in Ruby on Rails or Ruby
How to Keep Sending Emails to Users Every Week Depending on User Date Input in Rails
How to Implement Injection in Ruby
What's the Differences Between Ruby on Rails and Ruby
What Ruby and Rails Developers Ought to Know
Module and Class with the Same Name in Rails Project
Undefined Method 'Click' for Nil:Nilclass (Mechanize)
Ror: Execute SQL in Controller
Extract Text Between Two Tags Using Regex in Ruby
Create Multiple Records with Fields_For - Rails
Intermingling Attr_Accessor and an Initialize Method in One Class
Regex to Extract Boundary and Content Type Out of Mail Headers