Regular expression to extract href url
I not regular developer of Swift, but, Did you tried to use the withTemplate
option of stringByReplacingMatches
like this?
let regex = try! NSRegularExpression(pattern: "<a[^>]+href=\"(.*?)\"[^>]*>(.*)?</a>")
let range = NSMakeRange(0, text.characters.count)
let htmlLessString :String = regex.stringByReplacingMatches(in:
text,
options: [],
range:range ,
withTemplate: @"$2 ($1)")
Regex to extract URLs from href attribute in HTML with Python
import re
url = '<p>Hello World</p><a href="http://example.com">More Examples</a><a href="http://2.example">Even More Examples</a>'
urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', url)
>>> print urls
['http://example.com', 'http://2.example']
Extracting for URL from string using regex
Your regex
is incorrect.
Correct regex for extracting URl : /(https?:\/\/[^ ]*)/
Check out this fiddle.
Here is the snippet.
var urlRegex = /(https?:\/\/[^ ]*)/;
var input = "https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd";var url = input.match(urlRegex)[1];alert(url);
Extracting URL link using regular expression re - string matching - Python
re.findall(r'https?://[^\s<>"]+|www\.[^\s<>"]+', str(STRING))
The [^\s<>"]+
part matches any non-whitespace, non quote, non anglebracket character to avoid matching strings like:
<a href="http://www.example.com/stuff">
http://www.example.com/stuff</br>
Get all links from html page using regex
You need to use a global modifier /g
to get multiple matches with RegExp#exec
.
Besides, since your input is HTML code, you need to make sure you do not grab <
with \S
:
/(?:ht|f)tps?:\/\/[-a-zA-Z0-9.]+\.[a-zA-Z]{2,3}(\/[^"<]*)?/g
See the regex demo.
If for some reason this pattern does not match equal signs, add it as an alternative:
/(?:ht|f)tps?:\/\/[-a-zA-Z0-9.]+\.[a-zA-Z]{2,3}(?:\/(?:[^"<=]|=)*)?/g
See another demo (however, the first one should do).
Using a regular expression to extract URLs from links in an HTML document
I would suggest using DOMDocument for this very purpose rather than using regex. Consider following simple code:
$content = '
<div class="infobar">
<a href="/link/some-text">link 1</a>
<a href="/link/another-text">link 2</a>
<a href="/link/blabla">link 3</a>
<a href="/link/whassup">link 4</a>
</div>';
$dom = new DOMDocument();
$dom->loadHTML($content);
// To hold all your links...
$links = array();
// Get all divs
$divs = $dom->getElementsByTagName("div");
foreach($divs as $div) {
// Check the class attr of each div
$cl = $div->getAttribute("class");
if ($cl == "infobar") {
// Find all hrefs and append it to our $links array
$hrefs = $div->getElementsByTagName("a");
foreach ($hrefs as $href)
$links[] = $href->getAttribute("href");
}
}
var_dump($links);
OUTPUT
array(4) {
[0]=>
string(15) "/link/some-text"
[1]=>
string(18) "/link/another-text"
[2]=>
string(12) "/link/blabla"
[3]=>
string(13) "/link/whassup"
}
Regex - extract href of html tag a
Consider using an HTML parser instead. Regex often isn't powerful enough to parse HTML. For the example you posted, and fairly limited variations of it, the following should work:
<a[\s\S]*?href="([^"]+)"[\s\S]*?>
Demo
regular expression for finding 'href' value of a a link
I'd recommend using an HTML parser over a regex, but still here's a regex that will create a capturing group over the value of the href
attribute of each links. It will match whether double or single quotes are used.
<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1
You can view a full explanation of this regex at here.
Snippet playground:
const linkRx = /<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1/;const textToMatchInput = document.querySelector('[name=textToMatch]');
document.querySelector('button').addEventListener('click', () => { console.log(textToMatchInput.value.match(linkRx));});
<label> Text to match: <input type="text" name="textToMatch" value='<a href="google.com"'> <button>Match</button> </label>
Related Topics
Pylab.Ion() in Python 2, Matplotlib 1.1.1 and Updating of the Plot While the Program Runs
What Is Python Whitespace and How Does It Work
Get a Function Argument's Default Value
Split a String with Unknown Number of Spaces as Separator in Python
Parsing Datetime Strings Containing Nanoseconds
Pycharm Doesn't Recognise Installed Module
Splitting a List Based on a Delimiter Word
"Pythonic" Method to Parse a String of Comma-Separated Integers into a List of Integers
Do Python for Loops Work by Reference
Python: Urlerror: <Urlopen Error [Errno 10060]
Postponing Functions in Python
How to Do Exponentiation in Python
Cannot Concatenate 'Str' and 'Float' Objects