Replacing link with additional information, how to exclude img src= http:// tags
Just try the below code, and I am sure you get the desired output :
require 'uri'
text = '<p>This is a link: http://www.url1.com/</p>
<p>http://www.url2.com/</p>
<p><img src="http://www.url3.com/image.jpg"> something</p>'
URI.extract(text)
links => ["link:", "http://www.url1.com/", "http://www.url2.com/", "http://www.url3.com/image.jpg"]
And then replace all the links with the 'REPLACED' using gsub .
links.shift => "link :"
links.each do |link|
text = text.gusb(link, "REPLACED")
end
and output of text is
"<p>This is a link : REPLACED</p>\n<p>REPLACED</p>\n<p><img src=\"REPLACED\"> something</p>"
Hope that help.
Regex replace hyphens in text excluding urls, tags and mails
There's a few issues with your regex...
- You can't use
|
as an OR operator in your character class - Your regex is greedy
- You can't use multiple
not
operators in a character class - You don't need to have more than one space matched at the start and end
- Your character class swallows spaces
It seems to me that you're over thinking it; you could rephrase your task to: "replace hyphens in words"
(\s\w+)-(\w+\s)
(\s\w+) : Capture group matching a white space and then 1 or more of the characters [a-zA-Z0-9_]
- : Match a hyphen
(\w+\s) : Capture group matching a white space and then 1 or more of the characters [a-zA-Z0-9_]
However, you can also use your more wide ranging character class like:
(\s[^@\/\s]+)-([^@\/\s]+\s)
(\s[^@\/\s]+) : Capture group matching a space followed by 1 or more characters which aren't @, /, or a space
- : Matches a hyphen
([^@\/\s]+\s) : Capture group matching a space followed by 1 or more characters which aren't @, /, or a space
$string = "Some text with a link but also plain URL like http://another-domain.com and an e-mail info@some-domain.com and e-shop and some relative URL like /test-url/on-this-website.";
echo preg_replace("/(\s\w+)-(\w+\s)/", "$1‑$2", $string);
echo preg_replace("/(\s[^@\/\s]+)-([^@\/\s]+\s)/", "$1‑$2", $string);
Note: You may need to change the starting and closing space to include the start/end of a the string.
Javascript regex: Find all URLs outside a tags - Nested Tags
It turned out that probably the best solution is the following:
((https?|ftps?):\/\/[^"<\s]+)(?![^<>]*>|[^"]*?<\/a)
Looks like that the negative lookahead is working properly only if it starts with quantifiers and not strings. For such a case, it follows that practically we can do backtracks only.
Again, we just want to make sure that nothing inside HTML tags as attributes is messed up. Then we do a backtrack starting from </a
up to the first "
symbol (as it is not a valid URL symbol but <>
symbols are present with nested tags).
Now also nested tags inside <a>
tags are found properly. Of course, the code is not perfect but it should work with almost any simple HTML markup. Just you may need to be a bit careful with:
- placing quotes within
<a>
tags; - do not use this algorithm on
<a>
tags without any attribute (placeholders); - as well as you may need to avoid using multiple nested tags/lines unless the URL inside
<a>
tag is after any double quote.
Here is a very good and messy example (the last match should not be found but it is):
https://regex101.com/r/pC0jR7/2
It is a pity that this lookahead does not work: (?!<a.*?<\/a>)
jQuery using Regex to find links within text but exclude if the link is in quotes
What about adding [^"']
to the exp
variable?
var exp = /(\b[^"'](https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
Snippet:
// Get the content
var str = jQuery("#text2replace").html();
// Set the regex string
var exp = /(\b[^"'](https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
var replaced_text = str.replace(exp, function(url) {
clean_url = url.replace(/https?:\/\//gi,'');
return '<a href="' + url + '">' + clean_url + '</a>';
})
jQuery("#text2replace").html(replaced_text);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="text2replace">
The School of Computer Science and Informatics. She blogs at http://www.wordpress.com and can be found on Twitter <a href="https://twitter.com/abcdef">@Abcdef</a>.
</div>
Related Topics
Preserve Key Order (Stable Sort) When Sorting With PHP'S Uasort
How to Programmatically Login/Authenticate a User
How to Emulate a Get Request Exactly Like a Web Browser
Resetting Array Pointer in Pdo Results
Laravel Update Model With Unique Validation Rule For Attribute
How to Get the Sqlsrv Extension to Work With PHP, Since Mssql Is Deprecated
Stop People Uploading Malicious PHP Files Via Forms
Upload Video Files Via PHP and Save Them in Appropriate Folder and Have a Database Entry
PHP How to Start an External Program Running - Having Trouble With System and Exec
Fatal Error: Cannot Use Object of Type MySQLi_Result
Convert Latin1 Characters on a Utf8 Table into Utf8
Show Image Using File_Get_Contents
How to Replace Text Urls and Exclude Urls in HTML Tags
How to Find Day of Week in PHP in a Specific Timezone