C# Convert Relative to Absolute Links in HTML String

C# Convert Relative to Absolute Links in HTML String

The most robust solution would be to use the HTMLAgilityPack as others have suggested. However a reasonable solution using regular expressions is possible using the Replace overload that takes a MatchEvaluator delegate, as follows:

var baseUri = new Uri("http://test.com");
var pattern = @"(?<name>src|href)=""(?<value>/[^""]*)""";
var matchEvaluator = new MatchEvaluator(
match =>
{
var value = match.Groups["value"].Value;
Uri uri;

if (Uri.TryCreate(baseUri, value, out uri))
{
var name = match.Groups["name"].Value;
return string.Format("{0}=\"{1}\"", name, uri.AbsoluteUri);
}

return null;
});
var adjustedHtml = Regex.Replace(originalHtml, pattern, matchEvaluator);

The above sample searches for attributes named src and href that contain double quoted values starting with a forward slash. For each match, the static Uri.TryCreate method is used to determine if the value is a valid relative uri.

Note that this solution doesn't handle single quoted attribute values and certainly doesn't work on poorly formed HTML with unquoted values.

Relative to absolute paths in HTML

One of the possible ways to resolve this task is the use the HtmlAgilityPack library.

Some example (fix links):

WebClient client = new WebClient();
byte[] requestHTML = client.DownloadData(sourceUrl);
string sourceHTML = new UTF8Encoding().GetString(requestHTML);

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(sourceHTML);

foreach (HtmlNode link in htmlDoc.DocumentNode.SelectNodes("//a[@href]"))
{
if (!string.IsNullOrEmpty(link.Attributes["href"].Value))
{
HtmlAttribute att = link.Attributes["href"];
att.Value = this.AbsoluteUrlByRelative(att.Value);
}
}

Help identifying issue in Relative to Absolute URL RegEx Replacement

change

+ "/(?<url>[^\"'>\\\\]+)(?<delim2>[\"'\\\\]{0,2})";  

to

+ "(?<url>[^\"'>\\\\]+)(?<delim2>[\"'\\\\]{0,2})";  

ie drop the leading slash

and in the css section change

+ "(?!http)\\s*/(?<url>[^\"')]+)['\")]{1,2}";  

to

+ "(?!http)\\s*(?<url>[^\"')]+)['\")]{1,2}";  

Converting anchor tag with relative URL to absolute URL in HTML content using Java

I wouldn't do this in Java; I like to handle view-specific logic in the view layer. I'm assuming this block of code is coming from an AJAX call. So what you can do is get the HTML from the AJAX call and then do this:

jQuery(html).find("a[href]").each(function(index, value) {
var $a = jQuery(value);
var href = $a.attr("href");

if(!/^http:/.test(href)) {
$a.attr("href", "http://server-b.com" + href);
}
});

Or if you really want to do this in Java, Lauri's answer will work.

Convert relative path to full URL

You can use

string FullUrl = Request.Url.Scheme + System.Uri.SchemeDelimiter + Request.Url.Host + "/PDF/MyFile.pdf"

It works in asp.net, I'm not sure about MVC, but it should work too.

How to use html_safe and convert links to absolute URLs (not relative URLs)

I've tested this on my rails app, and run (check how i use ' and "):

@company.about = 'Some info about a <a href="http://google.com">company</a>.'

<%= @company.about.html_safe %>

Regex: absolute url to relative url (C#)

You should consider using the Uri.MakeRelativeUri method - your current algorithm depends on external files never containing "/Content/" in their path, which seems risky to me. MakeRelativeUri will determine whether a relative path can be made from the current Uri to the src or href regardless of changes you or the external file store make down the road.

How to get a relative url from an absolute Url programmaticaly?

I guess it depends on what you want it relative to, but assuming you're just looking for an absolute path...

string absolute = "http://example.com/this/is/a/test";

string rel = new Uri(absolute).AbsolutePath;

or with the query,

string rel = new Uri(absolute).PathAndQuery;

Admittedly I'm also a little confused about your attempt. Why are you splitting by a comma? Are we, perhaps, not on the same page about what a relative path should look like? In any event, this should do it.

Converting relative paths to absolute paths C#

I assume you're using ASP.NET here. In this case, I think you simply want the Server.MapPath function to return the actual physical URI of the file.

var absoluteUrl = this.Server.MapPath("../../../images/arrow.gif");
// absoluteUrl = "\\server\webroot\folder\images\arrow.gif"

(this refers to the current page of course. You can always use HttpContext.Current.Server instead, if that's not available for whatever reason.)

Note:
If you want to do things manually and you already have a specific string like "\server\webroot\folder\", then the functionality of System.IO.Path should do the job I would think:

var absoluteUri = Path.GetFullPath(Path.Combine("\\server\webroot\folder\",
"../../../images/arrow.gif"));


Related Topics



Leave a reply



Submit