Parse HTML links using C#
SubSonic.Sugar.Web.ScrapeLinks seems to do part of what you want, however it grabs the html from a url, rather than from a string. You can check out their implementation here.
Parsing HTML page to extract links
You can use:
href=\"[^\"]+\"
Test here
Parsing Hyperlinks from a webpage
try HtmlAgilityPack
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load("http://www.msdn.com");
foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
{
Console.WriteLine(link.GetAttributeValue("href", null));
}
this will print out every link on your URL.
if you want to store the links in a list:
var linkList = doc.DocumentNode.SelectNodes("//a[@href]")
.Select(i => i.GetAttributeValue("href", null)).ToList();
Parsing HTML with c#.net
Give the HTMLAgilityPack a look into. Its a pretty decent HTML parser
http://html-agility-pack.net/?z=codeplex
Here's some code to get you started (requires error checking)
HtmlDocument document = new HtmlDocument();
string htmlString = "<html>blabla</html>";
document.LoadHtml(htmlString);
HtmlNodeCollection collection = document.DocumentNode.SelectNodes("//a");
foreach (HtmlNode link in collection)
{
string target = link.Attributes["href"].Value;
}
How to extract specific link in c#?
Use an xpath expression as a selector:
var alink = htmlDocument.DocumentNode
.SelectSingleNode("//li/a[contains(@onclick, 'PDF')]")
.GetAttributeValue("href", "");
Explanation of xpath (as requested):
Match li
tag at any depth in the document with an immediate child a
tag, which has an attribute onclick
that contains the string 'PDF'
.
Related Topics
Configure Multiple Database Entity Framework 6
Should I Use Mkannotation, Mkannotationview or Mkpinannotation
How to Consume a Blazor Component as a Web Component Within a Regular Non-Blazor HTML Page
Best Practice: Direct SQL Access VS. Web Service
C# Compiler Bug? Why Doesn't This Implicit User-Defined Conversion Compile
C# Equivalent to PHP Associative Array
Perform Button Click Event When User Press Enter Key in Textbox
Deploy a C# Stateful Service Fabric Application from Visual Studio to Linux
Mono on Debian: Could Not Find File "/Srv/Www/Proj/Bin\Roslyn\Csc.Exe"
Detect Browser Close on Asp.Net
Execute SQL Script on SQL Server Using C#
Shell Script File(.Sh) Does Not Run from C# Core on Linux
How to Join 2 or More .Wav Files Together Programmatically
How to Get Utc Offset in JavaScript (Analog of Timezoneinfo.Getutcoffset in C#)