Selecting attribute values with html Agility Pack
Html Agility Pack does not support attribute selection.
Selecting attribute value using XPath and HtmlAgilityPack
There is a way using HtmlNodeNavigator
:
public static string TextfromOneNode(HtmlNode node, string xmlPath)
{
string toReturn = "";
var navigator = (HtmlAgilityPack.HtmlNodeNavigator)node.CreateNavigator();
var result = navigator.SelectSingleNode(xmlPath);
if(result != null)
{
toReturn = result.Value;
}
return toReturn;
}
The following console app example demonstrates how HtmlNodeNavigator.SelectSingleNode()
works with both XPath that return element and XPath that return attribute :
var raw = @"<div>
<meta name='pubdate' content='2012-08-30' />
<span>foo</span>
</div>";
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(raw);
var navigator = (HtmlAgilityPack.HtmlNodeNavigator)doc.CreateNavigator();
var xpath1 = "//meta[@name='pubdate']/@content";
var xpath2 = "//span";
var result = navigator.SelectSingleNode(xpath1);
Console.WriteLine(result.Value);
result = navigator.SelectSingleNode(xpath2);
Console.WriteLine(result.Value);
dotnetfiddle demo
output :
2012-08-30
foo
Get value of an attribute by HtmlAgilityPack
You can use XPath to query the document nodes to find the nodes you are looking for:
static void Main(string[] args)
{
var html = @"<div class=""vcard - names - container py - 3 js - sticky js - user - profile - sticky - fields "" style=""position: static; "">
< h1 class=""vcard-names"">
<span class=""vcard-fullname d-block"" itemprop=""name"">Name 001</span>
<span class=""vcard-username d-block"" itemprop=""additionalName"">Name 002</span>
</h1>
</div>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var names = doc.DocumentNode.SelectNodes("//span").Select(x => x.InnerText);
foreach (var name in names)
{
Console.WriteLine(name);
}
Console.ReadLine();
}
Select elements with attribute data-url using HTMLAgilityPack
The following should do what you want:
foreach (HtmlNode divNode in htmlDocument.DocumentNode.SelectNodes("//div[@data-url]"))
{
HtmlAttribute attribute = divNode.Attributes["data-url"];
links.Add(attribute.Value);
}
Effectively, the statement //div[@data-url]
should select all nodes with a data-url attribute. We then pull out this attribute.
If there are nodes other than divs with this attribute, then //*[@data-url]
should do the trick.
find all elements with data - attribute using html-agility-pack
// the html block of text to parse
var a = @"<p> sample text <a href="""" data-glossaryid=""F776EB48BD""></a>
<p><img alt=""my pic"" src=""/~/media/Images/mypic.jpg"" /></p>
sample text <a href="""" data-glossaryid=""5D476EB49E""></a>
<p> more sample text </p>
<span data-glossaryid=""F776EB49EF""> </span>";
// create an HtmlDocument
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(a);
// get all elements with the attr data-glossaryid and prints its values
foreach (var item in htmlDocument.DocumentNode.SelectNodes("//*[@data-glossaryid]"))
Console.WriteLine(item.GetAttributeValue("data-glossaryid", ""));
HtmlAgilityPack - Select td Attribute from Table
Try using:
var div = documentx.DocumentNode.SelectNodes("//*//table[3]//tr");
instead of:
var div = documentx.DocumentNode.SelectNodes("//*//table[2]//tr");
and use it like this:
var author = item.ChildNodes[0].InnerText;
var series = item.ChildNodes[1].InnerText;
var title = item.ChildNodes[2].InnerText;
Html Agility Pack - – Extract Node with “empty” class Attribute OR Select a PAIR of nodes (one and its immediate following node)
TBH, I am not quite understand your question clearly but here is my attempt to answer it.
A bit of code to get “one node by class AND its first following node”,
I haven’t used XPathes (or w/e it’s called) yet so I’m not used to -
public static bool HasClass(this HtmlNode node, params string[] classValueArray)
{
var classValue = node.GetAttributeValue("class", "");
var classValues = classValue.Split(' ');
return classValueArray.All(c => classValues.Contains(c));
}
doc.DocumentNode.Descendants("li").FirstOrDefault(_ => _.HasClass("classname")).NextSibling;
If it’s possible, a way to get a “node which has the class Attribute
but NO VALUE”
doc.DocumentNode.Descendants("li").Where(_ => string.IsNullOrEmpty(_.GetAttributeValue("class", "")))
Related Topics
Why Is Modulus Operator Not Working for Double in C#
Convert This Linq Expression into Lambda
Get the Decimal Part from a Double
How to Translate a List<String> into a SQLparameter for a SQL in Statement
How Can User Resize Control at Runtime in Winforms
Taking Screenshot of a Webpage Programmatically
Update Requires a Valid Updatecommand When Passed Datarow Collection with Modified Rows
Why Visual Studio Doesn't Create a Public Class by Default
Using Extension Methods in .Net 2.0
Xmlserialize a Custom Collection with an Attribute
Using System.Io.Packaging to Generate a Zip File
Cookie Authentication Expiring Too Soon in ASP.NET Core
Convert String Value to Operator in C#
Linq - What Is the Quickest Way to Find Out Deferred Execution or Not
Setting the Datasource for a Local Report - .Net & Report Viewer