Escape Single Quote in Xpath with Nokogiri

Escape single quote in XPath with Nokogiri?

XPath doesn’t have any way of escaping special characters, so this is a little tricky. A solution in this specific case would be to use double quotes instead of single quotes in the XPath expression:

text()="Frank's car"

If you did this, you’d have to escape the quotes from Ruby if you used double quotes around the whole expression:

"//li[text()=\"Frank's car\"]"

You could use single quotes here if you aren’t doing any interpolation, and then escape the single quote:

'//li[text()="Frank\'s car"]'

A better option would perhaps be to make use of Ruby’s flexible quoting, so that none of the quotes would need escaping ,e.g.:

%{//li[text()="Frank's car"]}

Note that all the examples here doing escaping in Ruby, so that the string that reaches the XPath processor is //li[text()="Frank's car"].

The more general case, when the text is variable that could contain single or double quotes is more difficult. XPath’s string literals can’t contain both types of quotes; you need to construct the string using the XPath concat function.

For example, if you wanted to match the string "That's mine", he said., you would need to do something like:

text()=concat('"That', "'", 's mine", he said.')

And then you’d have to escape the quotes from Ruby (using %{} would be easiest).

I found another question on SO dealing with this issue in C#, and a thread on the Nokogiri mailing list, both of which might be worth looking at if you need to take this further.

How to escape single quote in xpath 1.0 in selenium for python

In XPath 1.0, which is used by browsers and therefore by Selenium, there is no native way of escaping string literals (which was remedied in XPath 2.0). A few workarounds are mentioned by this poster, which includes:

  • First off, make sure you understand the difference between escaping in Python, which is possible, and escaping within the XPath expression
  • Then, if you simply need a single quote, surround it by double quotes, and vice versa
  • Then, if one string literal contains both double and single quotes, use something like concat('"', "Here's Johnny", '"', ", said Johnny."), which combines to the literal: "Here's Johnny", said Johnny..

In your case, this would work:

driver.find_element_by_xpath(u"//span[text()=\"" + cat2 + "\"]").click()

Another way around this is to set an XPath variable to contain the value of your string literal, which helps in readability. But I couldn't find how to do it with the web drivers for Selenium, which typically means there is no such method available.

Escaping Underscore with Xpath in Nokogiri

The problem isn't the underscore, its your XPath.

//v-product__details

is looking for a tag like <v-product__details>, not something with v-product__details in its class attribute.

I'd use CSS for this instead:

parse_page.css('.v-product__details')

But if you must use XPath:

parse_page.xpath('//div[contains(@class, "v-product__inner")]')
parse_page.xpath('//*[contains(@class, "v-product__inner")]')
parse_page.xpath('//div[@class="v-product__inner"]')
parse_page.xpath('//*[@class="v-product__inner"]')
...

And if parse_page came from Nokogiri::HTML.fragment(...) then you'll want to add a leading . to your XPath expressions:

parse_page.xpath('.//div[contains(@class, "v-product__inner")]')
...

But really, I'd go with CSS if possible.

Xpath matches with single quotes?

Try this

//faultstring[matches(text(),''')]

or

//faultstring[matches(text(),''')]

or

//faultstring[matches(text(),''')]

For a more elegant solution see this post

Apostrophe (') in XPath query

This is surprisingly difficult to do.

Take a look at the XPath Recommendation, and you'll see that it defines a literal as:

Literal ::=   '"' [^"]* '"' 
| "'" [^']* "'"

Which is to say, string literals in XPath expressions can contain apostrophes or double quotes but not both.

You can't use escaping to get around this. A literal like this:

'Some'Value'

will match this XML text:

Some&apos;Value

This does mean that it's possible for there to be a piece of XML text that you can't generate an XPath literal to match, e.g.:

<elm att=""&apos"/>

But that doesn't mean it's impossible to match that text with XPath, it's just tricky. In any case where the value you're trying to match contains both single and double quotes, you can construct an expression that uses concat to produce the text that it's going to match:

elm[@att=concat('"', "'")]

So that leads us to this, which is a lot more complicated than I'd like it to be:

/// <summary>
/// Produce an XPath literal equal to the value if possible; if not, produce
/// an XPath expression that will match the value.
///
/// Note that this function will produce very long XPath expressions if a value
/// contains a long run of double quotes.
/// </summary>
/// <param name="value">The value to match.</param>
/// <returns>If the value contains only single or double quotes, an XPath
/// literal equal to the value. If it contains both, an XPath expression,
/// using concat(), that evaluates to the value.</returns>
static string XPathLiteral(string value)
{
// if the value contains only single or double quotes, construct
// an XPath literal
if (!value.Contains("\""))
{
return "\"" + value + "\"";
}
if (!value.Contains("'"))
{
return "'" + value + "'";
}

// if the value contains both single and double quotes, construct an
// expression that concatenates all non-double-quote substrings with
// the quotes, e.g.:
//
// concat("foo", '"', "bar")
StringBuilder sb = new StringBuilder();
sb.Append("concat(");
string[] substrings = value.Split('\"');
for (int i = 0; i < substrings.Length; i++ )
{
bool needComma = (i>0);
if (substrings[i] != "")
{
if (i > 0)
{
sb.Append(", ");
}
sb.Append("\"");
sb.Append(substrings[i]);
sb.Append("\"");
needComma = true;
}
if (i < substrings.Length - 1)
{
if (needComma)
{
sb.Append(", ");
}
sb.Append("'\"'");
}

}
sb.Append(")");
return sb.ToString();
}

And yes, I tested it with all the edge cases. That's why the logic is so stupidly complex:

    foreach (string s in new[]
{
"foo", // no quotes
"\"foo", // double quotes only
"'foo", // single quotes only
"'foo\"bar", // both; double quotes in mid-string
"'foo\"bar\"baz", // multiple double quotes in mid-string
"'foo\"", // string ends with double quotes
"'foo\"\"", // string ends with run of double quotes
"\"'foo", // string begins with double quotes
"\"\"'foo", // string begins with run of double quotes
"'foo\"\"bar" // run of double quotes in mid-string
})
{
Console.Write(s);
Console.Write(" = ");
Console.WriteLine(XPathLiteral(s));
XmlElement elm = d.CreateElement("test");
d.DocumentElement.AppendChild(elm);
elm.SetAttribute("value", s);

string xpath = "/root/test[@value = " + XPathLiteral(s) + "]";
if (d.SelectSingleNode(xpath) == elm)
{
Console.WriteLine("OK");
}
else
{
Console.WriteLine("Should have found a match for {0}, and didn't.", s);
}
}
Console.ReadKey();
}

XPATH in Nokogiri

You can use data.remove_namespaces! ( see http://www.rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Document:remove_namespaces! ). Then this query works:

data.xpath('/Envelope/Body/findResponse/result/service.SapEgrQosQueueStatsLogRecord')



Related Topics



Leave a reply



Submit