How to Use Xpath Contains() For Specific Text

How to use XPath contains() for specific text?

Be careful of the contains() function.

It is a common mistake to use it to test if an element contains a value. What it really does is test if a string contains a substring. So, td[contains(.,'8')] takes the string value of td (.) and tests if it contains any '8' substrings. This might be what you want, but often it is not.

This XPath,

//td[.='8']

will select all td elements whose string-value equals 8.

Alternatively, this XPath,

//td[normalize-space()='8']

will select all td elements whose normalize-space() string-value equals 8. (The normalize-space() XPath function strips leading and trailing whitespace and replaces sequences of whitespace characters with a single space.)

Notes:

  • Both will work even if the 8 is inside of another element such as a
    a, b, span, div, etc.
  • Both will not match <td>gr8t</td>, <td>123456789</td>, etc.
  • Using normalize-space() will ignore leading or trailing whitespace
    surrounding the 8.

See also:

  • Why is contains(text(), "string" ) not working in XPath?

XPath contains(text(),'some string') doesn't work when used with node with more than one Text subnode

The <Comment> tag contains two text nodes and two <br> nodes as children.

Your xpath expression was

//*[contains(text(),'ABC')]

To break this down,

  1. * is a selector that matches any element (i.e. tag) -- it returns a node-set.
  2. The [] are a conditional that operates on each individual node in that node set. It matches if any of the individual nodes it operates on match the conditions inside the brackets.
  3. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set.
  4. contains is a function that operates on a string. If it is passed a node set, the node set is converted into a string by returning the string-value of the node in the node-set that is first in document order. Hence, it can match only the first text node in your <Comment> element -- namely BLAH BLAH BLAH. Since that doesn't match, you don't get a <Comment> in your results.

You need to change this to

//*[text()[contains(.,'ABC')]]
  1. * is a selector that matches any element (i.e. tag) -- it returns a node-set.
  2. The outer [] are a conditional that operates on each individual node in that node set -- here it operates on each element in the document.
  3. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set.
  4. The inner [] are a conditional that operates on each node in that node set -- here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets.
  5. contains is a function that operates on a string. Here it is passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the 'ABC' string and be able to match it.

Using XPATH, how to select ANY node that contains a certain string

There's some ambiguity regarding the term nodes (see XPath difference between child::* and child::node()) and the term contains (see How to use XPath contains() for specific text?) when being less than perfectly precise, but one of the following XPaths will likely meet your needs:

  1. All nodes whose string value contains the substring, "John":

    //node()[contains(.,"John")]
  2. All such elements:

    //*[contains(.,"John")]
  3. All such attributes:

    //@*[contains(.,"John")]
  4. All such text nodes:

    //text()[contains(.,"John")]
  5. All elements with text node children that contain the substring, "John":

    //*[text()[contains(.,"John")]]

Notice that #1 will include books, but #5 will exclude it. See Testing text() nodes vs string values in XPath.

You can replace contains(.,"John") with contains(lower-case(.),"john") in any of the above XPaths if you're using XPath 2.0. See also Case insensitive XPath contains() possible?

Selenium XPath: Find element with text OR other text

Yes, You can use XPath with OR Logical condition as follows :

  • //*[contains(text(),'A') or contains(text(),'B')]
  • //*[text()='exact_Text_1' or text()='exact_text_2']

  • //*[@class='abc' or @class='pqr']

Using Xpath Contains function to find element that contains text

Your XPath doesn't work because the string "Discipline" does not occur in the first text node child of the td element, but in a subsequent text node. You are using an XPath 1.0 processor (heaven only knows why Selenium doesn't move to something more modern), and in XPath 1.0 the rule is that contains(X, Y), when X is a set of nodes, only considers the first node in that set, and ignores the others.

I would have thought the closest fit to your stated requirement is something like //td[contains(., 'Discipline')]

XPath for an element that contains another element with certain text?

You can try

//button[contains(div,'Save')]

to locate button with child div that contains specific text

Why does XPath contains() select an unexpected node?

The reason that your XPath,

//*[contains(.,'http')]//text()

selects a surprise second result is that this XPath says to select all elements whose string-value contains an "http" substring, and return all descendant text nodes. These elements include not just the immediate parent element of the targeted text node but its ancestors as well:

  1. The loc element, as you expected.
  2. The urlset and url too, as you did not expect. (The urlset and url elements also have a 2019-08-07T15:01:51+00:00 descendant text node, and thus as part of their string-values.)

Alternatives to achieve desired result

  • Narrow the * all-elements wildcard to a single, named element:

    //loc[contains(.,'http')]/text()
  • Narrow the * all-elements wildcard to multiple, named elements:

    //*[(self::loc or self::e2) and contains(.,'http')]/text()
  • Select all text nodes containing the substring, "http" as noted by Michael Kay:

    //text()[contains(., 'http')]

See also

  • Testing text() nodes vs string values in XPath
  • Using XPATH, how to select ANY node that contains a certain string
  • How to use XPath contains() for specific text?


Related Topics



Leave a reply



Submit