case insensitive xpath searching in php
Gordon's recommendation to use a PHP function from within XPath will prove more flexible should you choose to use that. However, contrary to his answer, the translate
string function is available in XPath 1.0 so that means you can use it; your problem is how.
First, there is the obvious typo that Charles pointed out in his comment to the question. Then there is the logic of how you're trying to match the text values.
In word form, you are currently asking, "does the text contain the lowercase form of the keyword?" This is not really what you want to be asking. Instead, ask, "does the lowercase text contain the lowercase keyword?" Translating (pardon the pun) that back into XPath-land would be:
(Note: truncated alphabets for readability)
//line[contains(translate(text(),'ABC...Z','abc...z'),'chicago')]
The above lowercases the text contained within the line
node then checks that it (the lowercased text) contains the keyword chicago
.
And now for the obligatory code snippet (but really, the above idea is what you really need to take home):
$xml = simplexml_load_file($data);
$search = strtolower($keyword);
$nodes = $xml->xpath("//line[contains(translate(text(), 'ABCDEFGHJIKLMNOPQRSTUVWXYZ', 'abcdefghjiklmnopqrstuvwxyz'), '$search')]");
echo 'Got ' . count($nodes) . ' matches!' . PHP_EOL;
foreach ($nodes as $node){
echo $node . PHP_EOL;
}
Edit after dijon's comment
Inside the foreach, you could access the line number, chapter number and book name like below.
Line number -- this is just an attribute on the <line>
element which makes accessing it super-easy. There are two ways, with SimpleXML, of accessing it: $node['number']
or $node->attributes()->number
(I prefer the former).
Chapter number -- to get at this, as you rightly said, we need to traverse up the tree. If we were using the DOM classes, we would have a handy $node->parentNode
property leading us directly to the <chapter>
(since it is the immediate ancestor to our <line>
). SimpleXML does not have such a handy property, but we can use a relative XPath query to get it. The parent axis allows us to traverse up the tree.
Since xpath()
returns an array we can cheat and use current()
to access the first (and only) item in the array returned from it. Then it is just a matter of accessing the number
attribute as above.
// In the near future we can use: current(...)['number'] but not yet
$chapter = current($node->xpath('./parent::chapter'))->attributes()->number;
Book name -- the process for this is the same as that of accessing the chapter number. A relative XPath query from the <line>
could make use of the ancestor axis like ./ancestor::book
(or ./parent:chapter/parent::book
). Hopefully you can figure out how to access its name
attribute.
How do i make Xpath search case insensitive
XPath 1.0 :
$qry = "//channel/item[contains(
translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),
translate($search, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))]"
XPath 2.0 :
$qry = "//channel/item[lower-case(.) = lower-case($search)]"
Both replace all upper case to lower case.
How do I perform a case insensitive search for a node in PHP xpath?
There is no case conversion in xpath 1.0 as supported by php (see http://jp.php.net/manual/en/class.domxpath.php)
you could use the translate function, shown below in a very limited sense. note: not recommended as it won't work for non-english characters
/items/*[translate(node(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') = 'item']/text()
Upd:If node() will not work, try name()
you could also do a union as below
/items/ITEM/text() | /items/item/text()
How do i make Xpath 1.0 query case insensitive
Basically, translate is used to convert dynamic value that you need to compare to be all lower-case (or all upper-case). In this case, you want to apply translate()
to rel
attribute value, and compare the result to lower-case literal "canonical"
(formatted for readability) :
//link[
translate(@rel, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') = 'canonical'
]
How can I use XPath to perform a case-insensitive search and support non-english characters?
In XPath 1.0 (which is, I believe, the best you can get with PHP SimpleXML), you'd have to use the translate()
function to produce all-lowercase output from mixed-case input.
For convenience, I would wrap it in a function like this:
function findStopPointByName($xml, $query) {
$upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZÆØÅ"; // add any characters...
$lower = "abcdefghijklmnopqrstuvwxyzæøå"; // ...that are missing
$arg_stopname = "translate(StopName, '$upper', '$lower')";
$arg_query = "translate('$query', '$upper', '$lower')";
return $xml->xpath("//StopPoint[contains($arg_stopname, $arg_query)");
}
As a sanitizing measure I would either completely forbid or escape single quotes in $query
, because they will break your XPath string if they are ignored.
How to perform case insensitive search in XPath?
VBScript supports only XPath 1.0 and not XQuery, so first edit your question title.
In XPath 1.0 the translate()
function is used for case insensitivity.
//*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') , search)]/ancestor-or-self::*/*[local-name()='home' and @locale='en']
Where search = Lcase(V_SAEARCH)
It will work perfect. No need to use quotes around your variable.
another way to write this is:-
//*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') , translate('" & search & "', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'))]/ancestor-or-self::*/*[local-name()='home' and @locale='en']
Here search variable is being translated in XPath.
XPath/XML lowercase query methods
If you use
$query = $xmldoc->xpath('/products/product[contains(translate(name, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"),"desk")]');
you have a case insensitive comparison, at least for the letters the translate call converts.
case-insensitive matching in XPath?
XPath 2 has a lower-case (and upper-case) string function. That's not quite the same as case-insensitive, but hopefully it will be close enough:
//CD[lower-case(@title)='empire burlesque']
If you are using XPath 1, there is a hack using translate.
Related Topics
How to Convert Word Smart Quotes and Em Dashes in a String
Find Number Which Is Greater Than or Equal to N in an Array
Getting Pear to Work on Xampp (Apache/MySQL Stack on Windows)
Insert Data Through Ajax into MySQL Database
HTML Upload Max_File_Size Does Not Appear to Work
PHP Cli: How to Read a Single Character of Input from the Tty (Without Waiting for the Enter Key)
Get Current Url Path with Query String in PHP
Why Does 1234 == '1234 Test' Evaluate to True
Guzzle: Handle 400 Bad Request
Switching Between Http and Https Pages with Secure Session-Cookie
Inkscape Inside PHP/Apache Doesn't Render Fonts to Png
How to Use "Root" Namespace of PHP