Can Nokogiri Use Single Quotes for Attributes on Saving Xml

Can nokogiri use single quotes for attributes on saving xml?

It looks like the answer is no; not as the library is currently written, and maybe not at all. Tracing the call path for a node's serialization:

  • Nokogiri::XML::Node#to_s calls to_xml
  • Nokogiri::XML::Node#to_xml calls serialize (sets a few default options)
  • Nokogiri::XML::Node#serialize calls write_to
  • Nokogiri::XML::Node#write_to calls native_write_to
  • Nokogiri::XML::Node#native_write_to calls native_write_to, which looks like this:

&bsp;

def native_write_to(io, encoding, indent_string, options)
set_xml_indent_tree_output 1
set_xml_tree_indent_string indent_string
savectx = LibXML.xmlSaveToIO(IoCallbacks.writer(io), nil, nil, encoding, options)
LibXML.xmlSaveTree(savectx, cstruct)
LibXML.xmlSaveClose(savectx)
io
end

So, you are at the mercy of libxml at this point. Googling for libxml serialize single quote attributes does not immediately turn up any smoking guns.

I think you should file a feature request and see what sort of tenderlovin' you can get. :)

Can Nokogiri retain attribute quoting style?

No, it cannot. There is no information stored in a Nokogiri::XML::Attr (nor the underlying data structure in libxml2) about what type of quotes were (or should be) used to delimit an attribute. As such, all serialization (done by libxml2) uses the same attribute quoting style.

Indeed, this information is not even properly retained within the XML Information Set, as described by the specs:

Appendix D: What is not in the Information Set


The following information is not represented in the current version of the XML Information Set (this list is not intended to be exhaustive):

[...]

17) The kind of quotation marks (single or double) used to quote attribute values.

The good news is that the two XML serialization styles describe the exact same content. The bad news is that unless you're using a Canonical XML Serialization (which Nokogiri is not yet able to produce just recently able to produce) there are a large variety of ways to represent the same document that would look like many spurious 'changes' to a standard text-diffing tool.

Perhaps if you can describe why you wanted this functionality (what is the end goal you are trying to accomplish?) we could help you further.

You might also be interested in this similar question.

PHP DOM and single quotes

No. DOMDocument is a data-oriented access API for XML. And it serializes the documents however it wants to.

There is no ->save() flag http://www.php.net/manual/en/libxml.constants.php in PHP to accomplish it. And other language bindings don't allow it either: Can nokogiri use single quotes for attributes on saving xml?

And this is because libxml itself does provide no means to override this. libxml2/xmlsave.h and others mention no quote-style flags. So, I'm afraid you're really out of luck.

Escape single quote in XPath with Nokogiri?

XPath doesn’t have any way of escaping special characters, so this is a little tricky. A solution in this specific case would be to use double quotes instead of single quotes in the XPath expression:

text()="Frank's car"

If you did this, you’d have to escape the quotes from Ruby if you used double quotes around the whole expression:

"//li[text()=\"Frank's car\"]"

You could use single quotes here if you aren’t doing any interpolation, and then escape the single quote:

'//li[text()="Frank\'s car"]'

A better option would perhaps be to make use of Ruby’s flexible quoting, so that none of the quotes would need escaping ,e.g.:

%{//li[text()="Frank's car"]}

Note that all the examples here doing escaping in Ruby, so that the string that reaches the XPath processor is //li[text()="Frank's car"].

The more general case, when the text is variable that could contain single or double quotes is more difficult. XPath’s string literals can’t contain both types of quotes; you need to construct the string using the XPath concat function.

For example, if you wanted to match the string "That's mine", he said., you would need to do something like:

text()=concat('"That', "'", 's mine", he said.')

And then you’d have to escape the quotes from Ruby (using %{} would be easiest).

I found another question on SO dealing with this issue in C#, and a thread on the Nokogiri mailing list, both of which might be worth looking at if you need to take this further.

How can I get nokogiri to select node attributes and add them to other nodes?

next_sibling should do the job

require 'rubygems'
require 'nokogiri'

frag = Nokogiri::XML(DATA)
frag.css('title').each { |t| t['id'] = "ID#{t.next_sibling.next_sibling['number']}" }
puts frag.to_xml

__END__
<root>
<title>Section X</title>
<paragraph number="1">Stuff</paragraph>
<title>Section Y</title>
<paragraph number="2">Stuff</paragraph>
</root>

Because whitespace is also a node, you have to call next_sibling twice. Maybe there is a way to avoid this.

Alternatively you can use an xpath expression to select the number attribute of the next paragraph

t['id'] = "ID#{t.xpath('following-sibling::paragraph/@number').first}"

Ruby: How do I get attribute values from XML with Nokogiri?

require 'rubygems'
require 'nokogiri'

string = %Q{
<?xml version="1.0" encoding="UTF-8"?>
<response status="ok" permission_level="admin" message="ready to use" cached="0">
<title>kit</title>
</response>
}

doc = Nokogiri::XML(string)
doc.css("response").each do |response_node|
puts response_node["message"]
end

save and run this ruby file, you will get result:

#=> ready to use

Nokogiri: How to select the value of an attribute that contains periods in its id?

You didn't show how you are parsing your document, but if I parse it as HTML and then use single quotes around the attribute value in the css selector, I can get the tag:

require 'nokogiri'

html = <<END_OF_HTML
<td data-reactid="hello">10</td>
<td data-reactid=".3.3.1:$contract_23.$=1$dataRow:0.1">94.280</td>
<td data-reactid="goodbye">20</td>
END_OF_HTML

html_doc = Nokogiri::HTML(html)

html_doc.css("td[data-reactid='.3.3.1:$contract_23.$=1$dataRow:0.1']").each do |tag|
puts tag.text
end

--output:--
94.280

Check out the Mothereffing Unquoted Attribute Value Validator via this SO post:

CSS attribute selectors: The rules on quotes (", ' or none?)

How to use variable in Nokogiri contains

"main/key:contains(#{var})"

interpolates to to

"main/key:contains(yyy)"

Note the absence of quotes. You want this:

"main/key:contains(\"#{var}\")"

or more prettily

%Q{main/key:contains("#{var}")}

and some clever escaping would also help if you are not sure about the content of var.



Related Topics



Leave a reply



Submit