Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?
import yaml
class folded_unicode(unicode): pass
class literal_unicode(unicode): pass
def folded_unicode_representer(dumper, data):
return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='>')
def literal_unicode_representer(dumper, data):
return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|')
yaml.add_representer(folded_unicode, folded_unicode_representer)
yaml.add_representer(literal_unicode, literal_unicode_representer)
data = {
'literal':literal_unicode(
u'by hjw ___\n'
' __ /.-.\\\n'
' / )_____________\\\\ Y\n'
' /_ /=== == === === =\\ _\\_\n'
'( /)=== == === === == Y \\\n'
' `-------------------( o )\n'
' \\___/\n'),
'folded': folded_unicode(
u'It removes all ordinary curses from all equipped items. '
'Heavy or permanent curses are unaffected.\n')}
print yaml.dump(data)
The result:
folded: >
It removes all ordinary curses from all equipped items. Heavy or permanent curses
are unaffected.
literal: |
by hjw ___
__ /.-.\
/ )_____________\\ Y
/_ /=== == === === =\ _\_
( /)=== == === === == Y \
`-------------------( o )
\___/
For completeness, one should also have str implementations, but I'm going to be lazy :-)
How do I break a string in YAML over multiple lines?
Using yaml folded style. The indention in each line will be ignored. A line break will be inserted at the end.
Key: >
This is a very long sentence
that spans several lines in the YAML
but which will be rendered as a string
with only a single carriage return appended to the end.
http://symfony.com/doc/current/components/yaml/yaml_format.html
You can use the "block chomping indicator" to eliminate the trailing line break, as follows:
Key: >-
This is a very long sentence
that spans several lines in the YAML
but which will be rendered as a string
with NO carriage returns.
In either case, each line break is replaced by a space.
There are other control tools available as well (for controlling indentation for example).
See https://yaml-multiline.info/
How to dump strings in YAML using literal scalar style?
require 'psych'
# Construct an AST
visitor = Psych::Visitors::YAMLTree.new({})
visitor << DATA.read
ast = visitor.tree
# Find all scalars and modify their formatting
ast.grep(Psych::Nodes::Scalar).each do |node|
node.plain = false
node.quoted = true
node.style = Psych::Nodes::Scalar::LITERAL
end
begin
# Call the `yaml` method on the ast to convert to yaml
puts ast.yaml
rescue
# The `yaml` method was introduced in later versions, so fall back to
# constructing a visitor
Psych::Visitors::Emitter.new($stdout).accept ast
end
__END__
{
"page": 1,
"results": [
"item", "another"
],
"total_pages": 0
}
How can I control what scalar form PyYAML uses for my data?
Based on Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?
import yaml
from collections import OrderedDict
class quoted(str):
pass
def quoted_presenter(dumper, data):
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"')
yaml.add_representer(quoted, quoted_presenter)
class literal(str):
pass
def literal_presenter(dumper, data):
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
yaml.add_representer(literal, literal_presenter)
def ordered_dict_presenter(dumper, data):
return dumper.represent_dict(data.items())
yaml.add_representer(OrderedDict, ordered_dict_presenter)
d = OrderedDict(short=quoted("Hello"), long=literal("Line1\nLine2\nLine3\n"))
print(yaml.dump(d))
Output
short: "Hello"
long: |
Line1
Line2
Line3
yaml.dump adding unwanted newlines in multiline strings
If that is the only thing going into your YAML file then you can dump with the option default_style='|'
which gives you block style literal for all of your scalars (probably not what you want).
Your string, contains no special characters (that need \
escaping and double quotes), because of the newlines PyYAML decides to represented single quoted. In single quoted style a double newline is the way to represent a single newline that occurred in string that is represented. This gets "undone" on loading, but is indeed not very readable.
If you want to get the block style literals on an individual basis, you can do multiple things:
adapt the Representer to output all strings with embedded newlines using the literal scalar block style (assuming they don't need
\
escaping of special characters, which will force double quotes)import sys
import yaml
x = u"""\
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
...
"""
yaml.SafeDumper.org_represent_str = yaml.SafeDumper.represent_str
def repr_str(dumper, data):
if '\n' in data:
return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|')
return dumper.org_represent_str(data)
yaml.add_representer(str, repr_str, Dumper=yaml.SafeDumper)
yaml.safe_dump(dict(a=1, b='hello world', c=x), sys.stdout)make a subclass of string, that has its special representer. You should be able to take the code for that from here, here and here:
import sys
import yaml
class PSS(str):
pass
x = PSS("""\
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
...
""")
def pss_representer(dumper, data):
style = '|'
# if sys.versioninfo < (3,) and not isinstance(data, unicode):
# data = unicode(data, 'ascii')
tag = u'tag:yaml.org,2002:str'
return dumper.represent_scalar(tag, data, style=style)
yaml.add_representer(PSS, pss_representer, Dumper=yaml.SafeDumper)
yaml.safe_dump(dict(a=1, b='hello world', c=x), sys.stdout)use
ruamel.yaml
:import sys
from ruamel.yaml import YAML
from ruamel.yaml.scalarstring import PreservedScalarString as pss
x = pss("""\
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
...
""")
yaml = YAML()
yaml.dump(dict(a=1, b='hello world', c=x), sys.stdout)
All of these give:
a: 1
b: hello world
c: |
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
...
Please note that it is not necessary to specify default_flow_style=False
as the literal scalars can only appear in block style.
Change the scalar style used for all multi-line strings when serialising a dynamic model using YamlDotNet
To answer my own question, I've now worked out how to do this by deriving from the ChainedEventEmitter
class and overriding void Emit(ScalarEventInfo eventInfo, IEmitter emitter)
. See code sample below.
public class MultilineScalarFlowStyleEmitter : ChainedEventEmitter
{
public MultilineScalarFlowStyleEmitter(IEventEmitter nextEmitter)
: base(nextEmitter) { }
public override void Emit(ScalarEventInfo eventInfo, IEmitter emitter)
{
if (typeof(string).IsAssignableFrom(eventInfo.Source.Type))
{
string value = eventInfo.Source.Value as string;
if (!string.IsNullOrEmpty(value))
{
bool isMultiLine = value.IndexOfAny(new char[] { '\r', '\n', '\x85', '\x2028', '\x2029' }) >= 0;
if (isMultiLine)
eventInfo = new ScalarEventInfo(eventInfo.Source)
{
Style = ScalarStyle.Literal
};
}
}
nextEmitter.Emit(eventInfo, emitter);
}
}
Related Topics
How to Create a "Clone"-Able Enumerator for External Iteration
Rails 5.2 Activestorage Save and Then Read Exif Data
Rails Form Data Not Getting Saved to Db
Using Class Instance Variable for Mutex in Ruby
Watir: Get Sometimes a Net::Readtimeout Error by Launching Chrome Browser
Interpolating Regexes into Another Regex
Which Plugins/Gems Should I Use to Dynamically Generate Thumbnails On-The-Fly in Rails 3
Working with Multiple Processes in Ruby
How to Create Zip File Only in Memory in Ruby
Why Capypara + Rspect Tests Still Pass Even Though I Delete Application.Js File
Why Doesn't Array Override the Triple Equal Sign Method in Ruby
Rspec -- Test If Method Called Its Block Parameter
What Is the Fastest Way to Sort a Hash
How to Stop Chromedriver from Opening Settings Tab Automatically
What's a Semantically-Correct Way to Parse CSV from SQL Server 2008