Adding Comment to Yaml Programmatically

Adding comment to YAML programmatically

require 'yaml'

str = <<-eol
root:
label: 'Test'
account: 'Account'
add: 'Add'
local_folder: 'Local folder'
remote_folder: 'Remote folder'
status: 'Status'
subkey: 'Some value'
eol

h = YAML.load(str)
h["root"]["local_folder"] = h["root"]["local_folder"] + " !Test comment"
h["root"]["subkey"] = h["root"]["subkey"] + " !Test comment"

puts h.to_yaml

# >> ---
# >> root:
# >> label: Test
# >> account: Account
# >> add: Add
# >> local_folder: Local folder !Test comment
# >> remote_folder: Remote folder
# >> status: Status
# >> subkey: Some value !Test comment

EDIT

more programmatically:

require 'yaml'

str = <<-eol
root:
label: 'Test'
account: 'Account'
add: 'Add'
local_folder: 'Local folder'
remote_folder: 'Remote folder'
status: 'Status'
subkey: 'Some value'
eol

h = YAML.load(str)
%w(local_folder subkey).each {|i| h["root"][i] = h["root"][i] + " !Test comment" }

puts h.to_yaml

# >> ---
# >> root:
# >> label: Test
# >> account: Account
# >> add: Add
# >> local_folder: Local folder !Test comment
# >> remote_folder: Remote folder
# >> status: Status
# >> subkey: Some value !Test comment

How to write comments in the YAML programmatically in OpenCV?

You can use the function:

/* writes a comment */
CVAPI(void) cvWriteComment( CvFileStorage* fs, const char* comment, int eol_comment );

This is working example:

#include <opencv2\opencv.hpp>
#include <iostream>

using namespace cv;
using namespace std;

int main()
{
{
FileStorage fs("test.yml", FileStorage::WRITE);

cvWriteComment(*fs, "a double value", 0);
fs << "dbl" << 2.0;

cvWriteComment(*fs, "a\nvery\nimportant\nstring", 0);
fs << "str" << "Multiline comments work, too!";
}

{
double d;
string s;
FileStorage fs("test.yml", FileStorage::READ);
fs["dbl"] >> d;
fs["str"] >> s;
}

return 0;
}

The test.yml file:

%YAML:1.0
# a double value
dbl: 2.
# a
# very
# important
# string
str: "Multiline comments work, too!"

Adding comments to YAML produced with PyYaml

You probably have some representer for the MyObj class, as by default dumping ( print(yaml.dump(MyObj())) ) with PyYAML will give you:

!!python/object:__main__.MyObj {}

PyYAML can only do one thing with the comments in your desired output: discard them. If you would read that desired output back in, you end
up with a dict containing a dict ({'boby': {'age': 34}}, you would not get a MyObj() instance because there is no tag information)

The enhanced version for PyYAML that I developed (ruamel.yaml) can read in YAML with comments, preserve the comments and write comments when dumping.
If you read your desired output, the resulting data will look (and act) like a dict containing a dict, but in reality there is more complex data structure that can handle the comments. You can however create that structure when ruamel.yaml asks you to dump an instance of MyObj and if you add the comments at that time, you will get your desired output.

from __future__ import print_function

import sys
import ruamel.yaml
from ruamel.yaml.comments import CommentedMap

class MyObj():
name = "boby"
age = 34

def convert_to_yaml_struct(self):
x = CommentedMap()
a = CommentedMap()
x[data.name] = a
x.yaml_add_eol_comment('this is the name', 'boby', 11)
a['age'] = data.age
a.yaml_add_eol_comment('in years', 'age', 11)
return x

@staticmethod
def yaml_representer(dumper, data, flow_style=False):
assert isinstance(dumper, ruamel.yaml.RoundTripDumper)
return dumper.represent_dict(data.convert_to_yaml_struct())

ruamel.yaml.RoundTripDumper.add_representer(MyObj, MyObj.yaml_representer)

ruamel.yaml.round_trip_dump(MyObj(), sys.stdout)

Which prints:

boby:      # this is the name
age: 34 # in years

There is no need to wait with creating the CommentedMap instances until you want to represent the MyObj instance. I would e.g. make name and age into properties that get/set values from/on the approprate CommentedMap. That way you could more easily add the comments before the yaml_representer static method is called to represent the MyObj instance.

Ruby Appending comment block to YAML file

That the comments are lost with a dump is unfortunatly normal.
You have two options:

  1. convert your versions hash to yaml { :a => 'b'}.to_yaml, add the comments and with File.write do the dump yourself, you could overwrite the normal
    .dump method in YAML this way
  2. assign the comments to some dummy value at the end of your yaml file
    so that they are read into versions and saved also.

Insert a comment before a key in Ruamel.yaml

Yes that is possible, as you can check by doing a round-trip:

import sys
import ruamel.yaml

with open('your_input.yaml') as fp:
data = ruamel.yaml.round_trip_load(yaml_str)
ruamel.yaml.round_trip_dump(data, sys.stdout)

the printed output will match your input, so somehow the comment is inserted into the data hierarchy of structures, preserved, and written out when dumping.

In ruamel.yaml the comments are attached to wrapper classes for lists or dicts, which you check with print(type(data['a']): it is a CommentedMap (from ruamel.yaml.comment.py). The comment information for the value of a hangs of an attribute _yaml_comment that you can access via the property ca:

cm = data['a']
print(cm.ca)

gives:

items={'e': [None, [CommentToken(value='# This is my comment\n')], None, None]})

This shows the comment is associated with the key e, that is following the comment. Unfortunately the CommentToken cannot just be created by calling it like it is represented (i.e. CommentToken(value='# This is my comment\n')), it needs a little more work as it needs at least a start Mark.

There is no "helper" routine to create such a comment, but by looking at CommentedMap and its base class CommentedBase you can come up with the following ¹:

import sys
import ruamel.yaml

if not hasattr(ruamel.yaml.comments.CommentedMap, "yaml_set_comment_before_key"):
def my_yaml_set_comment_before_key(self, key, comment, column=None,
clear=False):
"""
append comment to list of comment lines before key, '# ' is inserted
before the comment
column: determines indentation, if not specified take indentation from
previous comment, otherwise defaults to 0
clear: if True removes any existing comments instead of appending
"""
key_comment = self.ca.items.setdefault(key, [None, [], None, None])
if clear:
key_comment[1] = []
comment_list = key_comment[1]
if comment:
comment_start = '# '
if comment[-1] == '\n':
comment = comment[:-1] # strip final newline if there
else:
comment_start = '#'
if column is None:
if comment_list:
# if there already are other comments get the column from them
column = comment_list[-1].start_mark.column
else:
column = 0
start_mark = ruamel.yaml.error.Mark(None, None, None, column, None, None)
comment_list.append(ruamel.yaml.tokens.CommentToken(
comment_start + comment + '\n', start_mark, None))
return self

ruamel.yaml.comments.CommentedMap.yaml_set_comment_before_key = \
my_yaml_set_comment_before_key

with CommentedMap extended with this method you can then do:

yaml_str = """\
a:
b: banana
c: apple
d: orange
e: pear
"""

data = ruamel.yaml.round_trip_load(yaml_str)
cm = data['a']

cm.yaml_set_comment_before_key('e', "This is Alex' comment", column=2)
cm.yaml_set_comment_before_key('e', 'and this mine')
ruamel.yaml.round_trip_dump(data, sys.stdout)

to get:

a:
b: banana
c: apple
d: orange
# This is Alex' comment
# and this mine one
e: pear

Unless you read in a comment, there is no way to query cm which column
the comment should be in, to align it with the key e (that column is determined on writing out the data structure). You might be tempted to store a special value (-1?) and try to determine this during output, but you have little context while streaming out. You can of course determine/set the column to the nesting level (1) and multiply that by the indent (the one you give to round_trip_dump, which defaults to 2)

The comments facilities were meant for preservation in round-tripping, and not initially for modification or inserting new ones, so the interface is not guaranteed to be stable. With that in mind make sure you create a single routine or a set of routines around yaml_set_comment_before_key(), to make your changes, so you only have a single module to update if the interface changes (the capability of being able to attach a comment will not go away, the method of doing so might however change)


¹ Maybe not you, but since I am the author of ruamel.yaml, I should be able to find my way in the underdocumented code.

Save/dump a YAML file with comments in PyYAML

PyYAML throws away comments at a very low level (in Scanner.scan_to_next_token).

While you could adapt or extend it to handle comments in its whole stack, this would be a major modification. Dumping (=emitting) comments seems to be easier and was discussed in ticket 114 on the old PyYAML bug tracker.

As of 2020, the feature request about adding support for loading comments is still stalling.

Modify existing yaml file and add new data and comments

First, let Me Start off by saying using yaml.Node does not produce valid yaml when marshalled from a valid yaml, given by the following example. Probably should file an issue.

package main

import (
"fmt"
"log"

"gopkg.in/yaml.v3"
)

var (
sourceYaml = `version: 1
type: verbose
kind : bfr

# my list of applications
applications:

# First app
- name: app1
kind: nodejs
path: app1
exec:
platforms: k8s
builder: test
`
)

func main() {
t := yaml.Node{}

err := yaml.Unmarshal([]byte(sourceYaml), &t)
if err != nil {
log.Fatalf("error: %v", err)
}

b, err := yaml.Marshal(&t)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(b))
}

Produces the following invalid yaml in go version go1.12.3 windows/amd64

version: 1
type: verbose
kind: bfr

# my list of applications
applications:
- # First app
name: app1
kind: nodejs
path: app1
exec:
platforms: k8s
builder: test

Secondly, using a struct such as

type VTS struct {
Version string `yaml:"version" json:"version"`
Types string `yaml:"type" json:"type"`
Kind string `yaml:"kind,omitempty" json:"kind,omitempty"`
Apps yaml.Node `yaml:"applications,omitempty" json:"applications,omitempty"`
}

From ubuntu's blog and the source documentation it made it seem that it would correctly identify fields within the struct that are nodes and build that tree separately, but that is not the case.
When unmarshalled, it will give a correct node tree, but when remarshalled it will produce the following yaml with all of the fields that yaml.Node exposes. Sadly we cannot go this route, must find another way.

version: "1"
type: verbose
kind: bfr
applications:
kind: 2
style: 0
tag: '!!seq'
value: ""
anchor: ""
alias: null
content:
- # First app
name: app1
kind: nodejs
path: app1
exec:
platforms: k8s
builder: test
headcomment: ""
linecomment: ""
footcomment: ""
line: 9
column: 3

Overlooking the first issue and the marshal bug for yaml.Nodes in a struct (on gopkg.in/yaml.v3 v3.0.0-20190409140830-cdc409dda467) we can now go about manipulating the Nodes that the package exposes. Unfortunately, there is no abstraction that will add Nodes with ease, so uses might vary and identifying nodes can be a pain. Reflection might help here a bit, so I leave that as an exercise for you.

You will find comment spew.Dumps that dump the entire node Tree in a nice format, this helped with debugging when adding Nodes to the source tree.

You can certainly remove nodes as well, you will just need to identify which particular nodes that need to be removed. You just have to ensure that you remove the parent nodes if it were a map or sequence.

package main

import (
"encoding/json"
"fmt"
"log"

"gopkg.in/yaml.v3"
)

var (
sourceYaml = `version: 1
type: verbose
kind : bfr

# my list of applications
applications:

# First app
- name: app1
kind: nodejs
path: app1
exec:
platforms: k8s
builder: test
`
modifyJsonSource = `
[

{
"comment": "Second app",
"name": "app2",
"kind": "golang",
"path": "app2",
"exec": {
"platforms": "dockerh",
"builder": "test"
}
}
]
`
)

// VTS Need to Make Fields Public otherwise unmarshalling will not fill in the unexported fields.
type VTS struct {
Version string `yaml:"version" json:"version"`
Types string `yaml:"type" json:"type"`
Kind string `yaml:"kind,omitempty" json:"kind,omitempty"`
Apps Applications `yaml:"applications,omitempty" json:"applications,omitempty"`
}

type Applications []struct {
Name string `yaml:"name,omitempty" json:"name,omitempty"`
Kind string `yaml:"kind,omitempty" json:"kind,omitempty"`
Path string `yaml:"path,omitempty" json:"path,omitempty"`
Exec struct {
Platforms string `yaml:"platforms,omitempty" json:"platforms,omitempty"`
Builder string `yaml:"builder,omitempty" json:"builder,omitempty"`
} `yaml:"exec,omitempty" json:"exec,omitempty"`
Comment string `yaml:"comment,omitempty" json:"comment,omitempty"`
}

func main() {
t := yaml.Node{}

err := yaml.Unmarshal([]byte(sourceYaml), &t)
if err != nil {
log.Fatalf("error: %v", err)
}

// Look for the Map Node with the seq array of items
applicationNode := iterateNode(&t, "applications")

// spew.Dump(iterateNode(&t, "applications"))

var addFromJson Applications
err = json.Unmarshal([]byte(modifyJsonSource), &addFromJson)
if err != nil {
log.Fatalf("error: %v", err)
}

// Delete the Original Applications the following options:
// applicationNode.Content = []*yaml.Node{}
// deleteAllContents(applicationNode)
deleteApplication(applicationNode, "name", "app1")

for _, app := range addFromJson {
// Build New Map Node for new sequences coming in from json
mapNode := &yaml.Node{Kind: yaml.MappingNode, Tag: "!!map"}

// Build Name, Kind, and Path Nodes
mapNode.Content = append(mapNode.Content, buildStringNodes("name", app.Name, app.Comment)...)
mapNode.Content = append(mapNode.Content, buildStringNodes("kind", app.Kind, "")...)
mapNode.Content = append(mapNode.Content, buildStringNodes("path", app.Path, "")...)

// Build the Exec Nodes and the Platform and Builder Nodes within it
keyMapNode, keyMapValuesNode := buildMapNodes("exec")
keyMapValuesNode.Content = append(keyMapValuesNode.Content, buildStringNodes("platform", app.Exec.Platforms, "")...)
keyMapValuesNode.Content = append(keyMapValuesNode.Content, buildStringNodes("builder", app.Exec.Builder, "")...)

// Add to parent map Node
mapNode.Content = append(mapNode.Content, keyMapNode, keyMapValuesNode)

// Add to applications Node
applicationNode.Content = append(applicationNode.Content, mapNode)
}
// spew.Dump(t)
b, err := yaml.Marshal(&t)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(b))
}

// iterateNode will recursive look for the node following the identifier Node,
// as go-yaml has a node for the key and the value itself
// we want to manipulate the value Node
func iterateNode(node *yaml.Node, identifier string) *yaml.Node {
returnNode := false
for _, n := range node.Content {
if n.Value == identifier {
returnNode = true
continue
}
if returnNode {
return n
}
if len(n.Content) > 0 {
ac_node := iterateNode(n, identifier)
if ac_node != nil {
return ac_node
}
}
}
return nil
}

// deleteAllContents will remove all the contents of a node
// Mark sure to pass the correct node in otherwise bad things will happen
func deleteAllContents(node *yaml.Node) {
node.Content = []*yaml.Node{}
}

// deleteApplication expects that a sequence Node with all the applications are present
// if the key value are not found it will not log any errors, and return silently
// this is expecting a map like structure for the applications
func deleteApplication(node *yaml.Node, key, value string) {
state := -1
indexRemove := -1
for index, parentNode := range node.Content {
for _, childNode := range parentNode.Content {
if key == childNode.Value && state == -1 {
state += 1
continue // found expected move onto next
}
if value == childNode.Value && state == 0 {
state += 1
indexRemove = index
break // found the target exit out of the loop
} else if state == 0 {
state = -1
}
}
}
if state == 1 {
// Remove node from contents
// node.Content = append(node.Content[:indexRemove], node.Content[indexRemove+1:]...)
// Don't Do this you might have a potential memory leak source: https://github.com/golang/go/wiki/SliceTricks
// Since the underlying nodes are pointers
length := len(node.Content)
copy(node.Content[indexRemove:], node.Content[indexRemove+1:])
node.Content[length-1] = nil
node.Content = node.Content[:length-1]
}
}

// buildStringNodes builds Nodes for a single key: value instance
func buildStringNodes(key, value, comment string) []*yaml.Node {
keyNode := &yaml.Node{
Kind: yaml.ScalarNode,
Tag: "!!str",
Value: key,
HeadComment: comment,
}
valueNode := &yaml.Node{
Kind: yaml.ScalarNode,
Tag: "!!str",
Value: value,
}
return []*yaml.Node{keyNode, valueNode}
}

// buildMapNodes builds Nodes for a key: map instance
func buildMapNodes(key string) (*yaml.Node, *yaml.Node) {
n1, n2 := &yaml.Node{
Kind: yaml.ScalarNode,
Tag: "!!str",
Value: key,
}, &yaml.Node{Kind: yaml.MappingNode,
Tag: "!!map",
}
return n1, n2
}

Produces yaml

version: 1
type: verbose
kind: bfr

# my list of applications
applications:
- # First app
name: app1
kind: nodejs
path: app1
exec:
platforms: k8s
builder: test
- # Second app
name: app2
kind: golang
path: app2
exec:
platform: dockerh
builder: test


Related Topics



Leave a reply



Submit