A Yaml File Cannot Contain Tabs as Indentation

A YAML file cannot contain tabs as indentation

A YAML file use spaces as indentation, you can use 2 or 4 spaces for indentation, but no tab. In other words, tab indentation is forbidden:

Why does YAML forbid tabs?

Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt.

(source: YAML FAQ (thanks to Destiny Architect for the link))

For example, the Symfony configuration file can be written with 2 or 4 spaces as indentation:

4 spaces

doctrine:
dbal:
default_connection: default

2 spaces

doctrine:
dbal:
default_connection: default

Using a configuration of YAML which is tab indented

Although tab characters are valid in YAML, they cannot be used in
indentation, in neither the current version
(1.2, nor in the
older 1.1, or
1.0)

That does not imply a tab cannot occur at the start of line, as the following
example shows

import sys
import ruamel.yaml

yaml_str = """\
'xxx
\tyyy'
"""

yaml = ruamel.yaml.YAML()
yaml.explicit_start = True
data = yaml.load(yaml_str)
print(data)

which runs without error and gives:

xxx yyy

if you remove the single quotes from the yaml_str, you will however
get the error that you got (on line 2, column 1), because the parser
has to consider if yyy starts a new token (while scanning the single
quoted scalar it doesn't do that).

Without seeing the actual YAML, it is difficult to say defitively, but
probalby your tool is to blame. You might get away with replacing the
tabs:

with open('yourfile.yaml') as fp:
data = yaml.load(fp.read().replace('\t', ' '))

YAML How many spaces per indent?

There is no requirement in YAML to indent any concrete number of spaces. There is also no requirement to be consistent. So for example, this is valid YAML:

a:
b:
- c
- d
- e
f:
"ghi"

Some rules might be of interest:

  • Flow content (i.e. everything that starts with { or [) can span multiple lines, but must be indented at least as many spaces as the surrounding current block level.
  • Block list items can (but don't need to) have the same indentation as the surrounding block level because - is considered part of the indentation:
a:    # top-level key
- b # value of that key, which is a list
- c
c: # next top-level key
d # non-list value which must be more indented

YAML as a JSON superset and TAB characters

Tabs ARE allowed in YAML, but only where indentation does not apply.

According to YAML 1.2 Section 5.5:

YAML recognizes two white space characters: space and tab.

The following examples will use · to denote spaces and to denote tabs. All examples can be validated using the official YAML Reference Parser.

YAML has a block style and flow style. In block style, indentation determines the structure of a document. The following document uses block style.

root:
··key: value

Validate

In flow style, special characters indicate the structure of the document. The following equivalent document uses flow style.

{
→ root: {
→ → key: value
→ }
}

Validate

You can even mix indentation in flow style.

{
→ root: {
··→ key: value
····}
}

Validate

If you're mixing block and flow style, the entire flow style part must respect the block style indentation.

root:
··{
····key: value
··}

Validate

But you can still mix your indentation within the flow style part.

root:
··{
··→ key: value
··}

Validate

If you have a single value document, you can surround the value with all manner of whitespace.

→ ··value··→ 

Validate

The point is, every JSON document that is parsed as YAML will put the document into flow style (because of the initial { or [ character) which supports tabs, unless it is a single value JSON document, in which case YAML still allows padding with whitespace.

If a YAML parser throws because of tabs in a JSON document, then it is not a valid parser.

That being said, your example is failing because a block style mapping value must always be indented if it's not on the same line as the mapping name.

root: {
··key: value
}

is not valid, however

root:
··{
····key: value
··}

is valid, and

root: { key: value }

is also valid.

How to automatically re-indent a YAML file?

What you seem to want to do is making sure that your YAML files are uniformly indented (e.g. before being checked into a revision control system). Your idea of dedenting and then re-indenting will not work as you lose information if you flatten your structure. This:

foo:
alice: female
bob: male

consists of two mappings: a mapping with one key and a value that is mapping of two keys to two values.

This:

foo:
alice: female
bob: male

is one mapping with three keys, and key foo has as value the null scalar (also writable, apart from the empty string, as ~, NULL, null in YAML files).

Most YAML parsers will lose information when reading in a file into internal data:

  • comments are dropped
  • key ordering is not preserved for mappings
  • extra spaces around scalars are not preserved

The ruamel.yaml Python package (of which I am the author) is an enhancemed parser which to allows round-tripping a YAML file to data and back to YAML to preserve more of the original information. It will preserve comments and key ordering, but it drops e.g. extra spacing around single line scalars.

This round-tripping normally stabilizes on a second round-trip and so this can be used to reindent a YAML file. The yaml utility included in the package ruamel.yaml.cmd, can be used for that without the need to program things yourself:

yaml round-trip your_file.yml --verbose

(round-trip can be shortened to rt) will check whether and how the file would change. It shows a unified diff if it does change. Based on that you can decide to save the file if it stabilizes:

yaml round-trip your_file.yml --save

the output for example.yml:

---
foo:
alice: female # verified
bob: male
bar:
- node: 42
name: none
- node: 43
name: none

would be:

example.yml:
stabilzes on second round trip, ok without comments
--- example.yml
+++ round trip YAML
@@ -1,9 +1,9 @@
---
foo:
alice: female # verified
- bob: male
+ bob: male
bar:
-- node: 42
+- node: 42
name: none
-- node: 43
- name: none
+- node: 43
+ name: none

and when saved look like:

---
foo:
alice: female # verified
bob: male
bar:
- node: 42
name: none
- node: 43
name: none

The indentation level is by default 2, but can be set with an option to yaml.

Tabs cannot be used for indentation when parsing YAML string

You are calling toString() on your StringReader which returns the cryptic and rather useless implementation provided by Object.toString(). java.io.StringReader@329dbdbf i.e. it doesn't tell you if you have tabs or not.

Instead you need to check the original String you used before passing it to StringReader and possibly apply .replaceAll("\t", "\\t")

Processing JSON with YAML Parser; throws on tab whitespace

Hits an error 'not allowed to use tab for indenting' <- which seems correct.

It is not.

This is the relevant production in the Spec:

[140]   c-flow-mapping(n,c) ::= “{” s-separate(n,c)?
ns-s-flow-map-entries(n,in-flow(c))? “}”

s-separate(n,c) resolves to s-separate-lines(n) here (because we are not inside block-key or flow-key). Skipping some steps, we reach s-separate-in-line which allows tab characters.

The bottom line is that this tab character in your JSON is not indentation. Indentation is only relevant in block style (i.e. not using [ or { for sequences and mappings respectively). In Flow style, whitespace is only for separation.

Edit: Removed example link because it was somewhat misleading.

Edit 2: To answer your second question: No, do not strip tabs. They may be content inside scalars! See this example where a tabular actually determines the indentation of a block scalar.

Docker-compose: Issue while running the spring boot enabled and spring cloud config application

There are multiple errors.

  1. Make sure that you only use spaces for indentation (instead of tabs). If you are interested why tabs don't work within yaml files have a look at A YAML file cannot contain tabs as indentation
  2. put your ports into strings (e.g. - "8060:8060"instead of - 8060:8060)
  3. I think you are misusing environment variables. They should/must look like e.g.:

environment:
- JAVA_OPTS
- EUREKA_SERVER=http://discovery:8761/eureka
- ANOTHER_ENV_VARIABLE=/config-data

Have a look at the docs for details: https://docs.docker.com/compose/environment-variables/

After fixing your docker-compose.yml you can validate your file by running docker-compose config inside of the directory where your docker-compose.yml is located.

Parse tab intended list to JSON with jq

This is an approach for a deeply nested input. It splits on top-level items using a negative look-ahead regex on tabs following newlines, then separates the head and "unindents" the rest by removing one tab following a newline, which serves as input for a recursive call.

jq -Rs '
def comp:
reduce (splits("\n(?!\\t)") | select(length > 0)) as $item ({};
($item | index(":")) as $hpos | .[$item[:$hpos]] = (
$item[$hpos + 1:] | gsub("\n\t"; "\n")
| if test("\n") then comp else .[index("'\''") + 1: rindex("'\''")] end
)
);
comp
'
{
"Heading One": {
"Sub One": "Value 1",
"Sub Two": "Value 2"
},
"Heading Two": {
"Sub Three": "Value 3",
"Sub Four": "Value 4"
},
"Key One": "This key has no heading"
}


Related Topics



Leave a reply



Submit