Why Does Python Pep-8 Strongly Recommend Spaces Over Tabs for Indentation

Why does Python pep-8 strongly recommend spaces over tabs for indentation?

The answer was given right there in the PEP [ed: this passage has been edited out in 2013]. I quote:

The most popular way of indenting Python is with spaces only.

What other underlying reason do you need?

To put it less bluntly: Consider also the scope of the PEP as stated in the very first paragraph:

This document gives coding conventions for the Python code comprising the standard library in the main Python distribution.

The intention is to make all code that goes in the official python distribution consistently formatted (I hope we can agree that this is universally a Good Thing™).

Since the decision between spaces and tabs for an individual programmer is a) really a matter of taste and b) easily dealt with by technical means (editors, conversion scripts, etc.), there is a clear way to end all discussion: choose one.

Guido was the one to choose. He didn't even have to give a reason, but he still did by referring to empirical data.

For all other purposes you can either take this PEP as a recommendation, or you can ignore it -- your choice, or your team's, or your team leaders.

But if I may give you one advice: don't mix'em ;-) [ed: Mixing tabs and spaces is no longer an option.]

Why does Python see a tab as 8 spaces?

Because the default tab size in Linux console is 8 spaces, and therefore most CLI text editors in Linux also default to 8 spaces. Most are also configurable, but it's been the default for ages.

Some older (C) code uses mixed indention to fake a 4 space tab - 1 indent == 4 spaces, 2 indents == 1 tab, 3 indents == 1 tab + 4 spaces, etc..... it was awful. Not sure if it was done intentionally in order to make the code easier to read, or if some editor did this automatically to simulate 4 space tabs. All I know is that I was using pico and it was a PITA working with those, especially when you needed to indent/dedent a whole block. :)

Python: using 4 spaces for indentation. Why?

Everyone else uses 4 spaces. That is the only reason to use 4 spaces that I've come across and accepted. In my heart, I still want to use tabs (1 indent character per indent, makes sense, no? Separate indent from other whitespace. I don't care that tabs can be displayed as different widths, that makes no syntactic difference. The worst that can happen is that some of the comments don't line up. The horror!) but I've accepted that since the python community as a whole uses 4 spaces, I use 4 spaces. This way, I can assemble code from snippets others have written, and it all works.

Python's interpretation of tabs and spaces to indent

Spaces are not treated as equivalent to tab. A line indented with a tab is at a different indentation from a line indented with 1, 2, 4 or 8 spaces.

Proof by counter-example (erroneous, or, at best, limited - tab != 4 spaces):

x = 1
if x == 1:
^Iprint "fff\n"
print "yyy\n"

The '^I' shows a TAB. When run through Python 2.5, I get the error:

  File "xx.py", line 4
print "yyy\n"
^
IndentationError: unindent does not match any outer indentation level

Thus showing that in Python 2.5, tabs are not equal to spaces (and in particular not equal to 4 spaces).


Oops - embarrassing; my proof by counter-example shows that tabs are not equivalent to 4 spaces. As Alex Martelli points out in a comment, in Python 2, tabs are equivalent to 8 spaces, and adapting the example with a tab and 8 spaces shows that this is indeed the case.

x = 1
if x != 1:
^Iprint "x is not 1\n"
print "y is unset\n"

In Python 2, this code works, printing nothing.


In Python 3, the rules are slightly different (as noted by Antti Haapala). Compare:

  • Python 2 on Indentation
  • Python 3 on Indentation

Python 2 says:

First, tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight (this is intended to be the same rule as used by Unix). The total number of spaces preceding the first non-blank character then determines the line’s indentation. Indentation cannot be split over multiple physical lines using backslashes; the whitespace up to the first backslash determines the indentation.

Python 3 says:

Tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight (this is intended to be the same rule as used by Unix). The total number of spaces preceding the first non-blank character then determines the line’s indentation. Indentation cannot be split over multiple physical lines using backslashes; the whitespace up to the first backslash determines the indentation.

(Apart from the opening word "First," these are identical.)

Python 3 adds an extra paragraph:

Indentation is rejected as inconsistent if a source file mixes tabs and spaces in a way that makes the meaning dependent on the worth of a tab in spaces; a TabError is raised in that case.

This means that the TAB vs 8-space example that worked in Python 2 would generate a TabError in Python 3. It is best — necessary in Python 3 — to ensure that the sequence of characters making up the indentation on each line in a block is identical. PEP8 says 'use 4 spaces per indentation level'. (Google's coding standards say 'use 2 spaces'.)

How to use tabs not causing PEP warnings

You can convert tabs to spaces or spaces to tabs by menu using Edit -> Convert Indents -> To spaces (To tabs) . Easier way is to press Shift twice and type to and select the option you want to apply.

Also, please refer to this pycharm docs (only first part). It lets you configure to convert tabs to spaces when tab is pressed.

EDIT:

Also, please read this short PEP8 guide about tabs and spaces.
Basically:

Spaces are the preferred indentation method.

EDIT 2:

If you want to disable some warnings, when you see squiggly underline (usually yellow for warnings), place cursor at it and press Alt+Enter and choose Ignore errors like this.

warnings

You can also disable some types of warnings by Settings -> Editor -> Inspections and see more at Python category.

Why is it recommended that I replace tabs by spaces?

Because it's the community standard.

Guido Van Rossum actually preferr{s,ed} tabs, but the community as a whole prefers spaces. If you want to share code, standards are especially helpful.

That's it. There are advantages to both spaces and tabs, but sticking to the standard tends to outweigh both.

Some advantages of spaces:

  • Spaces are less often mangled by badly-designed software
  • Some align code in ways that no longer work when you use tabs
  • The indentation is more consistent if you move code between editors that have different tabstops

etc.

Why python use indentation for defining block for conditional statement?

It's more or less hard-wired into the lexer - anything following a : is considered a block statement, and if there are new lines after that, then it must be indented over by at least one space.

PEP 8 provides some clarity on what the formal style guides are for Python. In a nutshell - indentation makes for more readable code.

The lexical analysis page also provides a bit of insight into this as well:

The indentation levels of consecutive lines are used to generate
INDENT and DEDENT tokens, using a stack, as follows.

Before the first line of the file is read, a single zero is pushed on
the stack; this will never be popped off again. The numbers pushed on
the stack will always be strictly increasing from bottom to top. At
the beginning of each logical line, the line’s indentation level is
compared to the top of the stack. If it is equal, nothing happens. If
it is larger, it is pushed on the stack, and one INDENT token is
generated. If it is smaller, it must be one of the numbers occurring
on the stack; all numbers on the stack that are larger are popped off,
and for each number popped off a DEDENT token is generated. At the end
of the file, a DEDENT token is generated for each number remaining on
the stack that is larger than zero.

Sine Python is using a stack to guarantee that the level of indentation is consistent for a particular block, not having indentation would break the lexer, and cause your Python code to not be interpreted.

The flexibility of the lexer also allows to do to this (but don't do this, or Python programmers will despise you 'til the end of days):

def foo(n):
for i in range(0, 10):
print i, i+1

i = 0
while i < 10:
print i
print i - 1
print i + 1

Oh, and if you have too many nested statements - perhaps you should consider refactoring your code to read more clearly, and reduce the amount of code complexity?

Is 4 Whitespace indentation better than Tab space indentation?

For indentation purpose better to use spaces instead of tabs. Because tabs are treated differently by different editors. When you use tabs, different editors shows different indentation. For making your code visible in every editor in same way use spaces instead of tabs.



Related Topics



Leave a reply



Submit