Underscore VS Double Underscore with Variables and Methods

What is the meaning of single and double underscore before an object name?

Single Underscore

In a class, names with a leading underscore indicate to other programmers that the attribute or method is intended to be be used inside that class. However, privacy is not enforced in any way.
Using leading underscores for functions in a module indicates it should not be imported from somewhere else.

From the PEP-8 style guide:

_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose name starts with an underscore.

Double Underscore (Name Mangling)

From the Python docs:

Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, so it can be used to define class-private instance and class variables, methods, variables stored in globals, and even variables stored in instances. private to this class on instances of other classes.

And a warning from the same page:

Name mangling is intended to give classes an easy way to define “private” instance variables and methods, without having to worry about instance variables defined by derived classes, or mucking with instance variables by code outside the class. Note that the mangling rules are designed mostly to avoid accidents; it still is possible for a determined soul to access or modify a variable that is considered private.

Example

>>> class MyClass():
... def __init__(self):
... self.__superprivate = "Hello"
... self._semiprivate = ", world!"
...
>>> mc = MyClass()
>>> print mc.__superprivate
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: myClass instance has no attribute '__superprivate'
>>> print mc._semiprivate
, world!
>>> print mc.__dict__
{'_MyClass__superprivate': 'Hello', '_semiprivate': ', world!'}

Underscore vs Double underscore with variables and methods

From PEP 8:

  • _single_leading_underscore: weak "internal use" indicator. E.g.

    from M import *

    does not import objects whose name starts with an underscore.

  • single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g.

    Tkinter.Toplevel(master, class_='ClassName')

  • __double_leading_underscore: when naming a class attribute, invokes name
    mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).

  • __double_leading_and_trailing_underscore__: "magic" objects or
    attributes that live in user-controlled namespaces. E.g. __init__,
    __import__ or __file__. Never invent such names; only use them
    as documented.

Also, from David Goodger's Code Like a Pythonista:

Attributes: interface, _internal, __private

But try to avoid the __private form. I never use it. Trust me. If you
use it, you WILL regret it later.

Explanation:

People coming from a C++/Java background are especially prone to
overusing/misusing this "feature". But __private names don't work the
same way as in Java or C++. They just trigger a name mangling whose
purpose is to prevent accidental namespace collisions in subclasses:
MyClass.__private just becomes MyClass._MyClass__private. (Note that
even this breaks down for subclasses with the same name as the
superclass, e.g. subclasses in different modules.) It is possible to
access __private names from outside their class, just inconvenient and
fragile (it adds a dependency on the exact name of the superclass).

The problem is that the author of a class may legitimately think "this
attribute/method name should be private, only accessible from within
this class definition" and use the __private convention. But later on,
a user of that class may make a subclass that legitimately needs
access to that name. So either the superclass has to be modified
(which may be difficult or impossible), or the subclass code has to
use manually mangled names (which is ugly and fragile at best).

There's a concept in Python: "we're all consenting adults here". If
you use the __private form, who are you protecting the attribute from?
It's the responsibility of subclasses to use attributes from
superclasses properly, and it's the responsibility of superclasses to
document their attributes properly.

It's better to use the single-leading-underscore convention,
_internal. "This isn't name mangled at all; it just indicates to
others to "be careful with this, it's an internal implementation
detail; don't touch it if you don't fully understand it". It's only a
convention though.

single underscore vs double underscore encapsulation in python

Here's why that's the "standard," a little different from other languages.

  1. No underscore indicates it's a public thing that users of that class can touch/modify/use
  2. One underscore is more of an implementation detail that usually (note the term usually) should only be referenced/used in sub-classes or if you know what you're doing. The beautiful thing about python is that we're all adults here and if someone wants to access something for some really custom thing then they should be able to.
  3. Two underscores is name mangled to include the classname like so _Temp__c behind the scenes to prevent your variables clashing with a subclass. However, I would stay away from defaulting to two because it's not a great habit and is generally unnecessary. There are arguments and other posts about it that you can read up on like this

Note: there is no difference to variables/methods that either have an underscore or not. It's just a convention for classes that's not enforced but rather accepted by the community to be private.
Note #2: There is an exception described by Matthias for non-class methods

When to use one or two underscore in Python

Short answer: use a single leading underscore unless you have a really compelling reason to do otherwise (and even then think twice).

Long answer:

One underscore means "this is an implementation detail" (attribute, method, function, whatever), and is the Python equivalent of "protected" in Java. This is what you should use for names that are not part of your class / module / package public API. It's a naming convention only (well mostly - star imports will ignore them, but you're not doing star imports anywhere else than in your Python shell are you ?) so it won't prevent anyone to access this name, but then they're on their own if anything breaks (see this as a "warranty void if unsealed" kind of mention).

Two underscores triggers a name mangling mechanism. There are very few valid reason to use this - actually there's only one I can think of (and which is documented): protecting a name from being accidentally overridden in the context of a complex framework's internals. As an example there might be about half a dozen or less instances of this naming scheme in the whole django codebase (mostly in the django.utils.functional package).

As far as I'm concerned I must have use this feature perhaps thrice in 15+ years, and even then I'm still not sure I really needed it.

Functions and variables beginning with a single or double underscore

Assuming normal conventions are used in PHP:

  • single underscore indicates a protected member variable or method
  • double underscore indicates a private member variable or method

This stems from when PHP had weak OOP support and did not have a concept of private and protected (everything was public). This convention allowed developers to indicate a member variable or method was private or protected as to better communicate this to users of the code.

Users could choose to ignore these semantics and call the "private" and "protected" member variables and methods if so chose, though.

Double underscore vs single underscore in nodeJS

I can be wrong, but as far as I know, there is only one convention in js: "if method or variable supposed to be private, use underscore in front of it - _privateMethod". And even this one is kind of "unofficial". Double underscore is not a naming convention. Just some developer from node decided to name thing like this.

What is the purpose of the double underscore, after a single underscore, as a parameter to a function / class method?

Nothing special. _ as __ as a are just variable identifiers. _ is often used to name an unused variable.
Here there are 2 variables unused, the first one is named _ and the second one __.
With multiple unused variables it's common to name them _, __, ___ ... or _1,_2,_3...

Double underscore in python

Leading double underscore names are private (meaning not available to derived classes)

This is not foolproof. It is implemented by mangling the name. Python Documentation says:

Any identifier of the form __spam (at least two leading underscores,
at most one trailing underscore) is textually replaced with
_classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard
to the syntactic position of the identifier, so it can be used to
define class-private instance and class variables, methods, variables
stored in globals, and even variables stored in instances. private to
this class on instances of other classes.

Thus __get is actually mangled to _A__get in class A. When class B attempts to reference __get, it gets mangled to _B__get which doesn't match.

In other words __plugh defined in class Xyzzy means "unless you are running as class Xyzzy, thou shalt not touch the __plugh."

Why are some methods with __ in front of them not private?

A double leading underscore does not actually make a method private. A single leading underscore hints that the method should be for internal use only, but is not enforced by python.

__init__

has a double leading and trailing underscore which means that it is a magic method that is reserved for special use by python. You should never name a method with double leading and trailing underscores.



Related Topics



Leave a reply



Submit