Instance Variables VS. Class Variables in Python

Instance variables vs. class variables in Python

If you have only one instance anyway, it's best to make all variables per-instance, simply because they will be accessed (a little bit) faster (one less level of "lookup" due to the "inheritance" from class to instance), and there are no downsides to weigh against this small advantage.

Why have a class with both a class variable and an instance variable of the same name?

In Python that can be used for defaults.... for example:

class Foo:
    x = 1

a = Foo()
b = Foo()
print(a.x, b.x) # --> 1 1
a.x = 2
print(a.x, b.x) # --> 2 1

Difference between Class variables and Instance variables

Class variables are shadowed by instance attribute. This means that when looking up an attribute, Python first looks in the instance, then in the class. Furthermore, setting a variable on an object (e.g. self) always creates an instance variable - it never changes the class variable.

This means that when, in your second example you do:

self.x += 1

which is (in this case, see footnote) equivalent to:

self.x = self.x + 1

what Python does is:

Look up self.x. At that point, self doesn't have the instance attribute x, so the class attribute A.x is found, with the value 10.
The RHS is evaluated, giving the result 11.
This result is assigned to a new instance attribute x of self.

So below that, when you look up x.x, you get this new instance attribute that was created in add(). When looking up y.x, you still get the class attribute. To change the class attribute, you'd have to use A.x += 1 explicitly – the lookup only happens when reading the value of an attribute.

Your first example is a classical gotcha and the reason you shouldn't use class attributes as "default" values for instance attributes. When you call:

self.x.append(1)

there is no assignment to self.x taking place. (Changing the contents of a mutable object, like a list, is not the same as assignment.) Thus, no new instance attribute is added to x that would shadow it, and looking up x.x and y.x later on gives you the same list from the class attribute.

Note: In Python, x += y is not always equivalent to x = x + y. Python allows you to override the in-place operators separately from the normal ones for a type. This mostly makes sense for mutable objects, where the in-place version will directly change the contents without a reassignment of the LHS of the expression. However, immutable objects (such as numbers in your second example) do not override in-place operators. In that case, the statement does get evaluated as a regular addition and a reassignment, explaining the behaviour you see.

(I lifted the above from this SO answer, see there for more details.)

How should I choose between using instance vs. class attributes?

Class variables are quite good for "constants" used by all the instances (that's all methods are technically). You could use module globals, but using a class variable makes it more clearly associated with the class.

There are often uses for class variables that you actually change, too, but it's usually best to stay away from them for the same reason you stay away from having different parts of your program communicate by altering global variables.

Instance variables are for data that is actually part of the instance. They could be different for each particular instance, and they often change over the lifetime of a single particular instance. It's best to use instance variables for data that is conceptually part of an instance, even if in your program you happen to only have one instance, or you have a few instances that in practice always have the same value.

Static variable vs class variable vs instance variable vs local variable

I believe that static and class variables are commonly used as synonyms.

What you say about the variables is correct from the convention point of view: this is how you should think about them most of the time.

However the above are just conventions: from the language point of view there is no distinction between class variables and instance variables.

Python is not like C++ or Java.

Everything is an object, including classes and integers:

 class C(object): pass
 print id(C)
 C.a = 1
 assert C.__dict__['a'] == 1

There is no clear distinction between methods and instance variables: they are just attributes of an object.

Therefore, there is no language level distinction between instance variables and class variables: they are just attributes of different objects:

instance variables are attributes of the object (self)
class variables are attributes of the Class object.

The real magic happens on the order that the . operator searches for attributes:

__dict__ of the object
__dict__ of the class of the object
MRO up to parent classes

You should read this great article before you get confused in the future.

Also beware of bound vs unbound methods.

EDIT: attempt to address further questions by the OP made in his post.

Wow that was large! I'll try to read everything, but for the future you should try to keep questions more concice. More code, less talk =). You'll get better answers.

should I just keep the idea in mind and not worry too much about reconciling my current point of view with that one until I become more experienced?": I do things.

I do as I feel necessary. When necessity calls, or I can't take magic behaviour anymore, I learn.

sort of imply that the Python documentation explains this point of view somewhere within it?

I don't know about the docs, but the language itself works that way.

Of course, the language was designed to give the impression that syntax works just like in C++ in the common cases, and it adds a thin layer of magic to classes to make it look like so.

But, since that is not how it truly works, you cannot account for all (useful) behaviour by thinking only in terms of C++ class syntax.

By using the code from the original post, is there maybe a sequence of commands that illustrates this point?

I'm not sure it can be illustrated in sequence of commands. The point is: classes are objects, and their attributes are searched by the dot . MRO on the same order as attributes of objects:

class C(object):
    i_static = 0
    def __init__(self):
        self.i = 1

# i is in the __dict__ of object c
c = C()
assert c.__dict__['i'] == 1
assert c.i == 1

# dot finds i_static because MRO looks at class
assert c.__class__.__dict__['i_static'] == 0
assert c.i_static == 0

# i_static is in the __dict__ of object C
assert C.__dict__['i_static'] == 0
assert C.i_static == 0

# __eq__ is in the dict of type, which is the __class__ of C
# By MRO, __eq__ is found. `C,C` because of bound vs unbound.
assert C.__class__.__dict__['__eq__'](C,C)
assert C == C

are there just the two scopes of global and local involved in that program?

This is a point I don't know very clearly.

There is a no global scope in Python, only module level.

Then there is a new local scope inside functions.

The rest is how the . looks for attributes.

can't pinpoint exactly what I was trying to ask

Ask: can I find a difference in syntax between classes, integers or functions?

If you think you have found one, ask: hmmm, how can I make an object with certain attributes that behaves just like that thing which does not look like an object?

You should find an answer every time.

Example:

def f(): pass

class C(object): pass

AHA: f is different than c = C() because I can do f() but notc()`!

But then, no, it is just that the f.__class__.__dict__['__call__'] attribute is defined for f, and can be found via MRO.

But we can do that for c too:

class C(object):
    def __call__(self): pass

and now we can do c().

So they were not different in that aspect.

Class variable and instance variable in the constructor

The __init__ function in a class is simply the first function that is called when that class is initialized, hence the function name. There isn't any special property of variables defined in the __init__ function, except for being defined first. In this case, the __init__ function is defining a local variable entry_gate and assigning it a value of 10. This is similar to:

def foo():
    entry_gate = 10

Within a class, this structure stays the same, with the addition that any self.xxx variables are tied to the instance of the class (defined by self) while non-self variables are not.

In terms of the semicolon, there isn't much of a reason for it to exist in this code snippet. Semicolons can be used to denote a separation of statements, but they aren't required. In this instance, the semicolon has the effect of:

def __init__(self, first, last, pay):
    entry_gate = 10; self.first = first
    ...

For a more in-depth look at semicolons, I would point you to this question.

What is the difference between class and instance variables?

When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.

After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.

Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.

An example!

class SomeClass:
    def __init__(self):
        self.foo = 'I am an instance attribute called foo'
        self.foo_list = []

    bar = 'I am a class attribute called bar'
    bar_list = []

After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.

Then we'll create an instance:

instance = SomeClass()

When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.

But:

print instance.bar

gives:

I am a class attribute called bar

How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.

That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.

The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.

But you have to be a bit careful. Have a look at the following operations:

sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)

sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)

print sc1.foo_list
print sc1.bar_list

print sc2.foo_list
print sc2.bar_list

What do you think this prints?

[1]
[2, 20]
[10]
[2, 20]

This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.

Advanced study follows. :)

To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.

In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.

In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.

But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!

In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.

The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:

class Foo(BaseFoo):
    def __init__(self, foo):
        self.foo = foo

    z = 28

is roughly equivalent to the following:

def __init__(self, foo):
    self.foo = foo

classdict = {'__init__': __init__, 'z': 28 }

Foo = type('Foo', (BaseFoo,) classdict)

And it will arrange for all the contents of classdict to become attributes of the object that gets created.

So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!

In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).

And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).

That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)

In Python, do assignment operators access class or instance variables when passed as a default value in a class method definition?

The only way you can access an instance variable is as an attribute of self.

When you just refer to var, that's never an instance variable; it's always a local, enclosing, global, or builtin variable.

In your method definition:

def lookup(self, var=var):
    print(var)

… you have a parameter named var. Parameters are local variables. The print(var) inside the body prints that local variable.

What about this?

def lookup(self, var=var):
    var = var

Again, var is a local variable—a parameter. So, you're just assigning the current value of that local variable to the same variable. Which has no useful effect, but of course it's perfectly legal.

Where does the parameter's value come from? At function call time, if you pass an argument, that argument gets bound to the parameter; if you don't, it gets filled in with the default value.

OK, so where does the default value come from?

At function definition time (when the def statement is executed), var is looked up in the current scope—that is, the body of the class definition—and its value is stored as a default in the function object (it should be visible as foo.lookup.__defaults__[0]).

So, the default value is "value goes here".

Notice that it's not a closure capture or other reference to the class attribute. When the class statement is executed, it uses that same class body namespace to build the class's attributes, so you end up with foo.var as another name for the same value that's in foo.lookup.__defaults__[0]. But they're completely independent names for that value; you can reassign foo.var = 3, and the default value for lookup's parameter will still be "value goes here".

So, to answer your specific questions:

Will it store it in the class variable, the instance variable, or a new variable with the same name?

None of the above. It stores it in a local variable that already exists, because it's a parameter.

Will it get the class variable, instance variable, or the method's variable?

If by "the method's variable" you mean the parameter, it's the last one.

How do I explicitly reference these variables?

The same way you explicitly reference anything else:

var is a local-enclosing-global-or-builtin variable.
self.var is an instance attribute, or a class attribute if there is no instance attribute.
type(self).var is a class attribute, even if there is an instance attribute.

Instance Variables VS. Class Variables in Python