What's the Difference Between Globals(), Locals(), and Vars()

What's the difference between globals(), locals(), and vars()?

Each of these return a dictionary:

globals() always returns the dictionary of the module namespace
locals() always returns a dictionary of the current namespace
vars() returns either a dictionary of the current namespace (if called with no argument) or the dictionary of the argument.

locals and vars could use some more explanation. If locals() is called inside a function, it updates a dict with the values of the current local variable namespace (plus any closure variables) as of that moment and returns it. Multiple calls to locals() in the same stack frame return the same dict each time - it's attached to the stack frame object as its f_locals attribute. The dict's contents are updated on each locals() call and each f_locals attribute access, but only on such calls or attribute accesses. It does not automatically update when variables are assigned, and assigning entries in the dict will not assign the corresponding local variables:

import inspect

def f():
    x = 1
    l = locals()
    print(l)
    locals()
    print(l)
    x = 2
    print(x, l['x'])
    l['x'] = 3
    print(x, l['x'])
    inspect.currentframe().f_locals
    print(x, l['x'])

f()

gives us:

{'x': 1}
{'x': 1, 'l': {...}}
2 1
2 3
2 2

The first print(l) only shows an 'x' entry, because the assignment to l happens after the locals() call. The second print(l), after calling locals() again, shows an l entry, even though we didn't save the return value. The third and fourth prints show that assigning variables doesn't update l and vice versa, but after we access f_locals, local variables are copied into locals() again.

Two notes:

This behavior is CPython specific -- other Pythons may allow the updates to make it back to the local namespace automatically.
In CPython 2.x it is possible to make this work by putting an exec "pass" line in the function. This switches the function to an older, slower execution mode that uses the locals() dict as the canonical representation of local variables.

If locals() is called outside a function it returns the actual dictionary that is the current namespace. Further changes to the namespace are reflected in the dictionary, and changes to the dictionary are reflected in the namespace:

class Test(object):
    a = 'one'
    b = 'two'
    huh = locals()
    c = 'three'
    huh['d'] = 'four'
    print huh

gives us:

{
  'a': 'one',
  'b': 'two',
  'c': 'three',
  'd': 'four',
  'huh': {...},
  '__module__': '__main__',
}

So far, everything I've said about locals() is also true for vars()... here's the difference: vars() accepts a single object as its argument, and if you give it an object it returns the __dict__ of that object. For a typical object, its __dict__ is where most of its attribute data is stored. This includes class variables and module globals:

class Test(object):
    a = 'one'
    b = 'two'
    def frobber(self):
        print self.c
t = Test()
huh = vars(t)
huh['c'] = 'three'
t.frobber()

which gives us:

three

Note that a function's __dict__ is its attribute namespace, not local variables. It wouldn't make sense for a function's __dict__ to store local variables, since recursion and multithreading mean there can be multiple calls to a function at the same time, each with their own locals:

def f(outer):
    if outer:
        f(False)
        print('Outer call locals:', locals())
        print('f.__dict__:', f.__dict__)
    else:
        print('Inner call locals:', locals())
        print('f.__dict__:', f.__dict__)

f.x = 3

f(True)

which gives us:

Inner call locals: {'outer': False}
f.__dict__: {'x': 3}
Outer call locals: {'outer': True}
f.__dict__: {'x': 3}

Here, f calls itself recursively, so the inner and outer calls overlap. Each one sees its own local variables when it calls locals(), but both calls see the same f.__dict__, and f.__dict__ doesn't have any local variables in it.

difference between locals() and globals() and dir() in python

At global scope , both locals() and globals() return the same dictionary to global namespace . But inside a function , locals() returns the copy to the local namespace , whereas globals() would return global namespace (which would contain the global names) . So the difference between them is only visible when in a function . Example to show this -

>>> locals() == globals() #global scope, that is directly within the script (not inside a function.
True
>>> def a():
...     l = 1
...     print('locals() :',locals())
...     print('globals() :',globals())
...
>>> a()
locals() : {'l': 1}
globals() : {'BOTTOM': 'bottom', 'PROJECTING': 'proj....

From documentation of globals() -

globals()
Return a dictionary representing the current global symbol table. This is always the dictionary of the current module (inside a function or method, this is the module where it is defined, not the module from which it is called).

From documentation of locals() -

locals()
Update and return a dictionary representing the current local symbol table. Free variables are returned by locals() when it is called in function blocks, but not in class blocks.
Note: The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter.

To answer the question about usage , one usage is to be able to access variables/names using string for that. For example if you have a variable named a , and you want to access its value using the string - 'a' , you can use globals() or locals() for that as - globals()['a'] , this would return you the value of global variable a or locals()['a'] would return you the value of a in current namespace (which is global namespace when directly inside the script, or local namespace if inside a function)

dir() shows a list of attributes for the object passed in as argument , without an argument it returns the list of names in the current local namespace (similar to locals().keys() ) . From documentation of dir() -

dir([object])
Without arguments, return the list of names in the current local scope. With an argument, attempt to return a list of valid attributes for that object.

Python3 globals() and locals() contents

Simple explanation

globals() refers to the current modules' attribute dictionary.
locals() refers to the current local variables in your function/code-snippet.

Setting a variable will only ever change locals(). (Unless you tell python otherwise using the global or nonlocal keyword.)

Here an example

By default on module-scope globals is the same dict as locals:

>>> globals() is locals()
True

Since globals is locals in this case, modifying the locals will also modify the globals.

If you create a function and look at locals in there, you will see that locals will differ

>>> def test():
...    print("globals is locals:", globals() is locals())
...    print("globals:", globals())
...    print("locals:", locals())
>>> test()
globals is locals: False
globals: {'__name__': '__main__', ...}
locals: {}

Locals will automatically update when you change a function-local variable

>>> def test2():
...     print("locals 1:", locals())
...     x = 1
...     print("locals 2:", locals())
>>> test2()
locals 1: {}
locals 2: {'x': 1}

Something similar happens when creating new classes

>>> class Test:
...     print("locals:", locals())
locals: {'__module__': '__main__', '__qualname__': 'Test'}

More in-depth explanation

If you want to know why globals and locals are the way they are let's look at what happens under the hood of Python.

Some ground work

All python code passes what equates to the eval or exec function at some point. These functions accept three parameters: source, globals (defaults to current globals) and locals (defaults to current locals).

The function globals() and locals() will return whatever has been passed into the eval or exec functions shown above.

What does the Python Shell do?

If you do

>>> print(globals())

The REPL will internally do something along the lines of

# This variable stores your globals.
_my_main_module = {}

def exec_some_line(line):
    return eval(line, globals=_my_main_module, locals=_my_main_module)

# ...

exec_some_line("print(globals())")

As you can see, the Python Shell will at some point set globals and locals to the same dict.

Function execution

Internally, function execution will essentially do three things:

Parse the arguments passed to the function and add them to the local variables.
Execute the code of the function
Return its result.

Here a pseudo-algorithm:

def __call__(*args, **kwargs):
    local_variables = parse_signature_with_args(args, kwargs)
    exec(function_source, function_globals, local_variables)
    return function_result

Creating new classes

When using the class-statement, all indented code will be executed separately.

a new dictionary is created that will act as locals()
Your code is executed with said locals.
The class is created passing locals in

If you execute this code:

class Test:
   a = 5

This is approximately what happens:

 # 1. A new dictionary is created
 _dict = type.__prepare__()
 _dict["__module__"] = __name__
 _dict["__qualname__"] = "Test"

 # 2. Execute the code
 exec("a = 5", globals=globals(), locals=_dict)

 # 3. A class is created
 Test = type("Test", (), _dict)

How this maps to module imports

If you import a module an intricate import mechanism starts. This is a simplified overview:

The interpreter will look if the module has already been imported.
The interpreter will find the file.
Then the file is read and parsed
A module object is created.
The python script is executed and its globals and locals will be set to the new modules' __dict__ attribute.
The module object is returned.

It works something like this:

import sys
from types import ModuleType
def __import__(name):
    # 1. See if module is already imported
    if name in sys.modules:
       return sys.modules[name]

    # 2. Find file.
    filename = find_out_path_to_file(name)

    # 3. Read and parse file
    with open(filename) as f:
      script = f.read()

    # 4. Create the new module
    module = ModuleType(name)

    # 5. Execute the code of the module.
    exec(script, globals=module.__dict__, locals=module.__dict__)

    # 6. Return the new module.
    return module

Any difference between dir() and locals() in Python?

The output of dir() when called without arguments is almost same as locals(), but dir() returns a list of strings and locals() returns a dictionary and you can update that dictionary to add new variables.

dir(...)
    dir([object]) -> list of strings

    If called without an argument, return the names in the current scope.

locals(...)
    locals() -> dictionary

    Update and return a dictionary containing the current scope's local variables.

Type:

>>> type(locals())
<type 'dict'>
>>> type(dir())
<type 'list'>

Update or add new variables using locals():

In [2]: locals()['a']=2

In [3]: a
Out[3]: 2

using dir(), however, this doesn't work:

In [7]: dir()[-2]
Out[7]: 'a'

In [8]: dir()[-2]=10

In [9]: dir()[-2]
Out[9]: 'a'

In [10]: a
Out[10]: 2

Why set is not in locals, globals or vars dictionaries

You need to quote the name when checking.

>>> my_set = set()
>>> locals
<built-in function locals>
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'my_set': set([]), '__doc__': None, '__package__': None}
>>> 'my_set' in locals()
True
>>>

about pandasql locals() and globals() method issue

locals() and globals() are python built-in functions that are used to return the corresponding namespace.

In Python , Namespace is a way to implement scope. So global namespace means global scope, so variables(names) defined there are visible throughout the module.

local namepsace is the namespace that is local to a particular function.

globals() returns a dictionary representing the current global namespace.

locals()'s return depends on where it is called, when called directly inside the script scope (not inside a particular function) it returns the same dictionary as globals() that is the global namespace. When called inside a function it returns the local namespace.

In pandasql , the second argument you need to pass is basically this namespace (dictionary) that contains the variables that you are using in the query. That is lets assume you create a DataFrame called a , and then write your query on it. Then pandasql needs to know the DataFrame that corresponds to the name a for this it needs the local/global namespace, and that is what the second argument is for.

So you need to decide what to pass in, example , if your DataFrame is only defined inside a function and does not exist in global scope, you need to pass in locals() return dictionary, If your DataFrame exists in global scope, you need to pass in result of globals() .

Writing to locals() works in contrast to documentation saying its not

Where are you reading this? Both Py 2 docs and Py 3 docs have the following disclaimer:

Note: The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter.

This shows exactly what this is: an implementation detail. Sure, it works in CPython, but it might not work in the various other interpreters, like IronPython and Jython. It's what would be called a hack.

Do not rely on it updating any variables. Do not even try to do it for anything serious, as it causes undefined behaviour.

In CPython 3.6.0, help(locals) has the following note:

NOTE: Whether or not updates to this dictionary will affect name lookups in
the local scope and vice-versa is *implementation dependent* and not
covered by any backwards compatibility guarantees.

CPython 2.7.13 has no such note, however.

Python 2.7 - local vs global vars

... what I don't understand is, when I call the run a print s outside the function, why is it printing the local variable instead of the global one.

There is no local s within the function. The global s statement causes the Python VM to use the s in the global scope even when binding it.

Performance with global variables vs local

Locals should be faster

According to this page on locals and globals:

When a line of code asks for the value of a variable x, Python will search for that variable in all the available namespaces, in order:

local namespace - specific to the current function or class method. If the function defines a local variable x, or has an argument x, Python will use this and stop searching.

global namespace - specific to the current module. If the module has defined a variable, function, or class called x, Python will use that and stop searching.

built-in namespace - global to all modules. As a last resort, Python will assume that x is the name of built-in function or variable.

Based on that, I'd assume that local variables are generally faster. My guess is what you're seeing is something particular about your script.

Locals are faster

Here's a trivial example using a local variable, which takes about 0.5 seconds on my machine (0.3 in Python 3):

def func():
    for i in range(10000000):
        x = 5

func()

And the global version, which takes about 0.7 (0.5 in Python 3):

def func():
    global x
    for i in range(1000000):
        x = 5

func()

`global` does something weird to variables that are already global

Interestingly, this version runs in 0.8 seconds:

global x
x = 5
for i in range(10000000):
    x = 5

While this runs in 0.9:

x = 5
for i in range(10000000):
    x = 5

You'll notice that in both cases, x is a global variable (since there's no functions), and they're both slower than using locals. I have no clue why declaring global x helped in this case.

This weirdness doesn't occur in Python 3 (both versions take about 0.6 seconds).

Better optimization methods

If you want to optimize your program, the best thing you can do is profile it. This will tell you what's taking the most time, so you can focus on that. Your process should be something like:

Run your program with profiling on.
Look at the profile in KCacheGrind or a similar program to determine Which functions are taking the most time.
In those functions:
- Look for places where you can cache results of functions (so you don't have to do as much work).
- Look for algorithmic improvements like replacing recursive functions with closed-form functions, or replacing list searches with dictionaries.
- Re-profile to make sure the function is still a problem.
- Consider using multiprocessing.