What are named tuples in Python?
Named tuples are basically easy-to-create, lightweight object types. Named tuple instances can be referenced using object-like variable dereferencing or the standard tuple syntax. They can be used similarly to struct
or other common record types, except that they are immutable. They were added in Python 2.6 and Python 3.0, although there is a recipe for implementation in Python 2.4.
For example, it is common to represent a point as a tuple (x, y)
. This leads to code like the following:
pt1 = (1.0, 5.0)
pt2 = (2.5, 1.5)
from math import sqrt
line_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)
Using a named tuple it becomes more readable:
from collections import namedtuple
Point = namedtuple('Point', 'x y')
pt1 = Point(1.0, 5.0)
pt2 = Point(2.5, 1.5)
from math import sqrt
line_length = sqrt((pt1.x-pt2.x)**2 + (pt1.y-pt2.y)**2)
However, named tuples are still backwards compatible with normal tuples, so the following will still work:
Point = namedtuple('Point', 'x y')
pt1 = Point(1.0, 5.0)
pt2 = Point(2.5, 1.5)
from math import sqrt
# use index referencing
line_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)
# use tuple unpacking
x1, y1 = pt1
Thus, you should use named tuples instead of tuples anywhere you think object notation will make your code more pythonic and more easily readable. I personally have started using them to represent very simple value types, particularly when passing them as parameters to functions. It makes the functions more readable, without seeing the context of the tuple packing.
Furthermore, you can also replace ordinary immutable classes that have no functions, only fields with them. You can even use your named tuple types as base classes:
class Point(namedtuple('Point', 'x y')):
[...]
However, as with tuples, attributes in named tuples are immutable:
>>> Point = namedtuple('Point', 'x y')
>>> pt1 = Point(1.0, 5.0)
>>> pt1.x = 2.0
AttributeError: can't set attribute
If you want to be able change the values, you need another type. There is a handy recipe for mutable recordtypes which allow you to set new values to attributes.
>>> from rcdtype import *
>>> Point = recordtype('Point', 'x y')
>>> pt1 = Point(1.0, 5.0)
>>> pt1 = Point(1.0, 5.0)
>>> pt1.x = 2.0
>>> print(pt1[0])
2.0
I am not aware of any form of "named list" that lets you add new fields, however. You may just want to use a dictionary in this situation. Named tuples can be converted to dictionaries using pt1._asdict()
which returns {'x': 1.0, 'y': 5.0}
and can be operated upon with all the usual dictionary functions.
As already noted, you should check the documentation for more information from which these examples were constructed.
Dictionary of (named) tuples in Python and speed/RAM performance
Cython's cdef-classes might be what you want: They use less memory than the pure Python classes, even if it comes at costs of more overhead when accessing members (because fields are stored as C-values and not Python-objects).
For example:
%%cython
cdef class CTuple:
cdef public unsigned long long int id
cdef public str name
cdef public bint isvalid
def __init__(self, id, name, isvalid):
self.id = id
self.name = name
self.isvalid = isvalid
which can be used as wished:
ob=CTuple(1,"mmm",3)
ob.id, ob.name, ob.isvalid # prints (2, "mmm", 3)
Timings/memory consumption:
First, the baseline on my machine:
0.258 s 252.4 MB # tuples
0.343 s 417.5 MB # dict
1.181 s 264.0 MB # namedtuple collections
with CTuple
we get:
0.306 s 191.0 MB
which is almost as fast and needs considerable less memory.
If the C-type of members isn't clear at compile time, one could use simple python-objects:
%%cython
cdef class PTuple:
cdef public object id
cdef public object name
cdef public object isvalid
def __init__(self, id, name, isvalid):
self.id = id
self.name = name
self.isvalid = isvalid
The timings are a little bit surprising:
0.648 s 249.8 MB
I didn't expect it to be so much slower than the CTuple
-version, but at least it is twice as fast as named tuples.
One disadvantage of this approach is that it needs compilation. Cython however offers cython.inline
which can be used to compile Cython-code created on-the-fly.
I've released cynamedtuple
which can be installed via pip install cynamedtuple
, and is based on the prototype bellow:
import cython
# for generation of cython code:
tab = " "
def create_members_definition(name_to_ctype):
members = []
for my_name, my_ctype in name_to_ctype.items():
members.append(tab+"cdef public "+my_ctype+" "+my_name)
return members
def create_signature(names):
return tab + "def __init__(self,"+", ".join(names)+"):"
def create_initialization(names):
inits = [tab+tab+"self."+x+" = "+x for x in names]
return inits
def create_cdef_class_code(classname, names):
code_lines = ["cdef class " + classname + ":"]
code_lines.extend(create_members_definition(names))
code_lines.append(create_signature(names.keys()))
code_lines.extend(create_initialization(names.keys()))
return "\n".join(code_lines)+"\n"
# utilize cython.inline to generate and load pyx-module:
def create_cnamedtuple_class(classname, names):
code = create_cdef_class_code(classname, names)
code = code + "GenericClass = " + classname +"\n"
ret = cython.inline(code)
return ret["GenericClass"]
Which can be used as follows, to dynamically define CTuple
from above:
CTuple = create_cnamedtuple_class("CTuple",
{"id":"unsigned long long int",
"name":"str",
"isvalid":"bint"})
ob = CTuple(1,"mmm",3)
...
Another alternative could be to use jit-compilation and Numba's jitted-classes which offer this possibility. They however seem to be much slower:
from numba import jitclass, types
spec = [
('id', types.uint64),
('name', types.string),
('isvalid', types.uint8),
]
@jitclass(spec)
class NBTuple(object):
def __init__(self, id, name, isvalid):
self.id = id
self.name = name
self.isvalid = isvalid
and the results are:
20.622 s 394.0 MB
so numba jitted classes are not (yet?) a good choice.
NamedTuples, Hashable and Python
NamedTuple
is based on the tuple
class. See collections.namedtuple()
The hash of a tuple
is the combined hash of all the elements. See tupleobject.c
Since set
is unhashable it is not possible to hash a tuple
or NamedTuple
containing a set
.
And since the hashing of a set is implemented in C you don't see the traceback
Can namedtuples be used in set()?
Yes, they are hashable, and can be used in sets, like tuples
.
The gotcha is that a tuple of mutable objects can change underneath you.
Tuples composed of immutable objects are safe in this regards.
Not sure it is a gotcha, but it is worth noting @user2357112supportsMonica's remark in the comments:
The other gotcha is that a namedtuple is still a tuple, and its
__hash__
and__eq__
are ordinary tuple hash and equality. Anamedtuple
will compare equal to ordinarytuples
or instances of unrelatednamedtuple
classes if the contents match.
When and why should I use a namedtuple instead of a dictionary?
In dict
s, only the keys have to be hashable, not the values. namedtuple
s don't have keys, so hashability isn't an issue.
However, they have a more stringent restriction -- their key-equivalents, "field names", have to be strings.
Basically, if you were going to create a bunch of instances of a class like:
class Container:
def __init__(self, name, date, foo, bar):
self.name = name
self.date = date
self.foo = foo
self.bar = bar
mycontainer = Container(name, date, foo, bar)
and not change the attributes after you set them in __init__
, you could instead use
Container = namedtuple('Container', ['name', 'date', 'foo', 'bar'])
mycontainer = Container(name, date, foo, bar)
as a replacement.
Of course, you could create a bunch of dict
s where you used the same keys in each one, but assuming you will have only valid Python identifiers as keys and don't need mutability,
mynamedtuple.fieldname
is prettier than
mydict['fieldname']
and
mynamedtuple = MyNamedTuple(firstvalue, secondvalue)
is prettier than
mydict = {'fieldname': firstvalue, 'secondfield': secondvalue}
Finally, namedtuple
s are ordered, unlike regular dict
s, so you get the items in the order you defined the fields, unlike a dict
.
How named tuples are implemented internally in python?
Actually, it's very easy to find out how a given namedtuple
is implemented: if you pass the keyword argument verbose=True
when creating it, its class definition is printed:
>>> Point = namedtuple('Point', "x y", verbose=True)
from builtins import property as _property, tuple as _tuple
from operator import itemgetter as _itemgetter
from collections import OrderedDict
class Point(tuple):
'Point(x, y)'
__slots__ = ()
_fields = ('x', 'y')
def __new__(_cls, x, y):
'Create new instance of Point(x, y)'
return _tuple.__new__(_cls, (x, y))
@classmethod
def _make(cls, iterable, new=tuple.__new__, len=len):
'Make a new Point object from a sequence or iterable'
result = new(cls, iterable)
if len(result) != 2:
raise TypeError('Expected 2 arguments, got %d' % len(result))
return result
def _replace(_self, **kwds):
'Return a new Point object replacing specified fields with new values'
result = _self._make(map(kwds.pop, ('x', 'y'), _self))
if kwds:
raise ValueError('Got unexpected field names: %r' % list(kwds))
return result
def __repr__(self):
'Return a nicely formatted representation string'
return self.__class__.__name__ + '(x=%r, y=%r)' % self
@property
def __dict__(self):
'A new OrderedDict mapping field names to their values'
return OrderedDict(zip(self._fields, self))
def _asdict(self):
'''Return a new OrderedDict which maps field names to their values.
This method is obsolete. Use vars(nt) or nt.__dict__ instead.
'''
return self.__dict__
def __getnewargs__(self):
'Return self as a plain tuple. Used by copy and pickle.'
return tuple(self)
def __getstate__(self):
'Exclude the OrderedDict from pickling'
return None
x = _property(_itemgetter(0), doc='Alias for field number 0')
y = _property(_itemgetter(1), doc='Alias for field number 1')
So, it's a subclass of tuple
with some extra methods to give it the required behaviour, a _fields
class-level constant containing the field names, and property
methods for attribute access to the tuple's members.
As for the code behind actually building this class definition, that's deep magic.
list of namedtuples; how to calculate the sum of individual elements
If you don't care what value the non-numeric values will have, you could use zip to form a new tuple with the totals:
R = Trade(*((max,sum)[isinstance(v[0],(int,float))](v)
for v in zip(*listofstuff)))
print(R)
Trade(Ticker='foo', Date='{2020:12:24}', QTY=150, Sell=550.0,
Buy=600.0, Profit=-50.0)
alternatively, you could place a None value in the non total fields:
R = Trade(*( sum(v) if isinstance(v[0],(int,float)) else None
for v in zip(*listofstuff)))
print(R)
Trade(Ticker=None, Date=None, QTY=150, Sell=550.0, Buy=600.0, Profit=-50.0)
If the aggregation type varies with each field (e.g. min for some fields, sum for others, etc), you can prepare a dictionary of aggregation functions and use it in the comprehension:
aggregate = {'QTY':sum, 'Sell':min, 'Buy':max, 'Profit':lambda v:sum(v)/len(v)}
R = Trade(*(aggregate.get(f,lambda _:None)(v)
for f,v in zip(Trade._fields,zip(*listofstuff))))
print(R)
Trade(Ticker=None, Date=None, QTY=150, Sell=50.0, Buy=500.0, Profit=-25.0)
Python namedtuples elementwise addition
This is a possible solution:
class Point(namedtuple("Point", "x y")):
def __add__(self, other):
return Point(**{field: getattr(self, field) + getattr(other, field)
for field in self._fields})
Related Topics
Creating a Symbolic in Shared Volume of Docker and Accessing It in Host MAChine
How Transform a Python Program .Py in an Executable Program in Ubuntu
How to Make a Discontinuous Axis in Matplotlib
What Do *Args and **Kwargs Mean
How to Write to a Python Subprocess' Stdin
Beyond Top Level Package Error in Relative Import
How Accurate Is Python's Time.Sleep()
Is There a Simple Way to Delete a List Element by Value
How to Tail a Log File in Python
Writing a Python List of Lists to a CSV File
Why Does Loading the Libc Shared Library Have "'Libraryloader' Object Is Not Callable" Error
Error Installing Uwsgi in Virtualenv
Valueerror: Setting an Array Element with a Sequence
Using Pandas to Pd.Read_Excel() for Multiple Worksheets of the Same Workbook
Accessing Pandas Column Using Squared Brackets VS Using a Dot (Like an Attribute)
How to Run Multiple Python Versions on Windows
Converting a String Representation of a List into an Actual List Object