How to group a list of tuples/objects by similar index/attribute in python?
defaultdict
is how this is done.
While for
loops are largely essential, if
statements aren't.
from collections import defaultdict
groups = defaultdict(list)
for obj in old_list:
groups[obj.some_attr].append(obj)
new_list = groups.values()
How to group objects (tuples) based on having consecutive attribute value?
You can do the following:
from itertools import groupby, count
from operator import itemgetter
data = [('a', 12), ('b', 13), ('c', 15), ('c', 16), ('c', 17)]
def key(i, cursor=count(0)):
"""Generate the same key for consecutive numbers"""
return i[1] - next(cursor)
ordered = sorted(data, key=itemgetter(1))
result = [list(group) for _, group in groupby(ordered, key=key)]
print(result)
Output
[[('a', 12), ('b', 13)], [('c', 15), ('c', 16), ('c', 17)]]
The above is based on an old example found in the documentation of Python 2.6, here.
To better illustrate, what is happening, for the following example:
lst = [12, 13, 15, 16, 17]
print([v - i for i, v in enumerate(lst)])
The generated keys are:
[12, 12, 13, 13, 13]
As it can be seen, consecutive runs have the same key.
I want to group tuples based on similar attributes
As we fill in the components, at each stage there are three cases to consider (as you will have to match up overlapping groups):
- Neither x or y are in any component already found.
- Both are already in different sets, x in set_i and y in set_j.
- Either one or both are in one component, x in set_i or y in a set_i.
We can use the built-in set
to help. (see @jwpat's and @DSM's trickier examples):
def connected_components(lst):
components = [] # list of sets
for (x,y) in lst:
i = j = set_i = set_j = None
for k, c in enumerate(components):
if x in c:
i, set_i = k, c
if y in c:
j, set_j = k, c
#case1 (or already in same set)
if i == j:
if i == None:
components.append(set([x,y]))
continue
#case2
if i != None and j != None:
components = [components[k] for k in range(len(components)) if k!=i and k!=j]
components.append(set_i | set_j)
continue
#case3
if j != None:
components[j].add(x)
if i != None:
components[i].add(y)
return components
lst = [(1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2)]
connected_components(lst)
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])]
map(list, connected_components(lst))
# [[8, 1, 2, 3, 4], [5, 6, 7]]
connected_components([(1, 2), (4, 3), (2, 3), (5, 6), (6, 7), (8, 2)])
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])] # @jwpat's example
connected_components([[1, 3], [2, 4], [3, 4]]
# [set([1, 2, 3, 4])] # @DSM's example
This certainly won't be the most efficient method, but is perhaps one similar to what they would expect. As Jon Clements points out there is a library for these type of calculations: networkx, where they will be much more efficent.
Group elements in python-list by type
Use collections.defaultdict:
from collections import defaultdict
l = [[], 1, 2, 'a', 3, 'b', [5, 6]]
accumulation = defaultdict(list)
for e in l:
accumulation[type(e)].append(e)
result = list(accumulation.values())
print(result)
Output
[[[], [5, 6]], [1, 2, 3], ['a', 'b']]
As an alternative you could use setdefault:
accumulation = {}
for e in l:
accumulation.setdefault(type(e), []).append(e)
grouping list of tuples with itertools
This has tripped me up in the past as well. If you want it to group globally, it's best to sort the list first:
In [163]: test = [(1,1),(3,1),(5,0),(3,0),(2,1)]
In [164]: crit = operator.itemgetter(1)
In [165]: test.sort(key=crit)
In [166]: result = [list(group) for key, group in itertools.groupby(test, crit)]
In [167]: result
Out[167]: [[(5, 0), (3, 0)], [(1, 1), (3, 1), (2, 1)]]
Grouping lists based on a certain value in python and then returning the minimum of the group
If lst
is sorted by the first elements (if not first sort using lst.sort(key=lambda x: x[0])
), then you could use itertools.groupby
to group the lists by the first element, then use min
with a key that compares each group by the last elements:
from itertools import groupby
out = [min(g, key=lambda x: x[-1]) for k, g in groupby(lst, lambda x: x[0])]
Output:
[(1, 42, 15, 5), (2, 72, 39, 6), (3, 12, 15, 1)]
Or if the number of tuples for each index is the same, we could get the desired outcome with sorted
+ list slicing:
out = sorted(lst, key=lambda x: (x[0], x[-1]))[::3]
How to combine values of int's with the same group in list of tuples?
You can use a dictionary to get the desired result without importing any extra module:
lst = [('a', 1),('a', 2),('b', 0),('b', 1),('c', 0)]
Dict = {}
for tup in lst:
first=tup[0]
second=tup[1]
if first not in Dict:
Dict[first]=0
Dict[first]+=second
secondList = []
for key in Dict.keys():
secondList.append((key,Dict[key]))
print(secondList)
List of Tuples to List of List of Tuples
Use itertools.groupby
to group items based on the integers:
from itertools import groupby
lst = [list(g)for _, g in groupby(tuple_list, lambda x: x[0])]
print(lst)
[[(1, 'hello', 'apple'), (1, 'no', 'orange')],
[(2, 'bye', 'grape')],
[(3, 'okay', 'banana')],
[(4, 'how are you?raisin'), (4, "I'm doing well", 'watermelon')]]
Group a list of tuples on two values, and return a list of all the third value
No need to use two nested groupby
grouping by a single field. Instead use itemgetter
with two parameters or a lambda
to group by both the first two values at once, then a list comprehension to get the final elements.
>>> from itertools import groupby
>>> from operator import itemgetter
>>> lst = [(1, 1, 4), (1, 1, 9), (1, 1, 14), (2, 1, 12), (2, 1, 99), (2, 6, 14), (2, 6, 19)]
>>> [(*k, [x[2] for x in g]) for k, g in groupby(lst, key=itemgetter(0, 1))]
[(1, 1, [4, 9, 14]), (2, 1, [12, 99]), (2, 6, [14, 19])]
If, for whatever reason, you want to use two separate groupby
, you can use this:
>>> [(k1, k2, [x[2] for x in g2]) for k1, g1 in groupby(lst, key=itemgetter(0))
... for k2, g2 in groupby(g1, key=itemgetter(1))]
[(1, 1, [4, 9, 14]), (2, 1, [12, 99]), (2, 6, [14, 19])]
Of course, this also works as a regular (nested) loop, more in line with your original code:
def sorter(lst):
for k1, g1 in groupby(lst, key=itemgetter(0)):
for k2, g2 in groupby(g1, key=itemgetter(1)):
yield (k1, k2, [x[2] for x in g2])
Or with the single groupby
, returning a generator object:
def sorter(lst):
return ((*k, [x[2] for x in g]) for k, g in groupby(lst, key=itemgetter(0, 1)))
As always, this assumes that lst
is already sorted
by the same key
. If it is not, sort it first.
Related Topics
Downloading File to Specified Location with Selenium and Python
Take the Content of a List and Append It to Another List
Add Column to Dataframe with Constant Value
Use and Meaning of "In" in an If Statement
How to Use Inspect to Get the Caller's Info from Callee in Python
Calculating Difference Between Two Rows in Python/Pandas
Deleting List Elements Based on Condition
Calculation Error with Pow Operator
Run Child Processes as Different User from a Long Running Python Process
Hiding a Password in a Python Script (Insecure Obfuscation Only)
Generating Matplotlib Graphs Without a Running X Server
Create a Day-Of-Week Column in a Pandas Dataframe Using Python
Concatenate Two Numpy Arrays Vertically
Python How to Write to a Binary File