Removing Elements That Have Consecutive Duplicates

Removing elements that have consecutive duplicates

>>> L = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [key for key, _group in groupby(L)]
[1, 2, 3, 4, 5, 1, 2]

For the second part

>>> [k for k, g in groupby(L) if len(list(g)) < 2]
[2, 3, 5, 1, 2]

If you don't want to create the temporary list just to take the length, you can use sum over a generator expression

>>> [k for k, g in groupby(L) if sum(1 for i in g) < 2]
[2, 3, 5, 1, 2]

How do I remove consecutive duplicates from a list?

itertools.groupby() is your solution.

newlst = [k for k, g in itertools.groupby(lst)]

If you wish to group and limit the group size by the item's value, meaning 8 4's will be [4,4], and 9 3's will be [3,3,3] here are 2 options that does it:

import itertools

def special_groupby(iterable):
last_element = 0
count = 0
state = False
def key_func(x):
nonlocal last_element
nonlocal count
nonlocal state
if last_element != x or x >= count:
last_element = x
count = 1
state = not state
else:
count += 1
return state
return [next(g) for k, g in itertools.groupby(iterable, key=key_func)]

special_groupby(lst)

OR

def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return itertools.zip_longest(*args, fillvalue=fillvalue)

newlst = list(itertools.chain.from_iterable(next(zip(*grouper(g, k))) for k, g in itertools.groupby(lst)))

Choose whichever you deem appropriate. Both methods are for numbers > 0.

Python remove N consecutive duplicates from the list

def remove_consecutive(l, length):
amount = len(l)
count = 1
start = 0
current = l[0]
i = 1
while i < len(l):
if l[i] == current:
count += 1
else:
if count >= length:
for i in range(count):
l.pop(start)
start = 0
i = 0
current = l[0]
else:
start = i
current = l[i]
count = 1
i+=1
if count >= length:
for i in range(count):
l.pop(start)
return amount - len(l)

Wuff, i got it. My brain is kinda stinky lately so it took so long.

Remove consecutive duplicates from a list using yield generator?

So there are several flaws, all of them described as comments to the question post.

  • there is a loop missing that would yield more than one value
  • you print ans and not x, which logically is the generator object.

Is this code working for you?

test = [5, 5, 5, 4, 5, 6, 6, 5, 5, 7, 8, 0, 0]

def compress(items):
for i, d in enumerate(items[:-1]):
if d == items[i+1]:
continue
yield d
yield items[-1]

for x in compress(test):
print(x)

Fast removal of consecutive duplicates in a list and corresponding items from another list

Python has this groupby in the libraries for you:

>>> list1 = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [k for k,_ in groupby(list1)]
[1, 2, 3, 4, 5, 1, 2]

You can tweak it using the keyfunc argument, to also process the second list at the same time.

>>> list1 = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> list2 = [9,9,9,8,8,8,7,7,7,6,6,6,5]
>>> from operator import itemgetter
>>> keyfunc = itemgetter(0)
>>> [next(g) for k,g in groupby(zip(list1, list2), keyfunc)]
[(1, 9), (2, 7), (3, 7), (4, 7), (5, 6), (1, 6), (2, 5)]

If you want to split those pairs back into separate sequences again:

>>> zip(*_)  # "unzip" them
[(1, 2, 3, 4, 5, 1, 2), (9, 7, 7, 7, 6, 6, 5)]

Removing elements that have consecutive partial duplicates in Python

Here's one solution using itertools.groupby. The idea is to group items depending on whether the first character is equal to a given k. Then apply your 2 criteria; if they are not satisfied, you can yield the items.

L = ['#python', 'is', '#great', 'for', 'handling', 'text',
'#python', '#text', '#nonsense', '#morenonsense', '.']

from itertools import chain, groupby

def list_filter(L, k):
grouper = groupby(L, key=lambda x: x[0]==k)
for i, j in grouper:
items = list(j)
if not (i and len(items) > 1):
yield from items

res = list_filter(L, '#')

print(list(res))

['#python', 'is', '#great', 'for', 'handling', 'text', '.']

Eliminate consecutive duplicates of list elements with prolog

We can solve this problem by one iteration along the list. At any point in the list we check the current element and the next element, if they are the same then we ignore the current element, else if they are different we take the current element.

rm_dup([], []).
rm_dup([X], [X]).
rm_dup([X1, X2 | Xs], [X1 | Ys]) :-
dif(X1, X2), rm_dup([X2|Xs], Ys).
rm_dup([X, X | Xs], Ys) :-
rm_dup([X | Xs], Ys).

The first and second clauses are base clauses in which there are no duplicate elements. The third and fourth clauses are recursive rules.

In third clause we state that if the input list has two values X1 and X2 and they are different dif(X1, X2), then keep the current value.

In fourth clause if we have same consecutive values then we ignore the current value.

The third and fourth clauses are mutually exclusive and hence to make the predicate deterministic it is better to combine them as follows

rm_dup([X], [X]) :- !.
rm_dup([X1, X2 | Xs], Ys) :-
dif(X1, X2) -> (rm_dup([X2 | Xs], Ys1), Ys = [X1 | Ys1]);
rm_dup([X2 | Xs], Ys).

Even better is to just use equality as a condition and flip the then and else clauses.

rm_dup([X], [X]) :- !.
rm_dup([X1, X2 | Xs], Ys) :-
X1 = X2 -> rm_dup([X2 | Xs], Ys);
rm_dup([X2 | Xs], Ys1), Ys = [X1 | Ys1].

How to delete consecutive duplicates in a list of lists efficiently?

You can use groupby like so:

[[k for k, g in groupby(x)] for x in l]

This will keep one if there are multiple repeating consecutive elements.

In case you need to completely remove repetitive consecutive elements, use:

[[k for k, g in groupby(x) if len(list(g)) == 1] for x in l]

Example:

from itertools import groupby

l = [['GILTI', 'was', 'intended', 'to','to', 'stifle', 'multinationals', 'was'],
['like' ,'technology', 'and', 'and','pharmaceutical', 'companies', 'like']]

print([[k for k, g in groupby(x)] for x in l])

# [['GILTI', 'was', 'intended', 'to', 'stifle', 'multinationals', 'was'],
# ['like', 'technology', 'and', 'pharmaceutical', 'companies', 'like']]


Related Topics



Leave a reply



Submit