How to Find Last Occurence Index Matching a Certain Value in a Pandas Series

How to find last occurence index matching a certain value in a Pandas Series?

Use last_valid_index:

s = pd.Series([False, False, True, True, False, False])
s.where(s).last_valid_index()

Output:

3

Using @user3483203 example

s = pd.Series(['dog', 'cat', 'fish', 'cat', 'dog', 'horse'], index=[*'abcdef'])
s.where(s=='cat').last_valid_index()

Output

'd'

display the rows based on the last occurrence of an element in a column in Pandas dataframe

I believe you are trying to get the last occurrence of each name, qty grouping.

df.groupby(['name', 'qty'], as_index=False).last()


name qty price
0 Adam 10 8
1 Jack 5 11
2 Jack 6 23
3 Jack 10 4
4 Jack 12 9
5 Rose 11 4
6 Rose 15 4

Python - how to extract the last occurrence meeting a certain condition from a list

l =  [['A', 'aa', '1', '300'],
['A', 'ab', '2', '30'],
['A', 'ac', '3', '60'],
['B', 'ba', '5', '50'],
['B', 'bb', '4', '10'],
['C', 'ca', '6', '50']]

import itertools
for key, group in itertools.groupby(l, lambda x: x[0]):
print key, list(group)[-1]

With no comment on "efficiency" because you haven't explained your conditions at all. Assuming the list is sorted by first element of sublist in advance.

If the list is sorted, one run through should be enough:

def tidy(l):
tmp = []
prev_row = l[0]

for row in l:
if row[0] != prev_row[0]:
tmp.append(prev_row)
prev_row = row
tmp.append(prev_row)
return tmp

and this is ~5x faster than itertools.groupby in a timeit test. Demonstration: https://repl.it/C5Af/0

[Edit: OP has updated their question to say they're already using Pandas to groupby, which is possibly way faster already]

How to find the last occurrence of an item in a Python list

If you are actually using just single letters like shown in your example, then str.rindex would work handily. This raises a ValueError if there is no such item, the same error class as list.index would raise. Demo:

>>> li = ["a", "b", "a", "c", "x", "d", "a", "6"]
>>> ''.join(li).rindex('a')
6

For the more general case you could use list.index on the reversed list:

>>> len(li) - 1 - li[::-1].index('a')
6

The slicing here creates a copy of the entire list. That's fine for short lists, but for the case where li is very large, efficiency can be better with a lazy approach:

def list_rindex(li, x):
for i in reversed(range(len(li))):
if li[i] == x:
return i
raise ValueError("{} is not in list".format(x))

One-liner version:

next(i for i in reversed(range(len(li))) if li[i] == 'a')

Get all rows after the last occurrence of a specific value in pandas

Reverse your rows (this is important). Then call groupby and cumsum, and take all rows with (reversed) cumsum value equal to zero.

df[df.colA.eq('B')[::-1].astype(int).groupby(df.ID).cumsum().eq(0)]

ID colA
1 1 D
3 2 D
4 2 C

How to get the first and the last matching occurrence of an array in python?

Just for fun, an optimized solution that:

  1. Uses index to push the work of confirming at least one value exists, and retrieving its index, to the C layer
  2. Avoids scanning a single element more than needed, and doesn't make any copies (it doesn't access elements between the start_index and end_index at all):
def solution(array,key):
try:
return [array.index(key),
len(array) - next(i for i, x in enumerate(reversed(array), 1) if x == key)]
except ValueError:
return [-1, -1]

It relies on array.index either computing the first element's index or raising ValueError (in which case no such element exists), then follows up by retrieving a single value from a genexpr running over the list in reverse order to get the index (the genexpr is guaranteed to find something, at worst it runs until it finds the same element the index call found). Technically, this does mean that if the element occurs only once, we test it for equality twice (coming from either end), but avoiding said "cost", while possible via itertools.islice, it hardly worth the bother.

If you really love one-liners, you could always do:

def solution(array,key):
return [next((i for i, x in enumerate(array) if x == key), -1)
next((len(array) - i for i, x in enumerate(reversed(array), 1) if x == key), -1)]

though that solution does involve scanning the whole of array twice (once forward, once backward) when the element is not found (if the element exists though, the minimal number of elements are scanned).

pandas series || get index of string if present

You can use [].index to get the index of a value in a series.

s = pd.Series(["koala", "dog", "chameleon"])
s[s == 'dog'].index

Similarly to get the first and last occurrence using min() and max():

s = pd.Series(["koala", "dog", "chameleon","dog"])
d_first, d_last = s[s == 'dog'].index.min(), s[s == 'dog'].index.max()


Related Topics



Leave a reply



Submit