Find Element's Index in Pandas Series

Find element's index in pandas Series

>>> myseries[myseries == 7]
3 7
dtype: int64
>>> myseries[myseries == 7].index[0]
3

Though I admit that there should be a better way to do that, but this at least avoids iterating and looping through the object and moves it to the C level.

How to get the index of ith item in pandas.Series or pandas.DataFrame?

You could get it straight from the index

s.index[5]

Or

s.index.values[5]

It all depends on what you consider better. I can tell you that a numpy approach will probably be faster.

For example. numpy.argsort returns an array where the first element in the array is the position in the array being sorted that should be first. The second element in argsort's return array is the position of the element in the array being sorted that should be second. So on and so forth.

So you could do this to get the index value of the 6th item after being sorted.

s.index.values[s.values.argsort()[5]]

Or more transparently

s.sort_values().index[5]

Or more creatively

s.nsmallest(6).idxmax()

pandas series || get index of string if present

You can use [].index to get the index of a value in a series.

s = pd.Series(["koala", "dog", "chameleon"])
s[s == 'dog'].index

Similarly to get the first and last occurrence using min() and max():

s = pd.Series(["koala", "dog", "chameleon","dog"])
d_first, d_last = s[s == 'dog'].index.min(), s[s == 'dog'].index.max()

Return the index of the last element in a pandas series

Try

 df.last_valid_index()

Works for me.

Edit:

To store last element you can use iloc

last_element = df.iloc[-1:]

Python: How to get position of pandas.series element where conditions exist?

Use:

up_edge_x = x[edge_or_not > 0]
up_edge_y = y[edge_or_not > 0]

down_edge_x = x[edge_or_not < 0]
down_edge_y = y[edge_or_not < 0]

all_edges_x = x[edge_or_not != 0]
all_edges_y = y[edge_or_not != 0]

Create Series by ranges with index by up_edge_x, down_edge_x first:

up_edge = pd.Series(range(len(up_edge_x)), index=up_edge_x, name='pos')
down_edge = pd.Series(range(len(down_edge_x)), index=down_edge_x, name='pos')
print (up_edge)
0.9394 0
0.8955 1
Name: pos, dtype: int64

print (down_edge)
0.8574 0
0.9196 1
0.9388 2
0.9602 3
Name: pos, dtype: int64

Then join together:

pos = pd.concat([up_edge, down_edge])
print (pos)
0.9394 0
0.8955 1
0.8574 0
0.9196 1
0.9388 2
0.9602 3
Name: pos, dtype: int64

And last map new column:

all_edges = pd.DataFrame({'y':all_edges_y,
'edge':edge_or_not[edge_or_not != 0].to_numpy(),
'pos': pd.Index(all_edges_x).map(pos)},
index=all_edges_x)

print (all_edges)
y edge pos
0.9394 0.884 1 0
0.8574 0.880 -1 0
0.8955 0.861 1 1
0.9196 0.817 -1 1
0.9388 0.771 -1 2
0.9602 0.727 -1 3

Get list of index of elements of a series in another series

Use broadcasting in numpy for compare with argmax for correct ordering:

out = (ser2.to_numpy() == ser1.to_numpy()[:, None]).argmax(axis=0)
[5, 4, 0, 8]

Solution if some value not matched with np.where and any:

ser1=pd.Series([10, 9, 6, 1, 3, 1, 12, 8, 13])
ser2 = pd.Series([1, 3, 10, 130])

m = ser2.to_numpy() == ser1.to_numpy()[:, None]
out = np.where(m.any(axis=0), m.argmax(axis=0), np.nan)
print (out)
[ 3. 4. 0. nan]

How to extract elements of a Pandas series using a list index

You can use boolean indexing based on the absence of an index in your list:

yields[~yields.index.isin([1, 3, 5])]

By the way, in your original case yields[[1,3,5]] is as good as yields.iloc[[1,3,5]].

Finding the index of an item in a list

>>> ["foo", "bar", "baz"].index("bar")
1

See the documentation for the built-in .index() method of the list:

list.index(x[, start[, end]])

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.

The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.

Caveats

Linear time-complexity in list length

An index call checks every element of the list in order, until it finds a match. If the list is long, and if there is no guarantee that the value will be near the beginning, this can slow down the code.

This problem can only be completely avoided by using a different data structure. However, if the element is known to be within a certain part of the list, the start and end parameters can be used to narrow the search.

For example:

>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514

The second call is orders of magnitude faster, because it only has to search through 10 elements, rather than all 1 million.

Only the index of the first match is returned

A call to index searches through the list in order until it finds a match, and stops there. If there could be more than one occurrence of the value, and all indices are needed, index cannot solve the problem:

>>> [1, 1].index(1) # the `1` index is not found.
0

Instead, use a list comprehension or generator expression to do the search, with enumerate to get indices:

>>> # A list comprehension gives a list of indices directly:
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> # A generator comprehension gives us an iterable object...
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> # which can be used in a `for` loop, or manually iterated with `next`:
>>> next(g)
0
>>> next(g)
2

The list comprehension and generator expression techniques still work if there is only one match, and are more generalizable.

Raises an exception if there is no match

As noted in the documentation above, using .index will raise an exception if the searched-for value is not in the list:

>>> [1, 1].index(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

If this is a concern, either explicitly check first using item in my_list, or handle the exception with try/except as appropriate.

The explicit check is simple and readable, but it must iterate the list a second time. See What is the EAFP principle in Python? for more guidance on this choice.

How to access the last element in a Pandas series

For select last value need Series.iloc or Series.iat, because df['col1'] return Series:

print (df['col1'].iloc[-1])
3
print (df['col1'].iat[-1])
3

Or convert Series to numpy array and select last:

print (df['col1'].values[-1])
3

Or use DataFrame.iloc or DataFrame.iat - but is necessary position of column by Index.get_loc:

print (df.iloc[-1, df.columns.get_loc('col1')])
3
print (df.iat[-1, df.columns.get_loc('col1')])
3

Or is possible use last value of index (necessary not duplicated) and select by DataFrame.loc:

print (df.loc[df.index[-1], 'col1'])
3


Related Topics



Leave a reply



Submit