Find element's index in pandas Series
>>> myseries[myseries == 7]
3 7
dtype: int64
>>> myseries[myseries == 7].index[0]
3
Though I admit that there should be a better way to do that, but this at least avoids iterating and looping through the object and moves it to the C level.
How to get the index of ith item in pandas.Series or pandas.DataFrame?
You could get it straight from the index
s.index[5]
Or
s.index.values[5]
It all depends on what you consider better
. I can tell you that a numpy
approach will probably be faster.
For example. numpy.argsort
returns an array where the first element in the array is the position in the array being sorted that should be first. The second element in argsort's return array is the position of the element in the array being sorted that should be second. So on and so forth.
So you could do this to get the index value of the 6th item after being sorted.
s.index.values[s.values.argsort()[5]]
Or more transparently
s.sort_values().index[5]
Or more creatively
s.nsmallest(6).idxmax()
pandas series || get index of string if present
You can use [].index
to get the index of a value in a series
.
s = pd.Series(["koala", "dog", "chameleon"])
s[s == 'dog'].index
Similarly to get the first and last occurrence using min()
and max()
:
s = pd.Series(["koala", "dog", "chameleon","dog"])
d_first, d_last = s[s == 'dog'].index.min(), s[s == 'dog'].index.max()
Return the index of the last element in a pandas series
Try
df.last_valid_index()
Works for me.
Edit:
To store last element you can use iloc
last_element = df.iloc[-1:]
Python: How to get position of pandas.series element where conditions exist?
Use:
up_edge_x = x[edge_or_not > 0]
up_edge_y = y[edge_or_not > 0]
down_edge_x = x[edge_or_not < 0]
down_edge_y = y[edge_or_not < 0]
all_edges_x = x[edge_or_not != 0]
all_edges_y = y[edge_or_not != 0]
Create Series
by ranges with index by up_edge_x, down_edge_x
first:
up_edge = pd.Series(range(len(up_edge_x)), index=up_edge_x, name='pos')
down_edge = pd.Series(range(len(down_edge_x)), index=down_edge_x, name='pos')
print (up_edge)
0.9394 0
0.8955 1
Name: pos, dtype: int64
print (down_edge)
0.8574 0
0.9196 1
0.9388 2
0.9602 3
Name: pos, dtype: int64
Then join together:
pos = pd.concat([up_edge, down_edge])
print (pos)
0.9394 0
0.8955 1
0.8574 0
0.9196 1
0.9388 2
0.9602 3
Name: pos, dtype: int64
And last map new column:
all_edges = pd.DataFrame({'y':all_edges_y,
'edge':edge_or_not[edge_or_not != 0].to_numpy(),
'pos': pd.Index(all_edges_x).map(pos)},
index=all_edges_x)
print (all_edges)
y edge pos
0.9394 0.884 1 0
0.8574 0.880 -1 0
0.8955 0.861 1 1
0.9196 0.817 -1 1
0.9388 0.771 -1 2
0.9602 0.727 -1 3
Get list of index of elements of a series in another series
Use broadcasting in numpy for compare with argmax
for correct ordering:
out = (ser2.to_numpy() == ser1.to_numpy()[:, None]).argmax(axis=0)
[5, 4, 0, 8]
Solution if some value not matched with np.where
and any
:
ser1=pd.Series([10, 9, 6, 1, 3, 1, 12, 8, 13])
ser2 = pd.Series([1, 3, 10, 130])
m = ser2.to_numpy() == ser1.to_numpy()[:, None]
out = np.where(m.any(axis=0), m.argmax(axis=0), np.nan)
print (out)
[ 3. 4. 0. nan]
How to extract elements of a Pandas series using a list index
You can use boolean indexing based on the absence of an index in your list:
yields[~yields.index.isin([1, 3, 5])]
By the way, in your original case yields[[1,3,5]]
is as good as yields.iloc[[1,3,5]]
.
Finding the index of an item in a list
>>> ["foo", "bar", "baz"].index("bar")
1
See the documentation for the built-in .index()
method of the list:
list.index(x[, start[, end]])
Return zero-based index in the list of the first item whose value is equal to x. Raises a
ValueError
if there is no such item.The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.
Caveats
Linear time-complexity in list length
An index
call checks every element of the list in order, until it finds a match. If the list is long, and if there is no guarantee that the value will be near the beginning, this can slow down the code.
This problem can only be completely avoided by using a different data structure. However, if the element is known to be within a certain part of the list, the start
and end
parameters can be used to narrow the search.
For example:
>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514
The second call is orders of magnitude faster, because it only has to search through 10 elements, rather than all 1 million.
Only the index of the first match is returned
A call to index
searches through the list in order until it finds a match, and stops there. If there could be more than one occurrence of the value, and all indices are needed, index
cannot solve the problem:
>>> [1, 1].index(1) # the `1` index is not found.
0
Instead, use a list comprehension or generator expression to do the search, with enumerate
to get indices:
>>> # A list comprehension gives a list of indices directly:
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> # A generator comprehension gives us an iterable object...
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> # which can be used in a `for` loop, or manually iterated with `next`:
>>> next(g)
0
>>> next(g)
2
The list comprehension and generator expression techniques still work if there is only one match, and are more generalizable.
Raises an exception if there is no match
As noted in the documentation above, using .index
will raise an exception if the searched-for value is not in the list:
>>> [1, 1].index(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 2 is not in list
If this is a concern, either explicitly check first using item in my_list
, or handle the exception with try
/except
as appropriate.
The explicit check is simple and readable, but it must iterate the list a second time. See What is the EAFP principle in Python? for more guidance on this choice.
How to access the last element in a Pandas series
For select last value need Series.iloc
or Series.iat
, because df['col1']
return Series
:
print (df['col1'].iloc[-1])
3
print (df['col1'].iat[-1])
3
Or convert Series to numpy array and select last:
print (df['col1'].values[-1])
3
Or use DataFrame.iloc
or DataFrame.iat
- but is necessary position of column by Index.get_loc
:
print (df.iloc[-1, df.columns.get_loc('col1')])
3
print (df.iat[-1, df.columns.get_loc('col1')])
3
Or is possible use last value of index (necessary not duplicated) and select by DataFrame.loc
:
print (df.loc[df.index[-1], 'col1'])
3
Related Topics
Expand the Line with Specified Width in Data Unit
Numpy.Where() Detailed, Step-By-Step Explanation/Examples
Best Way to Determine If a Sequence Is in Another Sequence
Convert Bytes to Bits in Python
Python Webdriver to Handle Pop Up Browser Windows Which Is Not an Alert
Embedding Ipython Qt Console in a Pyqt Application
"Ssl: Certificate_Verify_Failed" Error When Scraping Https://Www.Thenewboston.Com/
Find Substring in String But Only If Whole Words
How to Rotate a Matplotlib Plot Through 90 Degrees
Repeat Rows in Data Frame N Times
Downloading a Directory Tree with Ftplib
Changing Order of Unit Tests in Python
How to Add Static(Html, CSS, Js, etc) Files in Pyinstaller to Create Standalone Exe File
How to Change UI in Same Window Using Pyqt5
Directing Print Output to a .Txt File
How to Replace Django's Primary Key with a Different Integer That Is Unique for That Table