Getting the Index of a Row in a Pandas Apply Function

getting the index of a row in a pandas apply function

To access the index in this case you access the name attribute:

In [182]:

df = pd.DataFrame([[1,2,3],[4,5,6]], columns=['a','b','c'])
def rowFunc(row):
return row['a'] + row['b'] * row['c']

def rowIndex(row):
return row.name
df['d'] = df.apply(rowFunc, axis=1)
df['rowIndex'] = df.apply(rowIndex, axis=1)
df
Out[182]:
a b c d rowIndex
0 1 2 3 7 0
1 4 5 6 34 1

Note that if this is really what you are trying to do that the following works and is much faster:

In [198]:

df['d'] = df['a'] + df['b'] * df['c']
df
Out[198]:
a b c d
0 1 2 3 7
1 4 5 6 34

In [199]:

%timeit df['a'] + df['b'] * df['c']
%timeit df.apply(rowIndex, axis=1)
10000 loops, best of 3: 163 µs per loop
1000 loops, best of 3: 286 µs per loop

EDIT

Looking at this question 3+ years later, you could just do:

In[15]:
df['d'],df['rowIndex'] = df['a'] + df['b'] * df['c'], df.index
df

Out[15]:
a b c d rowIndex
0 1 2 3 7 0
1 4 5 6 34 1

but assuming it isn't as trivial as this, whatever your rowFunc is really doing, you should look to use the vectorised functions, and then use them against the df index:

In[16]:
df['newCol'] = df['a'] + df['b'] + df['c'] + df.index
df

Out[16]:
a b c d rowIndex newCol
0 1 2 3 7 0 6
1 4 5 6 34 1 16

how to get the applying element's index while using pandas apply function?

I used name attribute of the applying row and it worked just fine! no need to add more columns to my DataFrame.

df.apply(lambda row: row.name, axis=1)

Access index in pandas.Series.apply

I don't believe apply has access to the index; it treats each row as a numpy object, not a Series, as you can see:

In [27]: s.apply(lambda x: type(x))
Out[27]:
a b
1 2 <type 'numpy.float64'>
3 6 <type 'numpy.float64'>
4 4 <type 'numpy.float64'>

To get around this limitation, promote the indexes to columns, apply your function, and recreate a Series with the original index.

Series(s.reset_index().apply(f, axis=1).values, index=s.index)

Other approaches might use s.get_level_values, which often gets a little ugly in my opinion, or s.iterrows(), which is likely to be slower -- perhaps depending on exactly what f does.

getting the column of a row in a pandas apply function

You can directly modify the row Series and return the modified row Series.

def convert(row):
for col in row.index:
row[col] = f'({row.name}, {col}), {row[col]}'
return row

df = df.apply(convert, axis=1)
print(df)

X Y Z
a (a, X), 1 (a, Y), 3 (a, Z), 5
b (b, X), 2 (b, Y), 4 (b, Z), 6
c (c, X), 3 (c, Y), 5 (c, Z), 7
d (d, X), 4 (d, Y), 6 (d, Z), 8
e (e, X), 5 (e, Y), 7 (e, Z), 9

Get index of row within a sub-DataFrame with pandas apply

I think you want default index, because filtered rows has original index, there is no change:

sub = df.query("seg==2").reset_index(drop=True)


print (df.query("seg==2"))
seg text
3 2 do
4 2 you
5 2 see

print (df.query("seg==2").reset_index(drop=True))
seg text
0 2 do
1 2 you
2 2 see

pandas get row index list by condition in apply funciton

Check with where after we get the round of the value

out = df.where(df.astype(float).ne(df.astype(float).round())).stack()
Out[368]:
1 col1 4.5
2 col1 7.5
dtype: object

Index level 0 is your row index , and level 1 is the column



Related Topics



Leave a reply



Submit