Find First Non-Zero Value in Each Column of Pandas Dataframe

Find first non-zero value in each column of pandas DataFrame

You're looking for idxmax which gives you the first position of the maximum. However, you need to find the max of "not equal to zero"

df.ne(0).idxmax()

first 2
second 1
third 3
dtype: int64

We can couple this with lookup and assign

df.ne(0).idxmax().to_frame('pos').assign(val=lambda d: df.lookup(d.pos, d.index))

pos val
first 2 4
second 1 10
third 3 3

Same answer packaged slightly differently.

m = df.ne(0).idxmax()
pd.DataFrame(dict(pos=m, val=df.lookup(m, m.index)))

pos val
first 2 4
second 1 10
third 3 3

Mark first non-zero element in each column with pandas

Try with broadcasting:

Given Series of Thresholds:

threshold = pd.Series([0.02, 0.06, 0.1])
m = df < threshold.values[:, None]
new_df = pd.DataFrame(
m & (m.index.values[:, None] == m.idxmax().values),
columns=df.columns,
dtype=int
)
   A  B  C
0 0 0 1
1 0 1 0
2 1 0 0

Explaination:

Find where values are less than the array values:

df < threshold.values[:, None]
       A      B      C
0 False False True
1 False True False
2 True True True

Then use idxmax to get the first instance of true per column:

m.idxmax()
A    2
B 1
C 0
dtype: int64

Then broadcast again to find where value is both True and max index for the column:

m & (m.index.values[:, None] == m.idxmax().values)
[[False False  True]
[False True False]
[ True False False]]

Lastly turn back into a DataFrame:

new_df = pd.DataFrame(
m & (m.index.values[:, None] == m.idxmax().values),
columns=df.columns,
dtype=int
)
   A  B  C
0 0 0 1
1 0 1 0
2 1 0 0

Find First Non-zero Value in Each Row of Pandas DataFrame

try this:

res = df[df != 0.0].bfill(axis=1)['A']

all i'm doing is replacing all non-zeros with nans and then filling them in from the right, which forces all resulting values in the first column to be the first non-zero value in the row.

or, a quicker way, as suggested by @piRSquared:

df.replace(0, np.nan).bfill(1).iloc[:, 0]

Finding first non-zero entry across columns in Pandas

Use filter for select columns with info and other, then add new column called NaN for general solution - this value is in new column if only 0 rows, last compare values by ne (!=) and get column of first True by idxmax:

print (df)
entry_name entry_id 052018 info_1 062018 info_2 052018 other_1 \
0 a 1 0 0 1
1 b 2 0 1 0
2 c 4 0 0 0
3 d 5 2 3 4

062018 other_2
0 2
1 0
2 0
3 4

df1 = df.filter(like='info').copy()
df1[np.nan] = 1
df['info'] = df1.ne(0).idxmax(axis=1)

df2 = df.filter(like='other').copy()
df2[np.nan] = 1
df['other'] = df2.ne(0).idxmax(axis=1)
print (df)
entry_name entry_id 052018 info_1 062018 info_2 052018 other_1 \
0 a 1 0 0 1
1 b 2 0 1 0
2 c 4 0 0 0
3 d 5 2 3 4

062018 other_2 info other
0 2 NaN 052018 other_1
1 0 062018 info_2 NaN
2 0 NaN NaN
3 4 052018 info_1 052018 other_1

Find first and last non-zero column in each row of a pandas dataframe

first_valid_index and last_valid_index

d = df.mask(df == 0).drop(['Name', 'count'], 1)
df.assign(
Start=d.apply(pd.Series.first_valid_index, 1),
Finish=d.apply(pd.Series.last_valid_index, 1)
)

Name Jan17 Jun18 Dec18 Apr19 count Start Finish
0 Nick 0.0 1.7 3.7 0.0 2 Jun18 Dec18
1 Jack 0.0 0.0 2.8 3.5 2 Dec18 Apr19
2 Fox 0.0 1.7 0.0 0.0 1 Jun18 Jun18
3 Rex 1.0 0.0 3.0 4.2 3 Jan17 Apr19
4 Snack 0.0 0.0 2.8 4.4 2 Dec18 Apr19
5 Yosee 0.0 0.0 0.0 4.3 1 Apr19 Apr19
6 Petty 0.5 1.3 2.8 3.5 4 Jan17 Apr19

stack then groupby

d = df.mask(df == 0).drop(['Name', 'count'], 1)
def fl(s): return s.xs(s.name).index[[0, -1]]
s, f = d.stack().groupby(level=0).apply(fl).str
df.assign(Start=s, Finish=f)

Name Jan17 Jun18 Dec18 Apr19 count Start Finish
0 Nick 0.0 1.7 3.7 0.0 2 Jun18 Dec18
1 Jack 0.0 0.0 2.8 3.5 2 Dec18 Apr19
2 Fox 0.0 1.7 0.0 0.0 1 Jun18 Jun18
3 Rex 1.0 0.0 3.0 4.2 3 Jan17 Apr19
4 Snack 0.0 0.0 2.8 4.4 2 Dec18 Apr19
5 Yosee 0.0 0.0 0.0 4.3 1 Apr19 Apr19
6 Petty 0.5 1.3 2.8 3.5 4 Jan17 Apr19

Identify the first and all non-zero values in every row in Pandas DataFrame

For first you can select only numeric columns and replace non 0 value by 1 in DataFrame.mask, then for second add cumulative sum per axis=1 with compare first 1 values by DataFrame.eq and boolean mask convert to integers by DataFrame.astype:

df1, df2 = data.copy(), data.copy()
cols = df1.select_dtypes(include=np.number).columns
df1[cols] = df1[cols].mask(data[cols].ne(0), 1)

df2[cols] = df1[cols].cumsum(axis=1).eq(1).astype(int)
print(df1)
Name Date1 Date2 Date3 Date4 Date5
0 Juan 0 0 1 1 1
1 Luis 1 1 1 1 1
2 Maria 0 1 1 1 1
3 Laura 0 0 0 1 1
4 Lina 0 0 0 0 1

print(df2)
Name Date1 Date2 Date3 Date4 Date5
0 Juan 0 0 1 0 0
1 Luis 1 0 0 0 0
2 Maria 0 1 0 0 0
3 Laura 0 0 0 1 0
4 Lina 0 0 0 0 1

Get the column name for the first non-zero value in that row with pandas

It seems like you can use idxmax here:

df.set_index('cnum').idxmax(axis=1).reset_index(drop=True)

0 sup1
1 sup1
2 sup3
3 sup2
dtype: object

df['output'] = df.set_index('cnum').idxmax(axis=1).reset_index(drop=True)
# Slightly faster,
# df['output'] = df.set_index('cnum').idxmax(axis=1).to_numpy()

df
cnum sup1 sup2 sup3 sup4 output
0 285414459 1 0 1 1 sup1
1 445633709 1 0 0 0 sup1
2 556714736 0 0 1 0 sup3
3 1089852074 0 1 0 1 sup2

Another option with dot (will give you all non-zero columns):

d = df.set_index('cnum') 
d.dot(d.columns + ',').str.rstrip(',').reset_index(drop=True)

0 sup1,sup3,sup4
1 sup1
2 sup3
3 sup2,sup4
dtype: object

Or,

(d.dot(d.columns + ',')
.str.rstrip(',')
.str.split(',', 1).str[0]
.reset_index(drop=True))

0 sup1
1 sup1
2 sup3
3 sup2
dtype: object

Find first non-zero element within a group in pandas

In the groupby, set sort to False, get the cumsum, then filter for rows not equal to 0 :

df.loc[df.groupby(["group"], sort=False).target.cumsum() != 0]

group value target
2 1 3 2
3 1 4 0
4 1 5 1
8 2 4 1
9 2 5 3

Pandas dataframe column forward fill from first non-zero value

Use .values attribute:

df['c']=df.groupby('ID',as_index = False)['c'].apply(lambda x: x.replace(to_replace=0, method='ffill')).values

Now if you print df you will get your desired output:

    ID  b   c
0 1 0 0
1 1 5 1
2 1 8 1
3 2 4 0
4 2 8 1
5 2 81 1


Related Topics



Leave a reply



Submit