Find the Column Name Which Has the Maximum Value for Each Row

Find the column name which has the maximum value for each row

You can use idxmax with axis=1 to find the column with the greatest value on each row:

>>> df.idxmax(axis=1)
0 Communications
1 Business
2 Communications
3 Communications
4 Business
dtype: object

To create the new column 'Max', use df['Max'] = df.idxmax(axis=1).

To find the row index at which the maximum value occurs in each column, use df.idxmax() (or equivalently df.idxmax(axis=0)).

Find the column name which has the 2nd maximum value for each row (pandas)

You need numpy.argsort for position and then reorder columns names by indexing:

np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(5,5)), columns=list('ABCDE'))
print (df)
A B C D E
0 8 8 3 7 7
1 0 4 2 5 2
2 2 2 1 0 8
3 4 0 9 6 2
4 4 1 5 3 4

arr = np.argsort(-df.values, axis=1)
df1 = pd.DataFrame(df.columns[arr], index=df.index)
print (df1)
0 1 2 3 4
0 A B D E C
1 D B C E A
2 E A B C D
3 C D A E B
4 C A E D B

Verify:

#first column
print (df.idxmax(axis=1))
0 A
1 D
2 E
3 C
4 C
dtype: object

#last column
print (df.idxmin(axis=1))
0 C
1 A
2 D
3 B
4 B
dtype: object

Finding the maximum value for each row and extract column names

You can use apply like

maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])

Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame and do

resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))

resulting in

resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C

For each row return the column name of the largest value

One option using your data (for future reference, use set.seed() to make examples using sample reproducible):

DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))

colnames(DF)[apply(DF,1,which.max)]
[1] "V3" "V1" "V2"

A faster solution than using apply might be max.col:

colnames(DF)[max.col(DF,ties.method="first")]
#[1] "V3" "V1" "V2"

...where ties.method can be any of "random" "first" or "last"

This of course causes issues if you happen to have two columns which are equal to the maximum. I'm not sure what you want to do in that instance as you will have more than one result for some rows. E.g.:

DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(7,6,4))
apply(DF,1,function(x) which(x==max(x)))

[[1]]
V2 V3
2 3

[[2]]
V1
1

[[3]]
V2
2

R Create column which holds column name of maximum value for each row

You can use rowwise in dplyr and get the column names of a row that have maximum value.

library(dplyr)

df %>%
rowwise() %>%
mutate(Max = paste0(names(.)[c_across() == max(c_across())], collapse = '_'))

# V1 V2 V3 Max
# <dbl> <dbl> <dbl> <chr>
#1 2 7 7 V2_V3
#2 8 3 6 V1
#3 1 5 4 V2
#4 5 7 5 V2
#5 6 3 1 V1

In base R, you could use apply -

df$Max <- apply(df, 1, function(x) paste0(names(df)[x == max(x)],collapse = '_'))

Column name corresponding to largest value in pandas DataFrame

The fastest solution I can think of is DataFrame.dot:

df.eq(df.max(1), axis=0).dot(df.columns)

Details
First, compute the maximum per row:

df.max(1)
0 12
1 8
dtype: int64

Next, find the positions these values come from:

df.eq(df.max(1), axis=0)     
x y a b c
0 False False True False False
1 False False False False True

I use eq to make sure the comparison is broadcasted correctly across columns.

Next, compute the dot product with the column list:

df.eq(df.max(1), axis=0).dot(df.columns)
0 a
1 c
dtype: object

If the max is not unique, use

df.eq(df.max(1), axis=0).dot(df.columns + ',').str.rstrip(',')

To get a comma separated list of columns. For example,

Change a couple values:

df.at[0, 'c'] = 12
df.at[1, 'y'] = 8

Everything is the same, but notice I append a comma to every column:

df.columns + ','
Index(['x,', 'y,', 'a,', 'b,', 'c,'], dtype='object')

df.eq(df.max(1), axis=0).dot(df.columns + ',')
0 a,c,
1 y,c,
dtype: object

From this, strip any trailing commas:

df.eq(df.max(1), axis=0).dot(df.columns + ',').str.rstrip(',') 
0 a,c
1 y,c
dtype: object

Finding maximum value in each row and report column-name

We can use max.col on absolute values of dataframe.

df$MAX <- names(df)[max.col(abs(df))]

df
# V1 V2 V3 MAX
#1 0.4 -0.9 0.6 V2
#2 0.8 -0.2 0.4 V1
#3 -0.6 0.1 0.8 V3

Similarly we can also use an apply solution row-wise to get maximum value from each row

names(df)[apply(abs(df), 1, which.max)]
#[1] "V2" "V1" "V3"

data

df <- structure(list(V1 = c(0.4, 0.8, -0.6), V2 = c(-0.9, -0.2, 0.1
), V3 = c(0.6, 0.4, 0.8)), class = "data.frame", row.names = c(NA,
-3L))

How to select the max value of each row (not all columns) and mutate 2 columns which are the max value and name in R?

Method 1

Simply use pmax and max.col function to identify the maximum values and columns.

library(dplyr)

df %>% mutate(max = pmax(a,b), type = colnames(df)[max.col(df[,3:4]) + 2 ])

Method 2

Or first re-shape your data to a "long" format for easier manipulation. Then use mutate to extract max values and names. Finally change it back to a "wide" format and relocate columns according to your target.

df %>% 
pivot_longer(a:b, names_to = "colname") %>%
group_by(lon, lat) %>%
mutate(max = max(value),
type = colname[which.max(value)]) %>%
pivot_wider(everything(), names_from = "colname", values_from = "value") %>%
relocate(max, type, .after = b)

Output

# A tibble: 4 × 6
# Groups: lon, lat [4]
lon lat a b max type
<dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 102 31 4 5 5 b
2 103 32 3 2 3 a
3 104 33 7 4 7 a
4 105 34 6 9 9 b


Related Topics



Leave a reply



Submit