Display Column Name with Max Value Between Several Columns

Find max value in multiple columns and add column name

library(tidyverse)

df <- data.frame(player=c('A', 'B', 'C', 'D', 'E', 'F', 'G'),
points=c(28, 17, 3, 14, 3, 26, 5),
rebounds=c(5, 6, 4, 7, 14, 12, 9),
assists=c(10, 13, 7, 8, 4, 5, 8))

df %>%
pivot_longer(-player,
names_to = "max_value") %>%
group_by(player) %>%
slice_max(value) %>%
ungroup()

# A tibble: 7 × 3
player max_value value
<chr> <chr> <dbl>
1 A points 28
2 B points 17
3 C assists 7
4 D points 14
5 E rebounds 14
6 F points 26
7 G rebounds 9

SQL MAX of multiple columns?

This is an old answer and broken in many way.

See https://stackoverflow.com/a/6871572/194653 which has way more upvotes and works with sql server 2008+ and handles nulls, etc.

Original but problematic answer:

Well, you can use the CASE statement:

SELECT
CASE
WHEN Date1 >= Date2 AND Date1 >= Date3 THEN Date1
WHEN Date2 >= Date1 AND Date2 >= Date3 THEN Date2
WHEN Date3 >= Date1 AND Date3 >= Date2 THEN Date3
ELSE Date1
END AS MostRecentDate

Select column names based on max value by row

You can use the function GREATEST() in a CASE expression:

SELECT id,
CASE GREATEST(cars_cat1, cars_cat2, cars_cat_3)
WHEN cars_cat1 THEN 'cars_cat1'
WHEN cars_cat2 THEN 'cars_cat2'
WHEN cars_cat3 THEN 'cars_cat3'
END max_category
FROM tablename

How to get column name with max value across multiple columns

This isn't a SQL solution, since the problem lends itself better to a SAS data step. VNAME() also doesn't work in SAS SQL.

Assuming you're using SAS then you can use a combination of the VNAME, MAX ,and WHICHN functions. What would you want to happen if you had duplicates for the maximum value?

 data want;
set have;

array col(3) col_1-col_3;

index_of_max=whichn(max(of col(*)), of col(*));
variable_name=vname(col(index_of_max));

run;

Identifying maximum value in a row, from multiple columns, with an output including all columns in the dataset?

I am considering this in two steps

  1. find the max value of columns
  2. find label that matches the max value (assume not equal values)

If you only have two columns N and P then this is straightforward to do using case_when.

data2 = data %>%
mutate(max_val = pmax(N,P)) %>% # find max
mutate(source = case_when(max_val == N ~ "N", # find label
max_val == P ~ "P"))

However, if the number of columns, or the column names, is dynamic then this becomes harder. I have the following working:

cols = c("N", "P")    # list of column names to work with

data2 = data %>%
mutate(max_val = pmax(!!!syms(cols))) %>% # find max
mutate(source = NA) # initialize blank labels

# iterate to find labels
data3 = data2
for(c in cols)
data3 = mutate(data3, source = ifelse(is.na(source) & max_val == !!sym(c), c, source))

There is probably a way to combine sym with case_when so you do not have to iterate over the labels. If someone finds it, please post an update to this answer.

How to retrieve the column name which has maximum value by comparing the values from multiple columns using 'Case' statements

With CASE? Something like this, perhaps?

SQL> with test (a, b, c, d) as
2 (select 1, 2, 3, 1 from dual)
3 select
4 case when a >= b and a >= c and a >= d then a
5 when b >= a and b >= c and b >= d then b
6 when c >= a and c >= b and c >= d then c
7 when d >= a and d >= b and d >= c then d
8 end result
9 from test;

RESULT
----------
3

SQL>

Find the column name which has the maximum value for each row

You can use idxmax with axis=1 to find the column with the greatest value on each row:

>>> df.idxmax(axis=1)
0 Communications
1 Business
2 Communications
3 Communications
4 Business
dtype: object

To create the new column 'Max', use df['Max'] = df.idxmax(axis=1).

To find the row index at which the maximum value occurs in each column, use df.idxmax() (or equivalently df.idxmax(axis=0)).



Related Topics



Leave a reply



Submit