Find the column name which has the maximum value for each row
You can use idxmax
with axis=1
to find the column with the greatest value on each row:
>>> df.idxmax(axis=1)
0 Communications
1 Business
2 Communications
3 Communications
4 Business
dtype: object
To create the new column 'Max', use df['Max'] = df.idxmax(axis=1)
.
To find the row index at which the maximum value occurs in each column, use df.idxmax()
(or equivalently df.idxmax(axis=0)
).
Find the column name which has the 2nd maximum value for each row (pandas)
You need numpy.argsort
for position and then reorder columns names by indexing
:
np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(5,5)), columns=list('ABCDE'))
print (df)
A B C D E
0 8 8 3 7 7
1 0 4 2 5 2
2 2 2 1 0 8
3 4 0 9 6 2
4 4 1 5 3 4
arr = np.argsort(-df.values, axis=1)
df1 = pd.DataFrame(df.columns[arr], index=df.index)
print (df1)
0 1 2 3 4
0 A B D E C
1 D B C E A
2 E A B C D
3 C D A E B
4 C A E D B
Verify:
#first column
print (df.idxmax(axis=1))
0 A
1 D
2 E
3 C
4 C
dtype: object
#last column
print (df.idxmin(axis=1))
0 C
1 A
2 D
3 B
4 B
dtype: object
Finding the maximum value for each row and extract column names
You can use apply
like
maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])
Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame
and do
resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))
resulting in
resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C
For each row return the column name of the largest value
One option using your data (for future reference, use set.seed()
to make examples using sample
reproducible):
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
colnames(DF)[apply(DF,1,which.max)]
[1] "V3" "V1" "V2"
A faster solution than using apply
might be max.col
:
colnames(DF)[max.col(DF,ties.method="first")]
#[1] "V3" "V1" "V2"
...where ties.method
can be any of "random"
"first"
or "last"
This of course causes issues if you happen to have two columns which are equal to the maximum. I'm not sure what you want to do in that instance as you will have more than one result for some rows. E.g.:
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(7,6,4))
apply(DF,1,function(x) which(x==max(x)))
[[1]]
V2 V3
2 3
[[2]]
V1
1
[[3]]
V2
2
R Create column which holds column name of maximum value for each row
You can use rowwise
in dplyr
and get the column names of a row that have maximum value.
library(dplyr)
df %>%
rowwise() %>%
mutate(Max = paste0(names(.)[c_across() == max(c_across())], collapse = '_'))
# V1 V2 V3 Max
# <dbl> <dbl> <dbl> <chr>
#1 2 7 7 V2_V3
#2 8 3 6 V1
#3 1 5 4 V2
#4 5 7 5 V2
#5 6 3 1 V1
In base R, you could use apply
-
df$Max <- apply(df, 1, function(x) paste0(names(df)[x == max(x)],collapse = '_'))
Column name corresponding to largest value in pandas DataFrame
The fastest solution I can think of is DataFrame.dot
:
df.eq(df.max(1), axis=0).dot(df.columns)
Details
First, compute the maximum per row:
df.max(1)
0 12
1 8
dtype: int64
Next, find the positions these values come from:
df.eq(df.max(1), axis=0)
x y a b c
0 False False True False False
1 False False False False True
I use eq
to make sure the comparison is broadcasted correctly across columns.
Next, compute the dot product with the column list:
df.eq(df.max(1), axis=0).dot(df.columns)
0 a
1 c
dtype: object
If the max is not unique, use
df.eq(df.max(1), axis=0).dot(df.columns + ',').str.rstrip(',')
To get a comma separated list of columns. For example,
Change a couple values:
df.at[0, 'c'] = 12
df.at[1, 'y'] = 8
Everything is the same, but notice I append a comma to every column:
df.columns + ','
Index(['x,', 'y,', 'a,', 'b,', 'c,'], dtype='object')
df.eq(df.max(1), axis=0).dot(df.columns + ',')
0 a,c,
1 y,c,
dtype: object
From this, strip any trailing commas:
df.eq(df.max(1), axis=0).dot(df.columns + ',').str.rstrip(',')
0 a,c
1 y,c
dtype: object
Finding maximum value in each row and report column-name
We can use max.col
on absolute values of dataframe.
df$MAX <- names(df)[max.col(abs(df))]
df
# V1 V2 V3 MAX
#1 0.4 -0.9 0.6 V2
#2 0.8 -0.2 0.4 V1
#3 -0.6 0.1 0.8 V3
Similarly we can also use an apply
solution row-wise to get maximum value from each row
names(df)[apply(abs(df), 1, which.max)]
#[1] "V2" "V1" "V3"
data
df <- structure(list(V1 = c(0.4, 0.8, -0.6), V2 = c(-0.9, -0.2, 0.1
), V3 = c(0.6, 0.4, 0.8)), class = "data.frame", row.names = c(NA,
-3L))
How to select the max value of each row (not all columns) and mutate 2 columns which are the max value and name in R?
Method 1
Simply use pmax
and max.col
function to identify the maximum values and columns.
library(dplyr)
df %>% mutate(max = pmax(a,b), type = colnames(df)[max.col(df[,3:4]) + 2 ])
Method 2
Or first re-shape your data to a "long" format for easier manipulation. Then use mutate
to extract max
values and names. Finally change it back to a "wide" format and relocate
columns according to your target.
df %>%
pivot_longer(a:b, names_to = "colname") %>%
group_by(lon, lat) %>%
mutate(max = max(value),
type = colname[which.max(value)]) %>%
pivot_wider(everything(), names_from = "colname", values_from = "value") %>%
relocate(max, type, .after = b)
Output
# A tibble: 4 × 6
# Groups: lon, lat [4]
lon lat a b max type
<dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 102 31 4 5 5 b
2 103 32 3 2 3 a
3 104 33 7 4 7 a
4 105 34 6 9 9 b
Related Topics
Converting a Pandas Groupby Output from Series to Dataframe
Generate Random Integers Between 0 and 9
Match a Whole Word in a String Using Dynamic Regex
How to Convert Number Words to Integers
Understanding the Map Function
Why Does Id({}) == Id({}) and Id([]) == Id([]) in Cpython
Expanding Tuples into Arguments
Pg_Config Executable Not Found
How to Access Variables from Different Classes in Tkinter
Setting Y-Axis Limit in Matplotlib
Python C Program Subprocess Hangs at "For Line in Iter"
Proper Name for Python * Operator
How to Set Python's Default Version to 3.X on Os X
How to Set Time Limit on Raw_Input
Using Property() on Classmethods
Importing Orange Returns "Importerror: No Module Named Orange"