Add a variable to a data frame containing max value of each row
You can use apply
. For instance:
df[, "max"] <- apply(df[, 2:26], 1, max)
Here's a basic example:
> df <- data.frame(a=1:50, b=rnorm(50), c=rpois(50, 10))
> df$max <- apply(df, 1, max)
> head(df, 2)
a b c max
1 1 1.3527115 9 9
2 2 -0.6469987 20 20
> tail(df, 2)
a b c max
49 49 -1.4796887 10 49
50 50 0.1600679 13 50
Finding the maximum value for each row among 3 columns in R
You can use the apply
function for this like so:
df$max<-apply(X=df, MARGIN=1, FUN=max)
The MARGIN=1
argument indicated that for every row in X
you wish to apply the function in FUN
. If you use MARGIN=2
it will be by column or MARGIN=c(1,2)
it will be both rows and columns.
Calculate the maximum value across all rows without manually typing the names of every column
You get some data:
m <- tibble(matrix(runif(1000 * 500), ncol = 500))
Make sure every column is a double, then this should ideally work:
m_with_max_col <- m %>%
rowwise() %>%
mutate(max = max(c_across(where(is.numeric))))
This also works, but might be less desirable:
m_with_max_col <- m %>%
rowwise() %>%
mutate(max = max(across()))
Solution is taken from : Row-wise operations
Select the row with the maximum value in each group based on multiple columns in R dplyr
We may get rowwise max of the 'count' columns with pmax
, grouped by 'col1', filter
the rows where the max
value of 'Max' column is.
library(dplyr)
df1 %>%
mutate(Max = pmax(count_col1, count_col2) ) %>%
group_by(col1) %>%
filter(Max == max(Max)) %>%
ungroup %>%
select(-Max)
-output
# A tibble: 3 × 4
col1 col2 count_col1 count_col2
<chr> <chr> <dbl> <dbl>
1 apple aple 1 4
2 banana banan 4 1
3 banana bananb 4 1
We may also use slice_max
library(purrr)
df1 %>%
group_by(col1) %>%
slice_max(invoke(pmax, across(starts_with("count")))) %>%
ungroup
# A tibble: 3 × 4
col1 col2 count_col1 count_col2
<chr> <chr> <dbl> <dbl>
1 apple aple 1 4
2 banana banan 4 1
3 banana bananb 4 1
Finding the maximum value for each row and extract column names
You can use apply
like
maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])
Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame
and do
resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))
resulting in
resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C
How to select the max value of each row (not all columns) and mutate 2 columns which are the max value and name in R?
Method 1
Simply use pmax
and max.col
function to identify the maximum values and columns.
library(dplyr)
df %>% mutate(max = pmax(a,b), type = colnames(df)[max.col(df[,3:4]) + 2 ])
Method 2
Or first re-shape your data to a "long" format for easier manipulation. Then use mutate
to extract max
values and names. Finally change it back to a "wide" format and relocate
columns according to your target.
df %>%
pivot_longer(a:b, names_to = "colname") %>%
group_by(lon, lat) %>%
mutate(max = max(value),
type = colname[which.max(value)]) %>%
pivot_wider(everything(), names_from = "colname", values_from = "value") %>%
relocate(max, type, .after = b)
Output
# A tibble: 4 × 6
# Groups: lon, lat [4]
lon lat a b max type
<dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 102 31 4 5 5 b
2 103 32 3 2 3 a
3 104 33 7 4 7 a
4 105 34 6 9 9 b
R Create column which holds column name of maximum value for each row
You can use rowwise
in dplyr
and get the column names of a row that have maximum value.
library(dplyr)
df %>%
rowwise() %>%
mutate(Max = paste0(names(.)[c_across() == max(c_across())], collapse = '_'))
# V1 V2 V3 Max
# <dbl> <dbl> <dbl> <chr>
#1 2 7 7 V2_V3
#2 8 3 6 V1
#3 1 5 4 V2
#4 5 7 5 V2
#5 6 3 1 V1
In base R, you could use apply
-
df$Max <- apply(df, 1, function(x) paste0(names(df)[x == max(x)],collapse = '_'))
print from specific rows with highest value from multiple columns using R Studio
First, provide a reproducible version of your data (not a picture):
dput(dta)
structure(list(A = c(45, 20, 9, 6, 6), B = c(23, 34, 7, 10, 5
), C = c(12, 15, 8, 0, 4), D = c(4, 4, 6, 0, 3), E = c(5, 6,
3, 1, 2)), class = "data.frame", row.names = c("BOX_A", "BOX_B",
"BOX_C", "BOX_D", "BOX_E"))
Now find which column is the maximum:
idx <- apply(dta, 1, which.max)
Now display the rows where the maximum is in the first column. This is not what you asked for but it is what your picture shows:
dta[idx==1, ]
# A B C D E
# BOX_A 45 23 12 4 5
# BOX_C 9 7 8 6 3
# BOX_E 6 5 4 3 2
Related Topics
R Function Not Returning Values
Data.Table and Parallel Computing
How to Insert an Image into the Navbar on a Shiny Navbarpage()
Change Background and Text of Strips Associated to Multiple Panels in R/Lattice
What Are the Differences Between Community Detection Algorithms in Igraph
Similarity Scores Based on String Comparison in R (Edit Distance)
Creating a Symmetric Matrix in R
How to Speed Up Subset by Groups
Most Frequent Value (Mode) by Group
How to Create Two Independent Drill Down Plot Using Highcharter
How to Get Coefficients and Their Confidence Intervals in Mixed Effects Models
How to Stop Executing of R Code Inside Shiny (Without Stopping the Shiny Process)
Variable Name Restrictions in R
Removing the Border of Legend Symbol
Select Rows of a Matrix That Meet a Condition
How to Show the Y Value on Tooltip While Hover in Ggplot2
R: Ggplot2, How to Set the Plot Title to Wrap Around and Shrink the Text to Fit the Plot