Finding maximum value of one column (by group) and inserting value into another data frame in R
It sounds like you're just looking for aggregate
:
> aggregate(cbind(x1, x2, x3, x4) ~ country1 + year, Data, max)
country1 year x1 x2 x3 x4
1 B 1998 30 10 30 2
2 A 2000 95 90 25 90
3 C 2005 90 90 5 40
It's not very clear from your question how you want to proceed from there though....
Find maximum value of one column based on group_by multiple other columns
We can use slice_max
instead of summarise
to return all the columns after the select
step
library(dplyr)
df_k %>%
group_by(COUNTRY, date_start) %>%
select(-code) %>%
slice_max(order_by = 'ord', n = 1)
If we need to create a new column, use mutate
df_k %>%
group_by(COUNTRY, date_start) %>%
select(-code) %>%
mutate(ordMax = max(ord, na.rm = TRUE)) %>%
ungroup
How to find the maximum value within each group and then recode all other values in the group as zero?
You can try this
df %>%
group_by(Id) %>%
mutate(maxByGroup = (which.max(value) == seq_along(value)) * value) %>%
ungroup()
which gives
Id value maxByGroup
<dbl> <dbl> <dbl>
1 1 500 500
2 1 500 0
3 1 500 0
4 2 250 250
5 2 250 0
6 2 250 0
7 3 300 300
8 3 300 0
9 3 300 0
10 4 400 400
11 4 400 0
12 4 400 0
Extract the maximum value within each group in a dataframe
There are many possibilities to do this in R. Here are some of them:
df <- read.table(header = TRUE, text = 'Gene Value
A 12
A 10
B 3
B 5
B 6
C 1
D 3
D 4')
# aggregate
aggregate(df$Value, by = list(df$Gene), max)
aggregate(Value ~ Gene, data = df, max)
# tapply
tapply(df$Value, df$Gene, max)
# split + lapply
lapply(split(df, df$Gene), function(y) max(y$Value))
# plyr
require(plyr)
ddply(df, .(Gene), summarise, Value = max(Value))
# dplyr
require(dplyr)
df %>% group_by(Gene) %>% summarise(Value = max(Value))
# data.table
require(data.table)
dt <- data.table(df)
dt[ , max(Value), by = Gene]
# doBy
require(doBy)
summaryBy(Value~Gene, data = df, FUN = max)
# sqldf
require(sqldf)
sqldf("select Gene, max(Value) as Value from df group by Gene", drv = 'SQLite')
# ave
df[as.logical(ave(df$Value, df$Gene, FUN = function(x) x == max(x))),]
Select the row with the maximum value in each group
Here's a data.table
solution:
require(data.table) ## 1.9.2
group <- as.data.table(group)
If you want to keep all the entries corresponding to max values of pt
within each group:
group[group[, .I[pt == max(pt)], by=Subject]$V1]
# Subject pt Event
# 1: 1 5 2
# 2: 2 17 2
# 3: 3 5 2
If you'd like just the first max value of pt
:
group[group[, .I[which.max(pt)], by=Subject]$V1]
# Subject pt Event
# 1: 1 5 2
# 2: 2 17 2
# 3: 3 5 2
In this case, it doesn't make a difference, as there aren't multiple maximum values within any group in your data.
Assigning max value of column grouped by another column dynamically in dplyr
If we need to apply on multiple columns use mutate_at
my.events %>%
group_by(x) %>%
mutate_at(vars(starts_with("type")), max)
# A tibble: 5 x 4
# Groups: x [4]
# x title typethx typesea
# <date> <chr> <dbl> <dbl>
#1 2016-11-24 Thanksgiving 1 0
#2 2016-11-25 Thanksgiving 2 0
#3 2016-11-26 Thanksgiving 3 1
#4 2016-11-26 Season 3 1
#5 2016-11-27 Season 0 2
In R, how do I add a max by group?
Try
# This is how you create your data.frame
group<-c("A","A","A","A","A","B","B","C","C","C")
replicate<-c(1,2,3,4,5,1,2,1,2,3)
x<-data.frame(group,replicate) # here you don't need c()
# Here's my solution
Max <- tapply(x$replicate, x$group,max)
data.frame(x, max.per.group=rep(Max, table(x$group)))
group replicate max.per.group
1 A 1 5
2 A 2 5
3 A 3 5
4 A 4 5
5 A 5 5
6 B 1 2
7 B 2 2
8 C 1 3
9 C 2 3
10 C 3 3
creating new column based on highest values
You can use the pmax
function from baseR
to pull the max value across a defined set of columns in your dataframe. In our case this will be inspecting the education
and education_partner
fields.
new_data <- data %>%
mutate(highest_degree = pmax(education, education_partner, na.rm = TRUE))
Output:
ID marital education education_partner highest_degree
1 1 1 14 18 18
2 2 4 18 NA 18
3 3 0 10 NA 10
4 4 2 12 14 14
Find the maximum value for a specific value in a column?
An option with base R
aggregate(cbind(Highest_Mutation_Frequency = Mutation_Frequency) ~ Gene.Name, data, FUN = max)
How to get the maximum value by group
If you know sql this is easier to understand
library(sqldf)
sqldf('select year, max(score) from mydata group by year')
Update (2016-01): Now you can also use dplyr
library(dplyr)
mydata %>% group_by(year) %>% summarise(max = max(score))
Related Topics
Write Different Data Frame in One .CSV File with R
R Specify Function Environment
How to Save the Wordcloud in R
Purrr:Map and Glm - Issues with Call
R: Calculate Means for Subset of a Group
Warning "The Condition Has Length > 1 and Only the First Element Will Be Used"
R // Sum by Based on Date Range
Coerce Logical (Boolean) Vector to 0 and 1
R - Replace Specific Value Contents with Na
How to Calculate Confidence Intervals for Nonlinear Least Squares in R
How to Find the Package Name in R for a Specific Function
Using Proxy Interface in Plotly/Shiny to Dynamically Change Data
Pass R Variable to Rodbc's SQLquery with Multiple Entries
How to Expand a Large Dataframe in R
Draw Lines Between Different Elements in a Stacked Bar Plot