How to Extract the Row with Min or Max Values

How to extract the row with min or max values?

You can include your which.max call as the first argument to your subsetting call:

df[which.max(df$Temp),]

How to extract minimum and maximum values based on conditions in R

I used data.table for this.

My approach was to first change start and end to integers or there will be ordering problems.

Find which rows meet the start > max(all prior ends), then use cumsum to give an increasing sub-group number.

Then it's just a simple min and max by sub-group.

There are no loops to make this as fast as possible.

library(data.table)
df <- data.frame(group = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"),
class = c("2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3"),
start = c("23477018","23535465","23567386","24708741","24708741","24708741","48339885","87274","87274","127819","1832772","1832772","1832772","6733569","7005524","7005524","7644572","8095433","8095433","8095433"),
end = c("47341413", "47341413", "47909872","42247834","47776347","47909872","53818713","3161655","3479466","3503792","3503792","4916249","5329014","8089225","12037894","13934484","12037894","12037894","13626119","13934484"))

setDT(df)
df[, c('start', 'end') := lapply(.SD, as.integer), .SDcols = c('start', 'end')]
df[, subgrp := cumsum(start > shift(cummax(.SD$end), fill = 0)), keyby = c('group', 'class')]
ans <- df[, .(start = min(start), end = max(end)), keyby = c('group', 'class', 'subgrp')]
ans[, subgrp := NULL][]

group class start end
1: 1 2 23477018 47909872
2: 1 2 48339885 53818713
3: 1 3 87274 5329014
4: 1 3 6733569 13934484

Extract Min and Max Value from the row value delimited by #

You can loop by columns in list with Series.str.extractall for get negative and positive integers, reshape by Series.unstack and convert to floats for numeric. Then get minimal and maximum values with Series.where for misisng values if same values:

cols = ['Col_1', 'Col_2']
for c in cols:
df1 = df[c].str.extractall('([-]?\d+)')[0].unstack().astype(float)
min1 = df1.min(axis=1)
max1 = df1.max(axis=1)

df[f'{c}_min'] = min1
df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
Col_1 Col_2 Col_1_min Col_1_max Col_2_min Col_2_max
0 '0' '-33#90#' 0 NaN -33 90.0
1 '-1#65#' '0' -1 65.0 0 NaN
2 '90' '-22#-44#90#250' 90 NaN -44 250.0

If need remove original columns:

cols = ['Col_1', 'Col_2']
for c in cols:
df1 = df.pop(c).str.extractall('([-]?\d+)')[0].unstack().astype(float)
min1 = df1.min(axis=1)
max1 = df1.max(axis=1)
df[f'{c}_min'] = min1
df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
Col_1_min Col_1_max Col_2_min Col_2_max
0 0 NaN -33 90.0
1 -1 65.0 0 NaN
2 90 NaN -44 250.0

EDIT:

Another solution with split:

cols = ['Col_1', 'Col_2']
for c in cols:
df1 = df.pop(c).str.strip("'").str.split('#', expand=True)
df1 = df1.apply(pd.to_numeric, errors='coerce')
min1 = df1.min(axis=1)
max1 = df1.max(axis=1)
df[f'{c}_min'] = min1
df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
Col_1_min Col_1_max Col_2_min Col_2_max
0 0.0 0.0 -33.0 NaN
1 -1.0 NaN 0.0 0.0
2 90.0 90.0 -44.0 NaN

how to extract min and max rows of data frame and draw multiple graph in a lay out using R

Get the data in long format for 'frac' columns and get min and max value for every X1.time value.

library(dplyr)
library(tidyr)

df %>%
pivot_longer(cols = contains('frac')) %>%
group_by(X1.time) %>%
summarise(min_value = min(value),
max_value = max(value))

# X1.time min_value max_value
# <dbl> <dbl> <dbl>
#1 945 0.904 0.965
#2 955 0.920 0.959
#3 965 0.921 0.982
#4 975 0.925 0.973

Another option is to use rowwise :

df %>%
rowwise() %>%
transmute(X1.time,
min_value = min(c_across(contains('frac'))),
max_value = max(c_across(contains('frac')))) %>%
ungroup

data

It is helpful if you share data in a reproducble format which is easier to copy.

df <- structure(list(X1.time = c(945, 955, 965, 975), X1.frac = c(0.937752593, 
0.959463167, 0.982386049, 0.973241841), X1.time.1 = c(945, 955,
965, 975), X1.frac.1 = c(0.965208348, 0.954415107, 0.959723958,
0.925369792), X1.time.2 = c(945, 955, 965, 975), X1.frac.2 = c(0.904265228,
0.919962471, 0.920854173, 0.928773106)), row.names = c(NA, -4L),
class = "data.frame")

Finding the maximum value for each row and extract column names

You can use apply like

maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])

Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame and do

resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))

resulting in

resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C

R output BOTH maximum and minimum value by group in dataframe

You can use range to get max and min value and use it in summarise to get different rows for each Name.

library(dplyr)

df %>%
group_by(Name) %>%
summarise(Value = range(Value), .groups = "drop")

# Name Value
# <chr> <int>
#1 A 27
#2 A 57
#3 B 20
#4 B 89
#5 C 58
#6 C 97

If you have large dataset using data.table might be faster.

library(data.table)
setDT(df)[, .(Value = range(Value)), Name]

Extract MIN and MAX values related to datetime values on Postgres with a condition postgresql

This is a Gaps & Islands problem. You can use the traditional solution: using LAG() or LEAD().

For example:

select min(date), max(date), max(value)
from (
select *, sum(i) over(order by date) as g
from (
select *,
case when (lag(value) over(order by date) > 170) <> (value > 170)
then 1 else 0
end as i
from mytable
) x
) y
group by g
having max(value) > 170
order by g

Result:

 min                  max                  max   
-------------------- -------------------- -----
2022-02-07 15:30:55 2022-02-07 15:32:00 172.0
2022-02-07 15:34:10 2022-02-07 15:35:20 173.7

See running example at db<>fiddle.



Related Topics



Leave a reply



Submit