How to extract the row with min or max values?
You can include your which.max
call as the first argument to your subsetting call:
df[which.max(df$Temp),]
How to extract minimum and maximum values based on conditions in R
I used data.table for this.
My approach was to first change start and end to integers or there will be ordering problems.
Find which rows meet the start > max(all prior ends), then use cumsum to give an increasing sub-group number.
Then it's just a simple min and max by sub-group.
There are no loops to make this as fast as possible.
library(data.table)
df <- data.frame(group = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"),
class = c("2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3"),
start = c("23477018","23535465","23567386","24708741","24708741","24708741","48339885","87274","87274","127819","1832772","1832772","1832772","6733569","7005524","7005524","7644572","8095433","8095433","8095433"),
end = c("47341413", "47341413", "47909872","42247834","47776347","47909872","53818713","3161655","3479466","3503792","3503792","4916249","5329014","8089225","12037894","13934484","12037894","12037894","13626119","13934484"))
setDT(df)
df[, c('start', 'end') := lapply(.SD, as.integer), .SDcols = c('start', 'end')]
df[, subgrp := cumsum(start > shift(cummax(.SD$end), fill = 0)), keyby = c('group', 'class')]
ans <- df[, .(start = min(start), end = max(end)), keyby = c('group', 'class', 'subgrp')]
ans[, subgrp := NULL][]
group class start end
1: 1 2 23477018 47909872
2: 1 2 48339885 53818713
3: 1 3 87274 5329014
4: 1 3 6733569 13934484
Extract Min and Max Value from the row value delimited by #
You can loop by columns in list with Series.str.extractall
for get negative and positive integers, reshape by Series.unstack
and convert to floats for numeric. Then get minimal and maximum values with Series.where
for misisng values if same values:
cols = ['Col_1', 'Col_2']
for c in cols:
df1 = df[c].str.extractall('([-]?\d+)')[0].unstack().astype(float)
min1 = df1.min(axis=1)
max1 = df1.max(axis=1)
df[f'{c}_min'] = min1
df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
Col_1 Col_2 Col_1_min Col_1_max Col_2_min Col_2_max
0 '0' '-33#90#' 0 NaN -33 90.0
1 '-1#65#' '0' -1 65.0 0 NaN
2 '90' '-22#-44#90#250' 90 NaN -44 250.0
If need remove original columns:
cols = ['Col_1', 'Col_2']
for c in cols:
df1 = df.pop(c).str.extractall('([-]?\d+)')[0].unstack().astype(float)
min1 = df1.min(axis=1)
max1 = df1.max(axis=1)
df[f'{c}_min'] = min1
df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
Col_1_min Col_1_max Col_2_min Col_2_max
0 0 NaN -33 90.0
1 -1 65.0 0 NaN
2 90 NaN -44 250.0
EDIT:
Another solution with split:
cols = ['Col_1', 'Col_2']
for c in cols:
df1 = df.pop(c).str.strip("'").str.split('#', expand=True)
df1 = df1.apply(pd.to_numeric, errors='coerce')
min1 = df1.min(axis=1)
max1 = df1.max(axis=1)
df[f'{c}_min'] = min1
df[f'{c}_max'] = max1.mask(max1==min1)
print (df)
Col_1_min Col_1_max Col_2_min Col_2_max
0 0.0 0.0 -33.0 NaN
1 -1.0 NaN 0.0 0.0
2 90.0 90.0 -44.0 NaN
how to extract min and max rows of data frame and draw multiple graph in a lay out using R
Get the data in long format for 'frac'
columns and get min
and max
value for every X1.time
value.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = contains('frac')) %>%
group_by(X1.time) %>%
summarise(min_value = min(value),
max_value = max(value))
# X1.time min_value max_value
# <dbl> <dbl> <dbl>
#1 945 0.904 0.965
#2 955 0.920 0.959
#3 965 0.921 0.982
#4 975 0.925 0.973
Another option is to use rowwise
:
df %>%
rowwise() %>%
transmute(X1.time,
min_value = min(c_across(contains('frac'))),
max_value = max(c_across(contains('frac')))) %>%
ungroup
data
It is helpful if you share data in a reproducble format which is easier to copy.
df <- structure(list(X1.time = c(945, 955, 965, 975), X1.frac = c(0.937752593,
0.959463167, 0.982386049, 0.973241841), X1.time.1 = c(945, 955,
965, 975), X1.frac.1 = c(0.965208348, 0.954415107, 0.959723958,
0.925369792), X1.time.2 = c(945, 955, 965, 975), X1.frac.2 = c(0.904265228,
0.919962471, 0.920854173, 0.928773106)), row.names = c(NA, -4L),
class = "data.frame")
Finding the maximum value for each row and extract column names
You can use apply
like
maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])
Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame
and do
resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))
resulting in
resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C
R output BOTH maximum and minimum value by group in dataframe
You can use range
to get max
and min
value and use it in summarise
to get different rows for each Name
.
library(dplyr)
df %>%
group_by(Name) %>%
summarise(Value = range(Value), .groups = "drop")
# Name Value
# <chr> <int>
#1 A 27
#2 A 57
#3 B 20
#4 B 89
#5 C 58
#6 C 97
If you have large dataset using data.table
might be faster.
library(data.table)
setDT(df)[, .(Value = range(Value)), Name]
Extract MIN and MAX values related to datetime values on Postgres with a condition postgresql
This is a Gaps & Islands problem. You can use the traditional solution: using LAG()
or LEAD()
.
For example:
select min(date), max(date), max(value)
from (
select *, sum(i) over(order by date) as g
from (
select *,
case when (lag(value) over(order by date) > 170) <> (value > 170)
then 1 else 0
end as i
from mytable
) x
) y
group by g
having max(value) > 170
order by g
Result:
min max max
-------------------- -------------------- -----
2022-02-07 15:30:55 2022-02-07 15:32:00 172.0
2022-02-07 15:34:10 2022-02-07 15:35:20 173.7
See running example at db<>fiddle.
Related Topics
Data.Frame Without Ruining Column Names
Issue When Importing Dataset: 'Error in Scan(...): Line 1 Did Not Have 145 Elements'
Pretty Ticks for Log Normal Scale Using Ggplot2 (Dynamic Not Manual)
What Methods How to Use to Reshape Very Large Data Sets
How to Learn R as a Programming Language
Text Clustering with Levenshtein Distances
Euclidean Distance of Two Vectors
Can Dcast Be Used Without an Aggregate Function
Finding Out Which Functions Are Called Within a Given Function
How to Generate Distributions Given, Mean, Sd, Skew and Kurtosis in R
Marker Mouse Click Event in R Leaflet for Shiny
Rscript Does Not Load Methods Package, R Does -- Why, and What Are the Consequences
Reordering Factor Gives Different Results, Depending on Which Packages Are Loaded
What Are the Differences Between Community Detection Algorithms in Igraph