Select Rows with min value for each group
But the best would be save dates as date column. The you can use all function for dates
CREATE TABLE table1 (
[Date] varchar(10),
[Container ID] INTEGER
);
INSERT INTO table1
([Date], [Container ID])
VALUES
('1/1', '1'),
('2/2', '1'),
('3/3', '1'),
('4/4', '2'),
('5/5', '2'),
('6/6', '3'),
('7/7', '3');
GO
SELECT MIN([Date]), [Container ID] FROM table1 GROUP BY [Container ID]
GO
(No column name) | Container ID
:--------------- | -----------:
1/1 | 1
4/4 | 2
6/6 | 3
db<>fiddle here
Select rows with min value by group
Using DWin's solution, tapply
can be avoided using ave
.
df[ df$v1 == ave(df$v1, df$f, FUN=min), ]
This gives another speed-up, as shown below. Mind you, this is also dependent on the number of levels. I give this as I notice that ave
is far too often forgotten about, although it is one of the more powerful functions in R.
f <- rep(letters[1:20],10000)
v1 <- rnorm(20*10000)
v2 <- 1:(20*10000)
df <- data.frame(f,v1,v2)
> system.time(df[ df$v1 == ave(df$v1, df$f, FUN=min), ])
user system elapsed
0.05 0.00 0.05
> system.time(df[ df$v1 %in% tapply(df$v1, df$f, min), ])
user system elapsed
0.25 0.03 0.29
> system.time(lapply(split(df, df$f), FUN = function(x) {
+ vec <- which(x[3] == min(x[3]))
+ return(x[vec, ])
+ })
+ .... [TRUNCATED]
user system elapsed
0.56 0.00 0.58
> system.time(df[tapply(1:nrow(df),df$f,function(i) i[which.min(df$v1[i])]),]
+ )
user system elapsed
0.17 0.00 0.19
> system.time( ddply(df, .var = "f", .fun = function(x) {
+ return(subset(x, v1 %in% min(v1)))
+ }
+ )
+ )
user system elapsed
0.28 0.00 0.28
Pandas GroupBy and select rows with the minimum value in a specific column
I feel like you're overthinking this. Just use groupby
and idxmin
:
df.loc[df.groupby('A').B.idxmin()]
A B C
2 1 2 10
4 2 4 4
df.loc[df.groupby('A').B.idxmin()].reset_index(drop=True)
A B C
0 1 2 10
1 2 4 4
Group by minimum value in one field while selecting distinct rows
How about something like:
SELECT mt.*
FROM MyTable mt INNER JOIN
(
SELECT id, MIN(record_date) AS MinDate
FROM MyTable
GROUP BY id
) t ON mt.id = t.id AND mt.record_date = t.MinDate
This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.
pandas groupby ID and select row with minimal value of specific columns
Bkeesey's answer looks like it almost got you to your solution. I added one more step to get the overall minimum for each group.
import pandas as pd
# create sample df
df = pd.DataFrame({'ID': [1, 1, 2, 2, 3, 3],
'A': [30, 14, 100, 67, 1, 20],
'B': [10, 1, 2, 5, 100, 3],
'C': [1, 2, 3, 4, 5, 6],
})
# set "ID" as the index
df = df.set_index('ID')
# get the min for each column
mindf = df[['A','B']].groupby('ID').transform('min')
# get the min between columns and add it to df
df['min'] = mindf.apply(min, axis=1)
# filter df for when A or B matches the min
df2 = df.loc[(df['A'] == df['min']) | (df['B'] == df['min'])]
print(df2)
In my simplified example, I'm just finding the minimum between columns A and B. Here's the output:
A B C min
ID
1 14 1 2 1
2 100 2 3 2
3 1 100 5 1
How to select rows, using group by with minimum field values?
SELECT min(m.id) AS id, m.item_id, m.user_id, m.bid_price
FROM my_table m
INNER JOIN (
SELECT item_id, min(bid_price) AS min_price
FROM my_table
GROUP BY item_id
) t ON t.item_id = m.item_id
AND t.min_price= m.bid_price
GROUP BY item_id
Output
id item_id user_id bid_price
1 1 11 1
7 2 17 1
8 3 18 2
Live Demo
http://sqlfiddle.com/#!9/a52dc6/13
SQL query to select distinct row with minimum value
Use:
SELECT tbl.*
FROM TableName tbl
INNER JOIN
(
SELECT Id, MIN(Point) MinPoint
FROM TableName
GROUP BY Id
) tbl1
ON tbl1.id = tbl.id
WHERE tbl1.MinPoint = tbl.Point
Select rows with min value based on fourth column and group by first column in linux
You should just sort by column 4. You need to store the entire line in the array, not just $4
. And then print the entire array at the end.
To keep the heading from getting mixed in, I print that separately and then process the rest of the file.
head -n 1 original_file
tail -n +2 original_file | sort -t, -k 4n -u | awk -F, '
!a[$1] { a[$1] = $0 }
END { for (k in a) print a[k] }' | sort -t, -k 1,1n >> out
Get row with highest or lowest value from a GROUP BY
I think this is what you are trying to achieve:
SELECT t.* FROM test t
JOIN
( SELECT Name, MIN(Value) minVal
FROM test GROUP BY Name
) t2
ON t.Value = t2.minVal AND t.Name = t2.Name;
Output:
ID | VALUE | NAME |
---|---|---|
1 | 10 | row1 |
4 | 5 | row2 |
Select all rows of dataframe that have a minimum value for a group
Use DataFrame.sort_values
+ DataFrame.drop_duplicates
.
df.sort_values(['date','time']).drop_duplicates(subset ='date')[['date','value']]
# date value
#1 1/12 13
#2 1/13 8
or
df.sort_values(['date','time']).groupby('date',as_index=False).first()[['date','value']]
# date value
# 0 1/12 13
# 1 1/13 8
Related Topics
Using Regex in R to Find Strings as Whole Words (But Not Strings as Part of Words)
Calculating Cumulative Sum For Each Row
Add a New Column of the Sum by Group
Display/Print All Rows of a Tibble (Tbl_Df)
Overlaying Histograms With Ggplot2 in R
How to Suppress Warnings Globally in an R Script
Shiny 4 Small Textinput Boxes Side-By-Side
Melt/Reshape in Excel Using Vba
How to Set Up Conda-Installed R For Use With Rstudio
How to Uninstall R and Rstudio With All Packages, Settings and Everything Else
How to Assign Values to Dynamic Names Variables
Extract the First 2 Characters in a String
Using Data.Table Package Inside My Own Package