Sql: Filter Rows with Max Value

SQL: Filter rows with max value

An efficient way to do this is often to use not exists:

select t.*
from table t
where not exists (select 1
from table t2
where t2.file = t.file and t2.Version > t.version
);

This query can take advantage of an index on table(file, version).

This rephrases the query to be: "Get me all rows from the table where the corresponding file has no larger version."

SQL select only rows with max value on a column

At first glance...

All you need is a GROUP BY clause with the MAX aggregate function:

SELECT id, MAX(rev)
FROM YourTable
GROUP BY id

It's never that simple, is it?

I just noticed you need the content column as well.

This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.

It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.

Basically, you have two approaches to solve that problem:

Joining with simple group-identifier, max-value-in-group Sub-query

In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:

SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev

Left Joining with self, tweaking join conditions and filters

In this approach, you left join the table with itself. Equality goes in the group-identifier. Then, 2 smart moves:

  1. The second join condition is having left side value less than right value
  2. When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.

So you end up with:

SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;

Conclusion

Both approaches bring the exact same result.

If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.

Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".

Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.

how to use SQL group to filter rows with maximum date value

If you are using a DBMS that has analytical functions you can use ROW_NUMBER:

SELECT  Id, Value, ADate
FROM ( SELECT ID,
Value,
ADate,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Adate DESC) AS RowNum
FROM Test
) AS T
WHERE RowNum = 1;

Otherwise you will need to use a join to the aggregated max date by Id to filter the results from Test to only those where the date matches the maximum date for that Id

SELECT  Test.Id, Test.Value, Test.ADate
FROM Test
INNER JOIN
( SELECT ID, MAX(ADate) AS ADate
FROM Test
GROUP BY ID
) AS MaxT
ON MaxT.ID = Test.ID
AND MaxT.ADate = Test.ADate;

How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?

You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:

SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime

Filter rows from oracle sql table using max value of a column

Use keep analytic keyword:

select name, min(nation) keep (dense_rank last order by cnt)
from (select name, nation, count(*) as cnt
from /* your data source */
group by name, nation)
group by name
  1. min(nation) - min is meaningless in this case but you must keep it
    (doesn't work without)
  2. keep - keeps only one result of nation
  3. dense_rank last says to pick up the last element
  4. order by cnt says how to define the order of elements

In the end it will make for every name the nation with the biggest count. The same result can be achieved with

select name, min(nation) keep (dense_rank first order by cnt desc)

Spark.sql Filter rows by MAX

You can have your result using a SQL window in your request, as follows:

SELECT
cityname,
postcode,
date,
total
FROM
(SELECT
cityname,
postcode,
date,
total,
MAX(total) OVER (PARTITION BY cityname ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_total
FROM tablecases)
WHERE max_total = total
ORDER BY max_total DESC, date, cityname

How to select rows with max values in categories?

The only solution that comes to my mind is to :

  • Get the highest day for each ID (using groupBy)
  • Append the value of the highest day to each line (with matching ID) using join
  • Then a simple filter where the value of the two lines match
# select the max value for each of the ID
maxDayForIDs = df.groupBy("ID").max("day").withColumnRenamed("max(day)", "maxDay")

# now add the max value of the day for each line (with matching ID)
df = df.join(maxDayForIDs, "ID")

# keep only the lines where it matches "day" equals "maxDay"
df = df.filter(df.day == df.maxDay)

sql/mysql filter including only the max value

SELECT * FROM t WHERE myValue IN (SELECT max(myValue) From t);

###See this SQLFiddle

Edit:

As per discussion with OP.
OP wants to use alias in WHERE clause. But you can only use column aliases in GROUP BY, ORDER BY, or HAVING clauses.

Look at this answer.



Related Topics



Leave a reply



Submit