Sql: Filter Rows with Max Value

SQL: Filter rows with max value

An efficient way to do this is often to use not exists:

select t.*
from table t
where not exists (select 1
                  from table t2
                  where t2.file = t.file and t2.Version > t.version
                 );

This query can take advantage of an index on table(file, version).

This rephrases the query to be: "Get me all rows from the table where the corresponding file has no larger version."

SQL select only rows with max value on a column

At first glance...

All you need is a GROUP BY clause with the MAX aggregate function:

SELECT id, MAX(rev)
FROM YourTable
GROUP BY id

It's never that simple, is it?

I just noticed you need the content column as well.

This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.

It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.

Basically, you have two approaches to solve that problem:

Joining with simple `group-identifier, max-value-in-group` Sub-query

In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:

SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
    SELECT id, MAX(rev) rev
    FROM YourTable
    GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev

Left Joining with self, tweaking join conditions and filters

In this approach, you left join the table with itself. Equality goes in the group-identifier. Then, 2 smart moves:

The second join condition is having left side value less than right value
When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.

So you end up with:

SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
    ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;

Conclusion

Both approaches bring the exact same result.

If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.

Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".

Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.

how to use SQL group to filter rows with maximum date value

If you are using a DBMS that has analytical functions you can use ROW_NUMBER:

SELECT  Id, Value, ADate
FROM    (   SELECT  ID,
                    Value,
                    ADate,
                    ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Adate DESC) AS RowNum
            FROM    Test
        ) AS T
WHERE   RowNum = 1;

Otherwise you will need to use a join to the aggregated max date by Id to filter the results from Test to only those where the date matches the maximum date for that Id

SELECT  Test.Id, Test.Value, Test.ADate
FROM    Test
        INNER JOIN
        (   SELECT  ID, MAX(ADate) AS ADate
            FROM    Test
            GROUP BY ID
        ) AS MaxT
            ON MaxT.ID = Test.ID
            AND MaxT.ADate = Test.ADate;

How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?

You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:

SELECT tt.*
FROM topten tt
INNER JOIN
    (SELECT home, MAX(datetime) AS MaxDateTime
    FROM topten
    GROUP BY home) groupedtt 
ON tt.home = groupedtt.home 
AND tt.datetime = groupedtt.MaxDateTime

Filter rows from oracle sql table using max value of a column

Use keep analytic keyword:

select name, min(nation) keep (dense_rank last order by cnt)
from (select name, nation, count(*) as cnt
      from /* your data source */
      group by name, nation)
group by name

min(nation) - min is meaningless in this case but you must keep it
(doesn't work without)
keep - keeps only one result of nation
dense_rank last says to pick up the last element
order by cnt says how to define the order of elements

In the end it will make for every name the nation with the biggest count. The same result can be achieved with

select name, min(nation) keep (dense_rank first order by cnt desc)

Spark.sql Filter rows by MAX

You can have your result using a SQL window in your request, as follows:

SELECT
  cityname, 
  postcode, 
  date, 
  total
FROM
 (SELECT 
    cityname, 
    postcode, 
    date, 
    total, 
    MAX(total) OVER (PARTITION BY cityname ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_total 
  FROM tablecases)
WHERE max_total = total
ORDER BY max_total DESC, date, cityname

How to select rows with max values in categories?

The only solution that comes to my mind is to :

Get the highest day for each ID (using groupBy)
Append the value of the highest day to each line (with matching ID) using join
Then a simple filter where the value of the two lines match

# select the max value for each of the ID
maxDayForIDs = df.groupBy("ID").max("day").withColumnRenamed("max(day)", "maxDay")

# now add the max value of the day for each line (with matching ID)
df = df.join(maxDayForIDs, "ID")

# keep only the lines where it matches "day" equals "maxDay"
df = df.filter(df.day == df.maxDay)

sql/mysql filter including only the max value

SELECT * FROM t WHERE myValue IN (SELECT max(myValue) From t);

###See this SQLFiddle

Edit:

As per discussion with OP.
OP wants to use alias in WHERE clause. But you can only use column aliases in GROUP BY, ORDER BY, or HAVING clauses.

Look at this answer.

Sql: Filter Rows with Max Value