Groupwise Maximum

Groupwise maximum

You can use this query. You can achieve results in 75% less time. I checked with more data set. Sub-Queries takes more time.

SELECT p1.id, 
       p1.security, 
       p1.buy_date 
       FROM positions p1
left join
            positions p2
                on p1.security = p2.security
                   and p1.buy_date < p2.buy_date
      where 
      p2.id is null;

SQL-Fiddle link

mySQL groupwise minimum and maximum

subqueries are not required. This query will get you min and max by article and dealer

SELECT article, dealer, max(price) as max_price, min(price) as min_price
FROM   shop s1
group by 1,2
order by 1,2

in case dealers are different for the same article you can use this

SELECT article, max(price) as max_price, min(price) as min_price
    FROM   shop s1
    group by 1
    order by 1

Group-wise Maximum of a Certain Column

Standard SQL would reject your query because you can not SELECT non-aggregate fields that are not part of the GROUP BY clause in an aggregate query.

You're using a MySQL extension of SQL described here:

MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.

Groupwise maximum in larger query

You're looking for the latest row in your Email table for each distinct application_id.

Your subquery to get that isn't quite right. Here's how you get that.

SELECT s.application_id, e.student_email_id
  FROM email e
  JOIN (
         SELECT MAX(tstamp) tstamp, application_id
           FROM email
          GROUP BY application_id
       ) s ON e.application_id = s.application_id AND e.tstamp = s.tstamp

There's another way to do this, that might be more efficient. It will work if the id column is an autoincrement column.

SELECT s.application_id, e.student_email_id
  FROM email e
  JOIN (
         SELECT MAX(id) id
           FROM email
          GROUP BY application_id
       ) s ON e.id = s.id

Either of these preceding subqueries gets the latest student_email_id for each application_id. The second one uses the JOIN to extract only the highest id number for each application_id, and uses that id to find the latest student_email_id.

Your subquery was this. It doesn't get what you hoped for.

 SELECT MAX( tstamp ) AS tstamp, id, student_email_id, application_id /*wrong*/
   FROM email
  GROUP BY id, student_email_id, application_id

You grouped this by id. That means you're going to get all the detail rows. That's not what you want. Even this

 SELECT MAX( tstamp ) AS tstamp, student_email_id, application_id  /*wrong*/
   FROM email
  GROUP BY student_email_id, application_id

will give you more than one record for each application_id value.

So the query you need is:

SELECT  application.* ,  email1.student_email_id AS  email_student_email_id 
  FROM  application 
  LEFT JOIN (
              SELECT s.application_id, e.student_email_id
                FROM email e  
                JOIN (
                       SELECT MAX(id) id
                         FROM email
                        GROUP BY application_id
                     ) s ON e.id = s.id
           ) AS email1 ON  email1.application_id =  application.id 
 WHERE application.status =  'returned'

When you're designing queries like this, it's smart to test from the inside out, starting with the innermost subquery.

Optimize groupwise maximum query

Assuming relatively few rows in options for many rows in records.

Typically, you would have a look-up table options that is referenced from records.option_id, ideally with a foreign key constraint. If you don't, I suggest to create one to enforce referential integrity:

CREATE TABLE options (
  option_id int  PRIMARY KEY
, option    text UNIQUE NOT NULL
);

INSERT INTO options
SELECT DISTINCT option_id, 'option' || option_id -- dummy option names
FROM   records;

Then there is no need to emulate a loose index scan any more and this becomes very simple and fast. Correlated subqueries can use a plain index on (option_id, id).

SELECT option_id, (SELECT max(id)
                   FROM   records
                   WHERE  option_id = o.option_id) AS max_id
FROM   options o
ORDER  BY 1;

This includes options with no match in table records. You get NULL for max_id and you can easily remove such rows in an outer SELECT if needed.

Or (same result):

SELECT option_id, (SELECT id
                   FROM   records
                   WHERE  option_id = o.option_id
                   ORDER  BY id DESC NULLS LAST
                   LIMIT  1) AS max_id
FROM   options o
ORDER  BY 1;

May be slightly faster. The subquery uses the sort order DESC NULLS LAST - same as the aggregate function max() which ignores NULL values. Sorting just DESC would have NULL first:

Why do NULL values come first when ordering DESC in a PostgreSQL query?

The perfect index for this:

CREATE INDEX on records (option_id, id DESC NULLS LAST);

Index sort order doesn't matter much while columns are defined NOT NULL.

There can still be a sequential scan on the small table options, that's just the fastest way to fetch all rows. The ORDER BY may bring in an index (only) scan to fetch pre-sorted rows.

The big table records is only accessed via (bitmap) index scan or, if possible, index-only scan.

db<>fiddle here - showing two index-only scans for the simple case

_{Old sqlfiddle}

Or use LATERAL joins for a similar effect in Postgres 9.3+:

Optimize GROUP BY query to retrieve latest row per user

Groupwise maximum record lookup for contracts and latest status

This is called a groupwise-maximum problem.

It looks like your locks table gets updated sometimes, and those updates change the stamp timestamp column. So your problem is to report out the latest -- most recent in time -- locks record for each contractID. Start with a subquery to determine the latest stamp for each contract.

                 SELECT MAX(stamp) stamp, contractID
                   FROM locks
                  GROUP BY contractID

Then use that subquery in your main query to choose the appropriate row of locks.

SELECT c.id ,c.partner ,l.stamp ,l.`type`
  FROM contracts c
  LEFT JOIN (
                 SELECT MAX(stamp) stamp, contractID
                   FROM locks
                  GROUP BY contractID
       ) latest ON c.contractID=latest.contractID  
  LEFT JOIN locks l   ON c.contractID = l.contractID
                     AND latest.stamp = l.stamp
 WHERE c.partner="2000000301"
 ORDER BY c.id ASC

Notice that the latest locks record is not necessarily the one with the largest id value.

This index will help the query's performance when your locks table is large, by enabling the subquery to do a loose index scan.

ALTER TABLE locks ADD INDEX contractid_stamp (contractID, stamp);

And, you don't need both a PRIMARY KEY and a UNIQUE KEY on the same column. The PRIMARY KEY serves the purpose of guaranteeing uniqueness. Putting both keys on the table slows down INSERTs for no good reason.

mysql groupwise max as second where condition

Avoiding the inner join can improve the query:

SELECT *
FROM `test`
WHERE `master_id` =0
OR `id` IN (
    SELECT t1.id 
    FROM (SELECT * 
        FROM test t2 
        WHERE t2.master_id!=0   
        ORDER BY t2.date ASC) t1
    GROUP BY t1.master_id
)
ORDER BY `date`;

How to select the max aliased value in MySQL

You are getting an error because you cannot use alias references in the where clause. In you query you reference recent_meeting_date in the were clause. To fix your query you need to use the HAVING clause and you will be able to solve your problem. To more information about WHERE vs HAVING take a look on this stackoverflow

Here is the full query with the having clause:

    SELECT * , MAX(meeting_date) AS recent_meeting_date
    FROM driver
    INNER JOIN  meeting_attendee ON meeting_attendee.attendee_email = driver.driver_email
    INNER JOIN  meeting ON meeting.meeting_id = meeting_attendee.meeting_id
    GROUP BY driver_id
    HAVING recent_meeting_date < UTC_TIMESTAMP
    ORDER BY driver_id;