Advertisement
Advertisement


How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?


Question

My table is:

id  home  datetime     player   resource
---|-----|------------|--------|---------
1  | 10  | 04/03/2009 | john   | 399 
2  | 11  | 04/03/2009 | juliet | 244
5  | 12  | 04/03/2009 | borat  | 555
3  | 10  | 03/03/2009 | john   | 300
4  | 11  | 03/03/2009 | juliet | 200
6  | 12  | 03/03/2009 | borat  | 500
7  | 13  | 24/12/2008 | borat  | 600
8  | 13  | 01/01/2009 | borat  | 700

I need to select each distinct home holding the maximum value of datetime.

Result would be:

id  home  datetime     player   resource 
---|-----|------------|--------|---------
1  | 10  | 04/03/2009 | john   | 399
2  | 11  | 04/03/2009 | juliet | 244
5  | 12  | 04/03/2009 | borat  | 555
8  | 13  | 01/01/2009 | borat  | 700

I have tried:

-- 1 ..by the MySQL manual: 

SELECT DISTINCT
  home,
  id,
  datetime AS dt,
  player,
  resource
FROM topten t1
WHERE datetime = (SELECT
  MAX(t2.datetime)
FROM topten t2
GROUP BY home)
GROUP BY datetime
ORDER BY datetime DESC

Doesn't work. Result-set has 130 rows although database holds 187. Result includes some duplicates of home.

-- 2 ..join

SELECT
  s1.id,
  s1.home,
  s1.datetime,
  s1.player,
  s1.resource
FROM topten s1
JOIN (SELECT
  id,
  MAX(datetime) AS dt
FROM topten
GROUP BY id) AS s2
  ON s1.id = s2.id
ORDER BY datetime 

Nope. Gives all the records.

-- 3 ..something exotic: 

With various results.

2018/03/26
1
789
3/26/2018 9:58:42 AM

Accepted Answer

You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:

SELECT tt.*
FROM topten tt
INNER JOIN
    (SELECT home, MAX(datetime) AS MaxDateTime
    FROM topten
    GROUP BY home) groupedtt 
ON tt.home = groupedtt.home 
AND tt.datetime = groupedtt.MaxDateTime
2018/02/14
958
2/14/2018 12:06:06 PM

The fastest MySQL solution, without inner queries and without GROUP BY:

SELECT m.*                    -- get the row that contains the max value
FROM topten m                 -- "m" from "max"
    LEFT JOIN topten b        -- "b" from "bigger"
        ON m.home = b.home    -- match "max" row with "bigger" row by `home`
        AND m.datetime < b.datetime           -- want "bigger" than "max"
WHERE b.datetime IS NULL      -- keep only if there is no bigger than max

Explanation:

Join the table with itself using the home column. The use of LEFT JOIN ensures all the rows from table m appear in the result set. Those that don't have a match in table b will have NULLs for the columns of b.

The other condition on the JOIN asks to match only the rows from b that have bigger value on the datetime column than the row from m.

Using the data posted in the question, the LEFT JOIN will produce this pairs:

+------------------------------------------+--------------------------------+
|              the row from `m`            |    the matching row from `b`   |
|------------------------------------------|--------------------------------|
| id  home  datetime     player   resource | id    home   datetime      ... |
|----|-----|------------|--------|---------|------|------|------------|-----|
| 1  | 10  | 04/03/2009 | john   | 399     | NULL | NULL | NULL       | ... | *
| 2  | 11  | 04/03/2009 | juliet | 244     | NULL | NULL | NULL       | ... | *
| 5  | 12  | 04/03/2009 | borat  | 555     | NULL | NULL | NULL       | ... | *
| 3  | 10  | 03/03/2009 | john   | 300     | 1    | 10   | 04/03/2009 | ... |
| 4  | 11  | 03/03/2009 | juliet | 200     | 2    | 11   | 04/03/2009 | ... |
| 6  | 12  | 03/03/2009 | borat  | 500     | 5    | 12   | 04/03/2009 | ... |
| 7  | 13  | 24/12/2008 | borat  | 600     | 8    | 13   | 01/01/2009 | ... |
| 8  | 13  | 01/01/2009 | borat  | 700     | NULL | NULL | NULL       | ... | *
+------------------------------------------+--------------------------------+

Finally, the WHERE clause keeps only the pairs that have NULLs in the columns of b (they are marked with * in the table above); this means, due to the second condition from the JOIN clause, the row selected from m has the biggest value in column datetime.

Read the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.

2018/03/09

Here goes T-SQL version:

-- Test data
DECLARE @TestTable TABLE (id INT, home INT, date DATETIME, 
  player VARCHAR(20), resource INT)
INSERT INTO @TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700

-- Answer
SELECT id, home, date, player, resource 
FROM (SELECT id, home, date, player, resource, 
    RANK() OVER (PARTITION BY home ORDER BY date DESC) N
    FROM @TestTable
)M WHERE N = 1

-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource 
    FROM @TestTable T
INNER JOIN 
(   SELECT TI.id, TI.home, TI.date, 
        RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
    FROM @TestTable TI
    WHERE TI.date IN (SELECT MAX(TM.date) FROM @TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id

EDIT
Unfortunately, there are no RANK() OVER function in MySQL.
But it can be emulated, see Emulating Analytic (AKA Ranking) Functions with MySQL.
So this is MySQL version:

SELECT id, home, date, player, resource 
FROM TestTable AS t1 
WHERE 
    (SELECT COUNT(*) 
            FROM TestTable AS t2 
            WHERE t2.home = t1.home AND t2.date > t1.date
    ) = 0
2009/03/06

This will work even if you have two or more rows for each home with equal DATETIME's:

SELECT id, home, datetime, player, resource
FROM   (
       SELECT (
              SELECT  id
              FROM    topten ti
              WHERE   ti.home = t1.home
              ORDER BY
                      ti.datetime DESC
              LIMIT 1
              ) lid
       FROM   (
              SELECT  DISTINCT home
              FROM    topten
              ) t1
       ) ro, topten t2
WHERE  t2.id = ro.lid
2009/03/06

I think this will give you the desired result:

SELECT   home, MAX(datetime)
FROM     my_table
GROUP BY home

BUT if you need other columns as well, just make a join with the original table (check Michael La Voie answer)

Best regards.

2015/11/26

Since people seem to keep running into this thread (comment date ranges from 1.5 year) isn't this much simpler:

SELECT * FROM (SELECT * FROM topten ORDER BY datetime DESC) tmp GROUP BY home

No aggregation functions needed...

Cheers.

2010/12/05

Source: https://stackoverflow.com/questions/612231
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Email: [email protected]