Group by Without Aggregate Function

Any reason for GROUP BY clause without aggregation function?

is the GROUP BY statement in any way useful without an accompanying aggregate function?

Using DISTINCT would be a synonym in such a situation, but the reason you'd want/have to define a GROUP BY clause would be in order to be able to define HAVING clause details.

If you need to define a HAVING clause, you have to define a GROUP BY - you can't do it in conjunction with DISTINCT.

PostgreSQL group by without aggregate function. Why does it work?

This is covered, but not especially obvious, in the docs:

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

In this case, I'm guessing that id is the primary key of the table user which would make name functionally dependent on id.

Why does MySQL allow group by queries WITHOUT aggregate functions?

I believe that it was to handle the case where grouping by one field would imply other fields are also being grouped:

SELECT user.id, user.name, COUNT(post.*) AS posts 
FROM user 
  LEFT OUTER JOIN post ON post.owner_id=user.id 
GROUP BY user.id

In this case the user.name will always be unique per user.id, so there is convenience in not requiring the user.name in the GROUP BY clause (although, as you say, there is definite scope for problems)

GROUP BY without aggregate function in SparkSQL

You can use Window function - row_number().

val columns = input.columns.map(col(_))

input.withColumn("rn", row_number().over(Window.partitionBy(columns: _*).orderBy(columns: _*)))
  .where("rn = 1")
  .drop("rn")
  .show()

Why columns in selection without aggregate function needs to be part of Group by clause in MySQL?

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions (like sum, max, min etc which would return single value for each group), since there would otherwise be more than one possible value to return for an ungrouped column and select won't just return you an arbitrary value.

However, there are multiple workarounds to this.

Option 1. Which you did yourself, adding the other column in group by as -

    SELECT 
          matchid
        , mdate
        , COUNT(player)
    FROM game 
      JOIN goal 
         ON id = matchid
   WHERE (team1= 'POL' OR team2= 'POL')
   GROUP BY matchid, mdate;

Option 2. Also, what you could do in this instance is to add aggregate function on the other column as below (since the field mdate is functionally dependent on match id hence you can do that. You can use any aggregate function which would pick a value)

    SELECT
          matchid
        , max(mdate) as mdate
        , COUNT(player)
    FROM game 
      JOIN goal 
         ON id = matchid
   WHERE (team1= 'POL' OR team2= 'POL')
   GROUP BY matchid;

Option 3. You can calculate the aggregate in a sub-query and then join it with itself to get the additional columns you'd need to show as below

    select 
          t1.matchid
        , t2.mdate
        , t1.count_player
    from
    (SELECT 
          matchid
        , COUNT(player) as count_player
    FROM game 
      JOIN goal 
         ON id = matchid
   WHERE (team1= 'POL' OR team2= 'POL')
   GROUP BY matchid) t1 
   join game t2 on t1.matchid = t2.id;

Option 4. You can also use window function and get the distinct tuple value

    SELECT distinct 
                     matchid
                   , mdate
                   , COUNT(player) over(partition by matchid) as 
    count_player
    FROM game 
      JOIN goal 
         ON id = matchid
   WHERE (team1= 'POL' OR team2= 'POL');

What is difference between distinct and group by (without aggregate function)

GROUP BY lets you use aggregate functions, like AVG, MAX, MIN, SUM, and COUNT. Other hand DISTINCT just removes duplicates.

You can read this answer too : https://stackoverflow.com/a/164544/4227703