Postgresql - Group by Clause

PostgreSQL - GROUP BY clause

Postgres 9.1 or later, quoting the release notes of 9.1 ...

Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
The SQL standard allows this behavior, and because of the primary key,
the result is unambiguous.

Return a grouped list with occurrences using Rails and PostgreSQL

The queries in the question and in @Michael's answer have the logic backwards. We want to count how many tags match per article, not how many articles have a certain tag. So we need to GROUP BY w_article.id, not by a_tags.id.

list all articles with that tag, and also how many of given tags they match

To fix this:

SELECT count(t.tag) AS ct, a.*  -- any column from table a allowed ...
FROM   a_tags         t
JOIN   w_articles2tag a2t ON a2t.tag = t.id
JOIN   w_article      a   ON a.id = a2t.article
WHERE  t.tag IN ('css', 'php')
GROUP  BY a.id                  -- ... since PK is in GROUP BY
LIMIT  9;

Assuming id is the primary key of w_article.

However, this form will be faster while doing the same:

SELECT a.*, ct
FROM  (
   SELECT a2t.article AS id, count(*) AS ct
   FROM   a_tags         t
   JOIN   w_articles2tag a2t ON a2t.tag = t.id 
   GROUP  BY 1
   LIMIT  9      -- LIMIT early - cheaper
   ) sub
JOIN   w_article a USING (id);  -- attached alias to article in the sub

Closely related answer from just yesterday:

Why does the following join increase the query time significantly?

How do I port query with GROUP BY clause to PostgreSQL?

Like @Andomar already provided: Most RDBMS require to group by every column that appears unaggregated - anywhere else in the query (including the SELECT list, but also in the WHERE clause etc.)

PGError: ERROR: aggregates not allowed in WHERE clause on a AR query of an object and its has_many objects

The SQL standard also defines that expressions in the GROUP BY clause shall also cover functionally dependent expressions. Postgres implemented that the PK column covers all columns of the same table.

PostgreSQL - GROUP BY clause

So op.id covers the whole table and this should work for your current query:

GROUP BY op.id, c.name, 5, t.name, p.name

5 being a positional reference to the SELECT list, which is also allowed in Postgres. It's just notational shorthand for repeating the long expression:

CASE
   WHEN op.receiving_account_id IS NOT NULL THEN
      CASE
         WHEN op.account_id IS NULL THEN ac2.name
         ELSE ac.name || ' -> ' || ac2.name
      END
   ELSE ac.name
END

Concatenate multiple result rows of one column into one, group by another column
Select first row in each GROUP BY group?

I derive from your names that you have a n:m relationship between operation and tag, implemented with operation_tag. All other joins don't seem to multiply rows, so it would be more efficient to aggregate tags separately - like @Andomar hinted, just get the logic right.

This should work:

SELECT op.id
     , op.name
     , c.name
     , CASE  -- amountsign
          WHEN op.receiving_account_id IS NOT NULL THEN
             CASE WHEN op.account_id IS NULL THEN '+' ELSE '=' END
          ELSE '-' 
       END || ' ' || op.amount || ' zł' AS amount
     , CASE  -- account
          WHEN op.receiving_account_id IS NOT NULL THEN
             CASE
                WHEN op.account_id IS NULL THEN ac2.name
                ELSE ac.name || ' -> ' || ac2.name
             END
          ELSE ac.name
       END AS account
     , t.name AS type
     , to_char(op.date, 'DD.MM.YY') || ' ' || op.time AS date  -- see below
     , p.name AS place
     , ot.tags
FROM   operation op
LEFT   JOIN category c   ON op.category_id = c.id
LEFT   JOIN type     t   ON op.type_id = t.id
LEFT   JOIN account  ac  ON op.account_id = ac.id
LEFT   JOIN account  ac2 ON op.receiving_account_id = ac2.id
LEFT   JOIN place    p   ON op.place_id = p.id
LEFT   JOIN (
   SELECT operation_id, string_agg(t.name, ', ') AS tags
   FROM   operation_tag ot
   LEFT   JOIN tag      t  ON t.id = ot.tag_id
   GROUP  BY 1
   ) ot ON op.id = ot.operation_id
ORDER  BY op.date DESC, op.time DESC;

Asides

You can replace:

CASE --date
   WHEN op.time IS NOT NULL THEN to_char(op.date, 'DD.MM.YY') || ' ' || op.time
   ELSE to_char(op.date, 'DD.MM.YY')
END AS date

with this shorter equivalent:

concat_ws(' ', to_char(op.date, 'DD.MM.YY'), op.time) AS date

But since both columns are defined NOT NULL, you can furher simplify to:

to_char(op.date, 'DD.MM.YY') || ' ' || op.time AS date

Careful with your ORDER BY you have at least one input column also named date. If you use the unqualified name, it will refer to the output column - which is what you want (as clarified in the comment). Details:

PostgreSQL: How to return rows with respect to a found row (relative results)?

However, sorting by the text representation would not sort according to your timeline correctly. Sort by original values instead as suggested in my query above.

PostgreSQL - GROUP BY clause or be used in an aggregate function

I think you are trying to aggregate and group by on the same column. It depends on what data you want. Ether do this:

SELECT 
 cars.name, 
 cars.created_at, 
 cars.updated_at, 
 COUNT(cars.id) AS counter 
FROM cars 
LEFT JOIN users 
  ON cars.id=users.car_id 
GROUP BY cars.name, cars.created_at, cars.updated_at 
ORDER BY counter DESC

Or you want to count all maybe? Then like this:

SELECT
 cars.id,
 cars.name, 
 cars.created_at, 
 cars.updated_at, 
 COUNT(*) AS counter 
FROM cars 
LEFT JOIN users 
  ON cars.id=users.car_id 
GROUP BY cars.id, cars.name, cars.created_at, cars.updated_at 
ORDER BY counter DESC

PostgreSQL problem with the Group By Clause

You can use LAG() window function to check the previous value of Letter for each row and SUM() window function to create the groups of rows of consecutive occurrences.

Then aggregate in each group:

SELECT Letter, MAX(R_Value) Max_RValue
FROM (
  SELECT *, SUM(flag::int) OVER (ORDER BY R_Value DESC) grp
  FROM (
    SELECT *, Letter <> LAG(Letter, 1, '') OVER (ORDER BY R_Value DESC) flag
    FROM tablename
  ) t  
) t  
GROUP BY grp, Letter;

Or, simpler just select the rows where the Letter changes:

SELECT Letter, R_Value
FROM (
  SELECT *, LAG(Letter, 1, '') OVER (ORDER BY R_Value DESC) prev_Letter
  FROM tablename
) t
WHERE Letter <> prev_Letter;

See the demo.

PostgreSQL GROUP BY clause

You could use the rank window function to rank the employees per city, and then wrap that query with another query to get only the highest one per city:

SELECT ename, city, salary
FROM   (SELECT ename, city, salary,
               RANK() OVER (PARTITION BY city ORDER BY salary DESC) AS rk
        FROM   employee) t
WHERE  rk = 1

avoiding group by clause while using aggregate function in sql query

Your main problem is that th.created_date is a timestamp which you are using as as one of your GROUP BY columns. The time is different on each record so there is no grouping on that column.

I'm not certain why you need to cast dates as Character Varying, but in any case if you cast to date first (e.g. th.created_date::DATE) as both the column you select and in GROUP BY. Then it will give you the sum for each date (rather than for each microsecond).

Optimized Postgresql like and group by clause

It sounds like you need an index skip-scan. PostgreSQL currently doesn't implement those automatically but you can emulate it with a recursive CTE. People are working on adding this to the planner so it will be chosen automatically, but even if they succeed it would probably not work with your case-folding LIKE condition. I couldn't see how to integrate the case-folding LIKE condition into the recursive CTE, but if you return all distinct genders preserving case, you can then filter that small list quickly without needing to use an index.

WITH RECURSIVE t AS (
   SELECT min(gender) AS gender FROM employees
   UNION ALL
   SELECT (SELECT min(gender) FROM employees WHERE gender > t.gender)
   FROM t WHERE t.gender IS NOT NULL
   )
SELECT gender FROM t WHERE gender IS NOT NULL and lower(gender) like 'f%';

This took less than 2 ms for me, but it does require you add a plain index on gender, which you don't seem to have already.

Error: Id must appear in group by clause, Postgresql?

SELECT * means "give me all columns" from table(s) involved in the FROM clause. I presume that it is not just UserId, but bunch of other columns as well. Which ones? Can't tell, as you SELECT * FROM "Deals". Consider avoiding SELECT * anywhere but for quick & dirty testing purposes.

Therefore, either enumerate all of them in the GROUP BY clause (which you probably don't want), or SELECT only UserId along with aggregated column; for example,

select okd."UserId", min(okd.position) 
FROM (your current FROM clause)
group by okd."UserId"

[EDIT, based on Oracle (as I have it), but applies to your database too]

Have a look at the following examples:

This works OK - I'm selecting department number and sum salaries of all employees who work in those departments:

SQL> select deptno, sum(sal)
  2  from emp
  3  group by deptno
  4  order by deptno;

    DEPTNO   SUM(SAL)
---------- ----------
        10       8750
        20       6775
        30       9400

I'd like to include job as well, i.e. sum salaries per department and job. If I include a new column into the SELECT but don't have it in GROUP BY, it'll fail:

SQL> select deptno, job, sum(sal)
  2  from emp
  3  group by deptno
  4  order by deptno;
select deptno, job, sum(sal)
               *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

Therefore, you have two options:

one is to revert back to the first query (i.e. remove JOB and have DEPTNO only), or
include additional column into the GROUP BY clause

SQL> select deptno, job, sum(sal)
  2  from emp
  3  group by deptno, job
  4  order by deptno, job;

    DEPTNO JOB         SUM(SAL)
---------- --------- ----------
        10 CLERK           1300
        10 MANAGER         2450
        10 PRESIDENT       5000
        20 ANALYST         3000
        20 CLERK            800
        20 MANAGER         2975
        30 CLERK            950
        30 MANAGER         2850
        30 SALESMAN        5600

[A LITTLE BIT MORE]

Yet another thing: there's a way to aggregate values without using the GROUP BY clause; in Oracle, that's what analytic functions do. I don't know whether there's something like that in the database system you use, but you might check it. Here's an example:

SQL> select deptno, ename, job, sum(sal) over (partition by deptno) sum_sal_dept
  2  from emp
  3  order by deptno, job;

    DEPTNO ENAME      JOB       SUM_SAL_DEPT
---------- ---------- --------- ------------
        10 MILLER     CLERK             8750
        10 CLARK      MANAGER           8750
        10 KING       PRESIDENT         8750
        20 FORD       ANALYST           6775
        20 SMITH      CLERK             6775
        20 JONES      MANAGER           6775
        30 JAMES      CLERK             9400
        30 BLAKE      MANAGER           9400
        30 TURNER     SALESMAN          9400
        30 WARD       SALESMAN          9400
        30 ALLEN      SALESMAN          9400
        30 MARTIN     SALESMAN          9400

See? Without the GROUP BY clause, I've calculated sum of salaries per departments.