SQL Select Elements Where Sum of Field Is Less Than N

SQL select elements where sum of field is less than N

SELECT m.id, sum(m1.verbosity) AS total
FROM messages m
JOIN messages m1 ON m1.id <= m.id
WHERE m.verbosity < 70 -- optional, to avoid pointless evaluation
GROUP BY m.id
HAVING SUM(m1.verbosity) < 70
ORDER BY total DESC
LIMIT 1;

This assumes a unique, ascending id like you have in your example.


In modern Postgres - or generally with modern standard SQL (but not in SQLite):

Simple CTE

WITH cte AS (
SELECT *, sum(verbosity) OVER (ORDER BY id) AS total
FROM messages
)
SELECT *
FROM cte
WHERE total < 70
ORDER BY id;

Recursive CTE

Should be faster for big tables where you only retrieve a small set.

WITH RECURSIVE cte AS (
( -- parentheses required
SELECT id, verbosity, verbosity AS total
FROM messages
ORDER BY id
LIMIT 1
)

UNION ALL
SELECT c1.id, c1.verbosity, c.total + c1.verbosity
FROM cte c
JOIN LATERAL (
SELECT *
FROM messages
WHERE id > c.id
ORDER BY id
LIMIT 1
) c1 ON c1.verbosity < 70 - c.total
WHERE c.total < 70
)
SELECT *
FROM cte
ORDER BY id;

All standard SQL, except for LIMIT.

Strictly speaking, there is no such thing as "database-independent". There are various SQL-standards, but no RDBMS complies completely. LIMIT works for PostgreSQL and SQLite (and some others). Use TOP 1 for SQL Server, rownum for Oracle. Here's a comprehensive list on Wikipedia.

The SQL:2008 standard would be:

...
FETCH FIRST 1 ROWS ONLY

... which PostgreSQL supports - but hardly any other RDBMS.

The pure alternative that works with more systems would be to wrap it in a subquery and

SELECT max(total) FROM <subquery>

But that is slow and unwieldy.

db<>fiddle here

Old sqlfiddle

Select where cumulative sum is less than a number (in order of priority)

Here is a way to do it in pure SQL. I won't swear there isn't a better way.

Basically, it uses a recursive common table expression (i.e., WITH costed...) to
compute every possible combination of elements totaling less than 20,000,000.

Then it gets the first full path from that result.

Then, it gets all the rows in that path.

NOTE: the logic assumes that no id is longer than 5 digits. That's the LPAD(id,5,'0') stuff.

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
FROM a_test_table
WHERE cost <= 20000000
UNION ALL
SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
FROM costed, a_test_table a
WHERE a.priority < costed.priority
AND a.cost + costed.running_cost <= 20000000),
best_path as (
SELECT *
FROM costed c
where not exists ( SELECT 'longer path' FROM costed c2 WHERE c2.path like c.path || '|%' )
order by path
fetch first 1 row only )
SELECT att.*
FROM best_path cross join a_test_table att
WHERE best_path.path like '%' || lpad(att.id,5,'0') || '%'
order by att.priority desc;
+----+----------+----------+
| ID | COST | PRIORITY |
+----+----------+----------+
| 1 | 1000000 | 10 |
| 2 | 10000000 | 9 |
| 3 | 5000000 | 8 |
| 7 | 2000000 | 4 |
+----+----------+----------+

UPDATE - Shorter version

This version uses MATCH_RECOGNIZE to find all the rows in the best group following the recursive CTE:

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
FROM a_test_table
WHERE cost <= 20000000
UNION ALL
SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
FROM costed, a_test_table a
WHERE a.priority < costed.priority
AND a.cost + costed.running_cost <= 20000000)
search depth first by priority desc set ord
SELECT id, cost, priority
FROM costed c
MATCH_RECOGNIZE (
ORDER BY path
MEASURES
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN (STRT ADDON*)
DEFINE
ADDON AS ADDON.PATH = PREV(ADDON.PATH) || '|' || LPAD(ADDON.ID,5,'0')
)
WHERE mno = 1
ORDER BY priority DESC;

UPDATE -- Even shorter version, using clever idea from the SQL*Server link the OP posted

*Edit: removed use of ROWNUM=1 in anchor part of recursive CTE, since it depended on the arbitrary order in which rows would be returned. I'm surprised no one dinged me on that. *

WITH costed (id, cost, priority, running_cost) as 
( SELECT id, cost, priority, cost running_cost
FROM ( SELECT * FROM a_test_table
WHERE cost <= 20000000
ORDER BY priority desc
FETCH FIRST 1 ROW ONLY )
UNION ALL
SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost
FROM costed CROSS APPLY ( SELECT b.*
FROM a_test_table b
WHERE b.priority < costed.priority
AND b.cost + costed.running_cost <= 20000000
FETCH FIRST 1 ROW ONLY
) a
)
CYCLE id SET is_cycle TO 'Y' DEFAULT 'N'
select id, cost, priority from costed
order by priority desc

SQL select elements where sum per element is grater than N

Use a cumulative SUM, and then filter on that value:

WITH CTE AS(
SELECT id,
employee,
[date],
hours_worked,
SUM(hours_worked) OVER (PARTITION BY employee ORDER BY [date]
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS total_hours_worked
FROM dbo.YourTable)
SELECT id,
employee,
[date],
hours_worked
FROM CTE
WHERE total_hours_worked > 24;

Only return rows if sum is greater than a value

Whenever you need to do a "WHERE" clause on an aggregate (which SUM is), you need to use the HAVING clause.

SELECT PPOLNO, SUM(PPRMPD) AS SUM FROM PFNTLPYMTH
WHERE ((PYEAR=2012 AND PMONTH >=3 AND PDAY >=27) OR (PYEAR=2013
AND PYEAR <=3 AND PDAY<=27)) GROUP BY PPOLNO
HAVING SUM(PPRMPD) >= 5000

Select minimal count of rows with total sum greater than or equal to a given threshold

select id from 
(select id, if(not(@sum > 0.9), 1, 0) mark, (@sum:=@sum+value) as sum
from trade cross join (select @sum:=0) s
where price=2 order by value asc) t
where mark =1

The inner query counts cumulative sum and addional field mark, which is equal one while sum is less and turn into zero when it is over 0.9. Since it's working one step later, it gathers the first row where sum is above the limit.

The result of the inner select

id   mark   sum
4 1 0.30000001192092896
2 1 0.800000011920929
3 1 1.699999988079071

Now in the outer query you just need to select rows with mark equal 1. And it results in 4,2,3

demo on sqlfiddle

How to select data where sum is greater than x

Try

select uniqID, sum(qty) from salesdata group by uniqID having sum(qty) > 1

"where" cannot be used on aggregate functions - you can only use where on uniqId, in this case.

limiting the rows to where the sum a column equals a certain value in MySQL

Here's a way which should work in MySQL :

SELECT
O.Id,
O.Type,
O.MyAmountCol,
(SELECT
sum(MyAmountCol) FROM Table1
WHERE Id <= O.Id) 'RunningTotal'
FROM Table1 O
HAVING RunningTotal <= 7

It involves calculating a running total and selecting records while the running total is less than or equal to the given number, in this case 7.

SQL Fiddle

SQL How to sum values of the same column until a threshold/condition is met?

You can use running sums and select the row you by your criteria: <=100

select top 1 * from 
(
select CustomerID,
(
SELECT SUM(b.payment)
FROM #temp b
WHERE a.customerid=b.customerID and b.[order] <= a.[order]) as FirstFullPayment
from #temp a
--where customerid=yourCustomerId
)runningsums
where runningsums.FirstFullPayment<=100
order by runningsums.FirstFullPayment desc

MySql query : Get sum of N rows where N is defined in another column value

You can simulate ROW_NUMBER and PARTITION BY in MySql by dynamically assigning variables

SELECT rate, SUM(total) 
FROM
(
SELECT total, rate, category_count,
-- Row num counter, resets at each different rate
@row_num := IF(@prev_rate=rate, @row_num+1, 1) AS RowNum,
@prev_rate := rate -- Track to see if we are in the same rate
FROM tableName,
(SELECT @row_num := 1) x, -- Set initial value for @row_num and @prev_rate
(SELECT @prev_rate := '') y
-- Important, must keep rates together, then order by your requirement
ORDER BY rate ASC, total DESC
) ranked
WHERE ranked.RowNum <= ranked.category_count -- Your requirement of TOP N
GROUP BY rate;

The above returns

Rate Total
3 240
5 480

And if you do the SELECT SUM(total) then drop the GROUP BY you'll get 720 as you want.

SqlFiddle here

Edit

It seems rate is defined as (I'd assumed an integer from your sample data)

`rate` varchar(X) COLLATE utf8_unicode_ci

Change the one line:

 (SELECT @prev_rate := '' COLLATE utf8_unicode_ci) y

Which will set the temporary tracking variable to the same type as your column



Related Topics



Leave a reply



Submit