Limit on the WHERE col IN (...) condition
Depending on the database engine you are using, there can be limits on the length of an instruction.
SQL Server has a very large limit:
http://msdn.microsoft.com/en-us/library/ms143432.aspx
ORACLE has a very easy to reach limit on the other side.
So, for large IN clauses, it's better to create a temp table, insert the values and do a JOIN. It works faster also.
IN clause limitation in Sql Server
Yes, there is a limit, but Microsoft only specifies that it lies "in the thousands":
Explicitly including an extremely large number of values (many thousands of values separated by commas) within the parentheses, in an IN clause can consume resources and return errors 8623 or 8632. To work around this problem, store the items in the IN list in a table, and use a SELECT subquery within an IN clause.
Looking at those errors in details, we see that this limit is not specific to IN
but applies to query complexity in general:
Error 8623:
The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
Error 8632:
Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.
LIMIT SQL within WHERE clause?
No, it's not possible to add a LIMIT
clause within a WHERE clause.
It is possible to achieve the resultset you want, but the SQL to do that isn't pretty. It's going to require either a JOIN, a correlated subquery, or an inline view.
If there's an "order" to the rows in _mct_dot, you could use a correlated subquery to check for the number of rows "before" the row you pulled, and take rows only that have fewer than four rows.
SELECT d.*
FROM _mct_dot d
JOIN ( SELECT n.number
, q.qty
FROM (SELECT 4 AS `number` UNION ALL SELECT 7 UNION ALL SELECT 13) n
CROSS
JOIN (SELECT 3 AS `qty` UNION ALL SELECT 5 UNION ALL SELECT 7) q
) p
ON p.number = d.number
AND p.qty = d.qty
AND 5 > ( SELECT SUM(1)
FROM _mct_dot c
WHERE c.number = d.number
AND c.qty = d.qty
AND c.a_ID < d.a_ID
)
ORDER BY ...
The correlated subquery could wind up being executed a LOT of times, so for best performance, you are going to want an index with leading columns of number
and qty
and including the a_id
column.
Either:
... ON `_mct_dot` (`number`, `qty`, `a_ID`)
or
... ON `_mct_dot` (`qty`, `number`, `a_ID`)
Another option is to use MySQL user variables, to emulate a row_number()
analytic function, something like this:
SELECT t.*
FROM ( SELECT d.a_ID
, IF(d.number = @prev_number AND d.qty = @prev_qty
, @rn := @rn + 1
, @rn := 1
) AS rn
, @prev_number := d.number
, @prev_qty := d.qty
FROM (SELECT @prev_number := NULL, @prev_qty := NULL, @rn := 0 ) i
CROSS
JOIN ( SELECT n.number
, q.qty
FROM (SELECT 4 AS `number` UNION ALL SELECT 7 UNION ALL SELECT 13) n
CROSS
JOIN (SELECT 3 AS `qty` UNION ALL SELECT 5 UNION ALL SELECT 7) q
) p
JOIN _mct_dot d
ON d.number = p.number
AND d.qty = p.qty
ORDER BY d.number, d.qty
) s
JOIN _mct_dot t
ON t.a_ID = s.a_ID
WHERE s.rn <= 5
ORDER BY t.number ASC, t.qty ASC
(These queries are desk checked only; haven't setup a SQL Fiddle demo.)
FOLLOWUP
For the first query, I've just used an inline view (aliased as "p"), that generates the set of all pairs of number
and qty
values that are being requested.
And we can use a JOIN operation to locate all the rows that match each pair from _mct_dot table.
The tricky part is the correlated subquery. (There's a couple of approaches we could use.) The approach in the query above is to get a "count" of the rows with a matching "number" and "qty", but with an id value less than the id value of the current row, basically finding out how many rows are "before" the current row. And we're comparing that to a literal 5, because we want to return only the first 5 rows in each group.
For the second query,
the inline view aliased as i
is initializing some MySQL user variables. We don't really care what's returned by the query, except that it returns exactly one row (because we're referencing it in a JOIN operation)... what we're really interested in is getting the variables initialized at the start of the execution. And that happens because MySQL materializes the inline view (derived table), before the outer query that references the view is executed.
The inline view aliased as p
gets us the pairs of number,qty
that we want to retrieve, and we use a JOIN operation against _mct_dot to get the matching rows.
The "trick" in the inline view aliased as s
is the use of the MySQL user variables. We're doing a check of the current values against the values from the previous row... if the number and qty match, then we're in the same "group", so we can increment the row number counter by 1. If either of the values change, then it's a new group, so we reset the row number counter to 1, since the current row is the "first" row in the new group.
We can run the query for the inline view s
, and see that we're getting row numbers (rn
col) 1, 2, 3, etc. for each group.
Then the outermost query just filters out all the rows that have an rn
row number greater than five. Actually, from s
, we're returning just the unique identifier for the row; that outermost query is also doing a JOIN operation to retrieve the entire row, based on the unique id.
As I mentioned at the top of my answer, the SQL to do this is not pretty. (It does take a bit of work to unwind what those queries are doing.)
Query to select limit in specific condition
A simple union all
should be it. However, to make sure that you're getting exactly 1000 rows (in case there are more than 1000 rows but less than 100 are @gmail) you can do this:
with u as
(SELECT email from my_table where email like '%@gmail.%' limit 100)
select * from u
union all
(SELECT email from my_table
where email not like '%@gmail.%'
limit 1000 - (select count(*) from u));
MySQL IN condition limit
No there isn't, check the manual about the IN function:
The number of values in the
IN
list is only limited by the max_allowed_packet value.
Is there a limit on the number of WHERE conditions in a SELECT statement?
Consider using an IN clause for a query like that - it's more compact and signals your intent better.
SELECT * FROM table WHERE column NOT IN('asd', 'bsd', 'csd', ...);
Another alternative would be to create a table to do a left join against to filter out the rows you don't want.
How to limit to just one result per condition when looking through multiple OR/IN conditions in the WHERE clause (Postgresql)
Normally, a simple GROUP BY
would suffice for this type of solution, however as you have specified that you want to include ALL of the columns in the result, then we can use the ROW_NUMBER()
window function to provide a value to filter on.
As a general rule it is important to specify the column to sort on (
ORDER BY
) for all windowing or paged queries to make the result repeatable.
As no schema has been supplied, I have used Name
as the field to sort on for the window, please update that (or the question) with any other field you would like, the PK is a good candidate if you have nothing else to go on.
SELECT * FROM
(
SELECT *
, ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Name) AS _rn
FROM Customers
WHERE Country IN ('Germany', 'France', 'UK')
)
WHERE _rn = 1
The PARTITION BY
forces the ROW_NUMBER
to be counted across all records with the same Country
value, starting at 1, so in this case we only select the rows that get a row number (aliased as _rn
) of 1.
The WHERE
clause could have been in the outer query if you really want to, but ROW_NUMBER()
can only be specified in the SELECT
or ORDER BY
clauses of the query, so to use it as a filter criteria we are forced to wrap the results in some way.
Conditionally LIMIT in BigQuery
The LIMIT
clause works differently within BigQuery. It specifies the maximum number of depression inputs in the result. The LIMIT
n must be a constant INT64.
Using the LIMIT
clause, you can overcome the limitation on cache result size:
- Using filters to limit the result set.
- Using a LIMIT clause to reduce the result set, especially if you are
using an ORDER BY clause.
You can see this example:
SELECT
title
FROM
`my-project.mydataset.mytable`
ORDER BY
title DESC
LIMIT
100
This will only return 100 rows.
The best practice is to use it if you are sorting a very large number of values. You can see this document with examples.
If you want to return all rows from a table, you need to omit the LIMIT clause.
SELECT
title
FROM
`my-project.mydataset.mytable`
ORDER BY
title DESC
This example will return all the rows from a table. It is not recommended to omit LIMIT
if your tables are too large, as it will consume a lot of resources.
One solution to optimize resources is to use cluster tables. This will save costs and querying times. You can see this document with a detailed explanation of how it works.
Limit to number of Items in list for WHERE clause SQL query
Explicitly including an extremely large number of values (many thousands of values separated by commas) within the parentheses, in an IN clause can consume resources and return errors 8623 or 8632. To work around this problem, store the items in the IN list in a table, and use a SELECT subquery within an IN clause.
Error 8623:
The query processor ran out of internal resources and could not
produce a query plan. This is a rare event and only expected for
extremely complex queries or queries that reference a very large
number of tables or partitions. Please simplify the query. If you
believe you have received this message in error, contact Customer
Support Services for more information.Error 8632:
Internal error: An expression services limit has been reached. Please
look for potentially complex expressions in your query, and try to
simplify them.
microsoft docs
Related Topics
How to Select the First Row of Each Group
Difference Between Union and Union All
How to Comma Delimit Multiple Rows into One Column
Postgresql Group_Concat Equivalent
Calculate a Running Total in MySQL
Using Column Alias in Where Clause of MySQL Query Produces an Error
You Can't Specify Target Table For Update in from Clause
SQL Update from One Table to Another Based on a Id Match
What Is the Meaning of the Prefix N in T-SQL Statements and When Should I Use It
Solutions For Insert or Update on SQL Server
How to Perform an If...Then in an SQL Select
How to Update Two Tables in One Statement in SQL Server 2005
Computed/Calculated/Virtual/Derived Columns in Postgresql
How to Return Only the Date from a SQL Server Datetime Datatype
Why Would Someone Use Where 1=1 and ≪Conditions≫ in a SQL Clause