Why Use Select Top 100 Percent

Why use Select Top 100 Percent?

It was used for "intermediate materialization (Google search)"

Good article: Adam Machanic: Exploring the secrets of intermediate materialization

He even raised an MS Connect so it can be done in a cleaner fashion

My view is "not inherently bad", but don't use it unless 100% sure. The problem is, it works only at the time you do it and probably not later (patch level, schema, index, row counts etc)...

Worked example

This may fail because you don't know in which order things are evaluated

SELECT foo From MyTable WHERE ISNUMERIC (foo) = 1 AND CAST(foo AS int) > 100

And this may also fail because

SELECT foo
FROM
(SELECT foo From MyTable WHERE ISNUMERIC (foo) = 1) bar
WHERE
CAST(foo AS int) > 100

However, this did not in SQL Server 2000. The inner query is evaluated and spooled:

SELECT foo
FROM
(SELECT TOP 100 PERCENT foo From MyTable WHERE ISNUMERIC (foo) = 1 ORDER BY foo) bar
WHERE
CAST(foo AS int) > 100

Note, this still works in SQL Server 2005

SELECT TOP 2000000000 ... ORDER BY...

Why is this stored procedure using TOP 100 PERCENT?

That guy did not understand that tables do not have order. He tried to insert in an ordered way into the temp tables. This is not possible. The TOP 100 PERCENT trick shuts up the warning about that but does nothing to ensure order.

In earlier SQL Server versions this code might well have worked by coincidence. Since then more optimizations have been added and this code is extremely brittle. Rewrite this if you get the chance. It's a latent time bomb.

sql server order by TOP 100 PERCENT in SELECT query

You would have wanted the ORDER BY in the outer query, e.g.

select name,empID,salary,[deducted salary] from   
(select name,empID,salary,[deducted salary] = salary-7000, Joined_Date
from tblEmpDetails
) TmpTbl where [decucted salary] > 50000
order by Joined_Date

EDIT - Yes you need to include Joined_Date in the inner query to sort by it on the outer query, as well as explicitly listing only the 4 columns desired instead of *.

But you could also have written the query in one level

  select name,empID,salary,[deducted salary] = salary-7000
from tblEmpDetails
where salary-7000 > 50000
order by Joined_Date

Note that salary-7000 although repeated in the query is only evaluated once by SQL Server because it is smart enough to use it twice.

Dynamic TOP N / TOP 100 PERCENT in a single query based on condition

A better solution would be to not use TOP at all - but ROWCOUNT instead:

SET ROWCOUNT stops processing after the specified number of rows.

...

To return all rows, set ROWCOUNT to 0.

Please note that ROWCOUNT is recommended to use only with select statements -

Important

Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax.

DECLARE @V_COUNT INT = 0

SET ROWCOUNT @V_COUNT -- 0 means return all rows...

SELECT *
FROM MY_TABLE
ORDER BY COL1

SET ROWCOUNT 0 -- Avoid side effects...

This will eliminate the need to know how many rows there are in the table

Be sure to re-set the ROWCOUNT back to 0 after the query, to avoid side effects (Good point by Shnugo in the comments).

Is there a way to replace TOP 100 PERCENT from this Rank() OVER SQL?

You can remove both the TOP and ORDER BY clauses from the subquery - since this is a subquery, the ORDER BY clause serves no purpose and I suspect the TOP clause is only there to prevent the error you are seeing regarding using ORDER by in a subquery.

SQL Select Top n percent rows


select user, phrase, tfw
from key_phrases t1
join (
select count(*) total_rows_per_user, user
from key_phrases
group by user
) t2 on t1.user = t2.user
where (
select count(*) from key_phrases t3
where t3.user = t1.user
and t3.tfw >= t1.tfw
) / total_rows_per_user <= .1

another query using variables which should be faster

select user, phrase, tfw,
if(@prev_user = user, @user_count := @user_count + 1, @user_count := 1),
@prev_user := user
from key_phrases t1
join (
select count(*) total_rows_per_user, user
from key_phrases
group by user
) t2 on t1.user = t2.user
cross join (select @user_count := 1, @prev_user := null) t3
where @user_count / total_rows_per_user >= .9
order by user, tfw

select top 30 percent of the entries for each day

You can use row_number() with partition by date and check against the 30% number of total count of each day.

select date,receipt,total 
from (select *,
ceiling(tc * 30.00 / 100.00) as under30
from (select date,
receipt,
total,
row_number() over(partition by date order by (select null)) rn,
count(*) over(partition by date order by (select null)) tc
from sales) t
) t1
where rn <= under30

DEMO

Output:

+------------+---------+-------+
| date | receipt | total |
+------------+---------+-------+
| 2018-04-21 | 325 | 600 |
+------------+---------+-------+
| 2018-04-21 | 326 | 800 |
+------------+---------+-------+
| 2018-04-26 | 330 | 600 |
+------------+---------+-------+
| 2018-04-26 | 331 | 1080 |
+------------+---------+-------+
| 2018-04-29 | 334 | 600 |
+------------+---------+-------+
| 2018-05-01 | 336 | 1500 |
+------------+---------+-------+

Note: If you want 30% of of total count in that case you need to change your count calculation logic like following in the above query.

  count(*) over(order by (select null)) tc 


Related Topics



Leave a reply



Submit