Row Offset in SQL Server

Row Offset in SQL Server

I would avoid using SELECT *. Specify columns you actually want even though it may be all of them.

SQL Server 2005+

SELECT col1, col2 
FROM (
SELECT col1, col2, ROW_NUMBER() OVER (ORDER BY ID) AS RowNum
FROM MyTable
) AS MyDerivedTable
WHERE MyDerivedTable.RowNum BETWEEN @startRow AND @endRow

SQL Server 2000

Efficiently Paging Through Large Result Sets in SQL Server 2000

A More Efficient Method for Paging Through Large Result Sets

An SQL query with OFFSET/FETCH is returning unexpected results

The problem is that you expect the offset to happen before the filter but in actuality it doesn't happen until after the filter. Think about a simpler example where you want all the people named 'sam' and there are more people named 'sam' than your offset:

CREATE TABLE dbo.foo(id int, name varchar(32));

INSERT dbo.foo(id, name) VALUES
(1, 'sam'),
(2, 'sam'),
(3, 'bob'),
(4, 'sam'),
(5, 'sam'),
(6, 'sam');

If you just say:

SELECT id FROM dbo.foo WHERE name = 'sam';

You get:

1
2
4
5
6

If you then add an offset of 3,

-- this offsets 3 rows _from the filtered result_,
-- not the full table

SELECT id FROM dbo.foo
WHERE name = 'sam'
ORDER BY id
OFFSET 3 ROWS FETCH NEXT 2 ROWS ONLY;

You get:

5
6

It takes all the rows that match the filter, then skips the first three of those filtered rows (1,2,4) - not 1,2,3 like your question implies that you expect.

  • Example db<>fiddle

Going back to your case in the question, you are filtering out rows like 77 and 89 because they don't contain a 1 or a 2. So the offset you asked for is 200, but in terms of which rows that means, the offset is actually more like:

200 PLUS the number of rows that *don't* match your filter
until you hit the 200th row that *does*

You could try to force the filter to happen after, e.g.:

;WITH u AS 
(
SELECT *
FROM [User]
ORDER BY [Id]
OFFSET 200 ROWS FETCH NEXT 100 ROWS ONLY
)
SELECT * FROM u
WHERE (([NameGiven] LIKE '%1%')
OR ([NameFamily] LIKE '%2%'))
ORDER BY [Id]; -- yes you still need this one

...but then you would almost certainly never get 100 rows in each page because some of those 100 rows would then be removed by the filter. I don't think this is what you're after.

Equivalent of LIMIT and OFFSET for SQL Server?

The equivalent of LIMIT is SET ROWCOUNT, but if you want generic pagination it's better to write a query like this:

;WITH Results_CTE AS
(
SELECT
Col1, Col2, ...,
ROW_NUMBER() OVER (ORDER BY SortCol1, SortCol2, ...) AS RowNum
FROM Table
WHERE <whatever>
)
SELECT *
FROM Results_CTE
WHERE RowNum >= @Offset
AND RowNum < @Offset + @Limit

The advantage here is the parameterization of the offset and limit in case you decide to change your paging options (or allow the user to do so).

Note: the @Offset parameter should use one-based indexing for this rather than the normal zero-based indexing.

Offset and fetch in a delete statement

Use the primary key or ROWID to access the rows:

DELETE FROM mytable
WHERE rowid IN
(SELECT rowid FROM mytable WHERE (conditions) OFFSET p1 ROWS FETCH NEXT p2 ROWS ONLY);

When running the query repeatedly, you will end up with a table consisting of p1 rows. But as mentioned by sticky bit in the request comments: without an ORDER BY these rows will be arbitrary.

SQL Offset total row count slow with IN Clause

Step one for performance related questions is going to be to analyze your table/index structure, and to review the query plans. You haven't provided that information, so I'm going to make up my own, and go from there.

I'm going to assume that you have a heap, with ~10M rows (12,872,738 for me):

DECLARE @MaxRowCount bigint = 10000000,
@Offset bigint = 0;

DROP TABLE IF EXISTS #ExampleTable;
CREATE TABLE #ExampleTable
(
ID bigint NOT NULL,
Name varchar(50) COLLATE DATABASE_DEFAULT NOT NULL
);

WHILE @Offset < @MaxRowCount
BEGIN
INSERT INTO #ExampleTable
( ID, Name )
SELECT ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL )),
ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ))
FROM master.dbo.spt_values SV
CROSS APPLY master.dbo.spt_values SV2;
SET @Offset = @Offset + ROWCOUNT_BIG();
END;

If I run the query provided over #ExampleTable, it takes about 4 seconds and gives me this query plan:

Baseline query plan

This isn't a great query plan by any means, but it is hardly awful. Running with live query stats shows that the cardinality estimates were at most off by one, which is fine.

Lets give a massive number of items in our IN list (5000 items from 1-5000). Compiling the plan took 4 seconds:

Large IN list query plan

I can get my number up to 15000 items before the query processor stops being able to handle it, with no change in query plan (it does take a total of 6 seconds to compile). Running both queries takes about 5 seconds a pop on my machine.

This is probably fine for analytical workloads or for data warehousing, but for OLTP like queries we've definitely exceeded our ideal time limit.

Lets look at some alternatives. We can probably do some of these in combination.

  1. We could cache off the IN list in a temp table or table variable.
  2. We could use a window function to calculate the count
  3. We could cache off our CTE in a temp table or table variable
  4. If on a sufficiently high SQL Server version, use batch mode
  5. Change the indices on your table to make this faster.

Workflow considerations

If this is for an OLTP workflow, then we need something that is fast regardless of how many users we have. As such, we want to minimize recompiles and we want index seeks wherever possible. If this is analytic or warehousing, then recompiles and scans are probably fine.

If we want OLTP, then the caching options are probably off the table. Temp tables will always force recompiles, and table variables in queries that rely on a good estimate require you to force a recompile. The alternative would be to have some other part of your application maintain a persistent table that has paginated counts or filters (or both), and then have this query join against that.

If the same user would look at many pages, then caching off part of it is probably still worth it even in OLTP, but make sure you measure the impact of many concurrent users.

Regardless of workflow, updating indices is probably okay (unless your workflows are going to really mess with your index maintenance).

Regardless of workflow, batch mode will be your friend.

Regardless of workflow, window functions (especially with either indices and/or batch mode) will probably be better.

Batch mode and the default cardinality estimator

We pretty consistently get poor cardinality estimates (and resulting plans) with the legacy cardinality estimator and row-mode executions. Forcing the default cardinality estimator helps with the first, and batch-mode helps with the second.

If you can't update your database to use the new cardinality estimator wholesale, then you'll want to enable it for your specific query. To accomplish that, you can use the following query hint: OPTION( USE HINT( 'FORCE_DEFAULT_CARDINALITY_ESTIMATION' ) ) to get the first. For the second, add a join to a CCI (doesn't need to return data): LEFT OUTER JOIN dbo.EmptyCciForRowstoreBatchmode ON 1 = 0 - this enables SQL Server to pick batch mode optimizations. These recommendations assume a sufficiently new SQL Server version.

What the CCI is doesn't matter; we like to keep an empty one around for consistency, that looks like this:

CREATE TABLE dbo.EmptyCciForRowstoreBatchmode
(
__zzDoNotUse int NULL,
INDEX CCI CLUSTERED COLUMNSTORE
);

The best plan I could get without modifying the table was to use both of them. With the same data as before, this runs in <1s.

Batch Mode and NCE

WITH TempResult AS
(
SELECT ID,
Name,
COUNT( * ) OVER ( ) MaxRows
FROM #ExampleTable
WHERE ID IN ( <<really long LIST>> )
)
SELECT TempResult.ID,
TempResult.Name,
TempResult.MaxRows
FROM TempResult
LEFT OUTER JOIN dbo.EmptyCciForRowstoreBatchmode ON 1 = 0
ORDER BY TempResult.Name OFFSET ( @PageNum - 1 ) * @PageSize ROWS FETCH NEXT @PageSize ROWS ONLY
OPTION( USE HINT( 'FORCE_DEFAULT_CARDINALITY_ESTIMATION' ) );

Use offset and fetch with case in MS SQL server

The syntax you have is all over the place above. Your END, for example, is after the end of your statement (defined by the statement terminator (;)), and you're trying to use a CASE expression like it's a Case (Switch) statement. Case (Switch) statements don't exist in Transact-SQL.

Considering the simplicity of your query, I would personally do something like this, and use a Dynamic Statement:

DECLARE @Page int,
@Check int;
SET @page = 2;
SET @Check = 2;

DECLARE @SQL nvarchar(MAX),
@CRLF nchar(2) = NCHAR(13) + NCHAR(10);

SET @SQL = N'SELECT *' + @CRLF +
N'FROM dbo.HolonSsoRequest' + @CRLF +
N'ORDER BY {Column Name} DESC' + --Don't use Ordinal Positions: https://sqlblog.org/2009/10/06/bad-habits-to-kick-order-by-ordinal
CASE @Check WHEN 2 THEN @CRLF + N'OFFSET @Page ROWS FETCH NEXT @Page ROWS ONLY;' ELSE N';' END;

--PRINT @SQL; --YOur debugging friend

EXEC sp_executesql @SQL, N'@Page int', @Page;

Using offset fetch next only on condition

Assuming you can pick a sensible default upper limit for number of rows to return, just use some CASE expressions:

    declare @Condition bit

select
canvas.CanvasName,
c.CompanyID,
c.CompanyName
from JobCanvas_B2B canvas
inner join JobActivity act on act.CanvasId = canvas.CanvasId
inner join [Person_5.4] p on p.JobCanvasId = canvas.CanvasId
inner join Person pers on pers.PersonId = p.PersonId
inner join Company c on c.CompanyID = pers.CompanyId
where act.NextDateD between @StartDate and @EndDate
order by act.InteractionDateD desc

offset CASE WHEN @Condition = 1 THEN @SkipRows ELSE 0 END rows
fetch next CASE WHEN @Condition = 1 THEN @PageSize ELSE 1000000 END rows only

You can't "stop" in the middle of defining a query and start writing control flow statements (like IF) to decide how a part of your query should be structured.



Related Topics



Leave a reply



Submit