SQL Performance: Where VS Where(Row_Number)

SQL performance: WHERE vs WHERE(ROW_NUMBER)

The 2nd answer is your best choice. It takes into account the fact that you could have holes in your ID column. I'd rewrite it as a CTE though instead of a subquery...

;WITH MyCTE AS
(SELECT  *,  
         ROW_NUMBER() OVER (ORDER BY ID) AS row  
FROM     Table)
SELECT   *
FROM     MyCTE
WHERE    row >= @start 
         AND row <= @end

Is there a performance difference in using a GROUP BY with MAX() as the aggregate vs ROW_NUMBER over partition by?

The group by should be faster. The row number has to assign a row to all rows in the table. It does this before filtering out the ones it doesn't want.

The second query is, by far, the better construct. In the first, you have to be sure that the columns in the partition clause match the columns that you want. More importantly, "group by" is a well-understood construct in SQL. I would also speculate that the group by might make better use of indexes, but that is speculation.

ROW_NUMBER() Performance optimization

You could try creating an Indexed View on the two tables:

CREATE VIEW dbo.YourIndexedView
WITH SCHEMABINDING 
AS
    SELECT  az.ArticleID,
            az.ChannnelID,
            az.ZoneID,
            a.LastEditDate,
            a.LastEditDateTime,
            az.ArticleOrder
    FROM    dbo.Article_tbl a
            INNER JOIN dbo.ArticleZone_tbl az
                ON a.ArticleID = az.AtricleID;

GO
CREATE UNIQUE CLUSTERED INDEX UQ_YourIndexView_ArticleID_ChannelID_ZoneID 
    ON dbo.YourIndexedView (ArticleID, ChannelID, ZoneID);

Once you have your clustered index in place you can create a nonclustered index that would assist in the sorting:

CREATE NONCLUSTERED INDEX IX_YourIndexedView_LastEditDate_ArticleOrder_LastEditDateTime
    ON dbo.YourIndexedView (LastEditDate DESC, ArticleOrder ASC, LastEditDateTime DESC);

You can then reference this in your query:

WITH OrderedOrders AS
(   SELECT  RowNum = ROW_NUMBER() OVER(ORDER BY LastEditDate DESC, ArticleOrder ASC, LastEditDateTime DESC),
            ArticleID,
            ChannelID,
            ZoneID,
            LastEditDateTime,
            ArticleOrder
    FROM    dbo.YourIndexedView WITH (NOEXPAND)
    WHERE   ChannelID = 1 
    AND     ZoneID = 0
)
SELECT  *
FROM    OrderedOrders
WHERE   RowNum BETWEEN 1 AND 10;

N.B. I may have missed some columns from your article table, but I couldn't infer them from the question

Furthermore, if your query is always going to have the same zone and channel, you could filter the view, then your clustered index column simply becomes ArticleID:

CREATE VIEW dbo.YourIndexedView
WITH SCHEMABINDING 
AS
    SELECT  az.ArticleID,
            az.ChannnelID,
            az.ZoneID,
            a.LastEditDate,
            a.LastEditDateTime,
            az.ArticleOrder
    FROM    Article_tbl a
            INNER JOIN ArticleZone_tbl az
                ON a.ArticleID = az.AtricleID
    WHERE   az.ChannelID = 1
    AND     Az.ZoneID = 1;

GO
CREATE UNIQUE CLUSTERED INDEX UQ_YourIndexView_ArticleID 
    ON dbo.YourIndexedView (ArticleID);

Which means your indexes will be smaller, and faster to use.

Improve the performance with cte ROW_NUMBER()

One core principle when doing SQL queries with mutliple steps needs to be: eliminate as much data as possible as early as possible. Your CTE loads all rows from TransactionSendQueue, when you only want the latest transaction per Tran_ID. The more data that's being handled, the higher risk of data loaded being written to disk, which is extremely detrimental to performance. The more data that's written to disk, the worse the impact is. You can view your execution plan to check if this is the case but I'd say it's likely considering the execution time.

The CTE should only return one row per row that could possibly be updated in your #TempTran table. You can use an additional CTE to retrieve the latest update first, and then use that information in your ctetran to reduce the amount of data (rows) being search ed through in the update statement.

WITH LatestTran AS --the lastest transaction
(
    SELECT 
        Tran_ID, 
        MAX(LastUpdate) AS LastUpdate
    FROM 
        TransactionSendQueue
    WHERE 
        STATUS = '1' --where 1 mean complete
    GROUP BY
        Tran_ID
), ctetran AS
(
    SELECT 
        Tran_ID, 
        Field2, 
        Field3, 
        Field4, 
        Field5, 
        Field6, 
        Field7, 
        Field8, 
        Field9
    FROM 
        TransactionSendQueue TSQ
    INNER JOIN LatestTran LT ON 
        TSQ.Tran_ID = LT.Tran_ID AND 
        TSQ.LastUpdate = LT.LastUpdate
)


UPDATE temp
SET STATUS = CASE
                WHEN temp.f2 = cte.Field2
                     AND temp.f3 = cte.Field3
                     AND temp.f4 = cte.Field4
                     AND temp.f5 = cte.Field5
                     AND temp.f6 = cte.Field6
                     AND temp.f7 = cte.Field7
                     AND temp.f8 = cte.Field8
                     AND temp.f9 = cte.Field9
                   THEN '2' -- where 2 mean skip
                ELSE '3' --where 3 mean ready to execute
             END
FROM #TempTran temp
INNER JOIN ctetran cte ON temp.Tran_ID = cte.Tran_ID

How big performance increase this will be is dependent on how many Batch_ID you have per Tran_ID, the more the bigger perfomance boost.

If the query is still running slow, you could also look into using an index for the LastUpdate column in the TransactionSendQueue table, since the query is now using that in a join statement.

Please let me know how much the query time is reduced, would be interesting to know.

ROW_NUMBER Performance

Your problem might be parameter sniffing.
SQL doesn't know what @Skip and @Take are going to be and this could be throwing the plan.

Its worth comparing the data in Prod & Dev - Often Dev data can be a lot more "skewy" especially if its being inserted manually.
Look at see what are the Avg, min and Max

Try sticking a with RECOMPILE on the stored proc and see if this helps.
Alternately - DBCC FREEPROCCACHE will empty the proc cache and create a new plan.

I just realized you are not partitioning your Row_Number() query - this means that in effect you are just running a TOP N query order by Fullname- seems strange as it will give random results based on the number of Extracts and employees

What are the differences between the older row_number() and the newer OFFSET + FETCH based pagination in SQL Server?

Using ROW_NUMBER() works fine - it's just more work than necessary; you need to write a "skeleton" CTE around your actual query, add the ROW_NUMBER() column to your output set, and then filter on that.

Using the new OFFSET / FETCH is simpler - and yes, it's also better for performance, as these two links can show you:

New T-SQL features in SQL Server 2012
Comparing performance for different SQL Server paging

So overall: if you're using SQL Server 2012 - then you should definitely use OFFSET/FETCH rather than ROW_NUMBER() for paging

SQL Performance: Where VS Where(Row_Number)