How to Get Latest Records for Each Ledger Account Based on Variationid

PostgreSQL: How to select last balance for each account on each day in a given date range?

You can use ROW_NUMBER with PARTITION BY:

SELECT entry_date, account_id, balance
FROM (
SELECT entry_date, account_id, balance,
ROW_NUMBER() OVER (PARTITION BY account_id, entry_date::date
ORDER BY entry_date DESC) AS rn
FROM ledger
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp) AS t
WHERE t.rn = 1

PARTITION BY creates slices of account_id values per day since entry_date is also used in the same clause after being cast to a date value. Each slice is ordered by entry_date in descending order, hence ROW_NUMBER = 1 corresponds to the last record of the day.

Demo here

SQL Query for Getting total amount in each row

One way is to use a correlated subquery:

SELECT t.Date, l.Name, t.Amount,
(SELECT SUM(Amount)
FROM transactions t2
WHERE t2.ledgerID = t.ledgerID
AND t2.date <= t.date) as TotalAmountToDate
FROM ledger l
JOIN transactions t on l.ledgerID = t.ledgerID

This will work in SQL Server, not totally sure about Access.

Calculate running total / running balance

For those not using SQL Server 2012 or above, a cursor is likely the most efficient supported and guaranteed method outside of CLR. There are other approaches such as the "quirky update" which can be marginally faster but not guaranteed to work in the future, and of course set-based approaches with hyperbolic performance profiles as the table gets larger, and recursive CTE methods that often require direct #tempdb I/O or result in spills that yield roughly the same impact.



INNER JOIN - do not do this:

The slow, set-based approach is of the form:

SELECT t1.TID, t1.amt, RunningTotal = SUM(t2.amt)
FROM dbo.Transactions AS t1
INNER JOIN dbo.Transactions AS t2
ON t1.TID >= t2.TID
GROUP BY t1.TID, t1.amt
ORDER BY t1.TID;

The reason this is slow? As the table gets larger, each incremental row requires reading n-1 rows in the table. This is exponential and bound for failures, timeouts, or just angry users.



Correlated subquery - do not do this either:

The subquery form is similarly painful for similarly painful reasons.

SELECT TID, amt, RunningTotal = amt + COALESCE(
(
SELECT SUM(amt)
FROM dbo.Transactions AS i
WHERE i.TID < o.TID), 0
)
FROM dbo.Transactions AS o
ORDER BY TID;


Quirky update - do this at your own risk:

The "quirky update" method is more efficient than the above, but the behavior is not documented, there are no guarantees about order, and the behavior might work today but could break in the future. I'm including this because it is a popular method and it is efficient, but that doesn't mean I endorse it. The primary reason I even answered this question instead of closing it as a duplicate is because the other question has a quirky update as the accepted answer.

DECLARE @t TABLE
(
TID INT PRIMARY KEY,
amt INT,
RunningTotal INT
);

DECLARE @RunningTotal INT = 0;

INSERT @t(TID, amt, RunningTotal)
SELECT TID, amt, RunningTotal = 0
FROM dbo.Transactions
ORDER BY TID;

UPDATE @t
SET @RunningTotal = RunningTotal = @RunningTotal + amt
FROM @t;

SELECT TID, amt, RunningTotal
FROM @t
ORDER BY TID;


Recursive CTEs

This first one relies on TID to be contiguous, no gaps:

;WITH x AS
(
SELECT TID, amt, RunningTotal = amt
FROM dbo.Transactions
WHERE TID = 1
UNION ALL
SELECT y.TID, y.amt, x.RunningTotal + y.amt
FROM x
INNER JOIN dbo.Transactions AS y
ON y.TID = x.TID + 1
)
SELECT TID, amt, RunningTotal
FROM x
ORDER BY TID
OPTION (MAXRECURSION 10000);

If you can't rely on this, then you can use this variation, which simply builds a contiguous sequence using ROW_NUMBER():

;WITH y AS 
(
SELECT TID, amt, rn = ROW_NUMBER() OVER (ORDER BY TID)
FROM dbo.Transactions
), x AS
(
SELECT TID, rn, amt, rt = amt
FROM y
WHERE rn = 1
UNION ALL
SELECT y.TID, y.rn, y.amt, x.rt + y.amt
FROM x INNER JOIN y
ON y.rn = x.rn + 1
)
SELECT TID, amt, RunningTotal = rt
FROM x
ORDER BY x.rn
OPTION (MAXRECURSION 10000);

Depending on the size of the data (e.g. columns we don't know about), you may find better overall performance by stuffing the relevant columns only in a #temp table first, and processing against that instead of the base table:

CREATE TABLE #x
(
rn INT PRIMARY KEY,
TID INT,
amt INT
);

INSERT INTO #x (rn, TID, amt)
SELECT ROW_NUMBER() OVER (ORDER BY TID),
TID, amt
FROM dbo.Transactions;

;WITH x AS
(
SELECT TID, rn, amt, rt = amt
FROM #x
WHERE rn = 1
UNION ALL
SELECT y.TID, y.rn, y.amt, x.rt + y.amt
FROM x INNER JOIN #x AS y
ON y.rn = x.rn + 1
)
SELECT TID, amt, RunningTotal = rt
FROM x
ORDER BY TID
OPTION (MAXRECURSION 10000);

DROP TABLE #x;

Only the first CTE method will provide performance rivaling the quirky update, but it makes a big assumption about the nature of the data (no gaps). The other two methods will fall back and in those cases you may as well use a cursor (if you can't use CLR and you're not yet on SQL Server 2012 or above).



Cursor

Everybody is told that cursors are evil, and that they should be avoided at all costs, but this actually beats the performance of most other supported methods, and is safer than the quirky update. The only ones I prefer over the cursor solution are the 2012 and CLR methods (below):

CREATE TABLE #x
(
TID INT PRIMARY KEY,
amt INT,
rt INT
);

INSERT #x(TID, amt)
SELECT TID, amt
FROM dbo.Transactions
ORDER BY TID;

DECLARE @rt INT, @tid INT, @amt INT;
SET @rt = 0;

DECLARE c CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR SELECT TID, amt FROM #x ORDER BY TID;

OPEN c;

FETCH c INTO @tid, @amt;

WHILE @@FETCH_STATUS = 0
BEGIN
SET @rt = @rt + @amt;
UPDATE #x SET rt = @rt WHERE TID = @tid;
FETCH c INTO @tid, @amt;
END

CLOSE c; DEALLOCATE c;

SELECT TID, amt, RunningTotal = rt
FROM #x
ORDER BY TID;

DROP TABLE #x;


SQL Server 2012 or above

New window functions introduced in SQL Server 2012 make this task a lot easier (and it performs better than all of the above methods as well):

SELECT TID, amt, 
RunningTotal = SUM(amt) OVER (ORDER BY TID ROWS UNBOUNDED PRECEDING)
FROM dbo.Transactions
ORDER BY TID;

Note that on larger data sets, you'll find that the above performs much better than either of the following two options, since RANGE uses an on-disk spool (and the default uses RANGE). However it is also important to note that the behavior and results can differ, so be sure they both return correct results before deciding between them based on this difference.

SELECT TID, amt, 
RunningTotal = SUM(amt) OVER (ORDER BY TID)
FROM dbo.Transactions
ORDER BY TID;

SELECT TID, amt,
RunningTotal = SUM(amt) OVER (ORDER BY TID RANGE UNBOUNDED PRECEDING)
FROM dbo.Transactions
ORDER BY TID;


CLR

For completeness, I'm offering a link to Pavel Pawlowski's CLR method, which is by far the preferable method on versions prior to SQL Server 2012 (but not 2000 obviously).

http://www.pawlowski.cz/2010/09/sql-server-and-fastest-running-totals-using-clr/



Conclusion

If you are on SQL Server 2012 or above, the choice is obvious - use the new SUM() OVER() construct (with ROWS vs. RANGE). For earlier versions, you'll want to compare the performance of the alternative approaches on your schema, data and - taking non-performance-related factors in mind - determine which approach is right for you. It very well may be the CLR approach. Here are my recommendations, in order of preference:

  1. SUM() OVER() ... ROWS, if on 2012 or above
  2. CLR method, if possible
  3. First recursive CTE method, if possible
  4. Cursor
  5. The other recursive CTE methods
  6. Quirky update
  7. Join and/or correlated subquery

For further information with performance comparisons of these methods, see this question on http://dba.stackexchange.com:

https://dba.stackexchange.com/questions/19507/running-total-with-count


I've also blogged more details about these comparisons here:

http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals


Also for grouped/partitioned running totals, see the following posts:

http://sqlperformance.com/2014/01/t-sql-queries/grouped-running-totals

Partitioning results in a running totals query

Multiple Running Totals with Group By

SQL: How do I INSERT primary key values from two tables INTO a master table

Alright, thank you for your answers and suggestions.

I am working with an access project connected to MsSQL database. I tried to solve this by using a table trigger, but none of the suggestions has done the trick for me. Therefore I decided to solve this on the client side with VBA code instead.

It is probably not the proper way to solve it, but it might be useful to know for anyone reading.

The table structure is the same, but I have made a corresponding form for both the product and region table. On the forms AfterInsert event I have the following code:

Region table:

Private Sub Form_AfterInsert()

Dim varRegion As String
Dim strSQL As String

varRegion = Me![code]
strSQL = "INSERT INTO master([region], [product]) SELECT '" & varRegion & "', & _
[code] FROM product;"

DoCmd.RunSQL strSQL

End Sub

Product table:

Private Sub Form_AfterInsert()

Dim varProduct As String
Dim strSQL As String

varProduct = Me![code]
strSQL = "INSERT INTO master([region], [product]) SELECT [code], & _
'" & varProduct & "' FROM region;"

DoCmd.RunSQL strSQL

End Sub

EDIT: Having research the matter a little bit more, I found that this is the code you need to be using for the trigger if you are using SQL Server, if you don't want to use the client side setup.

Apparently, in SQL Server you need to reference a hidden table called "inserted" when you want to get the values of the inserted row. View this link for more info: Multirow Considerations for DML Triggers

Great!

Product table:

-- Trigger statement
CREATE TRIGGER "name-of-trigger"
ON producttable
FOR INSERT AS

-- Insert statement
INSERT INTO mastertable ([region],[product])
SELECT regiontable.[code], inserted.[code]
FROM regiontable, inserted;

Region table:

-- Trigger statement
CREATE TRIGGER "name-of-trigger"
ON regiontable
FOR INSERT AS

-- Insert statement
INSERT INTO mastertable ([product],[region])
SELECT producttable.[code], inserted.[code]
FROM producttable, inserted;

Is there a way to pull vendor creation date in Dynamics NAV 2016?

If you do have the change log active, the following is a basic query that will get you all insertions to the vendor table:

SELECT 
cle.[Primary Key]AS Vendor
, cle.[New Value]
, ven.Name
, CAST(cle.[Date and Time] AS DATE) AS LogDate
, CAST(cle.Time AS TIME(0)) AS LogTime
, cle.[Field No_]
, cle.[Type of Change]
, cle.[User ID]
FROM dbo.[YourCompany$Change Log Entry] cle
left outer JOIN dbo.YourCompany$Vendor ven
ON cle.[Primary Key] = ven.No_
WHERE
cle.[Table No_] = 23
and cle.[Field No_] = 1
AND cle.[Type of Change] = 0
ORDER BY LogDate, LogTime, Vendor

I'm also preparing a blog post on the change log which should be out next week.

Confusion about objectify transactions retrying

What is the correct way to store payments and refunds in a transaction table?

I went with the second implementation, taking the approach that @Raidex and @indiri suggested. No reason to unnecessarily complicate this record keeping, especially since we're talking just a payments/refunds table.

git log --all doesn't work inside a filter-branch

Essentially what you want to do here is:

  1. Build a map of all commits in the repository, indexed by hash ID.
  2. For each commit, determine the path names you wish to keep / use when running your filter.
  3. Run git filter-branch—or, at this point, just run your own code, since the map you built in step 1, and the stuff you computed in step 2, are a significant part of what filter-branch does—to copy old commits to new commits.
  4. If you are using your own code, create or update branch names for the last copied commits.

You can git read-tree to copy each commit into an index—you can use the main index, or a temporary one—and then use the Git tools to modify the index so as to arrange in it the names and hash IDs that you wish to keep. Then use git write-tree and git commit-tree to build your new commits, just like filter-branch does.

An easier case

You may be able to simplify this somewhat, if you don't have too many alternative names for files. For instance, suppose that the history—the chains of commits—in the repository looks like this, with two great History Bottlenecks B1 and B2:

  _______________________          ________________          _________
/ \ / \ / \--bra
< large cloud of commits >--B1--< cloud of commits >--B2--< ... >--nch
\_______________________/ \________________/ \_________/--es

where the file names that you want to keep are all the same within any one of the three big bubbles, but at commit B2 there is a mass renaming so the names are different in the middle bubble, and likewise at B1 there's a mass renaming so the names are different in the first bubble.

In this case, there's a clear historical test you can perform, in any filter—tree filter, index filter, whatever you like (but index filters far faster than tree filters)–to determine which file names to keep. Remember that filter-branch is copying commits, one by one, in topological order so that the newly copied parents are created before any newly copied children must be created. That is, it works on commits from the first group first, then it copies bottleneck commit B1, then it works on commits from the second group, and so on.

The hash ID of the commit being copied is available to your filter (regardless of which filter(s) you use): it's $GIT_COMMIT. So you simply need to test:

  • Is $GIT_COMMIT an ancestor of B1? If so, you're in the first set.
  • Is $GIT_COMMIT an ancestor of B2? If so, you're in the first or second set.

Hence an index filter that consists of "preserve names from set of names" can be written as:

if git merge-base --is-ancestor $GIT_COMMIT <hash of B1>; then
set_of_names=/tmp/list1
elif git merge-base --is-ancestor $GIT_COMMIT <hash of B2>; then
set_of_names=/tmp/list2
else
set_of_names=/tmp/list3
fi
...

where files /tmp/list1, /tmp/list2, and /tmp/list3 contain the names of the files to keep. You now need only write the ... code that implements the "keep fixed set of file names during index filter operation". This is actually already done, mostly anyway, in this answer to extract multiple directories using git-filter-branch (as you found earlier today).



Related Topics



Leave a reply



Submit