Calculating running totals is a common task when reporting warehouse or financial balance. The theory behind running totals is quite easy as it is simply sum of values on all previous rows.
RowID | ProductID | Date | Operation | Quantity | Total Quantity |
1 | 1 | 2010/09/01 | Buy | 1000 | 1000 |
2 | 1 | 2010/09/02 | Buy | 500 | 1500 |
3 | 1 | 2010/09/03 | Sell | 200 | 1300 |
4 | 1 | 2010/09/04 | Buy | 500 | 1800 |
5 | 1 | 2010/09/05 | Sell | 1000 | 800 |
Some reporting tools provide native support for running total, but some tools not. The real problems comes when we need to calculate running totals inside query. The main problems are with performance and sometimes complexity of such solution.
Jeff Moden has a great article about calculating running totals using T-SQL on SQL Server Central: Solving the “Running Total” & “Ordinal Rank” Problems.
For relatively small record sets the T-SQL solution is quite usable, but for larger record sets the real performance problems comes (a lot of reads and processing).
Although there are many articles related to running totals, nearly non of them describes the fastest possible method for calculating running totals – CLR. On the other side the CLR solution brings some issues which we have to take in mind.
I have updated the article after some comments and further email communication with Paul White MVP (blog). Paul has mentioned issues with rows ordering for calculation of the running totals so this updated version reflect this and also I’ve added a security check to the functions so those functions throws exceptions when and issue with row ordering is detected.
CLR solution for running totals
As I mentioned above, the CLR solution is the simplest and quickest solution for the running totals. Before we start with the CLR solution we will prepare a testing database and test data.
--====================== -- Create test database --====================== CREATE DATABASE RunningTotalsTest GO USE RunningTotalsTest GO --=========================== -- Create small test table --=========================== CREATE TABLE SmallTable ( RowID int NOT NULL IDENTITY(1,1), Value int, RandomValue int ) GO --Add Clustered primary key to the table ALTER TABLE SmallTable ADD CONSTRAINT PK_SmallTable PRIMARY KEY CLUSTERED (RowID) WITH FILLFACTOR = 100 GO --=========================== -- Create big test table --=========================== --The big table will be heap, becase we will want to test paralel query plans on it CREATE TABLE BigTable ( RowID int NOT NULL IDENTITY(1,1), Value int, RandomValue int, BigData binary(3000) ) GO --============================== -- Fill tables with test data --============================== --fill small table with data 1000 records WITH TallyBase AS ( SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N ), Tally AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) As Value FROM TallyBase t1, TallyBase t2, TallyBase t3 ) INSERT INTO SmallTable ( Value, RandomValue ) SELECT Value, ABS(CHECKSUM(NEWID())) % 100000 FROM Tally GO --fill BigTable with 100000 records WITH TallyBase AS ( SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N ), Tally AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) As Value FROM TallyBase t1, TallyBase t2, TallyBase t3, TallyBase t4, TallyBase t5 ) INSERT INTO BigTable ( Value, RandomValue ) SELECT Value, ABS(CHECKSUM(NEWID())) % 100000 FROM Tally GO
Simple scalar CLR running totals function
In our implementation we will use a simple scalar CLR function. This function will store current value for next row and in that row it will retrieve the value from previous row and calculate total. For this purpose we will use CallContext
class (See MSDN for details).
Here is testing class and function for BigInt data type.
public class RunningTotals { /// <summary> /// Storage Structure for holding actual Total and row number for security check. /// </summary> /// <typeparam name="T">Totals Data Type</typeparam> private struct RtStorage<T> where T : struct { public T Total; public int RowNo; } /// <summary> /// Calculates a running totals on BigInt (Int64) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlInt64 representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlInt64 RunningTotalBigInt(SqlInt64 val, SqlByte id, int rowNo, SqlInt64 nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlInt64>)lastSum : new RtStorage<SqlInt64>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } }
The function has several input parameters:
val
parameter represents the value to be accumulated into the running total.id
parameter is identification of the function in a select statement if there are more functions used in single select (eg. accumulating costs and incomes separately) – it is necessary to have separate storage for each call of the function.rowNo
is parameter for the security check to ensure, that rows are accumulated in expected order and correct results are provided. In case an issue with accumulation order is detected, an exception is thrown so user immediately knows the calculation is wrong. The security check is based on communication with Paul White (MVP) as mentioned above and the value for this parameter will be provided byROW_NUMBER()
windowing function with expected order.nullValue
allows specify value for null handling (eg. you may want to accumulate some particular value in case the val is null.
Once we compile the assembly with the class, we can register it in our test database. Because we use the CallContext
class, the assembly will require PERMISSION_SET = UNSAFE
and because of this we have to set also the database TRUSTWORTHY
to ON
;
--================================================================= -- Set Testing database TRUSTWORTHY ON so we can use UNSAFE code --================================================================= ALTER DATABASE [RunningTotalsTest] SET TRUSTWORTHY ON WITH ROLLBACK IMMEDIATE GO --========================================== -- Create assembly --========================================== CREATE ASSEMBLY [SqlClrTotals] AUTHORIZATION [dbo] FROM 'C:SQLCLRSqlClrTotals.dll' WITH PERMISSION_SET = UNSAFE GO --========================================== -- Create testing function --========================================== CREATE FUNCTION [dbo].[fn_RunningTotalBigInt]( @val [bigint], --value to be added to running total (a fiedl in the query) @id [tinyint], --id of the running total within a single query @rowNo [int], --RowNumber of processed records. This is compared to espected rowNo and in case out of synchronization an exceiption is fired @nullValue [bigint] = NULL --representation of the NULL value when adding to running totals ) RETURNS [bigint] WITH EXECUTE AS CALLER AS EXTERNAL NAME [SqlClrTotals].[RunningTotals].[RunningTotalBigInt] GO
Once the assembly and function is created, we can start testing the function on our test tables.
Testing running total function on queries with serial plan
Fist test will be done on simple queries which will have a simple serial plan.
Simple query with one running total
SET STATISTICS IO ON SET STATISTICS TIME ON GO SELECT RowID, Value, dbo.fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal FROM dbo.SmallTable ORDER BY RowID SET STATISTICS TIME OFF SET STATISTICS IO OFF GO
Please note the ROW_NUMBER() OVER(ORDER BY RowID)
. This is the security check by which we provide RowNumber in the order of RowID
. In case of some inconsistencies (rows arrive in different order as required. This can happen because the query engine doesn’t guarantee the rows order until final result set is received) an exception will be thrown.
As we can see, the syntax for this query is very simple and it also give as a very good and correct results:
RowID Value RunningTotal ----------- ----------- -------------------- 1 1 1 2 2 3 3 3 6 4 4 10 5 5 15 6 6 21 . . . . . . . . . 998 998 498501 999 999 499500 1000 1000 500500 (1000 row(s) affected) SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms. (1000 row(s) affected) Table 'SmallTable'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. (1 row(s) affected) SQL Server Execution Times: CPU time = 0 ms, elapsed time = 3 ms.
And also a very simple query plan:
Non of the T-SQL methods will produce such simple query plan. The Sequence Project is calculation of the ROW_NUMBER()
for security check.
Multiple running totals in single query
Now let test the situation if we use the function two times in the same query.
SELECT RowID, Value, [dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal, RandomValue, [dbo].fn_RunningTotalBigInt(RandomValue, 1, ROW_NUMBER() OVER(ORDER BY RowID), null) as RandomRunningTotal FROM dbo.SmallTable ORDER BY RowID
Result:
RowID Value RunningTotal RandomValue RandomRunningTotal ----------- ----------- -------------------- ----------- -------------------- 1 1 1 15137 15137 2 2 3 97303 112440 3 3 6 14688 127128 4 4 10 64512 191640 5 5 15 20463 212103 . . . . . . . . . . 996 996 496506 64071 50040800 997 997 497503 58488 50099288 998 998 498501 65331 50164619 999 999 499500 97196 50261815 1000 1000 500500 65433 50327248
As we can see, we have again correct results. Note, that each call to the fn_RunningTotalInt
has different ID.
In case we use same ID the execution will fail with detection of rows being processed in unexpected order.
SELECT RowID, Value, [dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal, RandomValue, [dbo].fn_RunningTotalBigInt(RandomValue, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RandomRunningTotal FROM dbo.SmallTable ORDER BY RowID
Msg 6522, Level 16, State 2, Line 1 A .NET Framework error occurred during execution of user-defined routine or aggregate "fn_RunningTotalInt": System.InvalidOperationException: Rows were processed out of expected order. Expected RowNo: 2, received RowNo: 1 System.InvalidOperationException: in RunningTotals.RunningTotalInt(SqlInt32 val, SqlByte id, Int32 rowNo, SqlInt32 nullValue)
This is caused by the fact that in case of using identical id, the internal counters are used by both functions instead of single counter for each function call.
Detection of processing row order changes
As I have mentioned at the beginning fo the article, this update was done upon Paul White comments, so now the function detect such issues if they occurs. Let’s simulate such issue.
SELECT RowID, Value, [ClrUnsafe].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal, ROW_NUMBER() OVER(ORDER BY RowID DESC) DescendingRowOrder FROM dbo.SmallTable ORDER BY RowID
Msg 6522, Level 16, State 2, Line 1 A .NET Framework error occurred during execution of user-defined routine or aggregate "fn_RunningTotalInt": System.InvalidOperationException: Rows were processed out of expected order. Expected RowNo: 1, received RowNo: 1000 System.InvalidOperationException: in RunningTotals.RunningTotalInt(SqlInt32 val, SqlByte id, Int32 rowNo, SqlInt32 nullValue)
This query produces below plan:
DescendingRowOrder
and prior the final sort. However if such issue occurs, it is correctly detected and an exception raised. This issue can occur even the scalar operator was the last operator as the order of rows is not guaranteed by the query processor until final result is received, but in normal cases in single threaded processing it generally works and in case of some issues, we have the check implemented.Null values handling
Now we will focus on the NULL values handling. Let’s run another simple query with some NULL values.
WITH Data (RowID, Value) AS ( SELECT 1 AS RowId, NULL As Value UNION ALL SELECT 2 AS RowId, 1 As Value UNION ALL SELECT 3 AS RowId, NULL As Value UNION ALL SELECT 4 AS RowId, 1 As Value UNION ALL SELECT 5 AS RowId, 1 As Value ) SELECT RowId ,Value ,[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as TotalNull ,[dbo].fn_RunningTotalBigInt(Value, 1, ROW_NUMBER() OVER(ORDER BY RowID), 0) as TotalZero ,[dbo].fn_RunningTotalBigInt(Value, 2, ROW_NUMBER() OVER(ORDER BY RowID), 10) as TotalTen FROM Data ORDER BY RowID
And of course result of such solution:
RowId Value TotalNull TotalZero TotalTen ----------- ----------- -------------------- -------------------- -------------------- 1 NULL NULL 0 10 2 1 1 1 11 3 NULL 1 1 21 4 1 2 2 22 5 1 3 3 23
Here we can see, that if we pass NULL, then the NULL values are ignored and in the other two cases the NULL values are handled as 0 respectively 10.
Parallel plans and not ordered result set testing
Till now we had tested only simple queries, which produce serial plan. But what if we will use the function in much more complicated queries, or queries processing high amount of data and which produce parallel query plan on multi processor (core) systems.
Let’s simulate such scenario on our big table, which should produce a parallel plan on multi core system and default settings of SQL Server.
SELECT RowID ,Value ,[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal FROM dbo.BigTable ORDER BY RowID
Here is query plan for above query:
And here results:
RowID Value RunningTotal ----------- ----------- -------------------- 1 1 1 2 2 3 3 3 6 4 4 10 99994 99994 4999450015 99995 99995 4999550010 99996 99996 4999650006 99997 99997 4999750003 99998 99998 4999850001 99999 99999 4999950000 100000 100000 5000050000 (100000 row(s) affected)
In this case the results are correct as Compute Scalar was performed after the parallel streams were garthered.
In case the computation of scalar running total value occured prior the parallel streams are garthered, then again an issue that rows are not processed in corect order may occur.
To avoid this kind of problems we can force serial query plan by setting max degree of parallelism for the query using the OPTION(MAXDOP 1)
SELECT RowID ,Value ,[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal FROM dbo.BigTable ORDER BY RowID OPTION(MAXDOP 1)
This query will always produce a serial plan. In this case following:
Final form of Running Totals
From the above we can construct a final form of the Running Totals functions for each numeric data type. The final form can look like following class;
using System; using Microsoft.SqlServer.Server; using System.Runtime.Remoting.Messaging; using System.Data.SqlTypes; using System.Collections.Generic; /// <summary> /// Class contains CLR scalar functions for calculation of running totals /// </summary> public class RunningTotals { /// <summary> /// Storage Structure for holding actual Total and row number for security check. /// </summary> /// <typeparam name="T">Totals Data Type</typeparam> private struct RtStorage<T> where T : struct { public T Total; public int RowNo; } /// <summary> /// Calculates a running totals on TinyInt (byte) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlByte representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlByte RunningTotalTinyInt(SqlByte val, SqlByte id, int rowNo, SqlByte nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlByte>)lastSum : new RtStorage<SqlByte>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on SmallInt (Int) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlInt16 representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlInt16 RunningTotalSmallInt(SqlInt16 val, SqlByte id, int rowNo, SqlInt16 nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlInt16>)lastSum : new RtStorage<SqlInt16>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on Int (Int32) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlInt32 representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlInt32 RunningTotalInt(SqlInt32 val, SqlByte id, int rowNo, SqlInt32 nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlInt32>)lastSum : new RtStorage<SqlInt32>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on BigInt (Int64) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlInt64 representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlInt64 RunningTotalBigInt(SqlInt64 val, SqlByte id, int rowNo, SqlInt64 nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlInt64>)lastSum : new RtStorage<SqlInt64>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on Float (Double) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlDouble representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlDouble RunningTotalFloat(SqlDouble val, SqlByte id, int rowNo, SqlDouble nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlDouble>)lastSum : new RtStorage<SqlDouble>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on Real (Single) data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlSingle representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlSingle RunningTotalReal(SqlSingle val, SqlByte id, int rowNo, SqlSingle nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlSingle>)lastSum : new RtStorage<SqlSingle>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on Money data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlMoney representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlMoney RunningTotalMoney(SqlMoney val, SqlByte id, int rowNo, SqlMoney nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlMoney>)lastSum : new RtStorage<SqlMoney>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } /// <summary> /// Calculates a running totals on Decimal data type /// </summary> /// <param name="val">Value of current row</param> /// <param name="id">ID of the function in single query</param> /// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param> /// <param name="nullValue">Value to be used for NULL values</param> /// <returns>SqlDecimal representing running total</returns> [SqlFunction(IsDeterministic = true)] public static SqlDecimal RunningTotalDecimal(SqlDecimal val, SqlByte id, int rowNo, SqlDecimal nullValue) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); object lastSum = CallContext.GetData(dataName); var storage = lastSum != null ? (RtStorage<SqlDecimal>)lastSum : new RtStorage<SqlDecimal>(); storage.RowNo++; if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); return storage.Total; } }
Conclusion
As we have tested and seen, the CLR version of calculation running totals produces very simple query plans and is one of the fastest methods to calculate running totals.
The CLR functions are really simple and fast.
On one side the CLR solution for running totals is one of the fastest, but on the other side we have to remember that this solution requires, that records are processed in order in which the running totals should be calculated. However the Query Optimizer and Query Processor doesn’t guarantee, that the records will be processed in particular order. It only guarantees, that the final output will be sorced according the ORDER BY clause.
To ensure, that those function to not produce incorrect results a security check in the form of rowNo generated by ROW_NUMBER() function needs to be part of the solution an in case the solution could produce incorrect results, it raises an exception.
So I would like to thank gain to Paul White MVP (blog) for his notes.
Remarks
You can download a complete Visual C# project + SQL Scripts (script for generating sample database, script for creating CLR objects + script with test queries) from my SkyDrive here: SqlClrTotals_Updated.zip