SQL Server and fastest running totals using CLR – Updated

Calculating running totals is a common task when reporting warehouse or financial balance. The theory behind running totals is quite easy as it is simply sum of values on all previous rows.

 RowID ProductID Date Operation Quantity Total Quantity 1 1 2010/09/01 Buy 1000 1000 2 1 2010/09/02 Buy 500 1500 3 1 2010/09/03 Sell 200 1300 4 1 2010/09/04 Buy 500 1800 5 1 2010/09/05 Sell 1000 800

Some reporting tools provide native support for running total, but some tools not. The real problems comes when we need to calculate running totals inside query. The main problems are with performance and sometimes complexity of such solution.

Jeff Moden has a great article about calculating running totals using T-SQL on SQL Server Central: Solving the “Running Total” & “Ordinal Rank” Problems.

For relatively small record sets the T-SQL solution is quite usable, but for larger record sets the real performance problems comes (a lot of reads and processing).

Although there are many articles related to running totals, nearly non of them describes the fastest possible method for calculating running totals – CLR. On the other side the CLR solution brings some issues which we have to take in mind.

I have updated the article after some comments and further email communication with Paul White MVP (blog). Paul has mentioned issues with rows ordering for calculation of the running totals so this updated version reflect this and also I’ve added a security check to the functions so those functions throws exceptions when and issue with row ordering is detected.

CLR solution for running totals

As I mentioned above, the CLR solution is the simplest and quickest solution for the running totals. Before we start with the CLR solution we will prepare a testing database and test data.

```--======================
-- Create test database
--======================
CREATE DATABASE RunningTotalsTest
GO
USE RunningTotalsTest
GO

--===========================
-- Create small test table
--===========================
CREATE TABLE SmallTable (
RowID int NOT NULL IDENTITY(1,1),
Value int, RandomValue int
)
GO

--Add Clustered primary key to the table
ALTER TABLE SmallTable
PRIMARY KEY CLUSTERED (RowID)
WITH FILLFACTOR = 100
GO

--===========================
-- Create big test table
--===========================

--The big table will be heap, becase we will want to test paralel query plans on it
CREATE TABLE BigTable (
RowID int NOT NULL IDENTITY(1,1),
Value int,
RandomValue int,
BigData binary(3000)
)
GO

--==============================
-- Fill tables with test data
--==============================

--fill small table with data 1000 records
WITH TallyBase AS (
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N
),
Tally AS (
SELECT
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) As Value
FROM TallyBase t1, TallyBase t2, TallyBase t3
)
INSERT INTO SmallTable (
Value,
RandomValue
)
SELECT
Value,
ABS(CHECKSUM(NEWID())) % 100000
FROM Tally
GO

--fill BigTable with 100000 records
WITH TallyBase AS (
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N UNION ALL
SELECT 1 AS N
),
Tally AS (
SELECT
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) As Value
FROM TallyBase t1, TallyBase t2, TallyBase t3, TallyBase t4, TallyBase t5
)
INSERT INTO BigTable (
Value,
RandomValue
)
SELECT
Value,
ABS(CHECKSUM(NEWID())) % 100000
FROM Tally
GO```

Simple scalar CLR running totals function

In our implementation we will use a simple scalar CLR function. This function will store current value for next row and in that row it will retrieve the value from previous row and calculate total. For this purpose we will use `CallContext` class (See MSDN for details).

Here is testing class and function for BigInt data type.

```public class RunningTotals
{
/// <summary>
/// Storage Structure for holding actual Total and row number for security check.
/// </summary>
/// <typeparam name="T">Totals Data Type</typeparam>
private struct RtStorage<T> where T : struct
{
public T Total;
public int RowNo;
}

/// <summary>
/// Calculates a running totals on BigInt (Int64) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlInt64 representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlInt64 RunningTotalBigInt(SqlInt64 val, SqlByte id, int rowNo, SqlInt64 nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlInt64>)lastSum : new RtStorage<SqlInt64>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}
}```

The function has several input parameters:

• `val` parameter represents the value to be accumulated into the running total.
• `id` parameter is identification of the function in a select statement if there are more functions used in single select (eg. accumulating costs and incomes separately) – it is necessary to have separate storage for each call of the function.
• `rowNo` is parameter for the security check to ensure, that rows are accumulated in expected order and correct results are provided. In case an issue with accumulation order is detected, an exception is thrown so user immediately knows the calculation is wrong.  The security check is based on communication with Paul White (MVP) as mentioned above and the value for this parameter will be provided by `ROW_NUMBER()` windowing function with expected order.
• `nullValue` allows specify value for null handling (eg. you may want to accumulate some particular value in case the val is null.

Once we compile the assembly with the class, we can register it in our test database. Because we use the `CallContext` class, the assembly will require `PERMISSION_SET = UNSAFE` and because of this we have to set also the database `TRUSTWORTHY` to `ON`;

```--=================================================================
-- Set Testing database TRUSTWORTHY ON so we can use UNSAFE code
--=================================================================
ALTER DATABASE [RunningTotalsTest]
SET TRUSTWORTHY ON WITH ROLLBACK IMMEDIATE
GO

--==========================================
-- Create assembly
--==========================================
CREATE ASSEMBLY [SqlClrTotals]
AUTHORIZATION [dbo]
FROM 'C:SQLCLRSqlClrTotals.dll'
WITH PERMISSION_SET = UNSAFE
GO

--==========================================
-- Create testing function
--==========================================
CREATE FUNCTION [dbo].[fn_RunningTotalBigInt](
@val [bigint],					--value to be added to running total (a fiedl in the query)
@id [tinyint],					--id of the running total within a single query
@rowNo [int],					--RowNumber of processed records. This is compared to espected rowNo and in case out of synchronization an exceiption is fired
@nullValue [bigint] = NULL		--representation of the NULL value when adding to running totals
)
RETURNS [bigint]
WITH EXECUTE AS CALLER
AS
EXTERNAL NAME [SqlClrTotals].[RunningTotals].[RunningTotalBigInt]
GO```

Once the assembly and function is created, we can start testing the function on our test tables.

Testing running total function on queries with serial plan

Fist test will be done on simple queries which will have a simple serial plan.

Simple query with one running total

```SET STATISTICS IO ON
SET STATISTICS TIME ON
GO

SELECT
RowID,
Value,
dbo.fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal
FROM dbo.SmallTable
ORDER BY RowID

SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO```

Please note the `ROW_NUMBER() OVER(ORDER BY RowID)`. This is the security check by which we provide RowNumber in the order of `RowID`. In case of some inconsistencies (rows arrive in different order as required. This can happen because the query engine doesn’t guarantee the rows order until final result set is received) an exception will be thrown.

As we can see, the syntax for this query is very simple and it also give as a very good and correct results:

```RowID       Value       RunningTotal
----------- ----------- --------------------
1           1           1
2           2           3
3           3           6
4           4           10
5           5           15
6           6           21
.           .           .
.           .           .
.           .           .
998         998         498501
999         999         499500
1000        1000        500500

(1000 row(s) affected)

SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms.

(1000 row(s) affected)

(1 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms,  elapsed time = 3 ms.```

And also a very simple query plan:

Non of the T-SQL methods will produce such simple query plan. The Sequence Project is calculation of the `ROW_NUMBER()` for security check.

Multiple running totals in single query

Now let test the situation if we use the function two times in the same query.

```SELECT
RowID,
Value,
[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal,
RandomValue,
[dbo].fn_RunningTotalBigInt(RandomValue, 1, ROW_NUMBER() OVER(ORDER BY RowID), null) as RandomRunningTotal
FROM dbo.SmallTable
ORDER BY RowID```

Result:

```RowID       Value       RunningTotal         RandomValue RandomRunningTotal
----------- ----------- -------------------- ----------- --------------------
1           1           1                    15137       15137
2           2           3                    97303       112440
3           3           6                    14688       127128
4           4           10                   64512       191640
5           5           15                   20463       212103
.           .           .                    .           .
.           .           .                    .           .
996         996         496506               64071       50040800
997         997         497503               58488       50099288
998         998         498501               65331       50164619
999         999         499500               97196       50261815
1000        1000        500500               65433       50327248```

As we can see, we have again correct results. Note, that each call to the `fn_RunningTotalInt` has different ID.

In case we use same ID the execution will fail with detection of rows being processed in unexpected order.

```SELECT
RowID,
Value,
[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal,
RandomValue,
[dbo].fn_RunningTotalBigInt(RandomValue, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RandomRunningTotal
FROM dbo.SmallTable
ORDER BY RowID```
```Msg 6522, Level 16, State 2, Line 1
A .NET Framework error occurred during execution of user-defined routine or aggregate "fn_RunningTotalInt":
System.InvalidOperationException: Rows were processed out of expected order. Expected RowNo: 2, received RowNo: 1
System.InvalidOperationException:
in RunningTotals.RunningTotalInt(SqlInt32 val, SqlByte id, Int32 rowNo, SqlInt32 nullValue)```

This is caused by the fact that in case of using identical id, the internal counters are used by both functions instead of single counter for each function call.

Detection of processing row order changes

As I  have mentioned at the beginning fo the article, this update was done upon Paul White comments, so now the function detect such issues if they occurs. Let’s simulate such issue.

```SELECT
RowID,
Value,
[ClrUnsafe].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal,
ROW_NUMBER() OVER(ORDER BY RowID DESC) DescendingRowOrder
FROM dbo.SmallTable
ORDER BY RowID```
```Msg 6522, Level 16, State 2, Line 1
A .NET Framework error occurred during execution of user-defined routine or aggregate "fn_RunningTotalInt":
System.InvalidOperationException: Rows were processed out of expected order. Expected RowNo: 1, received RowNo: 1000
System.InvalidOperationException:
in RunningTotals.RunningTotalInt(SqlInt32 val, SqlByte id, Int32 rowNo, SqlInt32 nullValue)```

This query produces below plan:

The issue is, that the the scalar operator is after a sort in descending order which is being done for the `DescendingRowOrder` and prior the final sort. However if such issue occurs, it is correctly detected and an exception raised. This issue can occur even the scalar operator was the last operator as the order of rows is not guaranteed by the query processor until final result is received, but in normal cases in single threaded processing it generally works and in case of some issues, we have the check implemented.

Null values handling

Now we will focus on the NULL values handling. Let’s run another simple query with some NULL values.

```WITH Data (RowID, Value) AS (
SELECT 1 AS RowId, NULL As Value
UNION ALL SELECT 2 AS RowId, 1 As Value
UNION ALL SELECT 3 AS RowId, NULL As Value
UNION ALL SELECT 4 AS RowId, 1 As Value UNION ALL
SELECT 5 AS RowId, 1 As Value
)
SELECT
RowId
,Value
,[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as TotalNull
,[dbo].fn_RunningTotalBigInt(Value, 1, ROW_NUMBER() OVER(ORDER BY RowID), 0) as TotalZero
,[dbo].fn_RunningTotalBigInt(Value, 2, ROW_NUMBER() OVER(ORDER BY RowID), 10) as TotalTen
FROM Data
ORDER BY RowID```

And of course result of such solution:

```RowId       Value       TotalNull            TotalZero            TotalTen
----------- ----------- -------------------- -------------------- --------------------
1           NULL        NULL                 0                    10
2           1           1                    1                    11
3           NULL        1                    1                    21
4           1           2                    2                    22
5           1           3                    3                    23```

Here we can see, that if we pass NULL, then the NULL values are ignored and in the other two cases the NULL values are handled as 0 respectively 10.

Parallel plans and not ordered result set testing

Till now we had tested only simple queries, which produce serial plan. But what if we will use the function in much more complicated queries, or queries processing high amount of data and which produce parallel query plan on multi processor (core) systems.

Let’s simulate such scenario on our big table, which should produce a parallel plan on multi core system and default settings of SQL Server.

```SELECT
RowID
,Value
,[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal
FROM dbo.BigTable
ORDER BY RowID```

Here is query plan for above query:

And here results:

```RowID       Value       RunningTotal
----------- ----------- --------------------
1           1           1
2           2           3
3           3           6
4           4           10
99994       99994       4999450015
99995       99995       4999550010
99996       99996       4999650006
99997       99997       4999750003
99998       99998       4999850001
99999       99999       4999950000
100000      100000      5000050000

(100000 row(s) affected)```

In this case the results are correct as Compute Scalar was performed after the parallel streams were garthered.
In case the computation of scalar running total value occured prior the parallel streams are garthered, then again an issue that rows are not processed in corect order may occur.
To avoid this kind of problems we can force serial query plan by setting max degree of parallelism for the query using the `OPTION(MAXDOP 1)`

```SELECT
RowID
,Value
,[dbo].fn_RunningTotalBigInt(Value, 0, ROW_NUMBER() OVER(ORDER BY RowID), null) as RunningTotal
FROM dbo.BigTable
ORDER BY RowID
OPTION(MAXDOP 1)```

This query will always produce a serial plan. In this case following:

Final form of Running Totals

From the above we can construct a final form of the Running Totals functions for each numeric data type. The final form can look like following class;

```using System;
using Microsoft.SqlServer.Server;
using System.Runtime.Remoting.Messaging;
using System.Data.SqlTypes;
using System.Collections.Generic;

/// <summary>
/// Class contains CLR scalar functions for calculation of running totals
/// </summary>
public class RunningTotals
{
/// <summary>
/// Storage Structure for holding actual Total and row number for security check.
/// </summary>
/// <typeparam name="T">Totals Data Type</typeparam>
private struct RtStorage<T> where T : struct
{
public T Total;
public int RowNo;
}

/// <summary>
/// Calculates a running totals on TinyInt (byte) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlByte representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlByte RunningTotalTinyInt(SqlByte val, SqlByte id, int rowNo, SqlByte nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlByte>)lastSum : new RtStorage<SqlByte>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on SmallInt (Int) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlInt16 representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlInt16 RunningTotalSmallInt(SqlInt16 val, SqlByte id, int rowNo, SqlInt16 nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlInt16>)lastSum : new RtStorage<SqlInt16>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on Int (Int32) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlInt32 representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlInt32 RunningTotalInt(SqlInt32 val, SqlByte id, int rowNo, SqlInt32 nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlInt32>)lastSum : new RtStorage<SqlInt32>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on BigInt (Int64) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlInt64 representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlInt64 RunningTotalBigInt(SqlInt64 val, SqlByte id, int rowNo, SqlInt64 nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlInt64>)lastSum : new RtStorage<SqlInt64>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on Float (Double) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlDouble representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlDouble RunningTotalFloat(SqlDouble val, SqlByte id, int rowNo, SqlDouble nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlDouble>)lastSum : new RtStorage<SqlDouble>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on Real (Single) data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlSingle representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlSingle RunningTotalReal(SqlSingle val, SqlByte id, int rowNo, SqlSingle nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlSingle>)lastSum : new RtStorage<SqlSingle>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on Money data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlMoney representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlMoney RunningTotalMoney(SqlMoney val, SqlByte id, int rowNo, SqlMoney nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlMoney>)lastSum : new RtStorage<SqlMoney>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}

/// <summary>
/// Calculates a running totals on Decimal data type
/// </summary>
/// <param name="val">Value of current row</param>
/// <param name="id">ID of the function in single query</param>
/// <param name="rowNo">Specifies expecter rowNo. It is for security check to ensure correctness of running totals</param>
/// <param name="nullValue">Value to be used for NULL values</param>
/// <returns>SqlDecimal representing running total</returns>
[SqlFunction(IsDeterministic = true)]
public static SqlDecimal RunningTotalDecimal(SqlDecimal val, SqlByte id, int rowNo, SqlDecimal nullValue)
{
string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value);

object lastSum = CallContext.GetData(dataName);

var storage = lastSum != null ? (RtStorage<SqlDecimal>)lastSum : new RtStorage<SqlDecimal>();
storage.RowNo++;

if (storage.RowNo != rowNo)
throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo));

if (!val.IsNull)
storage.Total = storage.Total.IsNull ? val : storage.Total + val;
else
storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue);

CallContext.SetData(dataName, storage);

return storage.Total;
}
}```

Conclusion

As we have tested and seen, the CLR version of calculation running totals produces very simple query plans and is one of the fastest methods to calculate running totals.

The CLR functions are really simple and fast.

On one side the CLR solution for running totals is one of the fastest, but on the other side we have to remember that this solution requires, that records are processed in order in which the running totals should be calculated. However the Query Optimizer and Query Processor doesn’t guarantee, that the records will be processed in particular order. It only guarantees, that the final output will be sorced according the ORDER BY clause.

To ensure, that those function to not produce incorrect results a security check in the form of rowNo generated by ROW_NUMBER() function needs to be part of the solution an in case the solution could produce incorrect results, it raises an exception.

So I would like to thank gain to Paul White MVP (blog) for his notes.

Remarks

You can download a complete Visual C# project + SQL Scripts (script for generating sample database, script for creating CLR objects + script with test queries) from my SkyDrive here: SqlClrTotals_Updated.zip

30 thoughts on “SQL Server and fastest running totals using CLR – Updated”

1. lordsandwich says:

Excellent article and code, thanks for the tips!

2. Dariusz says:

Hi there. A nice article, indeed. I have implemented your functions on my server. One question, though. The function that deals with decimals… it only works for decimal(18,0), that is, anything that has precision > 0 gets truncated. How do I allow for decimal(18,2)? I’d be more than happy if I could get an answer quickly… please?

3. Dariusz says:

Ok, I have found the solution, and that was not easy at all. To get decimal(18,2) as a parameter into the function you have to decorate it like this:

[Microsoft.SqlServer.Server.SqlFacet(Precision = 18, Scale = 2)]SqlDecimal val

and to be able to obtain this type from the function you have to decorate the function like this:

[SqlFunction(IsDeterministic = true, Name = “ufn_RunningTotalForDecimal”)]
[return: Microsoft.SqlServer.Server.SqlFacet(Precision = 18, Scale = 2)]
public static SqlDecimal RunningTotalsDecimal(…

It was not easy to find, phew… Hope it’ll help ppl.

Best,
Darek

• Hi,

you do not need to define this in the CLR code. Since the parameters of the CLR function in the C# code is decimal, you can create several SQL Server CLR functions to point into a single CLR method in assembly.

So you have a single CLR method in assembly

```    [SqlFunction(IsDeterministic = true)]
public static SqlDecimal RunningTotalDecimal(SqlDecimal val, SqlByte id, SqlDecimal nullValue)
{
...
...
```

and several SQL Server CLR functions pointing to it as any SQL decimal declaration is represented by a C# decimal and/or SqlDecimal data type

```CREATE FUNCTION [ClrUnsafe].[fn_RunningTotalDecimal_18_0](
@val [decimal](18, 0),
@id [tinyint],
@nullValue [decimal](18, 0) = NULL
)
RETURNS [decimal](18, 0)
WITH EXECUTE AS CALLER
AS
EXTERNAL NAME [SQLCLR].[RunningTotals].[RunningTotalDecimal]
GO

CREATE FUNCTION [ClrUnsafe].[fn_RunningTotalDecimal_18_4](
@val [decimal](18, 4),
@id [tinyint],
@nullValue [decimal](18, 4) = NULL
)
RETURNS [decimal](18, 4)
WITH EXECUTE AS CALLER
AS
EXTERNAL NAME [SQLCLR].[RunningTotals].[RunningTotalDecimal]
GO
```
• Paul McCoy says:

Thanks for not being unsatisfied/curious enough to investigate improving running totals calculations for all of us pre-2012 admins!!

For anyone interested, I was able to quickly modify your assembly/function id parameter in order to allow for a partitioned running totals calculation. For example, SqlInt32 ([int]) instead of SqlByte ([tinyint]) to accommodate a potentially large number of partitions. After that, just modify the call to the function in the query in a way such as this:

SELECT
RowID,
PartitionID,
Value,
dbo.fn_RunningTotalBigInt(
Value,
PartitionID,
ROW_NUMBER() OVER(PARTITION BY PartitionID ORDER BY RowID),
null) as RunningTotal
FROM dbo.SmallTable
ORDER BY PartitionID, RowID

4. Dan Martini says:

This is amazing work Pavel, thank you for sharing it.

I have a specific use for these functions in an accounting application I am writing and I have been working with a much simpler version I believe you wrote some time ago. I will update my code with your new functions and will feel more secure now.

But I have a problem (maybe more than one). I am using Entity Framework and .NET 4.0 and once I registered the function with the data model, I was able to call it with simple ESQL SELECT statements and it worked just fine.

The main problem is that while I can for example get the running total for an entire bank account, if I try to only select a subset of the account like all records between a date range (ex BETWEEN ‘2/1/2011’ AND ‘3/1/2011’), obviously it will give me a subtotal only for that subset and not include the records prior to 2/1/2011 to give me the running total accurate for that window of time.

I tried things like putting your function inside of a CTE so that the CTE was essentially ‘SELECT Col1, col2, RunningTotal(Amount) FROM Mytable and then query the outer CTE with the date range, but the result was the same. It started the total from 2/1/2011 and accumulated from there. I have tried all sorts of different things and nothing I am doing is working. I do not have nearly your familiarity or skill with SQL Server as you do and most solutions I am trying are just groping from Google searches,

In essence, people will be auditing bank accounts with thousands of entries from the opening balance of the account to the present day. To find a problem entry, the system they have been using allowed them to enter a date range and it would return the results within that date range with the running total accurately represented. However, the previous developers of the existing system achieved this by querying the table into a temporary table and iterating over it with a cursor to create the running total. Now you understand why I am rewriting it: it is incredible slow and resource intensive because there are hundreds or perhaps thousands of accounts in the DB,

Any clue how I might make this work much faster and give the user the ‘slice’ of the account data with correct running balance total on those records?

The other problem is that, as I mentioned, I am using Entity Framework. So while I can make nice clean ESQL DML calls, I don’t know how I will execute the equivalent of OPTION(MAXDOP 1, FORCE ORDER) since I don’t think OPTION() is a valid ESQL function.

I am desperate to avoid cursors and temp tables and would appreciate any guidance.

Thanks,

Dan Martini

• Hi, to achieve what you want, you can simply add the accumulated amount value from the period prior your BETWEEN, as in the sample here:

```CREATE TABLE #test (
ID int NOT NULL IDENTITY(1,1),
TranTMS datetime NOT NULL,
Amount money
)
GO
INSERT INTO #test(TranTMS, Amount)
SELECT '2011/01/01 12:10:30', 10 UNION ALL
SELECT '2011/01/01 12:20:00', -5 UNION ALL
SELECT '2011/01/02 1:13:00', 100 UNION ALL
SELECT '2011/01/02 14:25:00', 500 UNION ALL
SELECT '2011/01/03 05:47:00', 10 UNION ALL
SELECT '2011/01/04 09:50:00', -160 UNION ALL
SELECT '2011/01/05 16:39:00', 70 UNION ALL
SELECT '2011/01/05 18:07:00', 30 UNION ALL
SELECT '2011/01/05 09:06:00', -75 UNION ALL
SELECT '2011/01/09 10:23:00', 1000
GO

SELECT
ID,
TranTMS,
Amount,
ClrUnsafe.fn_RunningTotalMoney(Amount, 1, NULL) AS RunningTotal
FROM #test
ORDER BY TranTMS
OPTION(MAXDOP 1,FORCE ORDER)
GO

--Accumulated value from time period before the records we want
SELECT
ID,
TranTMS,
Amount,
(SELECT SUM(Amount) FROM #test WHERE TranTMS < '2011/01/02') +ClrUnsafe.fn_RunningTotalMoney(Amount, 1, NULL) AS RunningTotal
FROM #test
WHERE TranTMS BETWEEN '2011/01/02' AND '2011/01/05'
ORDER BY TranTMS
OPTION(MAXDOP 1,FORCE ORDER)
GO

DROP TABLE #test
GO
```

Related to the OPTION(MAXDOP 1, FORCE ORDER) and Entity Framework. I didn’t worked much with the EntityFramework, and don’t now whether it is possible to pass this option using the Entity Framework, but as you could see, this option are essential for correct work.</p
I can only suggest to encapsulate the T-SQL code in the a stored procedure with parameters for the interval, and call this stored proc. Then the OPTION will be inside the StoredProc and everything should execute correctly.

5. Dan Martini says:

While I am thinking of it, and this is probably an ignorant question: Using your final version of the CLR code, I am assuming that this will work safely in an environment where many users (up to a few hundred concurrent) will be continuously hitting the bank account register, paging back and forth, performing a lot of data entry etc. I ask because I am not not completely comfortable with CLR SQL integration and don’t really know all that WITH PERMISSION_SET = UNSAFE implies.

Being a financial application, you can understand that accuracy is paramount and speed is a close second. Thanks again for your help and the amazing code. No other solution I have tried in the past 6+ months comes even close to the stunning performance of your code.

• Hi,

related to the WITH PERMISSION_SET=UNSAFE you can reed on MSDN – CREATE ASSEMBLY. If the assembly is only your code and you know what is inside, you do not have to bother about the UNSAFE. The UNSAFE is necessary because of use of the CallContext class.

Related to many users usage, this depends on your severs, but the CallContext used to store intermediate values will not cause problems and each user request is processed in separate thread and the CallContext doesn’t interfere with other threads. You can read more again on MSDN – CallContext Class.

• Dan Martini says:

Hi Pavel,

Thank you for both answers and the guidance. I will try working with the code that way and see if I can achieve what I need, and your example looks very promising. I just need to set up a realistic size set of data and also compile and install this latest code of yours. I will also look further into EF material and see if I can find any way to call with OPTION(MAXDOP 1, FORCE ORDER), but it is good to have the stored procedure idea as a backup if I can’t do it in EF.

I don’t know how, but the following code worked even on a table with a half million rows when i tried it a couple of months ago as a test. It was a simple SELECT against a table with a UNIQUEIDENTIFIER key (no date filtering like my last example). It may have broken down somewhere along the way and I didn’t notice it, but it appeared correct.

```using System;
using System.Runtime.Remoting.Messaging;

namespace RunningTotal {
public static class RunningTotalUtils {
public static decimal RunningTotal(decimal currentValue) {
object _lastTotal = CallContext.GetData("runningTotal");
decimal lastTotal = _lastTotal == null ?
0 : Convert.ToDecimal(_lastTotal);
lastTotal += currentValue;
CallContext.SetData("runningTotal", lastTotal);
return lastTotal;
}
}
}
```

In any case, I completely trust what you are saying that it requires OPTION(MAXDOP 1, FORCE ORDER) to be completely safe and I will be using the new assembly of functions you wrote and try to work through the problems. Thanks again.

• I only suggest to use `SqlDecimal currentValue` instead of `decimal currentValue`

You can check the `currentValue.IsNull` as in my article above to process accordingly eventual `NULL` values.

The `OPTION(MAXDOP 1, FORCE ORDER)` is essential for correct working. As if the the query is processed by paralled plan, you will receive wrong results, as the query will be processed by parallel threads. Therefore the MAXDOP (maximum degree of parallelism) must be set to 1 to force serial plan. The same is for the FORCE ORDER to preserve the correct order in query plan. Anyway you should inspect your query plan, that the compute scalar for the running total occurs after the sort operation.

6. Dan Martini says:

Wow. Seriously, wow. I used your new code and the examples, particularly the filtered ones and while I have only run a couple of moderately complex queries against a large dataset, I have to say this is a beautiful thing. It works better than I could have hoped. I am very grateful Pavel.

I haven’t tried working with EF yet, and I don’t have high hopes for that, but that doesn’t bother me because I can get something working with either stored procedures or dynamic SQL and that is what counts.

Thank you very much.

7. Anthony Ackah says:

Thanks very much for such a wonderful code.
I’ve been trying to implement it in my database but seem to be having a teething problem:

The code below:

```--==========================================
-- Create assembly
--==========================================
CREATE ASSEMBLY [SqlClrTotals]
AUTHORIZATION [digitel]
FROM 'C:Program FilesHelpclsRunningTotalclsRunningTotalbinReleaseclsRunningTotal.dll'
WITH PERMISSION_SET = UNSAFE
GO

--==========================================
-- Create testing function
--==========================================
CREATE FUNCTION [dbo].[TestRunningTotalsMoney](
@value Money
)
RETURNS Money
AS
EXTERNAL NAME [SqlClrTotals].RunningTotalsTest.RunningTotalsMoney
GO
```

gives me sql error message “CREATE FUNCTION failed because parameter counts do not match.”

• Hi, if you use the final form of the CLR function, then i has 3 parameters.

1. Value to be added to the running total
2. ID of the function within a statement to allow multiple running totals in single select. If only one running total is calculated, you can put whatever value for that param
3. Value representing NULL. If NULL is passed, then the NULL values are ignored and counted as 0. Othervice if value is provided, then NULL values will be counted as the value provided

So you have modify the CREATE FUNCTION statement to be corect:

```--==========================================
-- Create testing function
--==========================================
CREATE FUNCTION [dbo].[TestRunningTotalsMoney](
@value Money,
@id tinyint,
@nullValue Money
)
RETURNS Money
AS
EXTERNAL NAME [SqlClrTotals].RunningTotalsTest.RunningTotalsMoney
GO
```
8. Anthony Ackah says:

Thanks a million. It works like a miracle from heaven. I used it to create a view and it worked perfectly.

• Hi, when using it in a view, remember on the MAXDOP and the ORDER of the operations in final query plan. As when you use the view subsequently in joins or something, the plan can change drastically.

9. Anthony Ackah says:

Thanks. I am producing a statement of account for different accounts in a transaction table. When I use the statement to select transactions for a specific account, I get the desired results. I’m also able to calculate previous transactions and add it as brought forward, as you advised Dan Martini. But I need to print statements for multiple accounts at a time, using crystal report to group the report by account code.

When I execute the code below in a query, the runningTotal continues through out and does not restart when the account code changes:

```SELECT AccountCode, AccountName, mDebit, mCredit, TransDate, NetBalance,
dbo.TestRunningTotalsMoney(NetBalance, 1, NULL) AS RunningTotal
FROM dbo.vwAccountStatement
Where LedgerID  = 19 AND TransDate Between '2010/09/01' AND '2011/07/20'
Order By AccountCode, TransDate, TransID
OPTION(MAXDOP 1,FORCE ORDER)
```

Result is:

```AccountCode    mDebit    mCredit    TransDate     NetBalance    RunningTotal
16001         7653.00    0.00       2010-09-23      -7653.00       -7653.00
16002          -55.50    0.00       2010-09-23         55.50       -7597.50
16002          100.00    0.00       2010-09-25       -100.00       -7697.50
16003          193.98    0.00       2010-09-23       -193.98       -7891.48
16004            0.00    7.50       2011-02-01          7.50       -7883.98
16004         4000.00    0.00       2011-02-05      -4000.00      -11883.98
16004            0.00    4.30       2011-07-17          4.30      -11879.68
16004            0.00    3.50       2011-07-17          3.50      -11876.18
16004            0.00    9.00       2011-07-17          9.00      -11867.18
16006        97397.98    0.00       2010-09-23     -97397.98     -109265.16
```

How can I achieve this? Is it also possible to add up previous balances to the running totals as the Account code change in the listing, as I’m able to do for a single account?

• You can add a parameter AccountCode parameter to the CLR function for calculating running total.

Then using the `CallContext.SetData` and `CallContext.GetData` you can store the Account Code value between calls to the Running Total function. Inside the function you will detect the change of the AccountCode and once the account code changes, you simply reset the running total to 0 prior adding the current amount value to the Running Total.

But generally if you want the calculation of running total on report, most reporting systems support direct running totals calculation including restarting on beginning/end of groups

• Jacob Howarth says:

I’ve been trying to do something similar, but for some reason, my code doesn’t seem to store the field changes using the `CallContext` class. I know this post is old, but I can’t figure out what I’m doing wrong and I don’t have a functioning SQL Server debugger, nor do I have an effective way to debug this.

My code is below, any insight into what I’m doing wrong is appreciated:
``` /// /// Calculates a running totals on Decimal data type based on transistion between a string and numeric field. /// /// Value of current row /// ID of the function in single query /// Specifies expecter rowNo. It is for security check to ensure correctness of running totals /// Value to be used for NULL values /// String field /// Numeric field /// SqlDecimal representing running total [SqlFunction(IsDeterministic = true)] public static SqlDecimal RunningTotalDecimalResetStringNumeric(SqlDecimal val, SqlByte id, int rowNo, SqlDecimal nullValue, SqlString field1, SqlDecimal field2) { string dataName = string.Format("MultiSqlRt_{0}", id.IsNull ? 0 : id.Value); string field1Name = string.Format("MultiSqlField1_{0}", id.IsNull ? 0 : id.Value); string field2Name = string.Format("MultiSqlField2_{0}", id.IsNull ? 0 : id.Value);```

``` object lastSum = CallContext.GetData(dataName); object field1Value = CallContext.GetData(field1Name); object field2Value = CallContext.GetData(field2Name); var storage = lastSum != null ? (RtStorage)lastSum : new RtStorage(); storage.RowNo++; var stringFieldStorage = field1Value != null ? (StringFieldStorage)field1Value : new StringFieldStorage(); var numericFieldStorage = field2Value != null ? (NumericFieldStorage)field2Value : new NumericFieldStorage(); if (storage.RowNo != rowNo) throw new System.InvalidOperationException(string.Format("Rows were processed out of expected order. Expected RowNo: {0}, received RowNo: {1}", storage.RowNo, rowNo)); if (stringFieldStorage.stringField != field1 || (stringFieldStorage.stringField == field1 && numericFieldStorage.numericField != field2)) { storage.Total = new SqlDecimal(0); stringFieldStorage.stringField = field1; numericFieldStorage.numericField = field2; } if (!val.IsNull) storage.Total = storage.Total.IsNull ? val : storage.Total + val; else storage.Total = storage.Total.IsNull ? nullValue : (nullValue.IsNull ? storage.Total : storage.Total + nullValue); CallContext.SetData(dataName, storage); CallContext.SetData(field1Name, stringFieldStorage); CallContext.SetData(field2Name, numericFieldStorage); ```

``` return storage.Total; } ```

• Hi don’t know what exactly is wrong with your code as I do not have a complete code available. By first look it should be working.

But if you simply take my solution which is working and add your own field and store the values in the existing `RTStorage` struct as additional string fields and then doing the comparison against those, it should work.

• Jacob Howarth says:

For whatever reason, it’s not working. Here’s how I have the `RTStorage` struct.

``` private struct RtStorage where T : struct { public T Total; public int RowNo; public SqlString strField; public SqlDecimal decField; } ```

The following changes were made to the `RunningTotalDecimal` function:

1) Copied your `RunningTotalDecimal` function and added the following parameters `SqlString field1, SqlDecimal field2`

2) Added this code after the exception throwing code for out of order row processing and before adding `SqlDecimal val` to the total

``` if (storage.strField != field1 || (storage.strField == field1 && storage.decField != field2)) { storage.Total = 0; storage.strField = field1; storage.decField = field2; } ```

I then reinstalled the assembly and added the following SQL scalar function

``` CREATE FUNCTION [dbo].[fn_RunningTotalDecimal_15_2_ResetStringNumeric]( @val [decimal](15, 2), @id [tinyint], @rowNo [int], @nullValue [decimal](15, 2) = NULL, @field1 [nvarchar](15), @field2 [numeric](18,0) ) RETURNS [decimal](15, 2) WITH EXECUTE AS CALLER AS EXTERNAL NAME [SqlClrRunningTotals].[RDSAPI.SQLClrRunningTotals.RunningTotals].[RunningTotalDecimalResetStringNumeric] GO ```

I then call it in my select statement like this
``` [dbo].fn_RunningTotalDecimal_15_2_ResetStringNumeric([USER_TB_TIME_TICKETS].[HRS_WORKED], 0, ROW_NUMBER() OVER(ORDER BY [USER_TB_TIME_TICKETS].[CUST_NO], [USER_TB_TIME_TICKETS].[JOB_NO], [USER_TB_TIME_TICKETS].[ACT_DAT], [USER_TB_TIME_TICKETS].[REF_NO]), 0, [USER_TB_TIME_TICKETS].[CUST_NO], [USER_TB_TIME_TICKETS].[JOB_NO]) ```

Please let me know if there’s anything else I can give or if you can point me to a resource that can help me solve this. I’ve posted it on stackoverflow and spiceworks with no luck.

• Jacob Howarth says:

The running total works exactly like its described in your post, I just don’t know why the use of `CallContext` can keep track of the previous sum, but not a string or decimal value?

• Roland says:

I was able to get it to partition by an additional field of type SqlString. Were you able to get your’s working, Jacob?

• Hi,

I was looking on your post on StackOverflow and I think, you can have a solution for your problem even without the CLR using the `ROW_NUMBER()` function.

When using the ROW_NUMBER() function, you have to specify at least the ORDER BY in the OVER clause of that function. What causes, that the table is being sorted, which can be an expensive operation.

However there is an easy workaround by using a SELECT in the ORDER BY of the OVER clause. You cannot use expressions in the ORDER BY, however a SELECT statement with constant can be part of the ORDER BY

```SELECT
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowNumber,
*
FROM aTable
```

In this case no Sort operation will be performed on the table and if you are selecting records from the heap, then the records will come in the sequence as they were read from the table.

10. […] an addendum, I did test the CLR function offered by Pavel Pawlowski, and the performance characteristics were nearly identical to the SQL Server 2012 solution using […]

11. Joseph N. Musser II says:

This is almost perfect for pre-2012 servers. One huge problem with this approach is that filtering rows causes the rows out of order exception because SQL server simply doesn’t hand the filtered-out rows to the CLR function.

Is there any way to force SQL server to run the CLR function on all rows from a table before filtering so that the results are the same as sum() over (order by) on 2012?

• This is the pitfall of this approach. It is not always possible to guarantee the operators order in the query plan.

A workaround would be to do all the necessary filtering and store the requested rows into a temp table. Than as final step calculate the running totals by doing the select against the temp table.

12. […] I’m trying to create a SQL CLR function that performs a running total similar to what’s explained here: SQL Server and fastest running totals using CLR – Updated […]

13. Roland says:

So elegant.