Cleanup of Retention Window in Partioned Tables – sp_tblCleanupRetentionWindow

In my GitHub collection of SQL Scripts is now available a new stored procedure sp_tblCleanupRetentionWindow under the TablesManagement/Partitioning folder. Help for the function is available in the repository Wiki.

Scenario Overview

You have huge tables in the MS SQL Database and to be able to efectively manage the amount of data, you have partitioned the table.

As the amount of the data grows over the time, you start thinking about regular maintenance of such huge table and drop old not needed data.

You need to clean the oldest partitions and MERGE the partitions in the partitoin fuction to keep the number of partitions reasonable (especially if new partitions are created as new data are arriving).

To cleanup partitions prior SQL Server 2016 you have to perform partition switching to cleanup the partitions and then you could MERGE the partitions. On SQL Server 2016 and above the situation is simplified as you can use the TUNRCATE TABLE WITH PARTITIONS. But again to merge partition, you must put exact partition boundary values etc.

To automate regular maintenance you need to write sripts to handle all the needed tasks.

sp_tblCleanupRetentionWindow

As mentioned in the scenario above, the sp_tblCleanupRetentionWindow is the procedure which takes care about all the operations needed to cleanup the table prior specified retention window.

Parameters

ParameterNameData TypeDefaultDescription
@pfNamenvarchar(128)NULLName of the partition function. All associated partition schemes and tables will be cleaned according specified @retentionWindow
@retentionWindowsql_variantNULL
  • Specifies retention window
  • @retentionWindow must be of the dame exact data type as the partition fuction
  • All partitions prior the partition containging @retentionWindow value wil be cleared.If partitions contain data, partitions are first truncated.
  • All partitions prior partition containing the @retentionWindow will be merged into the first (leftmost) partition.
  • If @retentionWindow is part of the first (leftmost) partition, no cleanup happens.
@infoOnlybit1When 1 prints only information about affected tables and partitions and does not perform any cleanup. When 0 does the actual cleanup

How It Operates

Procedure takes @pfName parameter, which identifies partition function used by partitioned table(s). Then based on the @retentionWindow parameter it idetnfifies partition containing the @retentionWindow value.

The @retentionWindow parameter identifies the lowest value which must be kept in the table. All records prior that one can be cleared.

If the @retentionWindow value is part of the first (leftmost) partition, the procedure ends and does not perform any cleanup as there is nothing to cleanup. It cannot clean the partition containing the @retentionWindow value.

If the @retentionWindow is part of second and higher partitions, it scan for all Partitions Shemes using the @pfName partition function.

After identification of the partition schemes it identifies all the tables using the identified partitions shemes.

Once the tables are identified, it loops through all the tables and TRUNCATES all the partitions prior the partition containing the @retentionWindow value.

After the truncation the procedure ALTERS the parition function and merges all the partitions prior the partition containing the @retentionWindow value to the first (leftmost) partition.

It keeps always the first leftmost partition empty.

By the scrip the procedure is deployed into the [master] database and marked as system stored procedure. This allows simple call of the procedure from within all databases on the SQL Server instance and also thanks to that the procedure is running in the context of the database from within it is being called.

Sample

The sample below does cleanup fo all tables associated with the pf_partDate partition function. Cleanup and merge all partitions prior partition containign value of @retentionWindow = '2020-12-15'. pf_partDate is using the date data type.

If we take a closer look on the pf_partDate partition function by using the sp_HelpPartitionFunction we will find, it has multiple partitions and is used by multipel partition schemes and tables.

sp_HelpPartitionFunction 'pf_partDate', 1

Info Only

If we execute the procedure with @infoOnly=1 then only information about affected partitions and tables will be provided.

DECLARE @retentionWindow date = '2020-12-05'

EXEC sp_tblCleanupRetentionWindow
    @pfName             = 'pf_partDate'
    ,@retentionWindow   = @retentionWindow
    ,@infoOnly          = 1

Info Only Output

Below is sample output of the procedure which informs about actions which will be performed if the @infoOnly would be set to 0.

sp_tblCleanupRetentionWindow v0.10 (2021-05-07) (C) 2017-2021 Pavel Pawlowski
=============================================================================
Cleans retention window for all tables associated with partition function

Feedback mail to: pavel.pawlowski@hotmail.cz
Repository:       https://github.com/PavelPawlowski/SQL-Scripts
-----------------------------------------------------------------------------
Cleaning retention window for partition function [pf_partDate]
 
Partitions to Cleanup:
----------------------
[          ]    [x] <  [2020-11-01]
[2020-11-01] <= [x] <  [2020-12-01]
 
Affected Partition Schemes And Tables:
--------------------------------------
[ps_date1]
  - [dbo].[tblDate1]
  - [dbo].[tblDate11]
[ps_date2]
  - [dbo].[tblDate2]

From the output we can see, that there are two partitions which would be affected. First partition containing data prior 2020-10-01 and second partition containing data between the 2020-10-01 and 2020-11-01.

Also it informs, that there are two partitions schemes. Partition scheme [ps_date1] which is used by tables [dbo].[tblDate1] and [dbo].[tblDate11] and partition scheme [ps_date2] which is used by table [dbo].[tblDate2].

Cleanup

If we execute the procedure with @infoOnly=0 then information about affected partitions and tables will be provided as well as actual cleanup will be done.

DECLARE @retentionWindow date = '2020-12-05'

EXEC sp_tblCleanupRetentionWindow
    @pfName             = 'pf_partDate'
    ,@retentionWindow   = @retentionWindow
    ,@infoOnly          = 0

Cleanup output

sp_tblCleanupRetentionWindow v0.10 (2021-05-07) (C) 2017-2021 Pavel Pawlowski
=============================================================================
Cleans retention window for all tables associated with partition function

Feedback mail to: pavel.pawlowski@hotmail.cz
Repository:       https://github.com/PavelPawlowski/SQL-Scripts
-----------------------------------------------------------------------------
Cleaning retention window for partition function [pf_partDate]
 
Partitions to Cleanup:
----------------------
[          ]    [x] <  [2020-11-01]
[2020-11-01] <= [x] <  [2020-12-01]
 
Affected Partition Schemes And Tables:
--------------------------------------
[ps_date1]
  - [dbo].[tblDate1]
  - [dbo].[tblDate11]
[ps_date2]
  - [dbo].[tblDate2]
 
CLEANUP PROCESS
---------------------------------------------------------------
2021-05-18 23:54:11.4600180 +02:00 - Starting CLEANUP Process
2021-05-18 23:54:11.4610182 +02:00 - [dbo].[tblDate2] PARTITION 1 not empty. Starting TRUNCATE
2021-05-18 23:54:11.4620186 +02:00 - [dbo].[tblDate2] PARTITION 1 TRUNCATE completed
2021-05-18 23:54:11.4710215 +02:00 - [dbo].[tblDate11] PARTITION for range [2020-11-01] <= [x] <  [2020-12-01] not empty. Starting TRUNCATE
2021-05-18 23:54:11.4760227 +02:00 - [dbo].[tblDate11] PARTITION for range [2020-11-01] <= [x] <  [2020-12-01] TRUNCATE completed
2021-05-18 23:54:11.4770230 +02:00 - [pf_partDate] range [2020-11-01] <= [x] <  [2020-12-01] : Start MERGE into PARTITION 1
2021-05-18 23:54:11.4790237 +02:00 - [pf_partDate] range [2020-11-01] <= [x] <  [2020-12-01] : MERGE into PARTITION 1 completed. New range of PARTITION 1: [x] <  [2020-12-01]
2021-05-18 23:54:11.4790237 +02:00 - CLEANUP Process COMPLETED

Compared to the @infoOnly=1 output we see additional section “CLEANUP PROCESS’, which informs about the actual cleanup steps performed.

Procedure identified, that partition 1 of the [dbo].[tblDate2] was not empty and therefore performed TRUNCATE of the partition 1. Then it identified that the table [dbo].[tblDate11] has data in partition covering range between dates 2020-11-01 and 2020-12-01, therefore performed also truncate of that partitions. As there is no information about any truncate of the table [dbo].[tblDate1] it means the table had both affected partitions empty. The same is related to the other afected partitions of the [dbo].[tblDate2] and [dbo].[tblDate11].

After the partition cleanup it started the multi-step partition merge process. It merged each affected partition into the left most partition and informs about the new range of the leftmost partition 1.

If we again utilize the sp_HelpPartitionFunction we can see, that the partitions were properly cleared.

Azure support

The procedure supports also execution in Azure SQL Databases as well as Instances or Azure Synapse Analytics.

When deploying to Azure SQL Database or Synapse Analytics, comment-out the unsuported USE statement or ignore the eventual error as stated in the header of the procedure source file.

/* *****************************************************************************************
                                      AZURE SQL DB Notice

   Comment-out the unsupported USE [master] when running in Azure SQL DB/Synapse Analytics
   or ignore error caused by unsupported USE statement
******************************************************************************************** */

USE [master]
GO

SQL Server Version Limitation

Because the procedure internally is using TRUNCATE TABLE WITH PARTITION , procedure is limited only to the SQL Server 2016 and above or Azure SQL Database, Azure Managed Instance and Azure Synapse Analytics.

Summary

The sp_tblCleanupRetentionWindow heavily simplifies the management of huge partitioned table and moves the complete maintenance of retention window into a single call of a stored procedure. Hopefully you will find the procedure usefull.

Advertisement

SQL Server 2016 Mobile Reports – Issue with Time Navigator for dynamic query parameters

SQL Server 2016 brings new kind of reports – Mobile Reports which are DataZen product integrated into SQL Server 2016.

It brings many new possibilities but also some pitfalls a user must be aware when developing reports.

Time Navigator Parameters Problem

One of those pitfalls is a Time Navigator if you want to use it for passing dynamic query parameters. It works correctly if you have a local time zone set to UTC (but nobody is using that time zone). The problem appears if you start to use it with different local time zones and need to pass the selected date ranges to Shared Dataset for dynamic query processing.

The problem is that Time Navigator is providing the SelectedStartTime, SelectedEndTime, ViewportStartTime and ViewportEndTime as UTC Date/Time Strings. This means in the yyyy-MM-ddThh:mm:ss.fffZ format eg. 2016-01-01T00:00:00.000Z. This itself would not be a problem but problem is that the report client is automatically shifting the selected range by the current user Time Zone offset including the daylight saving time at the selected date.

If your current time zone would be UTC where everything works properly then For example if you select in the Time Navigator a date 2016-01-01, the SelectedStartTime will provide 2016-01-01T00:00:0Z and SelectedEndTime will provide 2016-01-02T00:00:00.000Z. This is OK as you proces that information on the Server as interval <2016-01-0100:00:00Z; 2016-01-02T00:00:00.000Z).

However if you are in different time zone, as it was mentioned above, the provided stamps are shifted by the current time zone including a daylight saving. This means if you would be in UTC+1 the SelectedStartTime will provide 2015-12-31T23:00:00Z and SelectedEndTime will provide 2016-01-01T23:00:00Z.

In case of UTC-6 SelectedStartTime will provide 2016-01-01T06:00:00Z and SelectedEndTime will provide 2016-01-02T06:00:00Z.

As you can see, in case you are developing a report for users within single time zone, this would not be a big problem as you can deal with that. However if you are developing reports for enterprise users and the requests are arriving from different time zones, you always receive different time stamps. What more, in the time stamp there is no information about the originating time zone as the parameter is sent as plain string. The biggest issue of that is that because of that time shift handled on the user side, you may receive different start and end dates from what user selected.

Problem Demo

Let’s create an easy demonstration of that problem. Create a new Mobile Report, place a Time Navigator on the surface and in the Time intervals include Years, Months, Days.

mobilereport_timenavigator1

Then we need to create a testing Data Set on the Server. Let’s create a very simple Data Set, which will simply return the passed Start Date and End Date back to the client including the information about the parameter.

DECLARE @type sql_variant = @StartDate;
SELECT
    @StartDate AS StartDate
   ,@EndDate AS EndDate
   ,@Unit AS TimeUnit
   ,SQL_VARIANT_PROPERTY(@type, 'BaseType') AS StartDateDataType

We are adding the @type sql_variant to see what kind of data type is arriving to SQL Server. As we are creating a parameterized Data Set which will be used by Mobile Report, you have to set a default values for the parameters.

mobilereport_timenavigatorissue_datasetmobilereport_timenavigatorissue_datasetproperties

Once we have the Data Set ready, add it into the Mobile Report and Set the Parameters Binding.

mobilereport_timenavigatorissue_dataset2

MobileReport_TimeNavigatorIssue_DataSet3.png

After that we can add a Data Grid to the design surface which will show us the passed parameters.

MobileReport_TimeNavigatorIssue_DataGrid1.png

And Bind it to our parameterized Data Set

mobilereport_timenavigatorissue_datagrid2

Testing the Parameters

Once we have the testing report ready, we can start playing with it to demonstrate the problem.

UTC

First let’s check how it looks when the report is being run in the (UTC) Coordinated Universal Time.

Head to the Settings and se the UTC Time Zone. Also note that the UTC does not allow adjusting for daylight saving time.

mobilereport_timenavigatorissue_timezoneutc

And the Result is here:

MobileReport_TimeNavigatorIssue_TimeZoneUTCTest.png

We can clearly see, that for the whole year 2015 period the Start Date was passed as 2015-01-01T00:00:00.000Z and the end date was passed as 2016-01-01T00:00:00.000Z. This is correct and expected values which should be received. We also see, that we have received the parameter as nvarchar, this means plain string.

UTC+1

Now head again to settings and switch the Time Zone to some of the UTC+1 zones. And Ensure, that you also select one with automatic daylight saving time

mobilereport_timenavigatorissue_timezoneutc1

And Let’s make some tests.

MobileReport_TimeNavigatorIssue_TimeZoneUTC+1Test1.png

Here we can see, that for the whole year 2015 both the Start Date and End Date was shifted by the zone offset. So now we received the Start Date as 2014-12-31T23:00:00.000Z and End Date as 2015-12-31T23:00:00.000Z.

Now Select some month where the daylight saving is in effect. For example June.

MobileReport_TimeNavigatorIssue_TimeZoneUTC+1Test2.png

Here we can see, that the dates were shifted by the time zone offset as well the daylight time saving amount, this means shifted by 2 hours. The result is that Date Start was passed as 2015-05-31T22:00:00.000Z and End Date was passed as 2015-06-30T22:00:00.000Z.

Let’s take a look on different month during which the daylight saving change occurs, for Example March.

mobilereport_timenavigatorissue_timezoneutc1test3

Here we can clearly see, that the Start Date is shifted by one hour as the daylight saving was not in effect however the End Date is shifted by 2 hour as the daylight saving was on during that time. So Results are for Start Date 2015-02-28T23:00:00.000Z and End Date 2015-03-31T22:00:00.000Z.

mobilereport_timenavigatorissue_timezoneutc1test4

The same behavior we can see during the day to which the daylight saving went into effect. In case of year 2015 it was March 29. In that case we receive Start Date 2015-03-28T23:00:00.000Z and End Date 2015-03-29T22:00:00.000Z

UTC-6

Again head to the settings and adjust the time zone to TUC-6, e.g. Central Time.

mobilereport_timenavigatorissue_timezoneutc-6

And repeat some tests

mobilereport_timenavigatorissue_timezoneutc-6test1

Here we can see, that the dates were again shifted. This time the oposit way compared to the “plus” time zones. So for the whole year 2015 the Date Start we have 215-01-01T06:00:00.000Z and End Date as 2016-01-01T06:00:00.000Z.

As mentioned the same behavior we would see for all the tests.

Dealing with the issue

Ad mentioned at the beginning of that post. In case you develop a report which will use only users in one time zone and you know that those users are not travelling to different time zones when consuming the report the problem is not so big. as you adjust your eventual parameters processing for your time zone.

However if you develop an enterprise solution when the reports are using users around the globe from different time zones, you have to count with that. As it is clearly visible, you can receive different dates for different Time Zones.

As we saw from the test. in case for the Whole year of 2015 we receive

From that if you focus on whole dates, you can for example shift the dates by +1 in case eg. the Time Part of the received Start/End date string is for example greater or equal to 12:00:00 pm and do nothing with that parameter is the Start/End Date is till 12:00:00 pm.

No 100 % bullet proof solution

This will work for most common scenarios, but is not 100 % bullet proof if you have report users in time zones in more that +12. In case you would have users in +13 or +14 time zone and at the same time users in the -11 time zone, then there is no solution for such situation in case you go to the day level.

Let’s take a look a on a situation when use from UTC+14 (Kiritimati Island) time zone selects date 2015-01-05:

mobilereport_timenavigatorissue_timezoneutc14test1

Date Start is 2015-01-04T10:00:00.00Z and Date End is 2015-01-05T10:00:00.00Z

And now another users in the UTC-10 (Hawaii) selects a date 2015-01-04:

mobilereport_timenavigatorissue_timezoneutc-10test1

Here we again receive Date Start 2015-01-04T10:00:00.000Z and Date End 2015-01-05T10:00:00.000Z.

Because no information about the originating time zone of the user running the report is being passed to the date set on the report server and underlying database, we are not able to distinguish between those two different cases.

I can imagine a scenario in which this behavior would be useful, but in most cases this is not a wanted functionality. There is no setting in the report or report server to enable/disable this behavior and the report developer has to be very careful about that.