Automating CRUD procedures generation using T-SQL

2011-01-12 Pavel Pawlowski SQL Server, T-SQL automation, CRUD, SQL Server, Stored procedure, T-SQL

CRUD procedures represent a good way how to secure the database. When using CRUD procedures, no user has direct access to the database tables and instead of this is forced to use parameterized procedures for all data manipulations called CRUD procedures. CRUD is acronym for Create Read Update Delete.

CRUD procedures have a lot of advantages from which I can name several:

You do not need to grant rights directly to tables and in this way you hide the physical tables structure to users.
You can encapsulate some logic into the procedures and then when something needs to be changed, you simply modify a procedure and do not need to rebuild the application.
You hide the SQL logic encapsulated in stored procedures from end user.
You protect your SQL server against SQL injection as all data using the CRUD procedures has to be passed as parameters and in this way you drastically decrease a chance for the SQL injection.

The main disadvantage of CRUD procedures is the overhead which SQL developer has to do as for each table or group of several tables it is necessary separate stored procedures for the data manipulation.

There are several tools and plugins into visual studio, which simplifies creation of the CRUD procedures, but in this article I would like to show a possible way how to automate creation of the CRUD procedures using pure T-SQL.

The automation will be handled by a custom system stored procedure, which generate all the necessary code or directly create all the CRUD procedures for a database table.

This article describes how to crate a system stored procedure sp_makeCRUD, which will take 4 parameters @objectName, @executionMode, @dropExistingProcedures and @outputIdentityCalcField. The @objectName represents a table for which the CRUD procedures will be generated, @executionMode will represents execution mode of the stored procedure and will control whether the code of CRUD procedures will be only printed on scree or whether the generated code will be automatically executed and the CRUD procedures will created. @dropExistingProcedures parameter will specify whether include DROP statement for all procedures to drop eventual existing procedures and @outputIdentityCalcField will control, whether Identity and Calculated fields will be automatically returned as record set for ISERT and UPDATE operations.

CRUD procedures construction

To create a CRUD procedures we need to know all the fields in the table, whether the fields are Identity or calculated fields and also we need to know what fields the primary key is composed from. Also table name and it’s schema will be needed as user can specify the object name without schema or with schema. The CRUD procedures will be created in the same schema as the original table. To retrieve such information various system tables will be used.

Retrieving Object ID, Table Name, Schema Name and DB Name

To retrieve Object ID (ID of the table) and DB Name we will use the OBJECT_ID() and DB_NAME() functions.

SELECT
    @objID = OBJECT_ID(@objectName),
    @dbName = DB_NAME()

Table name and schema name, could be retrieved using system tables sys.objects and sys.schemas.

SELECT
    @schemaName = s.name,
    @tableName = o.name
FROM sys.objects o
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE o.object_id = @objId AND o.type = 'U'

Retrieving Table Field and Primary Key

After the object ID is retrieved, it is necessary to retrieve all the table fields, their data types and information whether the field is an IDENTITY field or whether it is Calculated Field. This information is necessary for generating INSERT and UPDATE statements as such fields are not updatable. Also it is necessary to identify fields which are part of the primary key. This fields are necessary for the UPDATE, DELETE and also for SELECT One statements. Fields will be temporarily stored in table variables as they will be accessed several times to construct different fields combinations for each statement and also for creation of procedure parameters.

The retrieval of all fields is quite easy as only access to sys.columns table for field names and sys.types for type names is necessary. The field type is written to the table variable as SQL statement which will be used when generating the procedure parameters.

DECLARE @allFields TABLE (        --Table variable for storing all the table fields
    name sysname,                --field name
    isIdentity bit,                --specifies whether field is INDENTITY
    isCalculated bit,            --specifies whether filed is Calculated field
    fieldType sysname            --Specified data type of the field
)

--Get all table fields and store them in the @allFields table variable for construction of CRUD procedures
INSERT INTO @allFields (
    name,
    isIdentity,
    isCalculated,
    fieldType
)
SELECT
    c.name,
    is_identity,
    is_computed,
    CASE
        WHEN t.name IN (N'char', N'nchar', N'varchar', N'nvarchar') THEN QUOTENAME(t.name) + N'(' + CASE WHEN c.max_length = -1 THEN N'max' ELSE CAST(c.max_length AS sysname) END + N')'
        WHEN t.name IN (N'decimal', N'numeric') THEN QUOTENAME(t.name) + N'(' + CAST(c.precision AS sysname) + N', ' + CAST(c.scale AS sysname) + N')'
        ELSE QUOTENAME(t.name)
    END
FROM sys.columns c
INNER JOIN sys.types t ON c.user_type_id = t.user_type_id
WHERE object_id = @objId

The query to retrieve the fields form which the Primary Key is composed is more complex and to retrieve all the information we need it is necessary to access several system tables. As the Primary Key is enforced by a unique index, the field names can be found using the ID of the columns in unique index by which the primary key is enforced.

DECLARE @pkFields TABLE (        --Table variable for storing fields which are part of primary key
    name sysname,                --field Name
    fieldType sysname            --Specified data type of the field
)

--Get list of Primary Key Fields and store them in @pkFields table variable for construction of CRUD procedures
INSERT INTO @pkFields(
    name,
    fieldType
)
SELECT
    c.name,
    CASE
        WHEN t.name IN (N'char', N'nchar', N'varchar', N'nvarchar') THEN QUOTENAME(t.name) + N'(' + CASE WHEN c.max_length = -1 THEN N'max' ELSE CAST(c.max_length AS sysname) END + N')'
        WHEN t.name IN (N'decimal', N'numeric') THEN QUOTENAME(t.name) + N'(' + CAST(c.precision AS sysname) + N', ' + CAST(c.scale AS sysname) + N')'
        ELSE QUOTENAME(t.name)
    END
FROM sys.key_constraints kc
INNER JOIN sys.indexes i ON kc.parent_object_id = i.object_id AND kc.unique_index_id = i.index_id
INNER JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
INNER JOIN sys.types t ON c.user_type_id = t.user_type_id
WHERE kc.parent_object_id = @objId and kc.type = 'PK'

Generating Columns and Variables Lists

Once the list of all columns is retrieved, the lists of different columns combinations could be generated. For this purpose a SELECT statement with FOR XML PATH clause will be used as this clause greatly simplifies a CSV list generation on SQL 2005 and above. Below is a code for the All Columns

SELECT @allColumns =
    STUFF(REPLACE((SELECT
                        N'            ' + @crlf + N'            ,' + QUOTENAME(c.name)
                    FROM @allFields c
                    FOR XML PATH(N'')
                    ),
            @crlfXML, @crlf),
        1, 27, N'             ')

The spaces are included in the generated string for later final query formatting as it allows the query to be printed in readable form and also when a stored procedure will be generated, it can be later easily modified as the query will contain correct line breaks and indentation. The REPLACE function is used to replace the XML representation of CRLF back to real CRLF characters as the FOR XML PATH in reality produces XML output and translates all the CRLF characters to their XML compliant equivalents.

Generating code of CRUD Procedures

Once all the the combination of columns are generated, the variables with such fields and parameters list can be used to generate the final CRUD procedure. Below is a snippet representing a SELECT procedure.

SET @sql = N'-- =======================================================
-- Author:        ' + QUOTENAME(SUSER_SNAME()) + N'
-- Create date: ' + CONVERT(nvarchar(10), GETDATE(), 111) + N'
-- Description:    Selects records from table ' + QUOTENAME(@schemaName) + '.' + QUOTENAME(@tableName) + N'
-- =======================================================
CREATE PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixSelect + N']
' + @selectParams + N'
AS
BEGIN
    SET NOCOUNT ON;

    IF (' + @selectAllIfNullCondition + N') THEN
    BEGIN
        SELECT
' + @allColumns + N'
        FROM ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N'
    END
    ELSE
    BEGIN
        SELECT
' + @allColumns + N'
        FROM ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N'
        WHERE
' + @selectCondition + N'
    END
END
GO

';

sp_makeCRUD Final Form

Once all the parts of the procedure are prepared it’s possible to construct a final form of the sp_makeCRUD procedure. Below is a complete code for the sp_makeCRUD procedure.

USE [master]
GO
--==========================================================
-- Author: (c) 2011 Pavel Pawlowski
-- Description: Generates CRUD procedures for a table
--
--@objectName = table name with or without schema for which the CRUD procedures should be generated
--
--@executionMode
--  1 = Print the script Only
--  2 = Output the script as recordset using SELECT for longer procedures which is not possible output using PRINT
--  4 = Execute and Commit
--  8 = Execute and Rollback - testing mode
--
--@dropExistingProcedures = 1 | 0
-- specifies whether generate DROP commands for existing objects
--
--@outputIndentityCalcFields = 1 | 0
-- specifies whether Identity and Calculated fields should be OUTPUTed in INSERT and UPDATE
--==========================================================
CREATE PROCEDURE [dbo].[sp_makeCRUD]
    @objectName sysname,
    @executionMode tinyint = 1,
    @dropExistingProcedures bit = 1,
    @outputIndentityCalcFields bit = 1
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    --variables declaration
    DECLARE
        @objId int,                                 --ID of the Table
        @schemaName sysname,                        --schema Name of the Table
        @tableName sysname,                         --TableName
        @dbName sysname,                            --Database name in which we are creating the procedures
        @crlfXML nchar(7),                          --XML representation of the CLRF
        @crlf nchar(2),                             --CLRF characters
        @procPrefix sysname,                        --CRUD procedures Prefix
        @sql nvarchar(max),                         --SQL code for particular steps
        @msg nvarchar(max),                         --A message
        @suffixSelect sysname,                      --Suffix for SELECT procedure
        @suffixUpdate sysname,                      --Suffix for UPDATE procedure
        @suffixDelete sysname,                      --Suffix for Delete procedure
        @suffixInsert sysname,                      --Suffix for INSERT procedure
        @selectParams nvarchar(max),                --Parameters for SELECT procedure
        @allColumns nvarchar(max),                  --List of All columns in a table for SELECT statement
        @selectAllIfNullCondition nvarchar(max),    --Condition for checking if all parameters in SELECT procedure are NULL
        @selectCondition nvarchar(max),             --SELECT statement condition
        @updateParams nvarchar(max),                --Parameters for UPDATE procedure
        @updateColumns nvarchar(max),               --List of columns for UPDATE statement
        @updateDeleteCondition nvarchar(max),       --Condition for UPDATE and DELETE statement
        @updateOutputCols nvarchar(max),            --List of UPDATE statement output columns to output calculated columns
        @deleteParams nvarchar(max),                --Parameters for DELETE procedure
        @insertParams nvarchar(max),                --Parameters for INSERT procedure
        @insertColumns nvarchar(max),               --List of COLUMNS for INSERT statement
        @insertOutputCols nvarchar(max),            --List of INSERT statement ouptup columns to output IDENTITY and calculated fields
        @insertParamNames nvarchar(max),            --List of parameter names in Insert procedure
        @isTooLongForPrint bit                      --Sores info whether some of the procs is too long for PRINT

    --Declaration of fields Table Variables
    DECLARE @pkFields TABLE (        --Table variable for storing fields which are part of primary key
        name sysname,                --field Name
        fieldType sysname            --Specified data type of the field
    )
    DECLARE @allFields TABLE (
        name sysname,                --field name
        isIdentity bit,              --specifies whether field is INDENTITY
        isCalculated bit,            --specifies whether filed is Calculated field
        fieldType sysname            --Specified data type of the field
    )

    --Table variable for storing scripts for execution
    DECLARE @scripts TABLE (
        id int NOT NULL IDENTITY,
        script nvarchar(max)
    )

    --Check if an execution mode is selected
    IF ((@executionMode & 7) = 0)
    BEGIN
        SET @msg = N'You have to select at at leas one possible execution Mode (@executionMode)
    1 = Print the script Only
    2 = Output the script as SELECT resordset for longer procedures which is not possible output using PRINT
    4 = Execute and Commit
    8 = Execute and Rollback - testing mode

You can also combine the Print and Execute Modes, but you cannot combine both execution modes'
        RAISERROR (@msg, 11, 1)
        RETURN
    END

    IF ((@executionMode & 6) = 6)
    BEGIN
        SET @msg = N'You cannot specify Execute and Commit with Execute and Rollback Together'
        RAISERROR (@msg, 11, 1)
        RETURN
    END

    --populate parameters and constants
    SELECT
        @objID = OBJECT_ID(@objectName),
        @dbName = DB_NAME(),
        @crlfXML = N'
' + NCHAR(10),    --XML Representation of the CR+LF delimiter as we use FOR XML PATH ant this translates the original CR+LF to XML Equivalent. We need it to change it back
        @crlf = NCHAR(13) + NCHAR(10),        --CR+LF delimiter used in script
        @procPrefix = 'usp_CRUD_',            --Specifies prefix to be added to all CRUD procedures
        @suffixSelect = '_S',                --Specifies suffix to be added to the Select Procedure
        @suffixUpdate = '_U',                --Specifies suffix to be added to the Update Procedure
        @suffixDelete = '_D',                --Specifies suffix to be added to the Delete Procedure
        @suffixInsert = '_I'                --Specifies suffix to be added to the Inser Procedure

    --Check whether object exists
    IF @objId IS NULL
    BEGIN
        SET @msg = N'Object "' + @objectName + '" doesnt'' exist in database ' + QUOTENAME(@dbName)
        RAISERROR (@msg, 11, 1)
        RETURN
    END

    --Populate table name and schema name
    SELECT
        @schemaName = s.name,
        @tableName = o.name
    FROM sys.objects o
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
    WHERE o.object_id = @objId AND o.type = 'U'

    --check whether object is table
    IF (@tableName IS NULL)
    BEGIN
        SET @msg = N'Object "' + @objectName + '" is not User Table. Creating CRUD procedures is possible only on User Tables.'
        RAISERROR (@msg, 11, 1)
        RETURN
    END

    --Get all table fields and store them in the @allFields table variable for construction of CRUD procedures
    INSERT INTO @allFields (
        name,
        isIdentity,
        isCalculated,
        fieldType
    )
    SELECT
        c.name,
        is_identity,
        is_computed,
        CASE
            WHEN t.name IN (N'char', N'nchar', N'varchar', N'nvarchar') THEN QUOTENAME(t.name) + N'(' + CASE WHEN c.max_length = -1 THEN N'max' ELSE CAST(c.max_length AS sysname) END + N')'
            WHEN t.name IN (N'decimal', N'numeric') THEN QUOTENAME(t.name) + N'(' + CAST(c.precision AS sysname) + N', ' + CAST(c.scale AS sysname) + N')'
            ELSE QUOTENAME(t.name)
        END
    FROM sys.columns c
    INNER JOIN sys.types t ON c.user_type_id = t.user_type_id
    WHERE object_id = @objId

    --Get list of Primary Key Fields and store them in @pkFields table variable for construction of CRUD procedures
    INSERT INTO @pkFields(
        name,
        fieldType
    )
    SELECT
        c.name,
        CASE
            WHEN t.name IN (N'char', N'nchar', N'varchar', N'nvarchar') THEN QUOTENAME(t.name) + N'(' + CASE WHEN c.max_length = -1 THEN N'max' ELSE CAST(c.max_length AS sysname) END + N')'
            WHEN t.name IN (N'decimal', N'numeric') THEN QUOTENAME(t.name) + N'(' + CAST(c.precision AS sysname) + N', ' + CAST(c.scale AS sysname) + N')'
            ELSE QUOTENAME(t.name)
        END
    FROM sys.key_constraints kc
    INNER JOIN sys.indexes i ON kc.parent_object_id = i.object_id AND kc.unique_index_id = i.index_id
    INNER JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
    INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
    INNER JOIN sys.types t ON c.user_type_id = t.user_type_id
    WHERE kc.parent_object_id = @objId and kc.type = 'PK'

    --Check Whether there is primary Key the CRUD works only if there is primary key in the table
    IF (NOT EXISTS(SELECT 1 FROM @pkFields))
    BEGIN
        SET @msg = N'Table "' + @objectName + '" does not have a Primary Key. There must exists a primary key prior generating CRUD procedures.'
        RAISERROR (@msg, 11, 1)
    END

    --list of output columns for INSERT statement (ouptup of Identity and calculated fields)
    SELECT @insertOutputCols =
        STUFF(REPLACE((SELECT
                            N'        ' + @crlf + N'        ,inserted.' + QUOTENAME(c.name)
                        FROM @allFields c
                        WHERE isIdentity = 1 OR isCalculated = 1
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 19, N'         ')

    --list of output columns for UPDATE statement (Calculated fields only)
    SELECT @updateOutputCols =
        STUFF(REPLACE((SELECT
                            N'        ' + @crlf + N'        ,inserted.' + QUOTENAME(c.name)
                        FROM @allFields c
                        WHERE isCalculated = 1
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 19, N'         ')        

    --ist of all columns used in the SELECT Statement
    SELECT @allColumns =
        STUFF(REPLACE((SELECT
                            N'            ' + @crlf + N'            ,' + QUOTENAME(c.name)
                        FROM @allFields c
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 27, N'             ')

    --list of columns for UPDATE statement including the equal sign and variable (all columns except indentity and calculated ones)
    SELECT @updateColumns =
        STUFF(REPLACE((SELECT
                            N'        ' + @crlf + N'        ,' + QUOTENAME(c.name) + N' = @' + c.name
                        FROM @allFields c
                        WHERE isIdentity = 0 AND isCalculated = 0
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 19, N'         ')

    --list of columns for INSERT statement (all columns except identity and calculated ones)
    SELECT @insertColumns =
        STUFF(REPLACE((SELECT
                            N'        ' + @crlf + N'        ,' + QUOTENAME(c.name)
                        FROM @allFields c
                        WHERE isIdentity = 0 AND isCalculated = 0
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 19, N'         ')

    --condition for UPDATE and DELETE statement
    SET @updateDeleteCondition =
            STUFF(REPLACE((SELECT
                            N'        ' + @crlf + N'        AND' + @crlf + '        ' + QUOTENAME(c.name) + N' = @' + c.name
                        FROM @pkFields c
                        FOR XML PATH(N'')), @crlfXML, @crlf), 1, 23, N'')

    --IF condition for SELECT statement if all params will be NULL to do not use condition to receive better plans
    SET @selectAllIfNullCondition =
                STUFF((SELECT
                            N' AND @' + c.name + N' IS NULL'
                        FROM @pkFields c
                        FOR XML PATH(N'')), 1, 5, N'')

    --Select condition (for SELECT ONE)
    SET @selectCondition =
        STUFF(REPLACE((SELECT
                            N'            ' + @crlf + N'            AND' + @crlf + N'            (@' + c.name + N' IS NULL OR ' + QUOTENAME(c.name) + N' = @' + c.name + N')'
                        FROM @pkFields c
                        FOR XML PATH(N'')), @crlfXML, @crlf), 1, 31, N'')

    --parameters list for SELECT CRUD procedure
    SELECT @selectParams =
        STUFF(REPLACE((SELECT
                            N',    ' + @crlf + N'    @' + c.name + N' ' + c.fieldType + N' = NULL'
                        FROM @pkFields c
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 11, N'    ')

    --parameters list for DELETE CRUD procedure
    SELECT @deleteParams =
        STUFF(REPLACE((SELECT
                            N',    ' + @crlf + N'    @' + c.name + N' ' + c.fieldType
                        FROM @pkFields c
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 11, N'    ')

    --parameters list for UPDATE CRUD procedure
    SELECT @updateParams =
        STUFF(REPLACE((SELECT
                            N',    ' + @crlf + N'    @' + c.name + N' ' + c.fieldType
                        FROM @allFields c
                        LEFT JOIN @pkFields pk ON c.name = pk.name
                        WHERE (c.isIdentity = 0 AND c.isCalculated = 0) OR pk.name IS NOT NULL
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 11, N'    ')

    --parameters list for INSERT CRUD procedure
    SELECT @insertParams =
        STUFF(REPLACE((SELECT
                            N',    ' + @crlf + N'    @' + c.name + N' ' + c.fieldType
                        FROM @allFields c
                        LEFT JOIN @pkFields pk ON c.name = pk.name
                        WHERE c.isIdentity = 0 AND c.isCalculated = 0
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 11, N'    ')

    --parameter names list for INSERT command in the INSERT CRUD procedure
    SELECT @insertParamNames =
        STUFF(REPLACE((SELECT
                            N'        ' + @crlf + N'        ,@' + c.name
                        FROM @allFields c
                        LEFT JOIN @pkFields pk ON c.name = pk.name
                        WHERE c.isIdentity = 0 AND c.isCalculated = 0
                        FOR XML PATH(N'')
                        ),
                @crlfXML, @crlf),
            1, 19, N'         ')

--USE DB
SET @sql = N'USE ' + QUOTENAME(@dbName) + N'
'
INSERT INTO @scripts(script) VALUES(@sql)

--SELECT PROCEDURE
IF (@dropExistingProcedures = 1)
BEGIN
SET @sql = N'--Drop existing SELECT CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(''' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixSelect + N']'') AND type = ''P''))
    DROP PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixSelect +N']
'
INSERT INTO @scripts(script) VALUES(@sql)
END

SET @sql = N'-- =======================================================
-- Author:      ' + QUOTENAME(SUSER_SNAME()) + N'
-- Create date: ' + CONVERT(nvarchar(10), GETDATE(), 111) + N'
-- Description: Selects records from table ' + QUOTENAME(@schemaName) + '.' + QUOTENAME(@tableName) + N'
-- =======================================================
CREATE PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixSelect + N']
' + @selectParams + N'
AS
BEGIN
    SET NOCOUNT ON;

    IF (' + @selectAllIfNullCondition + N')
    BEGIN
        SELECT
' + @allColumns + N'
        FROM ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N'
    END
    ELSE
    BEGIN
        SELECT
' + @allColumns + N'
        FROM ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N'
        WHERE
' + @selectCondition + N'
    END
END
';
INSERT INTO @scripts(script) VALUES(@sql)

--UPDATE PROCEDURE
IF (@dropExistingProcedures = 1)
BEGIN
SET @sql = N'--Drop existing UPDATE CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(''' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixUpdate + N']'') AND type = ''P''))
    DROP PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixUpdate + N']
'
INSERT INTO @scripts(script) VALUES(@sql)
END

SET @sql = N'-- =======================================================
-- Author:      ' + QUOTENAME(SUSER_SNAME()) + N'
-- Create date: ' + CONVERT(nvarchar(10), GETDATE(), 111) + N'
-- Description: Updates record in table ' + QUOTENAME(@schemaName) + '.' + QUOTENAME(@tableName) + N'
-- =======================================================
CREATE PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixUpdate + N']
' + @updateParams + N'
AS
BEGIN
    SET NOCOUNT ON;

    UPDATE ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N' SET
' + @updateColumns + CASE WHEN @outputIndentityCalcFields = 1 AND @updateOutputCols <> N'' AND @updateOutputCols IS NOT NULL THEN N'
    OUTPUT
' + @updateOutputCols
    ELSE N'' END + N'
    WHERE
' + @updateDeleteCondition + N'
END
';
INSERT INTO @scripts(script) VALUES(@sql)

--DELETE PROCEDURE
IF (@dropExistingProcedures = 1)
BEGIN
SET @sql = N'--Drop existing DELETE CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(''' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixDelete + N']'') AND type = ''P''))
    DROP PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName +  @suffixDelete + N']
'
INSERT INTO @scripts(script) VALUES(@sql)
END

SET @sql = N'-- =======================================================
-- Author:      ' + QUOTENAME(SUSER_SNAME()) + N'
-- Create date: ' + CONVERT(nvarchar(10), GETDATE(), 111) + N'
-- Description: Deletes recors from table ' + QUOTENAME(@schemaName) + '.' + QUOTENAME(@tableName) + N'
-- =======================================================
CREATE PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixDelete + N']
' + @deleteParams + N'
AS
BEGIN
    SET NOCOUNT ON;

    DELETE FROM ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N'
    WHERE
' + @updateDeleteCondition + N'
END
';
INSERT INTO @scripts(script) VALUES(@sql)

--INSERT PROCEDURE
IF (@dropExistingProcedures = 1)
BEGIN
SET @sql = N'--Drop existing INSERT CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(''' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixInsert + N']'') AND type = ''P''))
    DROP PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixInsert + N']
'
INSERT INTO @scripts(script) VALUES(@sql)
END

SET @sql = N'-- =======================================================
-- Author:      ' + QUOTENAME(SUSER_SNAME()) + N'
-- Create date: ' + CONVERT(nvarchar(10), GETDATE(), 111) + N'
-- Description: Deletes recors from table ' + QUOTENAME(@schemaName) + '.' + QUOTENAME(@tableName) + N'
-- =======================================================
CREATE PROCEDURE ' + QUOTENAME(@schemaName) + N'.[' + @procPrefix + @tableName + @suffixInsert + N']
' + @insertParams + N'
AS
BEGIN
    SET NOCOUNT ON;

    INSERT INTO ' + QUOTENAME(@schemaName) + N'.' + QUOTENAME(@tableName) + N' (
' + @insertColumns + N'
    )' + CASE WHEN @outputIndentityCalcFields = 1 AND @insertOutputCols <> N'' AND @insertOutputCols IS NOT NULL THEN N'
    OUTPUT
' + @insertOutputCols
    ELSE N'' END + N'
    SELECT
' + @insertParamNames + N'
END
';
INSERT INTO @scripts(script) VALUES(@sql)

DECLARE cr CURSOR FAST_FORWARD FOR
SELECT script FROM @scripts ORDER BY id

--if EXECUTION mode contains 2 we should output the code using SELECT
--Script generate using the SELECT can be saved by right lick on the result and
--select Save Result AS and storing it as CSV
IF ((@executionMode & 2) = 2 OR (@isTooLongForPrint = 1 AND (@executionMode & 1) = 1))
BEGIN
    SELECT script + N'GO' FROM @scripts
END

SET @isTooLongForPrint = ISNULL((SELECT 1 FROM @scripts WHERE LEN(script) > 4000), 0)

--if Execution mode contains 1 we should PRINT the statements
IF ((@executionMode & 1) = 1 AND @isTooLongForPrint = 0)
BEGIN
    OPEN cr
    FETCH NEXT FROM cr INTO @sql
    WHILE (@@FETCH_STATUS = 0)
    BEGIN
        PRINT @sql
        PRINT 'GO'
        FETCH NEXT FROM cr INTO @sql
    END
    CLOSE cr
END

--Execute the statement if it should be executed
IF (( @executionMode & 4 ) = 4 OR ( @executionMode & 8 ) = 8 )
BEGIN

    OPEN cr

    BEGIN TRY
        BEGIN TRANSACTION

        FETCH NEXT FROM cr INTO @sql
        WHILE (@@FETCH_STATUS = 0)
        BEGIN
            EXEC (@sql)
            FETCH NEXT FROM cr INTO @sql
        END

        IF ((@executionMode & 4) = 4)
        BEGIN
            IF (@@TRANCOUNT > 0)
                COMMIT TRANSACTION
        END
        ELSE
        BEGIN
            IF (@@TRANCOUNT > 0)
                ROLLBACK TRANSACTION
        END
    END TRY
    BEGIN CATCH
        IF (@@TRANCOUNT > 0)
            ROLLBACK TRANSACTION
    END CATCH
END

--if cursor is open, close it
IF (cursor_status('global', 'cr') = 1)
    CLOSE cr

DEALLOCATE cr
END
GO

EXECUTE sp_ms_marksystemobject 'dbo.sp_makeCRUD'
Go

The each of the SQL statement in the procedure is first stored in a table variable called @scripts for later printing and/or execution. At the end of the procedure the code is printed using a PRINT statement or using a SELECT statement which is automatically used in case the code for some CRUD procedure is longer than 4000 characters and therefore not possible to PRINT.

If Execute and Commit or Execute And Rollback was selected then the code for the CRUD procedures is also executed. The Execute and Rollback is useful for testing, that the code for the procedures is correct and working.

Procedure is created in the [master] database so it is accessible from all databases and we do not need to create it in every database.

To be fully functional, it is necessary to mark the procedure as system procedure using below command.

EXECUTE sp_ms_marksystemobject 'dbo.sp_makeCRUD'

Marking the procedure as system object is necessary as if we not mark the object, then the procedure will run in context of the database in which it was created – in this case [system] and will fail as there are not the tables for which we want to generate the CRUD procedures.

Once the procedure is marked as system object, then it executes in the context of the current database, even it is located in the [master] database.

After we create the proc, we can test it on a small example:

USE [tempdb]
GO
CREATE TABLE dbo.TestTable (
    ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
    Name varchar(10),
    Price money,
    Quantity decimal(5,2),
    TotalPrice AS (Price * Quantity)
)
GO
sp_makeCRUD 'dbo.TestTable'

This will produce below code for the CRUD procedures

USE [tempdb]
GO
--Drop existing SELECT CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID('[dbo].[usp_CRUD_TestTable_S]') AND type = 'P'))
    DROP PROCEDURE [dbo].[usp_CRUD_TestTable_S]
GO
-- =======================================================
-- Author:      [PAVEL-PCPavel]
-- Create date: 2011/02/10
-- Description: Selects records from table [dbo].[TestTable]
-- =======================================================
CREATE PROCEDURE [dbo].[usp_CRUD_TestTable_S]
    @ID [int] = NULL
AS
BEGIN
    SET NOCOUNT ON;

    IF (@ID IS NULL)
    BEGIN
        SELECT
             [ID]
            ,[Name]
            ,[Price]
            ,[Quantity]
            ,[TotalPrice]
        FROM [dbo].[TestTable]
    END
    ELSE
    BEGIN
        SELECT
             [ID]
            ,[Name]
            ,[Price]
            ,[Quantity]
            ,[TotalPrice]
        FROM [dbo].[TestTable]
        WHERE
            (@ID IS NULL OR [ID] = @ID)
    END
END
GO
--Drop existing UPDATE CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID('[dbo].[usp_CRUD_TestTable_U]') AND type = 'P'))
    DROP PROCEDURE [dbo].[usp_CRUD_TestTable_U]
GO
-- =======================================================
-- Author:      [PAVEL-PCPavel]
-- Create date: 2011/02/10
-- Description: Updates record in table [dbo].[TestTable]
-- =======================================================
CREATE PROCEDURE [dbo].[usp_CRUD_TestTable_U]
    @ID [int],
    @Name [varchar](10),
    @Price [money],
    @Quantity [decimal](5, 2)
AS
BEGIN
    SET NOCOUNT ON;

    UPDATE [dbo].[TestTable] SET
         [Name] = @Name
        ,[Price] = @Price
        ,[Quantity] = @Quantity
    OUTPUT
         inserted.[TotalPrice]
    WHERE
        [ID] = @ID
END
GO
--Drop existing DELETE CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID('[dbo].[usp_CRUD_TestTable_D]') AND type = 'P'))
    DROP PROCEDURE [dbo].[usp_CRUD_TestTable_D]
GO
-- =======================================================
-- Author:      [PAVEL-PCPavel]
-- Create date: 2011/02/10
-- Description: Deletes recors from table [dbo].[TestTable]
-- =======================================================
CREATE PROCEDURE [dbo].[usp_CRUD_TestTable_D]
    @ID [int]
AS
BEGIN
    SET NOCOUNT ON;

    DELETE FROM [dbo].[TestTable]
    WHERE
        [ID] = @ID
END
GO
--Drop existing INSERT CRUD Procedure
IF (EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID('[dbo].[usp_CRUD_TestTable_I]') AND type = 'P'))
    DROP PROCEDURE [dbo].[usp_CRUD_TestTable_I]
GO
-- =======================================================
-- Author:      [PAVEL-PCPavel]
-- Create date: 2011/02/10
-- Description: Deletes recors from table [dbo].[TestTable]
-- =======================================================
CREATE PROCEDURE [dbo].[usp_CRUD_TestTable_I]
    @Name [varchar](10),
    @Price [money],
    @Quantity [decimal](5, 2)
AS
BEGIN
    SET NOCOUNT ON;

    INSERT INTO [dbo].[TestTable] (
         [Name]
        ,[Price]
        ,[Quantity]
    )
    OUTPUT
         inserted.[ID]
        ,inserted.[TotalPrice]
    SELECT
         @Name
        ,@Price
        ,@Quantity
END
GO

Selecting random records from a table

2011-01-07 Pavel Pawlowski SQL Server, T-SQL Random, Select, SQL Server, T-SQL

Some time you may come to a situation, when you have a table and need to get only a few random records from it. You can come with a question how to achieve this on SQL Server.

You may come with incorporating the RAND() function, but as I mentioned in my previous post Random numbers in query, the RAND() function is a constant function executed once per whole query and does not it’s work.

One of the solution could be using the TABLESAMPLE clause. All the samples here use the AdventureWorks2008R2 sample database.

SELECT
    *
FROM Sales.SalesOrderDetail
TABLESAMPLE (100 ROWS);

As this for some tables could work, probably it will not give you exact results you probably expect as the current implementation of the TABLESAMPLE clause build for purpose of statistics retrieving.

Selecting random rows using NEWID()

As working workaround for the purpose of random records selection we can use again the NEWID() function mentioned in my previous post. Each call to the function creates new unique identifier. There are several possibilities to return random rows using the function.

Select an approximate percentage sample of random rows
Select exact number of random rows
Select an exact percentage sample of random rows
Select random number of random rows

Selecting approximate percentage sample of random rows

To select an approximate percentage sample of random rows from a table you can you a below query. The query will select around 10 % of records from a table.

SELECT
    *
FROM Sales.SalesOrderDetail
WHERE 0.1 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
    / CAST (0x7fffffff AS int)

Each execution of the above query returns a different count of records, which will be close to the 10 percent of total records in the table.

To select a different percentage simply change the 0.1 to and appropriate percentage. In the condition there is added a SalesOrderID field in the calculation to force generation of the NEWID() for each row, otherwise the NEWID() would be generated only once as a constant in this query. The 0x7fffffff represents maximum positive integer value, so when using it in bit AND operation it limits the original number to that value.

The above query produces following plan:

And trace from the profiler:

The table Sales.SalesOrderDetail has 121317 rows and from the profiler results we can see, that the query returned exactly 10 % of records (without 1.7). With multiple runs the row count produced will be different, but very close to 10 %.

Selecting exact number of random rows

Selecting exact number of random rows is very easy, as it is only necessary to sort the query by the NEWID() result and limit the number of returned rows. For returning 100 random rows it’s possible to use below query:

SELECT TOP 100
    *
FROM Sales.SalesOrderDetail
ORDER BY NEWID()

In this example it is not necessary to include any random number generation and checksum as the NEWID() is in the ORDER BY clause and therefore it will be evaluated once for each row so the SQL Server can sort the record on it’s value.

Here is a plan for the query:

And trace from profiler:

As we can see, even the query looks much simpler than the previous for approximate percentage, this query produces much more complicated parallel plan and from the profiler we can see, that even returning les rows, it produces more reads and has much higher CPU cost. This is caused by the ORDER BY operator as the database engine has to sort all the records prior returning the TOP 100 rows. The duration of the first query is longer, but this is caused by returning 12 times more rows to client.

On small tables this doesn’t matter, but on large one it can has high impact. To solve the issue, we can use the approximate count query and limit the records returned by this query. Only we need to choose appropriate count records in the WHERE clause. If we know, that in the table is more than 120 000 records, we can first limit the results to about 0.15 % so we will be sure, the base query returns more than 100 rows and then simply limit them to 100.

SELECT TOP 100
    *
FROM Sales.SalesOrderDetail
WHERE 0.0015 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
    / CAST (0x7fffffff AS int)

This query produces below plan and trace from profiler.

As we can see, the query plan is again much simpler and the Reads count is more than half less then previous and the CPU cost is much lower and duration is a little bit lower, but generally equal because of returning rows to client.

Selecting exact percentage sample of random rows

To select an exact percentage sample of random rows it’s very simple and consist of simple modification of the TOP clause to a percentage value. So to select 10 percent of random records we can use below query.

SELECT TOP 10 PERCENT
    *
FROM Sales.SalesOrderDetail
ORDER BY NEWID()

Query produces below and again more complex plan then the solution with WHERE clause. It’s even more complicated than the one for the TOP without PERCENT as a table spool operator is added to the plan.

From the profiles we again see, tan it produces enormous amount of reads comparing the to the approximate percentage sample even the amount of returned records is similar. The duration and CPU cost is incomparable and even writes are produced because of the writes to Table Spool.

To optimize this, we cannot use the approximate percentage count query and limit it by percent, as the TOP 10 PERCENT clause will return only 10 percent of records from the one limited by the WHERE clause.

We can optimize the reads, execution time and resources cost by below query.

SELECT TOP ((SELECT COUNT(*) FROM Sales.SalesOrderDetail) / 10)
    *
FROM Sales.SalesOrderDetail
WHERE 0.11 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
    / CAST (0x7fffffff AS int)

The (SELECT COUNT(*) FROM Sales.SalesOrderDetail) / 10 query in TOP clause calculates 10 percent of records and whole query produces below plan and profiler trace.

Even the plan may look more complex than the previous one, from the profiler trace we can see, that for the same Row Count this plan produces 256 time less reads, no writes and 3 time lower duration and 6.5 time less CPU cost.

Selecting random number of random records

To select a random number of random records we can again use the above query and slightly modify the top clause. This modification consist of adding a random generator into the TOP clause. To select random number of records (or percent) in particular interval – in this example 10-20 records or 10 – 20 percent a below queries can be used:

--Select 10 - 20 random records
SELECT TOP (ABS(CHECKSUM(NEWID())) % 10 + 10)
    *
FROM Sales.SalesOrderDetail
ORDER BY NEWID()

--Select 10 - 20 random records
SELECT TOP (ABS(CHECKSUM(NEWID())) % 10 + 10)
    *
FROM Sales.SalesOrderDetail
WHERE 0.0022 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
    / CAST (0x7fffffff AS int)

The plans for the queries are the same as the plan mentioned above and from the profiler trace we can again see, the incomparable lower costs of the second query compared to the first one.

--Select 10 - 20 % of random records
SELECT TOP (ABS(CHECKSUM(NEWID())) % 10 + 10) PERCENT
    *
FROM Sales.SalesOrderDetail
ORDER BY NEWID()

--Select 10 - 20 % of random records
SELECT TOP ((SELECT COUNT(*) FROM Sales.SalesOrderDetail) * (ABS(CHECKSUM(NEWID())) % 10 + 10) / 100)
    *
FROM Sales.SalesOrderDetail
WHERE 0.21 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
    / CAST (0x7fffffff AS int)

Again the plan are the same as the plan for each version showed above and again from the profiler trace we can again see, that the performance of the second query is incomparable better, than the first one.

Complete comparison

To make a complete comparison of performance of all the queries mentioned in this article we can use below query batch. All the queries are written as sub queries of SELECT COUNT(*) to avoid wrong duration calculation which is affected by transferring the records from server to client.

--Approximate 10 percent =======================
SELECT COUNT(*) FROM (
    SELECT
        *
    FROM Sales.SalesOrderDetail
    WHERE 0.1 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
        / CAST (0x7fffffff AS int)
) q
GO
--TOP 100, ORDER BY =============================
SELECT COUNT(*) FROM (
    SELECT TOP 100
        *
    FROM Sales.SalesOrderDetail
    ORDER BY NEWID()
) q
GO
--TOP 100, Approximate =============================
SELECT COUNT(*) FROM (
    SELECT TOP 100
        *
    FROM Sales.SalesOrderDetail
    WHERE 0.0015 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
        / CAST (0x7fffffff AS int)
) q
GO
--TOP 10 PERCENT, ORDER BY =============================
SELECT COUNT(*) FROM (
    SELECT TOP 10 PERCENT
        *
    FROM Sales.SalesOrderDetail
    ORDER BY NEWID()
) q
GO
--TOP 10 PERCENT, Approximate =============================
SELECT COUNT(*) FROM (
    SELECT TOP ((SELECT COUNT(*) FROM Sales.SalesOrderDetail) / 10)
        *
    FROM Sales.SalesOrderDetail
    WHERE 0.11 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
        / CAST (0x7fffffff AS int)
) q
GO
--Random 10 - 20 Records, ORDER BY  =============================
SELECT COUNT(*) FROM (
    SELECT TOP (ABS(CHECKSUM(NEWID())) % 10 + 10)
        *
    FROM Sales.SalesOrderDetail
    ORDER BY NEWID()
) q
GO
--Random 10 - 20 Records, Approximate =============================
SELECT COUNT(*) FROM (
    SELECT TOP (ABS(CHECKSUM(NEWID())) % 10 + 10)
        *
    FROM Sales.SalesOrderDetail
    WHERE 0.0022 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
        / CAST (0x7fffffff AS int)
) q
GO
--Random 10 - 20 PERCENT, ORDER BY =============================
SELECT COUNT(*) FROM (
    SELECT TOP (ABS(CHECKSUM(NEWID())) % 10 + 10) PERCENT
        *
    FROM Sales.SalesOrderDetail
    ORDER BY NEWID()
) q
GO
--Random 10 - 20 PERCENT, Approximate =============================
SELECT COUNT(*) FROM (
    SELECT TOP ((SELECT COUNT(*) FROM Sales.SalesOrderDetail) * (ABS(CHECKSUM(NEWID())) % 10 + 10) / 100)
        *
    FROM Sales.SalesOrderDetail
    WHERE 0.21 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
        / CAST (0x7fffffff AS int)
) q
GO

And profiler output for above queries.

Conclusion

As we can see from the queries, their query plans and profiler traces, we have several possibilities to select random records from table. Some possibilities have a very simpler query, but produces more complex plan and have a higher cost and can be suitable for relatively small tables.

Other versions producing the same results are more complex to write the query itself, but finally the query produces a much simpler plan and have much better performance comparing to the simple looking one. Such queries are then suitable even fro very large tables to produce results in acceptable time with low performance impact on the server.

Random numbers in query

2011-01-07 Pavel Pawlowski SQL Server, T-SQL Function, Random, SQL Server, T-SQL

Some times you may want to select rows from some table and in each row you would like to have some random number generated. You may want to try the RAND() function. However, it will generate a random number, all the returned rows will have the same random number and this is mostly not what you intended.

SELECT TOP 10
    RAND()
FROM sys.all_columns

Running the above query you will receive.

----------------------
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828
0,173895240378828

(10 row(s) affected)

This is caused because the RAND() function is a constant function and the constant function is executed once on the query start, and then the value is used everywhere in the query. There are more functions in SQL Server, that behave as Constant Expressions and which are executed once per the whole Query e.g. the GETDATE() function.

Whether the function is a executed as constant expression or whether it is executed once per row, you can find from the XML plan for the query. If we take the a look on the XML plan for the above query we will see:

<?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.1" Build="10.50.1600.1" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="10" StatementId="1" StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" StatementSubTreeCost="0.00346253" StatementText="SELECT TOP 10
    RAND()
FROM sys.all_columns" StatementType="SELECT" QueryHash="0x479AE6BF06344217" QueryPlanHash="0x1960EA8D4F47E514">
          <StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
          <QueryPlan CachedPlanSize="32" CompileTime="3" CompileCPU="3" CompileMemory="544">
            <RelOp AvgRowSize="15" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.00346253">
              <OutputList>
                <ColumnReference Column="Expr1063" />
              </OutputList>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1063" />
                    <ScalarOperator ScalarString="rand()">
                      <Identifier>
                        <ColumnReference Column="ConstExpr1065">
                          <ScalarOperator>
                            <Intrinsic FunctionName="rand">
                              <ScalarOperator>
                                <Const ConstValue="" />
                              </ScalarOperator>
                            </Intrinsic>
                          </ScalarOperator>
                        </ColumnReference>
                      </Identifier>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="9" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10" LogicalOp="Top" NodeId="1" Parallel="false" PhysicalOp="Top" EstimatedTotalSubtreeCost="0.00346153">
                  <OutputList />
                  <Top RowCount="false" IsPercent="false" WithTies="false">
                    <TopExpression>
                      <ScalarOperator ScalarString="(10)">
                        <Const ConstValue="(10)" />
                      </ScalarOperator>
                    </TopExpression>
                    <RelOp AvgRowSize="9" EstimateCPU="0.000515516" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10" LogicalOp="Concatenation" NodeId="2" Parallel="false" PhysicalOp="Concatenation" EstimatedTotalSubtreeCost="0.00346053">
                      <OutputList />
                      <Concat>
                        <DefinedValues />
                        <RelOp AvgRowSize="9" EstimateCPU="0.00083584" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10" LogicalOp="Filter" NodeId="3" Parallel="false" PhysicalOp="Filter" EstimatedTotalSubtreeCost="0.00337579">
                          <OutputList />
                          <Filter StartupExpression="false">
                            <RelOp AvgRowSize="13" EstimateCPU="0.0008753" EstimateIO="0.0068287" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10" LogicalOp="Index Scan" NodeId="4" Parallel="false" PhysicalOp="Index Scan" EstimatedTotalSubtreeCost="0.00335999" TableCardinality="653">
                              <OutputList>
                                <ColumnReference Database="[TestDB]" Schema="[sys]" Table="[syscolpars]" Column="id" />
                              </OutputList>
                              <IndexScan Ordered="false" ForcedIndex="false" ForceSeek="false" NoExpandHint="false">
                                <DefinedValues>
                                  <DefinedValue>
                                    <ColumnReference Database="[TestDB]" Schema="[sys]" Table="[syscolpars]" Column="id" />
                                  </DefinedValue>
                                </DefinedValues>
                                <Object Database="[TestDB]" Schema="[sys]" Table="[syscolpars]" Index="[nc]" IndexKind="NonClustered" />
                                <Predicate>
                                  <ScalarOperator ScalarString="[TestDB].[sys].[syscolpars].[number]=(0)">
                                    <Compare CompareOp="EQ">
                                      <ScalarOperator>
                                        <Identifier>
                                          <ColumnReference Database="[TestDB]" Schema="[sys]" Table="[syscolpars]" Column="number" />
                                        </Identifier>
                                      </ScalarOperator>
                                      <ScalarOperator>
                                        <Const ConstValue="(0)" />
                                      </ScalarOperator>
                                    </Compare>
                                  </ScalarOperator>
                                </Predicate>
                              </IndexScan>
                            </RelOp>
                            <Predicate>
                              <ScalarOperator ScalarString="has_access('CO',[TestDB].[sys].[syscolpars].[id])=(1)">
                                <Compare CompareOp="EQ">
                                  <ScalarOperator>
                                    <Intrinsic FunctionName="has_access">
                                      <ScalarOperator>
                                        <Const ConstValue="'CO'" />
                                      </ScalarOperator>
                                      <ScalarOperator>
                                        <Identifier>
                                          <ColumnReference Database="[TestDB]" Schema="[sys]" Table="[syscolpars]" Column="id" />
                                        </Identifier>
                                      </ScalarOperator>
                                      <ScalarOperator>
                                        <Const ConstValue="" />
                                      </ScalarOperator>
                                      <ScalarOperator>
                                        <Const ConstValue="" />
                                      </ScalarOperator>
                                    </Intrinsic>
                                  </ScalarOperator>
                                  <ScalarOperator>
                                    <Const ConstValue="(1)" />
                                  </ScalarOperator>
                                </Compare>
                              </ScalarOperator>
                            </Predicate>
                          </Filter>
                        </RelOp>
                        <RelOp AvgRowSize="9" EstimateCPU="0.0132712" EstimateIO="0.0749769" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Index Scan" NodeId="6" Parallel="false" PhysicalOp="Index Scan" EstimatedTotalSubtreeCost="0.00329434" TableCardinality="11922">
                          <OutputList />
                          <IndexScan Ordered="false" ForcedIndex="false" ForceSeek="false" NoExpandHint="false">
                            <DefinedValues />
                            <Object Database="[mssqlsystemresource]" Schema="[sys]" Table="[syscolrdb]" Index="[ncl]" IndexKind="NonClustered" />
                            <Predicate>
                              <ScalarOperator ScalarString="[mssqlsystemresource].[sys].[syscolrdb].[number]=(0)">
                                <Compare CompareOp="EQ">
                                  <ScalarOperator>
                                    <Identifier>
                                      <ColumnReference Database="[mssqlsystemresource]" Schema="[sys]" Table="[syscolrdb]" Column="number" />
                                    </Identifier>
                                  </ScalarOperator>
                                  <ScalarOperator>
                                    <Const ConstValue="(0)" />
                                  </ScalarOperator>
                                </Compare>
                              </ScalarOperator>
                            </Predicate>
                          </IndexScan>
                        </RelOp>
                      </Concat>
                    </RelOp>
                  </Top>
                </RelOp>
              </ComputeScalar>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>

On the line 19 or the Query you can see ConstExpr1065 under the node of <ScalarOperator ScalarString="rand()">. Each function which is in the XML plan expressed in the ColumnReference as ConstExpr is executed as constant function once per whole query.

Workaround for RAND function

As a workaround for generating the random numbers in a query, it is possible to use the NEWID() function. Although it the return value of the function is uniqueidentifier, every call to the function return a new unique value, which we can use a CHECKSUM() function over it which calculates checksum of an expression the checksum is an integer value. Because the value can be negative or positive it is necessary to use ABS() over it to receive consistent values.

So finally we can write following query:

SELECT TOP 10
    ABS(CHECKSUM(NEWID())) AS Random,
    ABS(CHECKSUM(NEWID())) % 100 AS Random_0_100,
    CAST(CHECKSUM(NEWID()) & 0x7fffffff AS float) / CAST(0x7fffffff as int)  AS RandomFloatValue
FROM sys.all_objects

And the result will be similar to the one below where each value is unique random number.

Random      Random_0_100 RandomFloatValue
----------- ------------ ----------------------
1060745673  73           0,290147927724825
1053300815  99           0,835112006792385
1233927720  13           0,551734537608798
1059863691  70           0,316186188401741
495435569   98           0,770220997636309
1364620416  82           0,404617617095177
783618590   58           0,161057420615599
1176654027  15           0,522251577825403
1871862272  0            0,295878297321442
1870125345  10           0,0176546154625968

(10 row(s) affected)

Remote Desktop Connection Manager

2011-01-05 Pavel Pawlowski Tools Connection Manager, Management, Remote Management, Server, Tools

If you are managing or monitoring a larger amount of servers, you can try Microsoft’s Remote Desktop Connection Manager. It’s a free tool and it’s size is only 789 kB. it does perfectly it’s job. You can group yours server, specify a default connection parameter for each group, and those override in sub groups and/or server level.

Once you connect to a sever, you can switch to another. If you select a group, you see a thumbnails of all the servers in the groups. it’s possible to set the thumbnail size and what more, the Remote Desktop Connection Manager allows live updates of the thumbnails, and you can manage the server even through the thumbnail as it’s really live.

This tool can very simplify your life of administrator as you see all the servers in one place, and you do not need to remember all the servers or need to save .rdp files for each server in some folder. You have a complete overview of your machines.

The Remote Desktop Connection Manager is supported on Windows 7, Windows Server 2003, Windows Server 2008, Windows Server 2008 R2 and Windows Vista. If you are using Windows XP or Windows Server 2003 you have to update to version 6 of Remote Desktop Connection client.

You can download it freely from Microsoft Download Center.

SQL Script Manager by RedGate

2011-01-04 Pavel Pawlowski SQL Server, Tools Manager, Script, SQL Server, T-SQL, Tools

On December 16. 2010 RedGate released a free tool for DBAs for managing and and running T-SQL or IronPython scripts. You can download the script manager directly from RedGate SQL Script Manager site.

The script manager allows you to manage collects scripts, and simplify their execution using Presets. Scripts are organized by Tags and Authors.

Script manager comes pre-bundled with 25 administration scripts and other scripts can be downloaded from SqlCentral.com scripts repository.

User can even create it’s own scripts. The scripts are encapsulated inside the .rgtool file, which is an XML file which structure is described on the RedGate support pages for the SQL Script Manager.

However it is interesting tool for collecting and managing scripts, I think, it brings nothing special what cannot be handled by the Templates and Templates Manager integrated in Management Studio (if talking about T-SQL Scripts). What more, SQL Management Studio Denali brings a new Code Snippets known from C# and VB .NET development in Visual Studio, which brings another power to the SSMS and from my point of view, writing the .rgtool is more complicated than writing a T-SQL template. On the other site it could be a useful tool for beginners DBAs.

Using Stored Procedure as Data Source in Excel

2011-01-04 Pavel Pawlowski Excel, SQL Server Connection, Excel, SQL Server, Stored procedure

When you want to use a SQL Server as external data source in excel, you can find, that it is not possible to select a stored procedure as a source of data in the Data Connection Wizard. This can lead to a conclusion, that it’s not possible to use a Stored Procedure as a source of data in Excel.

Although, you cannot select a stored procedure, in the Data Connection Wizard, there is a simple workaround who to select a stored procedure as a source of data in Excel.

First, create a connection using the wizard and select any table as a source for the newly created connection. Once the connection is created, choose properties for the connection and switch to the Definition Tab.

In the Definition type, change the Command Type to SQL and as Command text put the EXEC statement to execute the stored procedure. Once you enter, the Exec command, and confirm the dialog, you have fully configured connection to SQL Server which source is a Stored Procedure.

On the dialog confirmation a warning message appears telling you, that the connection in Excel and the external connection files doesn’t match and that the link will be removed. Simply confirm the message by Yes as the connection will be stored in the excel workbook.

Such modified connection you can use for importing data to Excel sheet, pivot table, pivot graph etc..

First report and report after specific time interval starts a long time on MS SQL 2008 Reporting Services

2011-01-03 Pavel Pawlowski SQL Server, SSRS Configuration, SQL Server, SSRS

Today an interesting article was published on MS SQL Tips: How to Speed up MS SQL Reporting Services on First Run.

It’s about solving a problem with slow loading of first report (or even accessing the report manager) in MS SQL 2005 Reporting Services. On SSRS 2005 the problem by default occurs every 20 minutes of idle time.

The same problem occurs also on MS SQL 2008 and MS SQL 2008R2 Reporting Services, but in 12 hours intervals. As Reporting Services 2008+ doesn’t use the IIS server and use it’s own integrated web service, solving the problem is quite different.

On SQL Server Reporting Services 2008 the problem occurs when an Application Domain is recycled and by default it’s 12 hours (720 minutes). After this time a new application domain is created an all new requests are redirected to the new application domain. Once the application domain is recycled, the new domain has to authenticate to the database server, load all necessary structures and this prolongs the first report startup after the domain is recycled.

If this default behavior (recycling every 12 hours) doesn’t meet your business needs as, you can modify the Report Server configuration file and specify a value which will be closer to your needs.

The configuration is stored in the rsreportserver.config file which is located in the installation directory of the Report Server instance. E.g. for default instance of SQL Server 2008 R2 Reporting Services it is located in ”C:Program FilesMicrosoft SQL ServerMSRS10_50.MSSQLSERVERReporting ServicesReportServer” directory.

The part which interests us is in the Service Node and it’s RecycleTime parameter:

  <Service>
    <IsSchedulingService>True</IsSchedulingService>
    <IsNotificationService>True</IsNotificationService>
    <IsEventService>True</IsEventService>
    <PollingInterval>10</PollingInterval>
    <WindowsServiceUseFileShareStorage>False</WindowsServiceUseFileShareStorage>
    <MemorySafetyMargin>80</MemorySafetyMargin>
    <MemoryThreshold>90</MemoryThreshold>
    <RecycleTime>720</RecycleTime>
    <MaxAppDomainUnloadTime>30</MaxAppDomainUnloadTime>
    <MaxQueueThreads>0</MaxQueueThreads>
    <UrlRoot>
    </UrlRoot>
    <UnattendedExecutionAccount>
      <UserName>
      </UserName>
      <Password>
      </Password>
      <Domain>
      </Domain>
    </UnattendedExecutionAccount>
    <PolicyLevel>rssrvpolicy.config</PolicyLevel>
    <IsWebServiceEnabled>True</IsWebServiceEnabled>
    <IsReportManagerEnabled>True</IsReportManagerEnabled>
    <FileShareStorageLocation>
      <Path>
      </Path>
    </FileShareStorageLocation>
  </Service>

By modifying the RecycleTime, you can shorten or prolong the recycling interval. The interval is specified in minutes and as mentioned above the default value is 720 minutes (12 hours).

Detail on the Application Domains for Report Server Applications can be found on MSDN. Details about the RSReportServer configuration file is also available on MSDN.

As mentioned in the documentations the MaxAppDomainUnloadTime is a time interval in minutes for which the Reporting Services are waiting for any current operation to finish during recycling the Application Domain. As It was mentioned during recycling Application Domain a new Domain is created and all new requests are directed to the new Application Domain and the current one is leave for the MaxAppDomainUnloadTime if there are currently some jobs running. If the jobs are not finished during the MaxAppDomainUnloadTime, the Application Domain is restarted and all jobs terminated.

If you have some some reports, especially scheduled one, which could take longer than the default 30 minutes to process, you can also consider to alter this parameter according to your needs. On the other side, if there is some job running for such a long time, it should be analyzed why it takes so long and whether it could be optimized in some way.

There is a lot of other interesting parameters in this configuration file, which could useful especially in the Scale-out SSRS Deployment scenarios. I will focus on some of them in one of the future posts.

For the overall workaround of this problem using PowerShell, see my later post Solving issue with first long starting report on SSRS 2008.

Pavel's SQL Pages

SQL Server & other stuff related

Month: January 2011

Automating CRUD procedures generation using T-SQL

CRUD procedures construction

Retrieving Object ID, Table Name, Schema Name and DB Name

Retrieving Table Field and Primary Key

Generating Columns and Variables Lists

Generating code of CRUD Procedures

sp_makeCRUD Final Form

Selecting random records from a table

Selecting random rows using NEWID()

Selecting approximate percentage sample of random rows

Selecting exact number of random rows

Selecting exact percentage sample of random rows

Selecting random number of random records

Complete comparison

Conclusion

Random numbers in query

Workaround for RAND function

Remote Desktop Connection Manager

SQL Script Manager by RedGate

Using Stored Procedure as Data Source in Excel

First report and report after specific time interval starts a long time on MS SQL 2008 Reporting Services

CRUD procedures construction

Retrieving Object ID, Table Name, Schema Name and DB Name

Retrieving Table Field and Primary Key

Generating Columns and Variables Lists

Generating code of CRUD Procedures

sp_makeCRUD Final Form

Share this:

Selecting random rows using NEWID()

Selecting approximate percentage sample of random rows

Selecting exact number of random rows

Selecting exact percentage sample of random rows

Selecting random number of random records

Complete comparison

Conclusion

Share this:

Workaround for RAND function

Share this:

Share this:

Share this:

Share this:

Share this: