Recently I have discovered a bug in Snowlake Core Cloud Sevices query parser, which causes wrong
NULL handling in recursive CTE queriers.
Snowflake is already aware about that and confirmed that bug, but currently there is no ETA for this bug resolution. Problem should be addressed in some of future releases.
The problem is, that in case you have an
NULL constant in the anchor part of resursive CTE, then such field when used in an expression is always evaluated as
WITH RECURSE_CTE AS ( SELECT T.TABLE_COLUMN ,T.PARENT_TABLE_COLUMN ,NULL AS NULL_PARENT_TABLE_COLUMN ,1 AS LEVEL FROM A_TABLE T WHERE T.PARENT_NODE IS NULL UNION ALL SELECT T.TABLE_COLUMN ,T.PARENT_TABLE_COLUMN ,PT.TABLE_COLUMN AS NULL_PARENT_TABLE_COLUMN ,PT.LEVEL + 1 AS LEVEL FROM A_TABLE T INNER JOIN RECURSE_CTE PT ON PT.TABLE_NAME = TH.PARENT_TABLE_COLUMN ) SELECT TABLE_COLUMN ,PARENT_TABLE_COLUMN ,NULL_PARENT_TABLE_COLUMN ,LEVEL ,IFNULL(NULL_PARENT_TABLE_COLUMN, 'This is always evaluated as NULL') AS ALWAYS_NULL_IN_EXPRESSION FROM RECURSE_CTE
In the example above there is
NULL AS NUL_PARENT_TABLE_COLUMN in the anchor expression of the CTE. It is a
NULL constant in the the anchor part, but in the recursive part it is defined as
PT.TABLE_COLUMN AS NULL_PARENT_TABLE_COLUMN, thus for child records it is
NOT NULL. In final select, the
NULL_PARENT_TABLE_COLUMN field produces correct results when used as standalone field. But if the field is used in an expression like
IFNULL(NULL_PARENT_TABLE_COLUMN, 'This is always evaluated as NULL'), then such expression always treats the
Here is a complete sample with some demo data:
CREATE OR REPLACE TABLE TABLE_HIERARCHY ( TABLE_NAME varchar ,TABLE_LOAD_TYPE varchar ,PARENT_TABLE_NAME varchar ); INSERT INTO TABLE_HIERARCHY ( TABLE_NAME ,TABLE_LOAD_TYPE ,PARENT_TABLE_NAME ) VALUES ('A', 'MERGE', NULL) ,('B', 'MERGE', NULL) ,('C', 'INSERT', 'A') ,('D', 'MERGE', 'B') ,('E', 'INSERT', 'D'); --SELECT * FROM TABLE_HIERARCHY; WITH RECURSE_CTE AS ( SELECT TH.TABLE_NAME ,TH.TABLE_LOAD_TYPE ,TH.PARENT_TABLE_NAME ,NULL AS PARENT_TABLE_LOAD_TYPE -- NULL CONSTANT in anchor part of recursive CTE causes further ,CAST('' as varchar) AS PARENT_TABLE_LOAD_TYPE_NOT_NULL -- ,1 AS LEVEL FROM TABLE_HIERARCHY TH WHERE PARENT_TABLE_NAME IS NULL UNION ALL SELECT TH.TABLE_NAME ,TH.TABLE_LOAD_TYPE ,TH.PARENT_TABLE_NAME ,PT.TABLE_LOAD_TYPE AS PARENT_TABLE_LOAD_TYPE ,PT.TABLE_LOAD_TYPE AS PARENT_TABLE_LOAD_TYPE_NOT_NULL ,PT.LEVEL + 1 AS LEVEL FROM TABLE_HIERARCHY TH INNER JOIN RECURSE_CTE PT ON PT.TABLE_NAME = TH.PARENT_TABLE_NAME ) SELECT C.TABLE_NAME ,C.TABLE_LOAD_TYPE ,C.PARENT_TABLE_NAME ,C.PARENT_TABLE_LOAD_TYPE -- For Anchor members it returns null, for non-anchor it returns correct values ,C.PARENT_TABLE_LOAD_TYPE_NOT_NULL ,C.LEVEL ,IFNULL(C.PARENT_TABLE_LOAD_TYPE, 'ISNULL') AS WORNG_NULL_HANDLING -- if there is in the anchor part of the recuse CTE a NULL constant, all rows evaluate to NULL ,COALESCE(C.PARENT_TABLE_LOAD_TYPE, 'ISNULL') AS WRONG_NULL_HANDLING2 -- if there is in the anchor part of the recuse CTE a NULL constant, all rows evaluate to NULL ,LENGTH(C.PARENT_TABLE_LOAD_TYPE) AS WRONG_NULL_HANDLING3 -- if there is in the anchor part of the recuse CTE a NULL constant, all rows evaluate to NULL ,C.PARENT_TABLE_LOAD_TYPE || '_CONCAT' AS WRONG_NULL_HANDLING4 -- if there is in the anchor part of the recuse CTE a NULL constant, all rows evaluate to NULL ,IFNULL(C.PARENT_TABLE_NAME, 'ISNULL') AS CORRECT_HANDLING -- Non-null constants work fine ,C.PARENT_TABLE_LOAD_TYPE_NOT_NULL || '_CONCAT' AS CORRECT_HANDLING2 -- nulls retrieved from table work fine FROM RECURSE_CTE C;
Te wrong handling is visible in the below image which shows results of the above demo:
If the field which has
NULL constant in anchor part of recursive CTE is used in resultset as it is (out of expression), then its value is evaluated properly and proper result is returned. Problem is only, if such field is used in expression. Then it looks like Snowflake query parser looks on the CTE and if there is
NULL constant, it automatically returns
NULL for each row (to avoid per-row evaluation). This migh work for normal CTE, but does not work for recursive CTE as the child records might produce
Also there are no problem, if a physical table field is used in expression and that field contains
NULL. Problem is only with those
Unless the problem is addressed by Snowflake, the only workaround is to use a
non NULL constant in the anchor part of recursive CTE, but this might be problematic in some cases.