I have to read data from AS400 machine using an ODBC driver under SQL Server 2005. The data on the AS400 machine are encoded using EBCDIC 870 – Multilingual/ROECE (Latin-2) code page and contains Czech diacritics.
Unfortunately,SQL Server 2005 have no native support for EBCDIC 870 code page and when you read the data, you receive wrong characters.
There are some EBCDIC character sets supported by SQL server, but not the 870 one.
select * from fn_helpcollations()
I was googling and binging for some time, but didn’t find any suitable solution, if you need to select live data from the AS400 machine.
Finally I was looking deeper into to the problem and found an acceptable work around.
The ODBC driver and AS400 support HEX() function, which returns sequence of hexadecimal codes representing the value.
SELECT * FROM OPENQUERY(AS400, 'SELECT HEX(FieldName) AS FieldName FROM AS400Table')
You can crate an CLR function which will allow you to convert the source CBCDIC 870 code page to the destination code page of you choice.
[SqlFunction()] public static SqlString fnEBCDIC870ToSql(SqlString hexBytes, int destinationCodePage) { if (hexBytes.IsNull) return hexBytes; string str = hexBytes.ToString(); byte[] buffer = new byte[str.Length / 2]; Encoding destinationEncoding = Encoding.GetEncoding(destinationCodePage); for (int i = 0; i < buffer.Length; i++) { buffer[i] = byte.Parse(str.Substring(i * 2, 2), NumberStyles.HexNumber); } buffer = Encoding.Convert(Encoding.GetEncoding(870), destinationEncoding, buffer); str = destinationEncoding.GetString(buffer); return new SqlString(str); }
If you cannot use the CLR function on the SQL server or Database, you can simply create an Conversion table and conversion scalar function for this conversion.
for conversion to code page 1250:
CREATE FUNCTION [dbo].[fnEBCDIC870ToWin1250] ( @pHex varchar(8000) ) RETURNS varchar(4000) AS BEGIN DECLARE @result varchar(4000) SET @result = ''; WITH Tally AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) As RowNo FROM sys.all_columns ) SELECT @result = @result + ISNULL(C.Character, '?') FROM Tally LEFT JOIN CodePage870 C ON UPPER(SUBSTRING(@pHex, (RowNo - 1) * 2 + 1, 2)) = C.HexByte WHERE RowNo <= LEN(@pHex) / 2 ORDER BY RowNo RETURN @result END
The function uses a conversion table CodePage870.
CREATE TABLE CodePage870 ( HexByte char(2) NOT NULL, Character char(1) NOT NULL, CONSTRAINT PK_CodePage870 PRIMARY KEY (HexByte) ) INSERT CodePage870 ( HexByte, Character ) SELECT '40' As HexByte, ' ' As Character UNION ALL SELECT '41' As HexByte, ' ' As Character UNION ALL SELECT '42' As HexByte, 'â' As Character UNION ALL SELECT '43' As HexByte, 'ä' As Character UNION ALL SELECT '44' As HexByte, 'ţ' As Character UNION ALL SELECT '45' As HexByte, 'á' As Character UNION ALL SELECT '46' As HexByte, 'ă' As Character UNION ALL SELECT '47' As HexByte, 'č' As Character UNION ALL SELECT '48' As HexByte, 'ç' As Character UNION ALL SELECT '49' As HexByte, 'ć' As Character UNION ALL SELECT '4A' As HexByte, '[' As Character UNION ALL SELECT '4B' As HexByte, '.' As Character UNION ALL SELECT '4C' As HexByte, '<' As Character UNION ALL SELECT '4D' As HexByte, '(' As Character UNION ALL SELECT '4E' As HexByte, '+' As Character UNION ALL SELECT '4F' As HexByte, '!' As Character UNION ALL SELECT '50' As HexByte, '&' As Character UNION ALL SELECT '51' As HexByte, 'é' As Character UNION ALL SELECT '52' As HexByte, 'ę' As Character UNION ALL SELECT '53' As HexByte, 'ë' As Character UNION ALL SELECT '54' As HexByte, 'ů' As Character UNION ALL SELECT '55' As HexByte, 'í' As Character UNION ALL SELECT '56' As HexByte, 'î' As Character UNION ALL SELECT '57' As HexByte, 'ľ' As Character UNION ALL SELECT '58' As HexByte, 'ĺ' As Character UNION ALL SELECT '59' As HexByte, 'ß' As Character UNION ALL SELECT '5A' As HexByte, ']' As Character UNION ALL SELECT '5B' As HexByte, '$' As Character UNION ALL SELECT '5C' As HexByte, '*' As Character UNION ALL SELECT '5D' As HexByte, ')' As Character UNION ALL SELECT '5E' As HexByte, ';' As Character UNION ALL SELECT '5F' As HexByte, '^' As Character UNION ALL SELECT '60' As HexByte, '-' As Character UNION ALL SELECT '61' As HexByte, '/' As Character UNION ALL SELECT '62' As HexByte, 'Â' As Character UNION ALL SELECT '63' As HexByte, 'Ä' As Character UNION ALL SELECT '64' As HexByte, '˝' As Character UNION ALL SELECT '65' As HexByte, 'Á' As Character UNION ALL SELECT '66' As HexByte, 'Ă' As Character UNION ALL SELECT '67' As HexByte, 'Č' As Character UNION ALL SELECT '68' As HexByte, 'Ç' As Character UNION ALL SELECT '69' As HexByte, 'Ć' As Character UNION ALL SELECT '6A' As HexByte, '|' As Character UNION ALL SELECT '6B' As HexByte, ',' As Character UNION ALL SELECT '6C' As HexByte, '%' As Character UNION ALL SELECT '6D' As HexByte, '_' As Character UNION ALL SELECT '6E' As HexByte, '>' As Character UNION ALL SELECT '6F' As HexByte, '?' As Character UNION ALL SELECT '70' As HexByte, 'ˇ' As Character UNION ALL SELECT '71' As HexByte, 'É' As Character UNION ALL SELECT '72' As HexByte, 'Ę' As Character UNION ALL SELECT '73' As HexByte, 'Ë' As Character UNION ALL SELECT '74' As HexByte, 'Ů' As Character UNION ALL SELECT '75' As HexByte, 'Í' As Character UNION ALL SELECT '76' As HexByte, 'Î' As Character UNION ALL SELECT '77' As HexByte, 'Ľ' As Character UNION ALL SELECT '78' As HexByte, 'Ĺ' As Character UNION ALL SELECT '79' As HexByte, '`' As Character UNION ALL SELECT '7A' As HexByte, ':' As Character UNION ALL SELECT '7B' As HexByte, '#' As Character UNION ALL SELECT '7C' As HexByte, '@' As Character UNION ALL SELECT '7D' As HexByte, '''' As Character UNION ALL SELECT '7E' As HexByte, '=' As Character UNION ALL SELECT '7F' As HexByte, '"' As Character UNION ALL SELECT '80' As HexByte, '˘' As Character UNION ALL SELECT '81' As HexByte, 'a' As Character UNION ALL SELECT '82' As HexByte, 'b' As Character UNION ALL SELECT '83' As HexByte, 'c' As Character UNION ALL SELECT '84' As HexByte, 'd' As Character UNION ALL SELECT '85' As HexByte, 'e' As Character UNION ALL SELECT '86' As HexByte, 'f' As Character UNION ALL SELECT '87' As HexByte, 'g' As Character UNION ALL SELECT '88' As HexByte, 'h' As Character UNION ALL SELECT '89' As HexByte, 'i' As Character UNION ALL SELECT '8A' As HexByte, 'ś' As Character UNION ALL SELECT '8B' As HexByte, 'ň' As Character UNION ALL SELECT '8C' As HexByte, 'đ' As Character UNION ALL SELECT '8D' As HexByte, 'ý' As Character UNION ALL SELECT '8E' As HexByte, 'ř' As Character UNION ALL SELECT '8F' As HexByte, 'ş' As Character UNION ALL SELECT '90' As HexByte, '°' As Character UNION ALL SELECT '91' As HexByte, 'j' As Character UNION ALL SELECT '92' As HexByte, 'k' As Character UNION ALL SELECT '93' As HexByte, 'l' As Character UNION ALL SELECT '94' As HexByte, 'm' As Character UNION ALL SELECT '95' As HexByte, 'n' As Character UNION ALL SELECT '96' As HexByte, 'o' As Character UNION ALL SELECT '97' As HexByte, 'p' As Character UNION ALL SELECT '98' As HexByte, 'q' As Character UNION ALL SELECT '99' As HexByte, 'r' As Character UNION ALL SELECT '9A' As HexByte, 'ł' As Character UNION ALL SELECT '9B' As HexByte, 'ń' As Character UNION ALL SELECT '9C' As HexByte, 'š' As Character UNION ALL SELECT '9D' As HexByte, '¸' As Character UNION ALL SELECT '9E' As HexByte, '˛' As Character UNION ALL SELECT '9F' As HexByte, '¤' As Character UNION ALL SELECT 'A0' As HexByte, 'ą' As Character UNION ALL SELECT 'A1' As HexByte, '~' As Character UNION ALL SELECT 'A2' As HexByte, 's' As Character UNION ALL SELECT 'A3' As HexByte, 't' As Character UNION ALL SELECT 'A4' As HexByte, 'u' As Character UNION ALL SELECT 'A5' As HexByte, 'v' As Character UNION ALL SELECT 'A6' As HexByte, 'w' As Character UNION ALL SELECT 'A7' As HexByte, 'x' As Character UNION ALL SELECT 'A8' As HexByte, 'y' As Character UNION ALL SELECT 'A9' As HexByte, 'z' As Character UNION ALL SELECT 'AA' As HexByte, 'Ś' As Character UNION ALL SELECT 'AB' As HexByte, 'Ň' As Character UNION ALL SELECT 'AC' As HexByte, 'Đ' As Character UNION ALL SELECT 'AD' As HexByte, 'Ý' As Character UNION ALL SELECT 'AE' As HexByte, 'Ř' As Character UNION ALL SELECT 'AF' As HexByte, 'Ş' As Character UNION ALL SELECT 'B0' As HexByte, '˙' As Character UNION ALL SELECT 'B1' As HexByte, 'Ą' As Character UNION ALL SELECT 'B2' As HexByte, 'ż' As Character UNION ALL SELECT 'B3' As HexByte, 'Ţ' As Character UNION ALL SELECT 'B4' As HexByte, 'Ż' As Character UNION ALL SELECT 'B5' As HexByte, '§' As Character UNION ALL SELECT 'B6' As HexByte, 'ž' As Character UNION ALL SELECT 'B7' As HexByte, 'ź' As Character UNION ALL SELECT 'B8' As HexByte, 'Ž' As Character UNION ALL SELECT 'B9' As HexByte, 'Ź' As Character UNION ALL SELECT 'BA' As HexByte, 'Ł' As Character UNION ALL SELECT 'BB' As HexByte, 'Ń' As Character UNION ALL SELECT 'BC' As HexByte, 'Š' As Character UNION ALL SELECT 'BD' As HexByte, '¨' As Character UNION ALL SELECT 'BE' As HexByte, '´' As Character UNION ALL SELECT 'BF' As HexByte, '×' As Character UNION ALL SELECT 'C0' As HexByte, '{' As Character UNION ALL SELECT 'C1' As HexByte, 'A' As Character UNION ALL SELECT 'C2' As HexByte, 'B' As Character UNION ALL SELECT 'C3' As HexByte, 'C' As Character UNION ALL SELECT 'C4' As HexByte, 'D' As Character UNION ALL SELECT 'C5' As HexByte, 'E' As Character UNION ALL SELECT 'C6' As HexByte, 'F' As Character UNION ALL SELECT 'C7' As HexByte, 'G' As Character UNION ALL SELECT 'C8' As HexByte, 'H' As Character UNION ALL SELECT 'C9' As HexByte, 'I' As Character UNION ALL SELECT 'CA' As HexByte, '' As Character UNION ALL SELECT 'CB' As HexByte, 'ô' As Character UNION ALL SELECT 'CC' As HexByte, 'ö' As Character UNION ALL SELECT 'CD' As HexByte, 'ŕ' As Character UNION ALL SELECT 'CE' As HexByte, 'ó' As Character UNION ALL SELECT 'CF' As HexByte, 'ő' As Character UNION ALL SELECT 'D0' As HexByte, '}' As Character UNION ALL SELECT 'D1' As HexByte, 'J' As Character UNION ALL SELECT 'D2' As HexByte, 'K' As Character UNION ALL SELECT 'D3' As HexByte, 'L' As Character UNION ALL SELECT 'D4' As HexByte, 'M' As Character UNION ALL SELECT 'D5' As HexByte, 'N' As Character UNION ALL SELECT 'D6' As HexByte, 'O' As Character UNION ALL SELECT 'D7' As HexByte, 'P' As Character UNION ALL SELECT 'D8' As HexByte, 'Q' As Character UNION ALL SELECT 'D9' As HexByte, 'R' As Character UNION ALL SELECT 'DA' As HexByte, 'Ě' As Character UNION ALL SELECT 'DB' As HexByte, 'ű' As Character UNION ALL SELECT 'DC' As HexByte, 'ü' As Character UNION ALL SELECT 'DD' As HexByte, 'ť' As Character UNION ALL SELECT 'DE' As HexByte, 'ú' As Character UNION ALL SELECT 'DF' As HexByte, 'ě' As Character UNION ALL SELECT 'E0' As HexByte, '' As Character UNION ALL SELECT 'E1' As HexByte, '÷' As Character UNION ALL SELECT 'E2' As HexByte, 'S' As Character UNION ALL SELECT 'E3' As HexByte, 'T' As Character UNION ALL SELECT 'E4' As HexByte, 'U' As Character UNION ALL SELECT 'E5' As HexByte, 'V' As Character UNION ALL SELECT 'E6' As HexByte, 'W' As Character UNION ALL SELECT 'E7' As HexByte, 'X' As Character UNION ALL SELECT 'E8' As HexByte, 'Y' As Character UNION ALL SELECT 'E9' As HexByte, 'Z' As Character UNION ALL SELECT 'EA' As HexByte, 'ď' As Character UNION ALL SELECT 'EB' As HexByte, 'Ô' As Character UNION ALL SELECT 'EC' As HexByte, 'Ö' As Character UNION ALL SELECT 'ED' As HexByte, 'Ŕ' As Character UNION ALL SELECT 'EE' As HexByte, 'Ó' As Character UNION ALL SELECT 'EF' As HexByte, 'Ő' As Character UNION ALL SELECT 'F0' As HexByte, '0' As Character UNION ALL SELECT 'F1' As HexByte, '1' As Character UNION ALL SELECT 'F2' As HexByte, '2' As Character UNION ALL SELECT 'F3' As HexByte, '3' As Character UNION ALL SELECT 'F4' As HexByte, '4' As Character UNION ALL SELECT 'F5' As HexByte, '5' As Character UNION ALL SELECT 'F6' As HexByte, '6' As Character UNION ALL SELECT 'F7' As HexByte, '7' As Character UNION ALL SELECT 'F8' As HexByte, '8' As Character UNION ALL SELECT 'F9' As HexByte, '9' As Character UNION ALL SELECT 'FA' As HexByte, 'Ď' As Character UNION ALL SELECT 'FB' As HexByte, 'Ű' As Character UNION ALL SELECT 'FC' As HexByte, 'Ü' As Character UNION ALL SELECT 'FD' As HexByte, 'Ť' As Character UNION ALL SELECT 'FE' As HexByte, 'Ú' As Character UNION ALL SELECT 'FF' As HexByte, '?' As Character
The Script was generated by simple C# method.
public static void CreateEBCDICScript(string destinationFileName, int destinationCodePage) { StringBuilder sb = new StringBuilder(); byte[] buffer = new byte[1]; Encoding ebcdic = Encoding.GetEncoding(870); Encoding destinationEncoding = Encoding.GetEncoding(destinationCodePage); for (int i = 64; i < 256; i++) { buffer[0] = (byte)i; buffer = Encoding.Convert(ebcdic, destinationEncoding, buffer); string target = destinationEncoding.GetString(buffer); sb.AppendFormat("SELECT '{0:X}' AS HexByte, '{1}' AS Character UNION ALL", i, target.Equals("'") ? "''" : target); sb.AppendLine(); } File.WriteAllText(destinationFileName, sb.ToString(); }
Finally as you have the conversion functions in database you can use below selects to receive correctly encoded strings from within the AS400 machine.
--CLR Version SELECT fnECDIC870ToSql(FieldName, 1250) As FieldName FROM OPENQUERY(AS400, 'SELECT HEX(FieldName) AS FieldName FROM AS400Table') --T-SQL Version SELECT fnECDIC870ToWin1250(FieldName) As FieldName FROM OPENQUERY(AS400, 'SELECT HEX(FieldName) AS FieldName FROM AS400Table')