Conversion from Ebcdic to Utf8 in Linux

Conversion from EBCDIC to UTF8 in Linux

It's simple with iconv.

iconv -f ISO8859-1   -t "UTF-8" result.csv -o new_result.csv

ISO8859-1 is the Latin-1 encoding format. For a list of encodings, refer t this table from official IBM documentation: https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.nls/doc/nlsgdrf/iconv.htm%23d722e3a267mela

Note that the conversion may leave non valid UTF-8 characters from EBCDIC. An example are NULL characters in the strings. To avoid this, use an HEX editor and replace hex values from 00 to 20 (space character).

Reg : Data Conversion from EBCDIC to UTF - 8

Thank You, for your responses.

I tried in a different way, that I generated the XML Document by specifying the the Code page conversion 1208, by using the WITH ENCODING 1208 function, then the issue got resolved.

With that encoding format I was successfully able to insert my data in to the pureXML table.

Is UTF to EBCDIC Conversion lossless?

If a MQ message has varying sub-message fields that require different encodings, then that's how you should handle those messages, i.e., as separate message pieces.

But as you describe this, the entire message needs to be received without conversion. The first eight bytes need to be extracted and held separately. The remainder of the message can then have its encoding converted (unless other sub-fields also need to be extracted as binary, unconverted bytes).

For any return message, the opposite conversion must be done. The text portion of the message can be converted, and then that sub-string can have the original eight bytes prepended to it. The newly reconstructed message then can be sent back through the queue, again without automatic conversion.

Your partner on the other end is not using the messaging product correctly. (Of course, you probably shouldn't say that out loud.) There should be no part of such a message that cannot automatically survive intact across both directions. Instead of an 8-byte binary field, it should be represented as something more like a 16-byte hex representation of the 8-byte value for one example method. In hex, there'd be no conversion problem either way across the route.



Related Topics



Leave a reply



Submit