iconv any encoding to UTF-8
Maybe you are looking for enca
:
Enca is an Extremely Naive Charset Analyser. It detects character set and encoding of text files and can also convert them to other encodings using either a built-in converter or external libraries and tools like libiconv, librecode, or cstocs.
Currently it supports Belarusian, Bulgarian, Croatian, Czech, Estonian, Hungarian, Latvian, Lithuanian, Polish, Russian, Slovak, Slovene, Ukrainian, Chinese, and some multibyte encodings independently on language.
Note that in general, autodetection of current encoding is a difficult process (the same byte sequence can be correct text in multiple encodings). enca
uses heuristics based on the language you tell it to detect (to limit the number of encodings). You can use enconv
to convert text files to a single encoding.
Force encode from US-ASCII to UTF-8 (iconv)
ASCII is a subset of UTF-8, so all ASCII files are already UTF-8 encoded. The bytes in the ASCII file and the bytes that would result from "encoding it to UTF-8" would be exactly the same bytes. There's no difference between them, so there's no need to do anything.
It looks like your problem is that the files are not actually ASCII. You need to determine what encoding they are using, and transcode them properly.
php iconv character encoding from Thai ISO-IR-166 to utf-8 doesn't work
After all your helps I've got to this result:
$convertedChar = iconv('ISO-IR-166', "UTF-8", utf8_decode('¤'));
How Can I Convert From US-ASCII to UTF-8 with iconv?
So I figured it out. ColdFusion does need the BOM to work correctly, unless you want to put a <cfprocessingdirective pageencoding="utf-8">
tag at the top of each and every CFM file you may have non-ASCII characters in. Reference:
https://forums.adobe.com/thread/930550
https://www.adobe.com/support/coldfusion/internationalization/internationalization_cfmx/internationalization_cfmx3.html
I'm a Sublime user, so I simply went to File -> Save With Encoding, UTF-8 with BOM, and it works without the tag. I then became quite happy that I spend most of my days in Python 3!
C convert iso ISO−8859-1 to UTF-8 with libconv
Please read the iconv_open(3)
manual page carefully:
iconv_t iconv_open(const char *tocode, const char *fromcode);
If you're converting to UTF-8 from ISO 8859-1 then this is at odds:
iconv_t iconvDesc = iconv_open ("ISO−8859-1", "UTF-8//TRANSLIT//IGNORE");
It should say
iconv_t iconvDesc = iconv_open ("UTF-8//TRANSLIT//IGNORE", "ISO−8859-1");
Convert files between UTF-8 and ISO-8859 on Linux
ISO-8859-x (Latin-1) encoding only contains very limited characters, you should always try to encode to UTF-8 to make life easier.
And utf-8 (Unicode) is a superset of ISO 8859 so it will be not surprised you could not convert UTF-8 to ISO 8859
It seems command file
just give a very limited info of the file encoding
You could try to guess the from encoding either ISO-8859-1 or ISO-8859-15 or the other from 2~14 as suggested in the comment by @hobbs
And you could get a supported encoding of iconv
by iconv -l
If life treats you not easy with guessing the real file encoding, this silly script might help you out :D
Related Topics
How to Disable Socket Creation for a Linux Process, for Sandboxing
Linux Command Output as a Parameter of Another Command
Filtering Rows Based on Number of Columns with Awk
Linux: Run Cron Job in Foreground
"Make" Command for Windows - Possible Options
Sort by Third Column Leaving First and Second Column Intact in Linux
How to Export Symbols from a Shared Library
Move Files That Are 30 Minutes Old
Create .So Files on Linux Without Using Pic (Position Independent Code) (X86 32Bit)
How Does Execve Call Dynamic Linker/Loader (Ld-Linux.So.2)
What's the Practical Limit on the Size of Single Packet Transmitted Over Domain Socket
How to See Contents of Hive Orc Files in Linux
Can Gdb Change the Assembly Code of a Running Program
Why Does Gcc Force Pic for X64 Shared Libs
How to Generate Plain-Text Source-Code PDF Examples That Work in a Document Viewer