dos2unix: Binary symbol found, skipping binary file
The ^@
is Vim's representation of a null byte; cp. :help <Nul>
Ordinary text files do not contain null characters. Binary files typically have many null characters, and they would become corrupted if converted as a whole; that's why dos2unix
refuses to convert it.
You have several options:
- That null character may have been inserted by accident or is garbage. Edit the file (in Vim) or recreate it. If you're using Vim, you can do the conversion in it as well (via
:help ++ff
, e.g.:w ++ff=unix
). Command-line tools likedos2unix
still have their use for non-interactive invocations. - That null character belongs there. The
dos2unix
command has a-f|--force
option to enforce conversion.
dos2unix: Binary symbol 0x04 found at line 1703
That 0x0004 character you are seeing in your file has nothing at all to do with the BOM (which is fine, by the way) -- it's an EOT (End of Transmission) character from the C0 control set, and has been at that codepoint since 7-bit ASCII was the new hotness. (It's also the familiar Control-D Unix EOF sequence.)
Unfortunately, the pre-dos2unix
way of applying tr
to the file to strip the carriage returns won't work directly since the file is UTF-16; since iconv
works for you, though, you can use it to convert to UTF-8 (which tr
will work on), and then run this tr
command:
tr -d '\r' < crs_2013_data_temp.txt > crs_2013_data_unix.txt
in order to get the text file into the Unix line ending convention. You will have to keep an eye on whatever tools you're feeding the file to, though, to make sure that they don't choke on the Ctrl-D/EOT character; if they do, you can use
tr -d '\004' < crs_2013_data_unix.txt > crs_2013_data_clean.txt
to get rid of it.
As to how it got there in the first place? I blame the Belgians for letting it sneak into the data they gave the OECD, which they probably keyed in with cat - > file
or some other similarly underwhelming means. Also, some text editors try to be a bit too helpful by hiding control characters, even though other tools will bail out when they see them as they think you just stuffed a binary file in that was pretending to be text for a while.
dos2unix modifies binary files - why
This is a relevant part of the source code of dos2unix program:
if ((ipFlag->Force == 0) &&
(TempChar < 32) &&
(TempChar != 0x0a) && /* Not an LF */
(TempChar != 0x0d) && /* Not a CR */
(TempChar != 0x09) && /* Not a TAB */
(TempChar != 0x0c)) { /* Not a form feed */
RetVal = -1;
ipFlag->status |= BINARY_FILE ;
if (ipFlag->verbose) {
if ((ipFlag->stdio_mode) && (!ipFlag->error)) ipFlag->error = 1;
d2u_fprintf(stderr, "%s: ", progname);
d2u_fprintf(stderr, _("Binary symbol 0x00%02X found at line %u\n"),TempChar, line_nr);
}
break;
}
It seems that if the file has other control character it is considered as a binary file and is skipped, otherwise it is processed as a text file. So if the binary file (e.g. an image) doesn't contain these characters, it will be corrupted.
How to strip binary characters from a file?
There's something called ansifilter which does exactly this. I tested it out on my file and it works.
dos2unix doesn't convert the env file even with -f option
First option -f, then the file name:
sudo dos2unix -f env
Unusual ./configure error when building GDAL 2.0.0 from source
./configure seems sensitive to the source being in a sub-directory inside a VM shared folder (vmhgfs)
.host:/adam 500105212 141512588 358592624 29% /mnt/adam
When in /mnt/adam/gdal-2.0.0 ./configure works correctly
When in ~/adam/gdal/gdal-2.0.0 ./configure works correctly
However in /mnt/adam/gdal/gdal-2.0.0 ./configure fails with error in question.
I can only assume this is some Unix permissions issue etc.
Related Topics
Upgrading PHPmyadmin (And Other Packages) on Debian Squeeze
How to Manipulate Array in Shell Script
Linux Umask for Sudo and Apache
Golang Os/Exec, Realtime Memory Usage
Simplest Way to Build Dotnet Sdk Project Requiring Net461 on Macos
How to Send a Mail with a Message in Unix Script
Linux: Proc/Net/Sockstat Tcp Mem More and More Larger
What Does "Private_Dirty" Memory Mean in Smaps
Linux: Move 1 Million Files into Prefix-Based Created Folders
Can Linux Flock(Fd, Lock_Ex|Lock_Nb) Fail Spuriously
Jetty Bash Script Works Only with Root User
Using Multiple Layers of Quotes in Bash
How to Execute The Vim Commands Through Shell Script
Change Conda Default Pkgs_Dirs and Envs Dirs
Read Lines Between Two Keywords
Individual Thread Priority Checking Using Command Line in Linux