How to grep for presence of specific hex bytes in files?
Check the post again. FrOsT is not including the '<' and '>' in his actual grep command. He only used the carats to enclose an example statement. His actual statement looks like this:
"\x01\x02"
not:
"<\x01\x02>"
I have a C source file on my computer that begins with the line:
#include <stdio.h>
When I run
grep -obUaP '\x69\x6E\x63\x6C\x75\x64\x65' io.c
I get
1:include
That is, the line number followed by only the string matching the pattern.
You may want to run
man grep
and find out what all those options mean.
Using grep to search for hex strings in a file
We tried several things before arriving at an acceptable solution:
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
root# grep -ibH "df" /usr/bin/xxd
Binary file /usr/bin/xxd matches
xxd -u /usr/bin/xxd | grep -H 'DF'
(standard input):00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
Then found we could get usable results with
xxd -u /usr/bin/xxd > /tmp/xxd.hex ; grep -H 'DF' /tmp/xxd
Note that using a simple search target like 'DF' will incorrectly match characters that span across byte boundaries, i.e.
xxd -u /usr/bin/xxd | grep 'DF'
00017b0: 4010 8D05 0DFF FF0A 0300 53E3 0610 A003 @.........S.....
--------------------^^
So we use an ORed regexp to search for ' DF' OR 'DF ' (the searchTarget preceded or followed by a space char).
The final result seems to be
xxd -u -ps -c 10000000000 DumpFile > DumpFile.hex
egrep ' DF|DF ' Dumpfile.hex
0001020: 0089 0424 8D95 D8F5 FFFF 89F0 E8DF F6FF ...$............
-----------------------------------------^^
0001220: 0C24 E871 0B00 0083 F8FF 89C3 0F84 DF03 .$.q............
--------------------------------------------^^
How do I grep for special character(control characters) using hex representation
byte
What you need to do first is to create inside a variable the exact byte that you want to search.
Something like any of this:
a=$(echo -e '\xc0)
a=$'\xc0'
a=$(printf '\xc0')
a=$(echo -e '\300') # 300 is 0xC0 in octal
a=$'\300'
a=$(printf '\300')
a=$(echo "c0" | xxd -r -p)
I could try to come up with some other ways, but I hope you get the idea.
Then, you could try to search for the byte
with grep:
echo $'Testing this: \xC0 byte' | grep "$a"
And, if you use a locale with utf-8 (as is the most common) that will fail.
If you change to a ISO-8859-1 locale, that will work:
LC_ALL=en_US.iso88591 echo $'Testing this: \xC0 byte' |
LC_ALL=en_US.iso88591 grep -P "$a"
Or, if you don't mind starting a new bash instance:
$ bash
$ export LC_ALL=en_US.iso88591
$ echo $'Testing this: \xC0 byte' | grep -P "$a"
And just return to the old bash environment by executing exit
.
This might work or not depending on your system.
Let's explore the other side: characters.
character
There is a very very important twist that you should understand.
A byte is not a character. Well, sometimes, by sheer luck, it is.
But beside those 128 ASCII characters in which a byte is a character (not in UTF-16 or UTF-32. And let's also forget about EBCDIC), all 1,114,112 (17 × 65,536) UNICODE code points have more than one byte 1.
In that case, you should ask for the UNICODE code point of hex 0xC0
.
In modern bash, like this:
$ printf '\U00C0`
À
Which is this character: LATIN CAPITAL LETTER A WITH GRAVE
That will be encoded as one byte if the locale is ISO-8859-1 (and ISO-8859-15, at least) and as two bytes if the locale is utf-8.
$ a=$(printf '\UC0')
$ printf 'Testing \U00C0 character' | grep -P "$a"
Testing À character
It also will work if you change the LC_ALL variable. Well, I mean that grep will detect the character, but the printed line may fail to render correctly the character due to the changed locale.
If the file has this character and the encoding of the file is correct. Grep will work with the value of the character in a variable.
How to grep a text file which contains some binary data?
You could run the data file through cat -v
, e.g
$ cat -v tmp/test.log | grep re
line1 re ^@^M
line3 re^M
which could be then further post-processed to remove the junk; this is most analogous to your query about using tr
for the task.
-v
simply tells cat
to display non-printing characters.
Adding Bytes to file using Hex Editor
It likely has to be the same length or shorter (e.g. padded with nulls) because of pointers within the file itself. If a game file is expecting a structure or function at index XXXX, and you shift everything by five bytes, then it's not going to work. How to fix it? You would need intimate knowledge of the game file format. Then you could go about revising what else needs to be revised.
As an aside, Windows DLLs keep their strings and dialogs in a separate resource area, and are surprisingly easy to revise using a resource editor!
Match two strings in one line with grep
You can use
grep 'string1' filename | grep 'string2'
Or
grep 'string1.*string2\|string2.*string1' filename
Portable way to get file size (in bytes) in the shell
wc -c < filename
(short for word count, -c
prints the byte count) is a portable, POSIX solution. Only the output format might not be uniform across platforms as some spaces may be prepended (which is the case for Solaris).
Do not omit the input redirection. When the file is passed as an argument, the file name is printed after the byte count.
I was worried it wouldn't work for binary files, but it works OK on both Linux and Solaris. You can try it with wc -c < /usr/bin/wc
. Moreover, POSIX utilities are guaranteed to handle binary files, unless specified otherwise explicitly.
Related Topics
How to Fix Urllib3 Runtimeerror: Requests Dependency 'Urllib3' Must Be Version >= 1.21.1, < 1.22
Total Number of Bytes Read/Written by a Linux Process and Its Children
Capturing User-Space Assembly with Ftrace and Kprobes (By Using Virtual Address Translation)
Forcing a Context Switch from The Userland on Linux
How to Access Google Drive from Cli Cyberduck
The Most Reliable Way to Terminate a Family of Processes
Find Installation Path in Linux
How to Start Linux with Gui Without Monitor
Google Cloud Storage Buckets: Mounting in a Linux Instance with Global Permissions
Installing PHPsh on Linux, Python Error
Version Control for My Web Server
Gnuplot-Like Program for Timeline Data
Why Cannot I Directly Compare 2 Thread Ids Instead of Using Pthread_Equal
How to Connect to Docker Container from Localhost