Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] binary search of binary data



On Wed, Dec 29, 2004 at 02:04:33PM +0900, Stephen J. Turnbull wrote:
> >>>>> "Edward" == Edward Wright <edw@example.com> writes:
> 
>     Edward> I use grep -ab for searching thru the file for text
>     Edward> strings, but I couldn't see how to pass a binary search
>     Edward> pattern to grep. Am I missing something simple here?
> 
> That's not a grep problem, that's a shell problem!<0.9wink>

Ah...  too true, too true

> 
> I can think of a number of ways to deal with this.  First, almost all
> shells/terminal drivers allow entry of binary characters in some way,
> for example use ^V<control character> for the low range and ALT-###
> for the high range.

While trying to figure this out from the bash man page (no luck <sigh>)
I came across $'<hex here>' which I didn't know and outputs the btye
value of the <hex here> - e.g. echo $'\x50\x51' gives "PQ".

>  Some shells will interpret octal escapes for you,
> although I believe bash doesn't.  Recent GNU greps support a
> --perl-regexp (-P) option; 

mine doesn't.... mebbe I should get a new one... :)

> I don't know if that implies processing
> Perl string escapes, it may not.  There's always the -f option to take
> the search pattern from a file.

This worked, although creating the file was a hassle. 

> 
> Finally, you can use 'grep -e `printf ...`'.

I don't know how to make prinf output the byte values. It'll ouput the
hex representation no problem....

I tried using it to create the above mentioned file like so:

printf $'\x26\x9d\x00\x00\x28\x9d\x00\x00' >testfile

but I ended up with a 2 byte file.... apparently it quit when it hit
the null byte.... which kinda makes sense.

(I tried the same thing as an argument to grep, but it also failed -
probly for the same reason)

> 
> Unix rules, even when it sucks!

Agreed!

> 
> None of these are particularly satisfactory, which is why I suggested
> doing the whole thing in <scripted-language-with-initial-character-p-
> or-r-at-your-option>.  (You could even use awk, etc.)

which proved to be the best solution - at least it worked for me with
the least hassle - but I did have to add a bit to escape meta-
characters (duh)

Anyway, thanks again for the assist! I got most of my data back.

Ed


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links