* `strings` mis-behavior
@ 2015-01-01 18:45 Alexander Cherepanov
0 siblings, 0 replies; only message in thread
From: Alexander Cherepanov @ 2015-01-01 18:45 UTC (permalink / raw)
To: elfutils-devel
[-- Attachment #1: Type: text/plain, Size: 2547 bytes --]
Hi!
I thought about reading strings.c but the code turned out to be
surprisingly complex. I understand that it's tuned for speed but I doubt
that I would want my default `strings` to be that complex, from security
POV. I ended up looking into process_chunk and read_block_no_mmap
functions mostly (probably not the most tested codepath). Some notes
follow. I don't think any of them is a security issue.
1. You cannot override one -e option with another. E.g. `strings -e b -e
l` will leave "big_endian = true".
2. A string at the end of file is not printed:
$ printf abcd | ./strings
(no output)
3. The things don't generally work well cross blocks. E.g. strings of
the length of about CHUNKSIZE trigger an assert:
$ printf '%65536s\n' x | ./strings
strings: strings.c:530: read_block_no_mmap: Assertion `unprinted ==
((void *)0) || ntrailer == 0' failed.
Aborted
Or with a delay between blocks:
$ (printf abcd; sleep 1; echo e) | ./strings
strings: strings.c:530: read_block_no_mmap: Assertion `unprinted ==
((void *)0) || ntrailer == 0' failed.
Aborted
4. Strings longer than about 2 CHUNKSIZE are not handled (only tail is
printed):
$ printf '%132000s\n' x > test
$ cat test | ./strings | wc -L
66464
5. Multibyte support seems to be completely broken:
$ printf abcdef | iconv -t ucs-4le | ./strings -n 1 -e L
a
b
c
d
e
(it should print "abcdef").
Probably the first step could be to replace "++buf" with "buf +=
bytes_per_char" in process_chunk_mb which should make it work in easy
cases assuming that NULs are ignored in output.
And I don't understand this comment:
378 /* There is no sane way of printing the string.
If we
379 assume the file data is encoded in UCS-2/UTF-16 or
380 UCS-4/UTF-32 respectively we could covert the
string.
381 But there is no such guarantee. */
382 fwrite_unlocked (start, 1, buf - start, stdout);
6. There is the following code in read_block_no_mmap:
548 /* We only use complete characters. */
549 nb &= ~(bytes_per_char - 1);
but it seems there is no compensation for the dropped bytes. If we split
the input from example in the item 5 into chunk of 6 bytes we can see
the loss:
$ printf abcdef | iconv -t ucs-4le | perl -e '$|=1; while (read STDIN,
$s, 6) { print $s; select undef, undef, undef, .1; }' | ./strings -n 1 -e L
a
d
--
Alexander Cherepanov
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2015-01-01 18:45 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-01 18:45 `strings` mis-behavior Alexander Cherepanov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).