public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/17499] New: wcslen() returns wrong result on x86_64
@ 2014-10-21  8:34 digitalfreak at lingonborough dot com
  2014-10-21 15:37 ` [Bug libc/17499] " joseph at codesourcery dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: digitalfreak at lingonborough dot com @ 2014-10-21  8:34 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17499

            Bug ID: 17499
           Summary: wcslen() returns wrong result on x86_64
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: digitalfreak at lingonborough dot com
                CC: drepper.fsp at gmail dot com

Created attachment 7839
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7839&action=edit
Testcase, compile as gcc wcslen-bug.c ; run as ./a.out

wcslen() returns always wrong results if all the conditions are met:

- operating system is Linux on x86_64;
- the string being tested is longer than 8 characters;
- the string is placed at the memory address which is not a multiple of 4.

Compile and run the testcase and you will see that not only wcslen() works
wrong but also printf() which probably calls wcslen(). In real life if you are
lucky (or clear the memory with memset()) you will get an incorrect result. If
you are unlucky you will get a core dump because wcslen() just skips the
terminating zero character and will read the illegal memory address.

I have not tried to patch but it seems to me that the problematic place is
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/wcslen.S;h=366016cf638bdb713818c0e2b86af44c0d8e6874;hb=HEAD#l45
this instruction clears 4 least significant bytes of the source address + 32
bytes; the bytes 2-3 are then restored but 0-1 are probably assumed as being
always 0. As long as it is possible and legal to put the wchar_t array at any
address it should not be assumed that it is a multiple of 4.

Also I have not tested but I can see the similar algorithms in wcschr.S and
wcsrchr.S.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libc/17499] wcslen() returns wrong result on x86_64
  2014-10-21  8:34 [Bug libc/17499] New: wcslen() returns wrong result on x86_64 digitalfreak at lingonborough dot com
@ 2014-10-21 15:37 ` joseph at codesourcery dot com
  2014-10-22 23:00 ` digitalfreak at lingonborough dot com
  2014-10-22 23:19 ` jsm28 at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: joseph at codesourcery dot com @ 2014-10-21 15:37 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17499

--- Comment #1 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
On Tue, 21 Oct 2014, digitalfreak at lingonborough dot com wrote:

> - the string is placed at the memory address which is not a multiple of 4.

Why do you think this is a valid use of wcslen?  The x86_64 ABI says that 
int (= wchar_t) has 4-byte alignment.  (Indeed, there's a special m68k 
version of wcpcpy to deal with m68k having lower alignment requirements 
than generic code expects.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libc/17499] wcslen() returns wrong result on x86_64
  2014-10-21  8:34 [Bug libc/17499] New: wcslen() returns wrong result on x86_64 digitalfreak at lingonborough dot com
  2014-10-21 15:37 ` [Bug libc/17499] " joseph at codesourcery dot com
@ 2014-10-22 23:00 ` digitalfreak at lingonborough dot com
  2014-10-22 23:19 ` jsm28 at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: digitalfreak at lingonborough dot com @ 2014-10-22 23:00 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17499

--- Comment #2 from Rafal Luzynski <digitalfreak at lingonborough dot com> ---
More info:

- 32-bit architecture is also affected if uses SSE2;
- wcschr() and wcsrchr() have the same problem.

(In reply to joseph@codesourcery.com from comment #1)
> On Tue, 21 Oct 2014, digitalfreak at lingonborough dot com wrote:
> 
> > - the string is placed at the memory address which is not a multiple of 4.
> 
> Why do you think this is a valid use of wcslen?

Both compilers and the architecture allow this operation.  At the moment the
only reason why it should be invalid is the way that wcslen() and similar
functions have been implemented.

> The x86_64 ABI says that 
> int (= wchar_t) has 4-byte alignment.

Yes, so unaligned data should be discouraged for performance reasons but not
invalid.  And if it was invalid it should raise an error rather than just
return incorrect result.

> (Indeed, there's a special m68k 
> version of wcpcpy to deal with m68k having lower alignment requirements 
> than generic code expects.)

This is the solution I would like to have for x86, too.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libc/17499] wcslen() returns wrong result on x86_64
  2014-10-21  8:34 [Bug libc/17499] New: wcslen() returns wrong result on x86_64 digitalfreak at lingonborough dot com
  2014-10-21 15:37 ` [Bug libc/17499] " joseph at codesourcery dot com
  2014-10-22 23:00 ` digitalfreak at lingonborough dot com
@ 2014-10-22 23:19 ` jsm28 at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2014-10-22 23:19 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17499

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #3 from Joseph Myers <jsm28 at gcc dot gnu.org> ---
No, if the ABI specifies an alignment for that type then C code attempting to
use that type other than at that alignment has undefined behavior (meaning no
requirements for any error and no requirements on how the library behaves if
you do it).  That the architecture allows something simply means that you could
do it from assembly code - not that it is ever valid for the C type to be
unaligned.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-10-22 23:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-21  8:34 [Bug libc/17499] New: wcslen() returns wrong result on x86_64 digitalfreak at lingonborough dot com
2014-10-21 15:37 ` [Bug libc/17499] " joseph at codesourcery dot com
2014-10-22 23:00 ` digitalfreak at lingonborough dot com
2014-10-22 23:19 ` jsm28 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).