public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Bug? wcsxfrm causing memory corruption
@ 2017-05-10  9:31 Erik Bray
  2017-05-21 17:18 ` Duncan Roe
  0 siblings, 1 reply; 6+ messages in thread
From: Erik Bray @ 2017-05-10  9:31 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1472 bytes --]

Greetings--

In the process of fixing the Python test suite on Cygwin I ran across
one test that was consistently causing segfaults later on, not
directly local to that test.  The test involves wcsxfrm so that's
where I focused my attention.

The attached test demonstrates the bug.  Given an output buffer of N
wide characters, wcsxfrm will cause bytes beyond the destination size
to be reversed. I believe it might actually be a bug in the underlying
LCMapStringW workhorse (this is on Windows 10; have not tested other
versions).

According to its docs [1], the cchDest argument (size of the
destination buffer) is treated as a *byte* count when using
LCMAP_SORTKEY.  However, for the purposes of applying the
LCMAP_BYTEREV transformation it seems to be treating the output size
(in bytes) as character count.  So in the example I give, where the
output sort key is 7 bytes (including the null terminator), it swaps
*14* bytes--the bytes including the sort key as well as the next 7
adjacent bytes.  This is obviously a problem if the destination buffer
is allocated out of some larger memory pool.

This definitely has to be a bug, right?  Or at least very poorly
documented on MS's part.  A workaround would either be to not use
LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
LCMapStringW with LCMAP_BYTEREV and the correct character count...

Thanks,
Erik


[1] https://msdn.microsoft.com/en-us/library/windows/desktop/dd318700(v=vs.85).aspx

[-- Attachment #2: LCMapStringW.c --]
[-- Type: text/x-csrc, Size: 1439 bytes --]

#include <stdlib.h>
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#include <string.h>
#include <windows.h>

#define SIZE 32


void fill_bytes(uint8_t *a, int n) {
    int idx;
    for (idx=0; idx<n; idx++) {
        a[idx] = idx;
    }
}


void print_bytes(uint8_t *a, int n) {
    int idx;
    for (idx=0; idx<n; idx++) {
        printf("0x%02x ", ((uint8_t*)a)[idx]);
        if ((idx + 1) % 8 == 0) printf("\n");
    }
}

int main(void) {
    wchar_t *a, *b;
    uint8_t *aa;
    size_t ret;
    LCID collate_lcid;
    int idx;
    collate_lcid = 1033;
    b = L"b";
    a = (wchar_t*) malloc(SIZE);
    aa = (uint8_t*) a;

    setlocale(LC_ALL, "en_US.UTF-8");

    printf("using wcsxfrm:\n");
    fill_bytes(aa, SIZE);
    printf("before:\n");
    print_bytes(aa, SIZE);
    ret = wcsxfrm(a, b, 4);
    printf("after (%d):\n", ret);
    print_bytes(aa, SIZE);

    printf("\nusing LCMapStringW directly:\n");
    fill_bytes(aa, SIZE);
    printf("before:\n");
    print_bytes(aa, SIZE);
    
    ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY | LCMAP_BYTEREV, b, -1, a, 8);
    printf("after (%d):\n", ret);
    print_bytes(aa, SIZE);

    printf("\nwithout LCMAP_BYTEREV:\n");
    fill_bytes(aa, SIZE);
    printf("before:\n");
    print_bytes(aa, SIZE);
    
    ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY, b, -1, a, 8);
    printf("after (%d):\n", ret);
    print_bytes(aa, SIZE);
    free(a);
    
    return 0;
}

[-- Attachment #3: Type: text/plain, Size: 219 bytes --]


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug? wcsxfrm causing memory corruption
  2017-05-10  9:31 Bug? wcsxfrm causing memory corruption Erik Bray
@ 2017-05-21 17:18 ` Duncan Roe
  2017-05-22 13:39   ` Erik Bray
  0 siblings, 1 reply; 6+ messages in thread
From: Duncan Roe @ 2017-05-21 17:18 UTC (permalink / raw)
  To: cygwin

On Wed, May 10, 2017 at 11:30:46AM +0200, Erik Bray wrote:
> Greetings--
>
> In the process of fixing the Python test suite on Cygwin I ran across
> one test that was consistently causing segfaults later on, not
> directly local to that test.  The test involves wcsxfrm so that's
> where I focused my attention.
>
> The attached test demonstrates the bug.  Given an output buffer of N
> wide characters, wcsxfrm will cause bytes beyond the destination size
> to be reversed. I believe it might actually be a bug in the underlying
> LCMapStringW workhorse (this is on Windows 10; have not tested other
> versions).
>
> According to its docs [1], the cchDest argument (size of the
> destination buffer) is treated as a *byte* count when using
> LCMAP_SORTKEY.  However, for the purposes of applying the
> LCMAP_BYTEREV transformation it seems to be treating the output size
> (in bytes) as character count.  So in the example I give, where the
> output sort key is 7 bytes (including the null terminator), it swaps
> *14* bytes--the bytes including the sort key as well as the next 7
> adjacent bytes.  This is obviously a problem if the destination buffer
> is allocated out of some larger memory pool.
>
> This definitely has to be a bug, right?  Or at least very poorly
> documented on MS's part.  A workaround would either be to not use
> LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
> LCMapStringW with LCMAP_BYTEREV and the correct character count...
>
> Thanks,
> Erik
>
>
> [1] https://msdn.microsoft.com/en-us/library/windows/desktop/dd318700(v=vs.85).aspx

> #include <stdlib.h>
> #include <stdio.h>
> #include <locale.h>
> #include <wchar.h>
> #include <string.h>
> #include <windows.h>
>
> #define SIZE 32
>
>
> void fill_bytes(uint8_t *a, int n) {
>     int idx;
>     for (idx=0; idx<n; idx++) {
>         a[idx] = idx;
>     }
> }
>
>
> void print_bytes(uint8_t *a, int n) {
>     int idx;
>     for (idx=0; idx<n; idx++) {
>         printf("0x%02x ", ((uint8_t*)a)[idx]);
>         if ((idx + 1) % 8 == 0) printf("\n");
>     }
> }
>
> int main(void) {
>     wchar_t *a, *b;
>     uint8_t *aa;
>     size_t ret;
>     LCID collate_lcid;
>     int idx;
>     collate_lcid = 1033;
>     b = L"b";
>     a = (wchar_t*) malloc(SIZE);
>     aa = (uint8_t*) a;
>
>     setlocale(LC_ALL, "en_US.UTF-8");
>
>     printf("using wcsxfrm:\n");
>     fill_bytes(aa, SIZE);
>     printf("before:\n");
>     print_bytes(aa, SIZE);
>     ret = wcsxfrm(a, b, 4);
>     printf("after (%d):\n", ret);
>     print_bytes(aa, SIZE);
>
>     printf("\nusing LCMapStringW directly:\n");
>     fill_bytes(aa, SIZE);
>     printf("before:\n");
>     print_bytes(aa, SIZE);
>
>     ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY | LCMAP_BYTEREV, b, -1, a, 8);
>     printf("after (%d):\n", ret);
>     print_bytes(aa, SIZE);
>
>     printf("\nwithout LCMAP_BYTEREV:\n");
>     fill_bytes(aa, SIZE);
>     printf("before:\n");
>     print_bytes(aa, SIZE);
>
>     ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY, b, -1, a, 8);
>     printf("after (%d):\n", ret);
>     print_bytes(aa, SIZE);
>     free(a);
>
>     return 0;
> }

Hi Erik,

I get

using wcsxfrm:
before:
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
after (3):
0x09 0x0e 0x01 0x01 0x01 0x01 0x00 0x00
0x09 0x08 0x0b 0x0a 0x0d 0x0c 0x0e 0x0f
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f

using LCMapStringW directly:
before:
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
after (7):
0x09 0x0e 0x01 0x01 0x01 0x01 0x07 0x00
0x09 0x08 0x0b 0x0a 0x0d 0x0c 0x0e 0x0f
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f

without LCMAP_BYTEREV:
before:
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
after (7):
0x0e 0x09 0x01 0x01 0x01 0x01 0x00 0x07
0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f

Is that the same as you see?

Cheers ... Duncan.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug? wcsxfrm causing memory corruption
  2017-05-21 17:18 ` Duncan Roe
@ 2017-05-22 13:39   ` Erik Bray
  2017-06-06 15:57     ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Erik Bray @ 2017-05-22 13:39 UTC (permalink / raw)
  To: cygwin

On Sun, May 21, 2017 at 6:23 AM, Duncan Roe wrote:
> On Wed, May 10, 2017 at 11:30:46AM +0200, Erik Bray wrote:
>> Greetings--
>>
>> In the process of fixing the Python test suite on Cygwin I ran across
>> one test that was consistently causing segfaults later on, not
>> directly local to that test.  The test involves wcsxfrm so that's
>> where I focused my attention.
>>
>> The attached test demonstrates the bug.  Given an output buffer of N
>> wide characters, wcsxfrm will cause bytes beyond the destination size
>> to be reversed. I believe it might actually be a bug in the underlying
>> LCMapStringW workhorse (this is on Windows 10; have not tested other
>> versions).
>>
>> According to its docs [1], the cchDest argument (size of the
>> destination buffer) is treated as a *byte* count when using
>> LCMAP_SORTKEY.  However, for the purposes of applying the
>> LCMAP_BYTEREV transformation it seems to be treating the output size
>> (in bytes) as character count.  So in the example I give, where the
>> output sort key is 7 bytes (including the null terminator), it swaps
>> *14* bytes--the bytes including the sort key as well as the next 7
>> adjacent bytes.  This is obviously a problem if the destination buffer
>> is allocated out of some larger memory pool.
>>
>> This definitely has to be a bug, right?  Or at least very poorly
>> documented on MS's part.  A workaround would either be to not use
>> LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
>> LCMapStringW with LCMAP_BYTEREV and the correct character count...
>>
>> Thanks,
>> Erik
>>
>>
>> [1] https://msdn.microsoft.com/en-us/library/windows/desktop/dd318700(v=vs.85).aspx
>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <locale.h>
>> #include <wchar.h>
>> #include <string.h>
>> #include <windows.h>
>>
>> #define SIZE 32
>>
>>
>> void fill_bytes(uint8_t *a, int n) {
>>     int idx;
>>     for (idx=0; idx<n; idx++) {
>>         a[idx] = idx;
>>     }
>> }
>>
>>
>> void print_bytes(uint8_t *a, int n) {
>>     int idx;
>>     for (idx=0; idx<n; idx++) {
>>         printf("0x%02x ", ((uint8_t*)a)[idx]);
>>         if ((idx + 1) % 8 == 0) printf("\n");
>>     }
>> }
>>
>> int main(void) {
>>     wchar_t *a, *b;
>>     uint8_t *aa;
>>     size_t ret;
>>     LCID collate_lcid;
>>     int idx;
>>     collate_lcid = 1033;
>>     b = L"b";
>>     a = (wchar_t*) malloc(SIZE);
>>     aa = (uint8_t*) a;
>>
>>     setlocale(LC_ALL, "en_US.UTF-8");
>>
>>     printf("using wcsxfrm:\n");
>>     fill_bytes(aa, SIZE);
>>     printf("before:\n");
>>     print_bytes(aa, SIZE);
>>     ret = wcsxfrm(a, b, 4);
>>     printf("after (%d):\n", ret);
>>     print_bytes(aa, SIZE);
>>
>>     printf("\nusing LCMapStringW directly:\n");
>>     fill_bytes(aa, SIZE);
>>     printf("before:\n");
>>     print_bytes(aa, SIZE);
>>
>>     ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY | LCMAP_BYTEREV, b, -1, a, 8);
>>     printf("after (%d):\n", ret);
>>     print_bytes(aa, SIZE);
>>
>>     printf("\nwithout LCMAP_BYTEREV:\n");
>>     fill_bytes(aa, SIZE);
>>     printf("before:\n");
>>     print_bytes(aa, SIZE);
>>
>>     ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY, b, -1, a, 8);
>>     printf("after (%d):\n", ret);
>>     print_bytes(aa, SIZE);
>>     free(a);
>>
>>     return 0;
>> }
>
> Hi Erik,
>
> I get
>
> using wcsxfrm:
> before:
> 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
> after (3):
> 0x09 0x0e 0x01 0x01 0x01 0x01 0x00 0x00
> 0x09 0x08 0x0b 0x0a 0x0d 0x0c 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
>
> using LCMapStringW directly:
> before:
> 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
> after (7):
> 0x09 0x0e 0x01 0x01 0x01 0x01 0x07 0x00
> 0x09 0x08 0x0b 0x0a 0x0d 0x0c 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
>
> without LCMAP_BYTEREV:
> before:
> 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
> after (7):
> 0x0e 0x09 0x01 0x01 0x01 0x01 0x00 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f

Yes, that's the same.  Thanks for giving it a try--I should have
included example output in my original message.

You can see in the last case that without LCMAP_BYTEREV it writes the sequence

0x0e 0x09 0x01 0x01 0x01 0x01 0x00

with a terminating 0x00.  Bytes after that remain unchanged.  In the
other two examples *with* LCMAP_BYTEREV, the terminating 0x00 gets
swapped with the 0x07 after it, but this documented and expected
behavior of LCMapStringW, and is already accounted for in Cygwin's
wcsxfrm.  What is undocumented, and unexpected, is that it also byte
swaps 3 more byte pairs after the actual sort key, which can corrupt
memory unexpectedly.

Thanks,
Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug? wcsxfrm causing memory corruption
  2017-05-22 13:39   ` Erik Bray
@ 2017-06-06 15:57     ` Corinna Vinschen
  2017-06-06 19:22       ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2017-06-06 15:57 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2976 bytes --]

Hi Erik,

On May 22 14:50, Erik Bray wrote:
> On Sun, May 21, 2017 at 6:23 AM, Duncan Roe wrote:
> > On Wed, May 10, 2017 at 11:30:46AM +0200, Erik Bray wrote:
> >> [...]
> >> The attached test demonstrates the bug.  Given an output buffer of N
> >> wide characters, wcsxfrm will cause bytes beyond the destination size
> >> to be reversed. I believe it might actually be a bug in the underlying
> >> LCMapStringW workhorse (this is on Windows 10; have not tested other
> >> versions).
> >>
> >> According to its docs [1], the cchDest argument (size of the
> >> destination buffer) is treated as a *byte* count when using
> >> LCMAP_SORTKEY.  However, for the purposes of applying the
> >> LCMAP_BYTEREV transformation it seems to be treating the output size
> >> (in bytes) as character count.  So in the example I give, where the
> >> output sort key is 7 bytes (including the null terminator), it swaps
> >> *14* bytes--the bytes including the sort key as well as the next 7
> >> adjacent bytes.  This is obviously a problem if the destination buffer
> >> is allocated out of some larger memory pool.
> >>
> >> This definitely has to be a bug, right?  Or at least very poorly
> >> documented on MS's part.  A workaround would either be to not use
> >> LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
> >> LCMapStringW with LCMAP_BYTEREV and the correct character count...
> >> [...]
> > Hi Erik,
> >
> > I get
> > [...]
> Yes, that's the same.  Thanks for giving it a try--I should have
> included example output in my original message.
> 
> You can see in the last case that without LCMAP_BYTEREV it writes the sequence
> 
> 0x0e 0x09 0x01 0x01 0x01 0x01 0x00
> 
> with a terminating 0x00.  Bytes after that remain unchanged.  In the
> other two examples *with* LCMAP_BYTEREV, the terminating 0x00 gets
> swapped with the 0x07 after it, but this documented and expected
> behavior of LCMapStringW, and is already accounted for in Cygwin's
> wcsxfrm.  What is undocumented, and unexpected, is that it also byte
> swaps 3 more byte pairs after the actual sort key, which can corrupt
> memory unexpectedly.

Incredible piece of detective work.  Yeah, this looks very much like a
bug in LCMapStringW, embarrassingly one I didn't notice at the time of
writing the code.  I just tested this on W7 and the problem was already
present.

I don't know if this could be fixed by documentation.  There's just no
safe way to combine LCMAP_SORTKEY with LCMAP_BYTEREV it seems, unless
you always allocate a buffer at least twice as big as actually required
for the sort key.  

So, yeah, I think manual swaping after calling LCMapStringW without
LCMAP_BYTEREV is the way to go.  I'll prepare a fix.


Thanks for tracking down and the testcase,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug? wcsxfrm causing memory corruption
  2017-06-06 15:57     ` Corinna Vinschen
@ 2017-06-06 19:22       ` Corinna Vinschen
  2017-06-09 13:03         ` Erik Bray
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2017-06-06 19:22 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2365 bytes --]

On Jun  6 17:57, Corinna Vinschen wrote:
> > > On Wed, May 10, 2017 at 11:30:46AM +0200, Erik Bray wrote:
> > >> [...]
> > >> The attached test demonstrates the bug.  Given an output buffer of N
> > >> wide characters, wcsxfrm will cause bytes beyond the destination size
> > >> to be reversed. I believe it might actually be a bug in the underlying
> > >> LCMapStringW workhorse (this is on Windows 10; have not tested other
> > >> versions).
> > >>
> > >> According to its docs [1], the cchDest argument (size of the
> > >> destination buffer) is treated as a *byte* count when using
> > >> LCMAP_SORTKEY.  However, for the purposes of applying the
> > >> LCMAP_BYTEREV transformation it seems to be treating the output size
> > >> (in bytes) as character count.  So in the example I give, where the
> > >> output sort key is 7 bytes (including the null terminator), it swaps
> > >> *14* bytes--the bytes including the sort key as well as the next 7
> > >> adjacent bytes.  This is obviously a problem if the destination buffer
> > >> is allocated out of some larger memory pool.
> > >>
> > >> This definitely has to be a bug, right?  Or at least very poorly
> > >> documented on MS's part.  A workaround would either be to not use
> > >> LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
> > >> LCMapStringW with LCMAP_BYTEREV and the correct character count...
> > >> [...]
> Incredible piece of detective work.  Yeah, this looks very much like a
> bug in LCMapStringW, embarrassingly one I didn't notice at the time of
> writing the code.  I just tested this on W7 and the problem was already
> present.
> 
> I don't know if this could be fixed by documentation.  There's just no
> safe way to combine LCMAP_SORTKEY with LCMAP_BYTEREV it seems, unless
> you always allocate a buffer at least twice as big as actually required
> for the sort key.  
> 
> So, yeah, I think manual swaping after calling LCMapStringW without
> LCMAP_BYTEREV is the way to go.  I'll prepare a fix.
> 
> 
> Thanks for tracking down and the testcase,
> Corinna

Latest snapshot from https://cygwin.com/snapshots/ has the patch applied.
Please try.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug? wcsxfrm causing memory corruption
  2017-06-06 19:22       ` Corinna Vinschen
@ 2017-06-09 13:03         ` Erik Bray
  0 siblings, 0 replies; 6+ messages in thread
From: Erik Bray @ 2017-06-09 13:03 UTC (permalink / raw)
  To: cygwin

On Tue, Jun 6, 2017 at 9:22 PM, Corinna Vinschen wrote:
> On Jun  6 17:57, Corinna Vinschen wrote:
>> > > On Wed, May 10, 2017 at 11:30:46AM +0200, Erik Bray wrote:
>> > >> [...]
>> > >> The attached test demonstrates the bug.  Given an output buffer of N
>> > >> wide characters, wcsxfrm will cause bytes beyond the destination size
>> > >> to be reversed. I believe it might actually be a bug in the underlying
>> > >> LCMapStringW workhorse (this is on Windows 10; have not tested other
>> > >> versions).
>> > >>
>> > >> According to its docs [1], the cchDest argument (size of the
>> > >> destination buffer) is treated as a *byte* count when using
>> > >> LCMAP_SORTKEY.  However, for the purposes of applying the
>> > >> LCMAP_BYTEREV transformation it seems to be treating the output size
>> > >> (in bytes) as character count.  So in the example I give, where the
>> > >> output sort key is 7 bytes (including the null terminator), it swaps
>> > >> *14* bytes--the bytes including the sort key as well as the next 7
>> > >> adjacent bytes.  This is obviously a problem if the destination buffer
>> > >> is allocated out of some larger memory pool.
>> > >>
>> > >> This definitely has to be a bug, right?  Or at least very poorly
>> > >> documented on MS's part.  A workaround would either be to not use
>> > >> LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
>> > >> LCMapStringW with LCMAP_BYTEREV and the correct character count...
>> > >> [...]
>> Incredible piece of detective work.  Yeah, this looks very much like a
>> bug in LCMapStringW, embarrassingly one I didn't notice at the time of
>> writing the code.  I just tested this on W7 and the problem was already
>> present.
>>
>> I don't know if this could be fixed by documentation.  There's just no
>> safe way to combine LCMAP_SORTKEY with LCMAP_BYTEREV it seems, unless
>> you always allocate a buffer at least twice as big as actually required
>> for the sort key.
>>
>> So, yeah, I think manual swaping after calling LCMapStringW without
>> LCMAP_BYTEREV is the way to go.  I'll prepare a fix.
>>
>>
>> Thanks for tracking down and the testcase,
>> Corinna
>
> Latest snapshot from https://cygwin.com/snapshots/ has the patch applied.
> Please try.

I had a look at the fix and it looks good to me; thanks!

Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-09 13:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10  9:31 Bug? wcsxfrm causing memory corruption Erik Bray
2017-05-21 17:18 ` Duncan Roe
2017-05-22 13:39   ` Erik Bray
2017-06-06 15:57     ` Corinna Vinschen
2017-06-06 19:22       ` Corinna Vinschen
2017-06-09 13:03         ` Erik Bray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).