public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Possibly buggy use of ctype.h macros.
@ 2024-01-02 22:18 Kaz Kylheku
  2024-01-02 23:25 ` Takashi Yano
  0 siblings, 1 reply; 6+ messages in thread
From: Kaz Kylheku @ 2024-01-02 22:18 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

Hi All, 

I noticed that this macro, defined in winsup/cygwin/local_includes/path.h: 

  #define isdrive(s) (isalpha (*(s)) && (s)[1] == ':') 

is being used with arguments of type char, like dereferenced "char *" pointers. 

Unless the isalpha implementation is robust against this, it should be isalpha((unsigned char) *(s)).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possibly buggy use of ctype.h macros.
  2024-01-02 22:18 Possibly buggy use of ctype.h macros Kaz Kylheku
@ 2024-01-02 23:25 ` Takashi Yano
  2024-01-02 23:56   ` Kaz Kylheku
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Yano @ 2024-01-02 23:25 UTC (permalink / raw)
  To: cygwin; +Cc: Kaz Kylheku

On Tue, 02 Jan 2024 14:18:15 -0800
Kaz Kylheku via Cygwin <cygwin@cygwin.com> wrote:
> I noticed that this macro, defined in winsup/cygwin/local_includes/path.h: 
> 
>   #define isdrive(s) (isalpha (*(s)) && (s)[1] == ':') 
> 
> is being used with arguments of type char, like dereferenced "char *" pointers. 
> 
> Unless the isalpha implementation is robust against this, it should be isalpha((unsigned char) *(s)).


https://cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/ctype/ctype_.h;h=a73870b3e4ade5545fd41bcc6f10446a22b4b46e;hb=HEAD

https://cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/ctype/ctype_.c;h=32ce4f3187dc528c3103d6d884708d7f364a6698;hb=HEAD#l82

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possibly buggy use of ctype.h macros.
  2024-01-02 23:25 ` Takashi Yano
@ 2024-01-02 23:56   ` Kaz Kylheku
  2024-01-03  0:11     ` Takashi Yano
  0 siblings, 1 reply; 6+ messages in thread
From: Kaz Kylheku @ 2024-01-02 23:56 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

On 2024-01-02 15:25, Takashi Yano via Cygwin wrote:
> On Tue, 02 Jan 2024 14:18:15 -0800
> Kaz Kylheku via Cygwin <cygwin@cygwin.com> wrote:
>> I noticed that this macro, defined in winsup/cygwin/local_includes/path.h:
>>
>>   #define isdrive(s) (isalpha (*(s)) && (s)[1] == ':')
>>
>> is being used with arguments of type char, like dereferenced "char *" pointers.
>>
>> Unless the isalpha implementation is robust against this, it should be isalpha((unsigned char) *(s)).
> 
> 
> https://cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/ctype/ctype_.h;h=a73870b3e4ade5545fd41bcc6f10446a22b4b46e;hb=HEAD

I see, so on Cygwin, the negative index is allowed, thanks to a displacement.

# define DEFAULT_CTYPE_PTR      ((char *) _ctype_b + 127)

But, 127? A (signed) char can go to -128. '\x80' will access _ctype_b[-1]. Is that also somehow OK?

(Tangential remark: all ctype implementations have to handle at least one negative value, namely EOF.)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possibly buggy use of ctype.h macros.
  2024-01-02 23:56   ` Kaz Kylheku
@ 2024-01-03  0:11     ` Takashi Yano
  2024-01-03  3:59       ` Kaz Kylheku
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Yano @ 2024-01-03  0:11 UTC (permalink / raw)
  To: cygwin; +Cc: Kaz Kylheku

On Tue, 02 Jan 2024 15:56:00 -0800
Kaz Kylheku <kaz@kylheku.com> wrote:

> On 2024-01-02 15:25, Takashi Yano via Cygwin wrote:
> > On Tue, 02 Jan 2024 14:18:15 -0800
> > Kaz Kylheku via Cygwin <cygwin@cygwin.com> wrote:
> >> I noticed that this macro, defined in winsup/cygwin/local_includes/path.h:
> >>
> >>   #define isdrive(s) (isalpha (*(s)) && (s)[1] == ':')
> >>
> >> is being used with arguments of type char, like dereferenced "char *" pointers.
> >>
> >> Unless the isalpha implementation is robust against this, it should be isalpha((unsigned char) *(s)).
> > 
> > 
> > https://cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/ctype/ctype_.h;h=a73870b3e4ade5545fd41bcc6f10446a22b4b46e;hb=HEAD
> 
> I see, so on Cygwin, the negative index is allowed, thanks to a displacement.
> 
> # define DEFAULT_CTYPE_PTR      ((char *) _ctype_b + 127)
> 
> But, 127? A (signed) char can go to -128. '\x80' will access _ctype_b[-1]. Is that also somehow OK?
> 
> (Tangential remark: all ctype implementations have to handle at least one negative value, namely EOF.)

https://cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/ctype/isalpha.c;h=8b8e78a296359a4a8b3687fd76b1b73cd707a6da;hb=HEAD

Perhaps, the off-by-one is for EOF as you guess.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possibly buggy use of ctype.h macros.
  2024-01-03  0:11     ` Takashi Yano
@ 2024-01-03  3:59       ` Kaz Kylheku
  2024-01-03  6:58         ` Takashi Yano
  0 siblings, 1 reply; 6+ messages in thread
From: Kaz Kylheku @ 2024-01-03  3:59 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

On 2024-01-02 16:11, Takashi Yano via Cygwin wrote:
> Perhaps, the off-by-one is for EOF as you guess.

I doubt it. If EOF were out of range of char, it would have to be -129 or less,
so that -127 would look even more wrong.

I see EOF is just -1. That value will also be produced by '\xFF', or "\xff"[0], etc.

This -127 displacement is worth looking into; there is a smell
emanating from it.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Possibly buggy use of ctype.h macros.
  2024-01-03  3:59       ` Kaz Kylheku
@ 2024-01-03  6:58         ` Takashi Yano
  0 siblings, 0 replies; 6+ messages in thread
From: Takashi Yano @ 2024-01-03  6:58 UTC (permalink / raw)
  To: cygwin

On Tue, 02 Jan 2024 19:59:57 -0800
Kaz Kylheku <kaz@kylheku.com> wrote:
> On 2024-01-02 16:11, Takashi Yano via Cygwin wrote:
> > Perhaps, the off-by-one is for EOF as you guess.
> 
> I doubt it. If EOF were out of range of char, it would have to be -129 or less,
> so that -127 would look even more wrong.
> 
> I see EOF is just -1. That value will also be produced by '\xFF', or "\xff"[0], etc.

In systems other than cygwin, ALLOW_NEGATIVE_CTYPE_INDEX might not
be set. In that case, 

 124 const char _ctype_[1 + 256] = {
 125         0,
 126         _CTYPE_DATA_0_127,
 127         _CTYPE_DATA_128_255
 128 };

is used for __CTYPE_PTR.

So, isalpha(EOF) reffers to 0 in line 125 via this trick.

  45 int
  46 isalpha (int c)
  47 {
  48         return(__CTYPE_PTR[c+1] & (_U|_L));
  49 }

In cygwin, both isalpha((char*)0xff) and isalpha(EOF) reffers to
_CTYPE_DATA_128_255[127] in line 89, while isalpha((unsigned char*)0xff)
reffers to _CTYPE_DATA_128_255[127] in line 91. 

  88 char _ctype_b[128 + 256] = {
  89         _CTYPE_DATA_128_255,
  90         _CTYPE_DATA_0_127,
  91         _CTYPE_DATA_128_255
  92 };

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-03  6:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-02 22:18 Possibly buggy use of ctype.h macros Kaz Kylheku
2024-01-02 23:25 ` Takashi Yano
2024-01-02 23:56   ` Kaz Kylheku
2024-01-03  0:11     ` Takashi Yano
2024-01-03  3:59       ` Kaz Kylheku
2024-01-03  6:58         ` Takashi Yano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).