public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
From: "eggert at gnu dot org" <sourceware-bugzilla@sources.redhat.com>
To: glibc-bugs-regex@sources.redhat.com
Subject: [Bug regex/1278] regex undefined behavior with shifting past word length
Date: Fri, 02 Sep 2005 23:17:00 -0000	[thread overview]
Message-ID: <20050902231722.28797.qmail@sourceware.org> (raw)
In-Reply-To: <20050831193645.1278.eggert@gnu.org>


------- Additional Comments From eggert at gnu dot org  2005-09-02 23:17 -------
Andreas is right.  For example, "unsigned long int x = ~0u;" will not
have an all-1s value on most 64-bit hosts.

In this particular hunk, ~0u would also work since the destination
type is unsigned short int.  So if you'd really rather use ~0u I
guess that would be OK.  However, as a style matter, it is confusing
to use ~0u in some unsigned contexts, while using -1 in other unsigned
contexts.  Since -1 always works, it's more consistent to use it in
all unsigned contexts.

For example, suppose someone later changes eps_reachable_subexps_map
from unsigned short int to unsigned long int, for performance reasons.
If the code used ~0u here, it would have to be changed to ~ (unsigned
long int) 0, and it's quite possible that people would forget to make
that change.  Whereas if we simply change it to -1 now, it will work
regardless of later changes like this.

I should mention that the situation is different in signed contexts.
In general one must use ~ (SIGNED_TYPE) 0 in that case to get an
all-1s pattern.  But signed bit-twiddling is trickier (since one must
in general worry about ~0 == 0 and overflow issues), and I'd rather
that the regex code stuck with unsigned unsigned bit-twiddling.


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=1278

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


  parent reply	other threads:[~2005-09-02 23:17 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-31 19:37 [Bug regex/1278] New: " eggert at gnu dot org
2005-08-31 19:37 ` [Bug regex/1278] " eggert at gnu dot org
2005-09-01  7:04 ` paolo dot bonzini at lu dot unisi dot ch
2005-09-01 10:00 ` schwab at suse dot de
2005-09-01 22:29 ` eggert at gnu dot org
2005-09-02  6:17 ` paolo dot bonzini at lu dot unisi dot ch
2005-09-02 10:21 ` schwab at suse dot de
2005-09-02 23:17 ` eggert at gnu dot org [this message]
2005-09-06  7:32 ` eggert at gnu dot org
2005-09-06 23:30 ` drepper at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050902231722.28797.qmail@sourceware.org \
    --to=sourceware-bugzilla@sources.redhat.com \
    --cc=glibc-bugs-regex@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).