From: Paul Eggert <eggert@cs.ucla.edu>
To: Alejandro Colomar <alx.manpages@gmail.com>,
A <amit234234234234@gmail.com>,
libc-alpha@sourceware.org
Subject: Re: size_t vs long.
Date: Thu, 17 Nov 2022 13:39:08 -0800 [thread overview]
Message-ID: <27229b18-673b-d038-9a4c-c32c50ca547c@cs.ucla.edu> (raw)
In-Reply-To: <380b196e-b78e-3b0e-7399-ee106b0e716c@gmail.com>
>> Second and more important, that code is bogus. Nobody should ever write code like that. If I wrote code like that, I'd *want* a trap.
>
> for (size_t i = 41; i < sizeof A / sizeof A[0]; --i) {
> A[i] = something_nice;
> }
>
> The code above seems a bug by not being used to it. Once you get used to it, it can become natural, but let's go for the more natural:
>
>
> for (size_t i = 0; i < sizeof A / sizeof A[0]; ++i) {
> A[i] = something_nice;
> }
Those loops do not mean the same thing. The first is bogus; the second
one is OK (notice, the bogus loop has a "41", the OK loop doesn't).
I'm not surprised you didn't notice how bogus the first loop was - most
people wouldn't notice it either. And it's Gustedt's main point! I don't
know why he went off the rails with that overly-clever code, but he did.
> The main advantage of this code compared to the equivalent ssize_t or ptrdiff_t or idx_t code is that if you somehow write an off-by-one error, and manage to access the array at [-1], if i is unsigned you'll access [SIZE_MAX], which will definitely crash your program.
That's not true on the vast majority of today's platforms, which don't
have subscript checking, and for which a[-1] is treated the same way
a[SIZE_MAX] is. On my platform (Fedora 36 x86-64) the same machine code
is generated for 'a' and 'b' for the following C code.
#include <stdint.h>
int a(int *p) { return p[-1]; }
int b(int *p) { return p[SIZE_MAX]; }
Yes, debugging implementations might catch p[SIZE_MAX], but the ones
that do will likely catch p[-1] as well.
In short, there's little advantage to using size_t for indexes, and
there are real disadvantages due to comparison confusion and lack of
signed integer overflow checking.
>> First, Gustedt technically incorrect, because the code *can* trap on
>> platforms where SIZE_MAX <= INT_MAX,
> I honestly don't know of any existing platforms where that is true
They're a dying breed. The main problem from my point of view is that C
and POSIX allow these oddballs, so if you want to write really portable
code you have to worry about them - and this understadably discourages
people from writing really portable code. (What's the point of coding to
the standards if it's just a bunch of make-work?)
Anyway, one example is Unisys Clearpath C, in which INT_MAX and SIZE_MAX
both equal 2**39 - 1. This is allowed by the current POSIX and C
standards, and this compiler is still for sale and supported. (I doubt
whether they'll port it to C23, so there's that....)
> C23 will require that signed integers are 2's complement, which I guess
> removes the possibility of a trap
It doesn't remove the possibility, since signed integers can have trap
representations. But we are straying from the more important point.
next prev parent reply other threads:[~2022-11-17 21:39 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-17 7:02 A
2022-11-17 9:21 ` Alejandro Colomar
2022-11-17 9:48 ` A
2022-11-17 11:00 ` Alejandro Colomar
2022-11-17 19:40 ` Jason Duerstock
2022-11-17 20:01 ` Alejandro Colomar
2022-11-17 19:17 ` Paul Eggert
2022-11-17 20:27 ` Alejandro Colomar
2022-11-17 21:39 ` Paul Eggert [this message]
2022-11-17 23:04 ` Alejandro Colomar
2022-11-23 20:08 ` Using size_t to crash on off-by-one errors (was: size_t vs long.) Alejandro Colomar
2022-11-18 2:11 ` size_t vs long Maciej W. Rozycki
2022-11-18 2:47 ` Paul Eggert
2022-11-23 20:01 ` Alejandro Colomar
2022-11-17 21:58 ` DJ Delorie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27229b18-673b-d038-9a4c-c32c50ca547c@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=alx.manpages@gmail.com \
--cc=amit234234234234@gmail.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).