public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
From: "panda.trooper" <panda.trooper@protonmail.com>
To: "newlib@sourceware.org" <newlib@sourceware.org>
Subject: Re: [EXTERNAL]: Re: Why int32_t is long int on 32 Bit Intel?
Date: Fri, 28 Jul 2023 18:27:20 +0000	[thread overview]
Message-ID: <5R8qxH6_HYtlUHJwlmty3kXezOsTViDs7SS5LKuaYjuv0deTlTbaqrPDHb9jMAlRkhEPjUvslRMThonV8TOEnmMvOXKFEAk5DZAGXvfmubA=@protonmail.com> (raw)
In-Reply-To: <MN2PR13MB4510E6C22E4E3DA58A189829C406A@MN2PR13MB4510.namprd13.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 8781 bytes --]

Thanks Mike pointing on the original question. I would like to describe the problem with another example to get across what I do not understand.

I would also like to note, that regarding C/C++ standard, there is no issue. The standard only specifies sizes for fixed size types, it does not specify the actual types which they alias. This issue is just about (my) expectations and curiosity.

It is also important, that the issue only existusing C++. There is no issue when using C, because C does not support overloading.

Let's assume we have a library that implements following functions:

void foo(int n) {
// go left n steps
}

void foo(long n) {
// go right n steps
}

Who would write such silly code, you may ask. Beiive me, there is such code out there. Sometimes legacy code that you cannot change.

When using this library you may have such code somewhere:

#include <cstdint>

void bar(int32_t n) {
foo(n);
}

Where are we going now? Left or right? The answer is, as often in C++: it depends. It depends on the compiler you are using and what your target architecture is.

Does it also depend on the standard C library we are using? Lately I learned, yes. But I did not expect it.

When I compile the example code for x86-32 with GCC and glibc, I go left. The same with 32 Bit MinGW. Only with newlib I go right.

In my real life project I have a similar situation. I have the same codebase for an embedded system that I compile with an i686-elf compiler and newlib, and there is a simulation of that embedded system that I compile with a native x86_64-unknown-linux-gnu compiler (-m32).

I realized, that I may get different behavior in my embedded setup and in my simulation because of this. This issue has effects on my type system. I cannot use int32_t, because I am not in control of the underlying type even on the very same CPU architecture! I find this amazing.

So again, just out of curiosity: what is the reason using long as int32_t on an architecture where int is suitable, too?

------- Original Message -------
On Friday, July 28th, 2023 at 17:49, Mike Burgess <Mike.Burgess@coherent.com> wrote:

> I think the original question was, “Why, on a 32-bit architecture, does int32_t (and by extension, size_t) become `long` instead of just `int`?” This becomes an immediate portability headache; I have several compilers—for the _same_ architecture—that make different choices, which means I need to to use conditional compilation, casts, or other ugly work-arounds or suffer misguided warnings (i.e., printf argument type mismatch). Printf has a pretty good solution for size_t nowadays, but not so much for int32_t.
>
> Mike
> ---------------------------------------------------------------
>
> From: Newlib <newlib-bounces+mburgess=ii-vi.com@sourceware.org> on behalf of Joel Sherrill <joel@rtems.org>
> Sent: Friday, July 28, 2023 10:15 AM
> To: Anders Montonen <Anders.Montonen@iki.fi>
> Cc: panda.trooper <panda.trooper@protonmail.com>; newlib@sourceware.org <newlib@sourceware.org>
> Subject: [EXTERNAL]: Re: Why int32_t is long int on 32 Bit Intel?
>
> In my experience which dates back to the 80s including 80186 development
> and decades with RTEMS, an int matches the native register size.
>
> As a general rule, 16-bit CPUs have 16 bit int, 32-bit CPUs have 32-bit
> int, and 64-bit CPUs have 64-bit ints.
>
> There may be compiler options to change the register model but this means
> all source must be compiled with this option. The aarch64 has LP64 (native
> 64-bit) and ILP32 (like 32-bit ARM) and this is the option description from
> GCC:
>
> -mabi=name
> <https://urldefense.com/v3/__https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html*index-mabi__;Iw!!BEJPKrpf!7JFtIWFkyYyyceaUH2Y8aZISARdyWaefFCB4tKJMmxFfxyXb7rJrv4c919c-KibqHRBVWjN-W2E$ [gcc[.]gnu[.]org]>
>
> Generate code for the specified data model. Permissible values are ‘ilp32’
> for SysV-like data model where int, long int and pointers are 32 bits, and ‘
> lp64’ for SysV-like data model where int is 32 bits, but long int and
> pointers are 64 bits.
>
> The default depends on the specific target configuration. Note that the
> LP64 and ILP32 ABIs are not link-compatible; you must compile your entire
> program with the same ABI, and link with a compatible set of libraries.
> If you look at the C standard, you want to look at "5.2.4.2.1 Sizes of
> integer types <limits.h>" in C99. This defines the minimum ranges of each
> integer type. Picking one of the values at random, this is a typical
> description:
>
> — maximum value for an object of type int
> INT_MAX +32767 // 2 15 − 1
>
> If you want another esoteric area, char may be signed or unsigned and it
> varies based on architecture even with GCC. I don't remember the exact
> distribution but RTEMS supports 18 processor architectures and I think the
> split is about 1/3 one way.
>
> --joel sherrill
> RTEMS
>
> On Fri, Jul 28, 2023 at 8:23 AM Anders Montonen <Anders.Montonen@iki.fi>
> wrote:
>
>> Hi,
>>
>> > On 28 Jul 2023, at 11:06, panda.trooper <panda.trooper@protonmail.com>
>> wrote:
>> >
>> >> On 2023-07-27 05:55, panda.trooper wrote:
>> >>
>> >>> Hi, can somebody explain what is the reason behind the architectural
>> decision that on x86 the type of int32_t is long int by default and not int
>> when using newlib?
>> >>
>> >>
>> >> Lots of embedded processors have 16 bit int and 32 bit long, and 80186
>> >> compatibles are still being produced and sold, although gcc -m16 now has
>> >> limitations.
>> >>
>> >> [The ancient PDP-11 is still supported by gcc 13:
>> >>
>> >>
>> https://urldefense.com/v3/__https://gcc.gnu.org/onlinedocs/gcc/gcc-command-options/machine-dependent-options/pdp-11-options.html__;!!BEJPKrpf!7JFtIWFkyYyyceaUH2Y8aZISARdyWaefFCB4tKJMmxFfxyXb7rJrv4c919c-KibqHRBVYzVGYH4$ [gcc[.]gnu[.]org]
>> >>
>> >> probably because it may still be exemplary CISC ISA in comp arch
>> courses using
>> >> simulators like SimH et al.]
>> >>
>> >> --
>> >> Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
>> >>
>> >> La perfection est atteinte Perfection is achieved
>> >> non pas lorsqu'il n'y a plus rien à ajouter not when there is no more
>> to add
>> >> mais lorsqu'il n'y a plus rien à retirer but when there is no more to
>> cut
>> >> -- Antoine de Saint-Exupéry
>> >
>> > Ok, I understand, some embedded systems have 16 bit int. But why not
>> looking first if int is 32 bit and if yes, selecting that type as int32_t,
>> and if the size doesn't fit, look for other types?
>> >
>> > I am on x86 (32 bit) and have C++ code like this:
>> >
>> > void foo(long) {}
>> > void foo(int) {}
>> >
>> > Now this compiles with bot, my native Linux GCC and with my newlib based
>> i686-elf cross compiler. If I change this to this:
>> >
>> > void foo(long) {}
>> > void foo(int32_t) {}
>> >
>> > then it will still compile with native Linux GCC (int32_t is int) but
>> will fail with newlib i686-elf cross GCC, because both types are the same.
>> The newlib behavior is kind of unintuitive to me. It is correct, because
>> the standard only defines the size of the type, not the exact type. But I
>> would not expect to get different types on the same CPU architecture with
>> the same compiler just because I am using a different standard C library.
>> >
>> > Is this expectation wrong? I am unsure.
>>
>> The representation of data types is determined by the ABI. Most, if not
>> all, x86-32 ABIs use 4-byte longs. These things would probably have been
>> decided in the 80s, when the i386 was introduced.
>>
>> https://urldefense.com/v3/__http://agner.org/optimize/calling_conventions.pdf__;!!BEJPKrpf!7JFtIWFkyYyyceaUH2Y8aZISARdyWaefFCB4tKJMmxFfxyXb7rJrv4c919c-KibqHRBVxNCABro$ [agner[.]org]
>> https://urldefense.com/v3/__http://www.sco.com/developers/devspecs/abi386-4.pdf__;!!BEJPKrpf!7JFtIWFkyYyyceaUH2Y8aZISARdyWaefFCB4tKJMmxFfxyXb7rJrv4c919c-KibqHRBVAG9vnH0$ [sco[.]com]
>>
>> -a
> ---------------------------------------------------------------
> This email is from Coherent Corp. or a Coherent group company. The contents of this email, including any attachments, are intended solely for the intended recipient and may contain Coherent proprietary and/or confidential information and material. Any review, use, disclosure, re-transmission, dissemination, distribution, retention, or copying of this email and any of its contents by any person other than the intended recipient is strictly prohibited. If you received this email in error, please immediately notify the sender and destroy any and all copies of this email and any attachments. To contact us directly, please email postmaster@coherent.com.
>
> Privacy: For information about how Coherent processes personal information, please review our privacy policy at https://ii-vi.com/privacy/.

  parent reply	other threads:[~2023-07-28 18:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-27 11:55 panda.trooper
2023-07-27 23:13 ` Brian Inglis
2023-07-28  8:06   ` panda.trooper
2023-07-28 13:23     ` Anders Montonen
2023-07-28 14:15       ` Joel Sherrill
2023-07-28 15:49         ` [EXTERNAL]: " Mike Burgess
2023-07-28 16:11           ` Stefan Tauner
2023-07-28 16:26             ` Mike Burgess
2023-07-28 17:06               ` Richard Damon
2023-07-28 18:05                 ` Grant Edwards
2023-07-28 18:27           ` panda.trooper [this message]
2023-07-28 20:23             ` Jon Beniston
2023-07-28 21:42               ` panda.trooper
2023-07-28 21:53                 ` Jon Beniston
2023-07-29 12:47                   ` Trampas Stern
2023-07-29 13:19                     ` Stefan Tauner
2023-07-29 21:21                     ` Grant Edwards

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='5R8qxH6_HYtlUHJwlmty3kXezOsTViDs7SS5LKuaYjuv0deTlTbaqrPDHb9jMAlRkhEPjUvslRMThonV8TOEnmMvOXKFEAk5DZAGXvfmubA=@protonmail.com' \
    --to=panda.trooper@protonmail.com \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).