From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <alves.ped@gmail.com>
Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com
 [209.85.128.46])
 by sourceware.org (Postfix) with ESMTPS id A866D394D896
 for <binutils@sourceware.org>; Fri,  6 May 2022 10:43:30 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A866D394D896
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=palves.net
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-wm1-f46.google.com with SMTP id
 a14-20020a7bc1ce000000b00393fb52a386so6622323wmj.1
 for <binutils@sourceware.org>; Fri, 06 May 2022 03:43:30 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
 :content-language:to:cc:references:from:in-reply-to
 :content-transfer-encoding;
 bh=441i+PQ1FSNN6PiN90soICOvWjrCyRjmxzxIoT2vzo4=;
 b=7gf/8wVQeeE3nX2eSRl1/V0KQiAB0Oift2DVRROKntGQwoBdSVoQEmUt9nVQDkP1E3
 PT0Uz0O4c4Y9V9cP/selyjOMfG2wDcwOh9OdsjBZxyWoiN3jlcSCc+68+4ix9BZSzk0n
 KW6NBp6DYTU8zaJl+qBmjhB6ug/Ps5EC159F1pyQMK03pU9+E/3Duy5I4fSsH7PhH2Gu
 r8arVnA4Gdaq+UvbH+P9HzD+T1EaSWbvYd1gII326LnE7qbsz7Vu+iqtDNgx6UOJNop8
 iLv1eN8MP/yW9Z3jL9IoYtLG/m8OJWCYxuqGosJlIwj9nct1wd4fzW4dsPOPfSncSJJ/
 7rzA==
X-Gm-Message-State: AOAM530KU5kdLTIF69xlcfyntbqoQKYuP94vDy5I3m/4bJHjCG7ekXsc
 ACm0ZYT7JbdAVldoLPnLn9U=
X-Google-Smtp-Source: ABdhPJx5iUvmliB53D9YPExb/geE+pwVAHy3+O8RGXR0GDlTPRESW3wML86tI/B7861BvvJER9cJDA==
X-Received: by 2002:a05:600c:4f8d:b0:394:663d:3fff with SMTP id
 n13-20020a05600c4f8d00b00394663d3fffmr2784203wmq.24.1651833809529; 
 Fri, 06 May 2022 03:43:29 -0700 (PDT)
Received: from ?IPV6:2001:8a0:f924:2600:209d:85e2:409e:8726?
 ([2001:8a0:f924:2600:209d:85e2:409e:8726])
 by smtp.gmail.com with ESMTPSA id
 p11-20020a1c544b000000b003942a244f30sm9609206wmi.9.2022.05.06.03.43.28
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Fri, 06 May 2022 03:43:28 -0700 (PDT)
Message-ID: <8d88e540-d3b2-8a4c-05b0-a4020bb57816@palves.net>
Date: Fri, 6 May 2022 11:43:27 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.8.1
Subject: Re: 32-bit archs, want64=true, and gas integers
Content-Language: en-US
To: Jan Beulich <jbeulich@suse.com>
Cc: Hans-Peter Nilsson <hp@axis.com>, binutils@sourceware.org,
 Alan Modra <amodra@gmail.com>
References: <20220504075628.81292-1-luis.machado@arm.com>
 <YnI0kxET/7rV9g37@squeak.grove.modra.org>
 <88ada15d-c371-df10-368e-f1c9fb91c289@arm.com>
 <YnI73gXIaDL3pMwh@squeak.grove.modra.org>
 <20220504143703.58AD620462@pchp3.se.axis.com>
 <YnMAEzJFbSnDmJIV@squeak.grove.modra.org>
 <ccfa44e7-7435-4e80-1054-cec678431a16@palves.net>
 <YnRyTm3jJDf0dBN4@squeak.grove.modra.org>
 <7fc3846e-6498-ee3b-efce-a28efe9ce519@palves.net>
 <0c7d04f6-06fe-3f8e-93ea-f33418213be4@suse.com>
 <ffcc3952-67e2-86d2-852c-368955d9ebb8@palves.net>
 <36427241-7f48-6729-1dbd-c575d794ad17@suse.com>
From: Pedro Alves <pedro@palves.net>
In-Reply-To: <36427241-7f48-6729-1dbd-c575d794ad17@suse.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,
 FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0,
 HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: binutils@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Binutils mailing list <binutils.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/binutils>,
 <mailto:binutils-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/binutils/>
List-Post: <mailto:binutils@sourceware.org>
List-Help: <mailto:binutils-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/binutils>,
 <mailto:binutils-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Fri, 06 May 2022 10:43:33 -0000

On 2022-05-06 11:17, Jan Beulich wrote:
> On 06.05.2022 12:01, Pedro Alves wrote:
>> On 2022-05-06 10:55, Jan Beulich wrote:
>>> On 06.05.2022 11:00, Pedro Alves wrote:
>>>> Looking at the "Numbers" chapter in the manual:
>>>>
>>>>  https://sourceware.org/binutils/docs/as/Numbers.html
>>>>
>>>> we have:
>>>>
>>>>  "Integers are numbers that would fit into an int in the C language. Bignums are integers, but they are stored in more than 32 bits."
>>>>
>>>> I note it says "more than 32 bits", not "32 bits or more".
>>>>
>>>> By my reading of the "Integers" and "Bignums" subsections, these integers are always signed.  Maybe
>>>> that should be explicitly said in the manual, instead of introducing signed vs unsigned numbers?
>>>>
>>>> And then, couldn't we make gas use int32_t for integers, and int64_t for Bignums (and clarify in the manual
>>>> that "more than 32 bits" is exactly "64 bits"?  IOW, use int64_t instead of bfd_vma in the
>>>> expression evaluation stuff.  
>>>
>>> Bignums are quite a bit wider than 64 bits, and on 64-bit architectures
>>> using just int32_t for integers is definitely insufficient (there are
>>> pretty limited operations one can do on bignums). I'm afraid the doc is
>>> simply outdated in talking about "int in C language"; it's more like
>>> "long", and even then only in Unix-like environments, so it would likely
>>> be even better to talk about {,u}intptr_t.
>>
>> But why {,u}intptr_t?
> 
> Because it's used for address calculations.
> 
>>  Why does it have to be the size of a pointer, which 
>> raises the question of -- which pointer, host or target?
> 
> Target.
> 
>>  And if you support
>> multiple targets, which target?
> 
> That of the biggest target. But that's no an issue in e.g. gas (and
> hence with expression evaluation), which only supports a single target
> (but perhaps multiple sub-targets) at a time. Sub-targets with more
> narrow address size indeed aren't always properly handled.

Sure, but a gas implementation detail is leaking into bfd, and contributing to
a 32-bit vs 64-bit mess in multiple tools that do support multiple targets properly,
which is what I'm trying to see if we could decouple.

> 
>> Couldn't gas always use int64_t for integers instead?
> 
> This might be possible. Once a 128-bit arch appears it'll end up
> outdated again, though.

I've never looked at gas code before, but I found the the code that parses integers,
and it seems to automatically handle integers larger than 64-bit, by promoting to bignum.
Maybe the ideal or the original intent was for integers to behave as as they had unlimited
precision, with non-bignum integers being an optimization?  And then make sure that all
operations, including division would handle bignums too?

Anyhow, here's a tentative patch of what I was imagining.  I'm not set up for testing
this properly.  I was hoping one of you guys would like the idea so much that
you'd run with it.  :-)

If something like this went in, then I assume that cris could go back to not setting
want64=yes.

>From 7fbbecc9eacd7e5dac076bfc25b099b9ff0ca1ac Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Fri, 6 May 2022 11:15:11 +0100
Subject: [PATCH] gas: make non-bignum integers be always 64-bit

Currently, on 32-bit hosts, gas integers are either 32-bit or 64-bit,
depending on --enable-64-bit-bfd.  Make valueT be always uint64_t to
make gas behave the same on all hosts.

Change-Id: I6258dfcd975ea9abb9cc4eb1d4e604a077df3033
---
 gas/as.h   |  2 +-
 gas/expr.c | 65 ++++++++++++++----------------------------------------
 gas/expr.h |  1 -
 3 files changed, 18 insertions(+), 50 deletions(-)

diff --git a/gas/as.h b/gas/as.h
index 135abc8f23d..5b636cfd59a 100644
--- a/gas/as.h
+++ b/gas/as.h
@@ -139,7 +139,7 @@ typedef bfd_vma addressT;
 typedef bfd_signed_vma offsetT;
 
 /* Type of symbol value, etc.  For use in prototypes.  */
-typedef addressT valueT;
+typedef uint64_t valueT;
 
 #ifndef COMMON
 #ifdef TEST
diff --git a/gas/expr.c b/gas/expr.c
index 6ad8bee2733..5dce12fe99d 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -255,14 +255,6 @@ floating_constant (expressionS *expressionP)
   expressionP->X_add_number = -1;
 }
 
-uint32_t
-generic_bignum_to_int32 (void)
-{
-  return ((((uint32_t) generic_bignum[1] & LITTLENUM_MASK)
-	   << LITTLENUM_NUMBER_OF_BITS)
-	  | ((uint32_t) generic_bignum[0] & LITTLENUM_MASK));
-}
-
 uint64_t
 generic_bignum_to_int64 (void)
 {
@@ -288,31 +280,24 @@ integer_constant (int radix, expressionS *expressionP)
   char *name;		/* Points to name of symbol.  */
   symbolS *symbolP;	/* Points to symbol.  */
 
-  int small;			/* True if fits in 32 bits.  */
+  bool small;			/* True if fits in 64 bits.  */
 
-  /* May be bignum, or may fit in 32 bits.  */
-  /* Most numbers fit into 32 bits, and we want this case to be fast.
-     so we pretend it will fit into 32 bits.  If, after making up a 32
+  /* May be bignum, or may fit in 64 bits.  */
+  /* Most numbers fit into 64 bits, and we want this case to be fast.
+     so we pretend it will fit into 64 bits.  If, after making up a 64
      bit number, we realise that we have scanned more digits than
-     comfortably fit into 32 bits, we re-scan the digits coding them
+     comfortably fit into 64 bits, we re-scan the digits coding them
      into a bignum.  For decimal and octal numbers we are
      conservative: Some numbers may be assumed bignums when in fact
-     they do fit into 32 bits.  Numbers of any radix can have excess
+     they do fit into 64 bits.  Numbers of any radix can have excess
      leading zeros: We strive to recognise this and cast them back
      into 32 bits.  We must check that the bignum really is more than
-     32 bits, and change it back to a 32-bit number if it fits.  The
+     64 bits, and change it back to a 64-bit number if it fits.  The
      number we are looking for is expected to be positive, but if it
-     fits into 32 bits as an unsigned number, we let it be a 32-bit
+     fits into 64 bits as an unsigned number, we let it be a 64-bit
      number.  The cavalier approach is for speed in ordinary cases.  */
-  /* This has been extended for 64 bits.  We blindly assume that if
-     you're compiling in 64-bit mode, the target is a 64-bit machine.
-     This should be cleaned up.  */
-
-#ifdef BFD64
-#define valuesize 64
-#else /* includes non-bfd case, mostly */
-#define valuesize 32
-#endif
+
+#define valuesize (sizeof (valueT) * CHAR_BIT)
 
   if (is_end_of_line[(unsigned char) *input_line_pointer])
     {
@@ -384,7 +369,6 @@ integer_constant (int radix, expressionS *expressionP)
       maxdig = radix = 10;
       too_many_digits = (valuesize + 11) / 4; /* Very rough.  */
     }
-#undef valuesize
   start = input_line_pointer;
   c = *input_line_pointer++;
   for (number = 0;
@@ -455,23 +439,15 @@ integer_constant (int radix, expressionS *expressionP)
 	     && num_little_digits > 1)
 	num_little_digits--;
 
-      if (num_little_digits <= 2)
-	{
-	  /* will fit into 32 bits.  */
-	  number = generic_bignum_to_int32 ();
-	  small = 1;
-	}
-#ifdef BFD64
-      else if (num_little_digits <= 4)
+      if (num_little_digits <= 4)
 	{
 	  /* Will fit into 64 bits.  */
 	  number = generic_bignum_to_int64 ();
-	  small = 1;
+	  small = true;
 	}
-#endif
       else
 	{
-	  small = 0;
+	  small = false;
 
 	  /* Number of littlenums in the bignum.  */
 	  number = num_little_digits;
@@ -513,20 +489,12 @@ integer_constant (int radix, expressionS *expressionP)
       /* Again, c is char after number.  */
       /* input_line_pointer -> after c.  */
       know (LITTLENUM_NUMBER_OF_BITS == 16);
-      if (leader < generic_bignum + 2)
-	{
-	  /* Will fit into 32 bits.  */
-	  number = generic_bignum_to_int32 ();
-	  small = 1;
-	}
-#ifdef BFD64
-      else if (leader < generic_bignum + 4)
+      if (leader < generic_bignum + 4)
 	{
 	  /* Will fit into 64 bits.  */
 	  number = generic_bignum_to_int64 ();
 	  small = 1;
 	}
-#endif
       else
 	{
 	  /* Number of littlenums in the bignum.  */
@@ -1947,14 +1915,15 @@ expr (int rankarg,		/* Larger # is higher rank.  */
 	      as_warn (_("division by zero"));
 	      v = 1;
 	    }
-	  if ((valueT) v >= sizeof(valueT) * CHAR_BIT
+	  if ((valueT) v >= valuesize
 	      && (op_left == O_left_shift || op_left == O_right_shift))
 	    {
 	      as_warn_value_out_of_range (_("shift count"), v, 0,
-					  sizeof(valueT) * CHAR_BIT - 1,
+					  valuesize - 1,
 					  NULL, 0);
 	      resultP->X_add_number = v = 0;
 	    }
+#undef valuesize
 	  switch (op_left)
 	    {
 	    default:			goto general;
diff --git a/gas/expr.h b/gas/expr.h
index dff40857427..766c70146e3 100644
--- a/gas/expr.h
+++ b/gas/expr.h
@@ -186,7 +186,6 @@ extern int expr_symbol_where (symbolS *, const char **, unsigned int *);
 extern void current_location (expressionS *);
 extern symbolS *expr_build_uconstant (offsetT);
 extern symbolS *expr_build_dot (void);
-extern uint32_t generic_bignum_to_int32 (void);
 extern uint64_t generic_bignum_to_int64 (void);
 extern int resolve_expression (expressionS *);
 

base-commit: 0ee8858e7aeca5ba5f702204daad2ddd290ef229
-- 
2.36.0