From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from franke.ms (serveronline.org [136.243.37.185]) by sourceware.org (Postfix) with ESMTPS id BA68B3847718 for ; Wed, 10 Apr 2024 14:25:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BA68B3847718 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=franke.ms Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=franke.ms ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BA68B3847718 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=136.243.37.185 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712759109; cv=none; b=g6khWh+6phaTKYcXvDad3IBHI++9ZYM3EB2qmQ9tzJGemr8SWmBP4I5vvx7DPg1mhVO4wzO+w+E+mdyI6vh0K58Nw1jjEbOXZQH3ih32qGzYjrp+bCk3Tihh863uXIM5Vn6SuRT/IuFzfanfVqPTI3xxMJsafLU/m62kpxEJQS4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712759109; c=relaxed/simple; bh=cUTuD9AOJ//qNGTAI9h+tBSKbT1qV+L9QUoDYy2zIGk=; h=Date:From:To:Subject:Message-ID:MIME-Version; b=qqh4TE/zPBu/joAqW2jFc5gywrUj8jJFcmCDGLAsxGEK/51G6uSeNVaUQrcGOgerJ4kAXt7sF+o5xCqDZGeCRdIlfCsMTVOciYX1R8tqysXm+gSzcROM/Nz36UXHdkNoVCxUmKhkbiLf7ozeV+RcP3ORlRDVyNXzglPHS/Sjo38= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [127.0.0.1] (i59F6C2ED.versanet.de [89.246.194.237]) by serveronline.org (BEJY V1.6.12-SNAPSHOT (c) 2000-2021 by BebboSoft, Stefan "Bebbo" Franke, all rights reserved) with SMTP id 18ec865e5ab3221a37037713f18 from stefan@franke.ms for gcc-help@gcc.gnu.org; Wed, 10 Apr 2024 15:25:04 +0100 Date: Wed, 10 Apr 2024 16:25:03 +0200 From: Stefan Franke To: gcc-help@gcc.gnu.org Subject: Re: AW: optimizer discards sign information User-Agent: K-9 Mail for Android In-Reply-To: References: <016501da8b24$735d5170$5a17f450$@franke.ms> <018301da8b28$e7a601e0$b6f205a0$@franke.ms> Message-ID: <51BFC92A-35A4-4287-8FB0-F8258DAF0D9D@franke.ms> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=----T25M45DXJHAVQ27HWSAJ6P88RREFLB Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,HTML_MESSAGE,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: ------T25M45DXJHAVQ27HWSAJ6P88RREFLB Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Am 10. April 2024 13:52:39 MESZ schrieb David Brown via Gcc-help : >On 10/04/2024 11:24, stefan@franke.ms wrote: >>> -----Urspr=C3=BCngliche Nachricht----- >>> Von: Alexander Monakov >>> Gesendet: Mittwoch, 10. April 2024 11:17 >>> An: stefan@franke.ms >>> Cc: gcc-help@gcc.gnu.org >>> Betreff: Re: optimizer discards sign information >>>=20 >>>=20 >>> On Wed, 10 Apr 2024, stefan@franke.ms wrote: >>>=20 >>>> Hi all, >>>>=20 >>>> I just stumbled over an issue, which is present in almost all gcc vers= ions. >>>> I worked around using inline assembly=E2=80=A6 Maybe gcc behaves corre= ct and I >>>> am wrong? Here is the code: >>>>=20 >>>> https://godbolt.org/z/cW8jcdh56 >>>>=20 >>>> typedef unsigned long long int u64; >>>> typedef unsigned int u32; >>>> typedef unsigned short u16; >>>>=20 >>>> u64 foo(u16 a, u16 b) { >>>> u32 x =3D a * b; >>>> u64 r =3D x; >>>> return r; >>>> } >>>>=20 >>>> And on gcc 13.2 x86.64 you get >>>>=20 >>>> foo: >>>> movzx esi, si >>>> movzx edi, di >>>> imul edi, esi >>>> movsx rax, edi >>>> ret >>>>=20 >>>>=20 >>>> There is a sign extension! The optimizer step discards the information >>>>=20 >>>> x_6 =3D (u32) _3; >>>>=20 >>>> and uses _3 directly instead, which is signed. >>>>=20 >>>> Am I wrong or is it gcc? >>>=20 >>> GCC is not wrong. When your code computes x: >>>=20 >>> u32 x =3D a * b; >>>=20 >>> 'a' and 'b' are first promoted to int according to C language rules, an= d the >>> multiplication happens in the signed int type, with UB on overflow. >>> The compiler deduces the range of signed int temporary holding the resu= lt of >>> the multiplication is [0, 0x7fffffff], which allows to propagate it to = the >>> assignment of 'r' (which in the end produces a sign extension, as you >>> observed, so the propagation did not turn out to be useful). >>>=20 >>> u16 * u16 is a famous footgun for sure. I'd suggest 'x =3D 1u * a * b' >>> as a fix for the code. >>>=20 >>> Alexander >>=20 >> Thank you for the fix =F0=9F=98=8A >> I considered >>=20 >> u32 x =3D a * b; >>=20 >> as good enough, since from my understanding, x *is* unsigned. >>=20 >> Adding the multiplication with 1u resolves this. >>=20 > >>From the wording you use, I think perhaps you have (or had) two misunderst= andings about the way C works here. First, when you have an expression "x = =3D y", the type of "x" is irrelevant to the evaluation of the expression "= y", its value, and the validity of the value. > >Secondly, if you hit undefined behaviour somewhere, /everything/ afterward= s is undefined behaviour. You cannot "correct" it, or force it in some way= s to establish properties about it. And the compiler can assume it does no= t happen (or you don't care about anything that might happen). This lets i= t optimise based on these assumptions. > >So the fact that "x" is declared as an unsigned type is basically irreleva= nt. The semantics of the code "u32 x =3D a * b;", for the type sizes of x8= 6-64, are that the compiler must take the u16 value in "a" and convert it i= nto a 32-bit "int" preserving its value. It does the same with "b". It kn= ows these are two 32-bit signed ints with values between 0 and 0xffff. > >Then it is asked to multiply them. It knows the result does not overflow = - because overflow in the operation would be UB. Thus it knows the result = is between 0 and 0x7fff'ffff. > >This is then converted to a u32. The compiler can do this any way it like= s, as long as the value is conserved for all possible values (0 to 0x7fff'f= fff). Typically it will be a no-op. > >Then this is converted to a u64, and again the compiler can do so in any m= anner that is value conserving for all possible values. Since it knows the= value is in the range 0 to 0x7fff'ffff, then either a sign-extension from = 32-bit to 64-bit, or a zero extension, will work fine. (It could also expl= icitly set the upper 32-bit to 0, or bitwise-and the register with 0x0000'0= 000'7fff'ffff, or shift it 33 bits left then 33 bits right, or any other op= eration that gives the same result in the end. That's a question of optimi= sation and efficiency, not correctness.) > >If it were to matter whether the compiler used sign extension or zero exte= nsion, you would have already have executed undefined behaviour - and all b= ets are off. Thus it /doesn't/ matter which is used - the compiler can use= either method, whichever it thinks is most efficient. (And if it picked th= e less efficient one, that's a missed optimisation bug, not a code correctn= ess bug.) > >And if you try to rely on the effects of UB in your code, then your code h= as a bug. Different compilers and different options may give you different= end results - undefined behaviour means /anything/ can happen, including t= he thing you wanted to happen! > > >This particular case is a bit subtle, and it's easy to get caught out - af= ter all, it's using unsigned types everywhere, and the common myth is that = unsigned types don't have undefined behaviour. The reality is that it is /= operations/ that have defined or undefined behaviour, not the types, and th= ere are no arithmetic operations on types smaller than "int". "uint8_t" an= d "uint16_t" silently promote to "int" (on 32-bit or 64-bit systems), and o= perations on /int/ can have undefined behaviour. > > >I hope that clears up any remaining misunderstandings - and I hope it didn= 't sound patronising about things that you already knew. > >David > > Thank you for the detailed explanation. It still doesn't fit together, since if the compiler knows that it's unsign= ed, why does it emit code for sign extension on the m68k target? Well. Since there are enough explanations and my expectations are simply wr= ong consider this as closed. Kind regards Stefan=20 ------T25M45DXJHAVQ27HWSAJ6P88RREFLB--