public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/97312] New: [11 regression] aarch64/pr90838.c fails
@ 2020-10-07  7:06 clyon at gcc dot gnu.org
  2020-10-07  8:47 ` [Bug target/97312] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: clyon at gcc dot gnu.org @ 2020-10-07  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97312

            Bug ID: 97312
           Summary: [11 regression] aarch64/pr90838.c fails
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed that
FAIL: gcc.target/aarch64/pr90838.c scan-assembler-times and\t 2

This appeared between r11-3681 (g:29c650cd899496c4f9bc069d03d0d7ecfb632176)
and r11-3685 (g:fcae5121154d1c3382b056bcc2c563cedac28e74)
but r11-3684 broke the toolchain build so the wrong commit may be either of
those.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97312] [11 regression] aarch64/pr90838.c fails
  2020-10-07  7:06 [Bug target/97312] New: [11 regression] aarch64/pr90838.c fails clyon at gcc dot gnu.org
@ 2020-10-07  8:47 ` rguenth at gcc dot gnu.org
  2020-10-07 13:10 ` aldyh at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-07  8:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97312

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97312] [11 regression] aarch64/pr90838.c fails
  2020-10-07  7:06 [Bug target/97312] New: [11 regression] aarch64/pr90838.c fails clyon at gcc dot gnu.org
  2020-10-07  8:47 ` [Bug target/97312] " rguenth at gcc dot gnu.org
@ 2020-10-07 13:10 ` aldyh at gcc dot gnu.org
  2020-10-09  8:20 ` cvs-commit at gcc dot gnu.org
  2020-10-12  9:31 ` aldyh at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: aldyh at gcc dot gnu.org @ 2020-10-07 13:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97312

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-10-07
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |WAITING

--- Comment #1 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Confirmed.

This test is checking the final assembly for a specific sequence.  I don't
speak aarch64 assembly, but the IL is different coming out of evrp.

The first culprit is this difference in the mergephi1 dump:

   _9 = .CTZ (x_6(D));
-  _10 = _9 & 31;
+  _10 = _9;

These are unsigned ints, so assuming they are 32 bits on aarch64, __builtin_ctz
is always less than 32.  This is because a CTZ of 0 is undefined according to
the GCC manual:

Built-in Function: int __builtin_ctz (unsigned int x)

    Returns the number of trailing 0-bits in x, starting at the least
significant bit position. If x is 0, the result is undefined. 

So a bitwise AND of anything less than 32 with 0x1f (31) is a no-op.

Are aarch64 int's 32-bits?

Here are the full IL differences:

--- legacy-evrp/pr90838.c.038t.mergephi1        2020-10-07 08:44:12.152358885
-0400
+++ ranger/pr90838.c.038t.mergephi1     2020-10-07 08:39:12.339296502 -0400
@@ -1,41 +1,41 @@

 ;; Function ctz1 (ctz1, funcdef_no=0, decl_uid=3587, cgraph_uid=1,
symbol_order=0)

 ctz1 (unsigned int x)
 {
   static const char table[32] =
"\x00\x01\x1c\x02\x1d\x0e\x18\x03\x1e\x16\x14\x0f\x19\x11\x04\b\x1f\x1b\r\x17\x15\x13\x10\x07\x1a\f\x12\x06\v\x05\n\t";
   unsigned int _1;
   unsigned int _2;
   unsigned int _3;
   unsigned int _4;
   char _5;
   int _9;
   int _10;

   <bb 2> :
   _1 = -x_6(D);
   _2 = _1 & x_6(D);
   _3 = _2 * 125613361;
   _4 = _3 >> 27;
   _9 = .CTZ (x_6(D));
-  _10 = _9 & 31;
+  _10 = _9;
   _5 = (char) _10;
   return _10;

 }



 ;; Function ctz2 (ctz2, funcdef_no=1, decl_uid=3591, cgraph_uid=2,
symbol_order=1)

 ctz2 (unsigned int x)
 {
   static short int table[64] = {32, 0, 1, 12, 2, 6, 0, 13, 3, 0, 7, 0, 0, 0,
0, 14, 10, 4, 0, 0, 8, 0, 0, 25, 0, 0, 0, 0, 0, 21, 27, 15, 31, 11, 5, 0, 0, 0,
0, 0, 9, 0, 0, 
24, 0, 0, 20, 26, 30, 0, 0, 0, 0, 23, 0, 19, 29, 0, 22, 18, 28, 17, 16, 0};
   unsigned int _1;
   unsigned int _2;
   unsigned int _3;
   short int _4;
   int _8;

   <bb 2> :
   _1 = -x_5(D);
@@ -87,27 +87,27 @@


 ;; Function ctz4 (ctz4, funcdef_no=3, decl_uid=3601, cgraph_uid=4,
symbol_order=5)

 ctz4 (long unsigned int x)
 {
   long unsigned int lsb;
   long unsigned int _1;
   long long unsigned int _2;
   long long unsigned int _3;
   char _4;
   int _9;
   int _10;

   <bb 2> :
   _1 = -x_5(D);
   lsb_6 = _1 & x_5(D);
   _2 = lsb_6 * 283881067100198605;
   _3 = _2 >> 58;
   _9 = .CTZ (x_5(D));
-  _10 = _9 & 63;
+  _10 = _9;
   _4 = (char) _10;
   return _10;

 }

The difference in assembly matches.  We have 2 less AND's in the final output:

$ diff -u legacy.s ranger.s
--- legacy.s    2020-10-07 09:06:13.420446783 -0400
+++ ranger.s    2020-10-07 09:06:42.646646949 -0400
@@ -8,7 +8,6 @@
 ctz1:
        rbit    w0, w0
        clz     w0, w0
-       and     w0, w0, 31
        ret
        .size   ctz1, .-ctz1
        .align  2
@@ -36,7 +35,6 @@
 ctz4:
        rbit    x0, x0
        clz     x0, x0
-       and     w0, w0, 63
        ret
        .size   ctz4, .-ctz4

If my analysis is correct, someone aarch64 savvy should adjust this:

/* { dg-final { scan-assembler-times "and\t" 2 } } */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97312] [11 regression] aarch64/pr90838.c fails
  2020-10-07  7:06 [Bug target/97312] New: [11 regression] aarch64/pr90838.c fails clyon at gcc dot gnu.org
  2020-10-07  8:47 ` [Bug target/97312] " rguenth at gcc dot gnu.org
  2020-10-07 13:10 ` aldyh at gcc dot gnu.org
@ 2020-10-09  8:20 ` cvs-commit at gcc dot gnu.org
  2020-10-12  9:31 ` aldyh at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-10-09  8:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97312

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:781634daea8cb788efb33994f4a19df76598542e

commit r11-3744-g781634daea8cb788efb33994f4a19df76598542e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Oct 9 10:19:16 2020 +0200

    vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

    > Perhaps another way out of this would be document and enforce that
    > __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
    > calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2

    The following patch implements that, i.e. __builtin_c?z* now take full
    advantage of them being UB at zero, while the ifns are well defined at zero
    if *_DEFINED_VALUE_AT_ZERO (*) == 2.  That is what fixes PR94801.

    Furthermore, to fix PR97312, if it is well defined at zero and the value at
    zero is prec, we don't lower the maximum unless the argument is known to be
    non-zero.
    For gimple-range.cc I guess we could improve it if needed e.g. by returning
    a [0,7][32,32] range for .CTZ of e.g. [0,137], but for now it (roughly)
    matches what vr-values.c does.

    2020-10-09  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/94801
            PR target/97312
            * vr-values.c (vr_values::extract_range_basic) <CASE_CFN_CLZ,
            CASE_CFN_CTZ>: When stmt is not an internal-fn call or
            C?Z_DEFINED_VALUE_AT_ZERO is not 2, assume argument is not zero
            and thus use [0, prec-1] range unless it can be further improved.
            For CTZ, don't update maxi from upper bound if it was previously
prec.
            * gimple-range.cc (gimple_ranger::range_of_builtin_call)
<CASE_CFN_CLZ,
            CASE_CFN_CTZ>: Likewise.

            * gcc.dg/tree-ssa/pr94801.c: New test.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97312] [11 regression] aarch64/pr90838.c fails
  2020-10-07  7:06 [Bug target/97312] New: [11 regression] aarch64/pr90838.c fails clyon at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-10-09  8:20 ` cvs-commit at gcc dot gnu.org
@ 2020-10-12  9:31 ` aldyh at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: aldyh at gcc dot gnu.org @ 2020-10-12  9:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97312

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|WAITING                     |RESOLVED

--- Comment #3 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
fixed in trunk

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-12  9:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07  7:06 [Bug target/97312] New: [11 regression] aarch64/pr90838.c fails clyon at gcc dot gnu.org
2020-10-07  8:47 ` [Bug target/97312] " rguenth at gcc dot gnu.org
2020-10-07 13:10 ` aldyh at gcc dot gnu.org
2020-10-09  8:20 ` cvs-commit at gcc dot gnu.org
2020-10-12  9:31 ` aldyh at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).