From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 191E5385E836; Fri, 3 May 2024 08:38:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 191E5385E836 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1714725518; bh=myrQmmLMNrUPOIWWHOdG4f5bFNKs+wOtzw4uIosAWlU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=w9AFs/QG2NfFxn9e2tZhFAto0496hcghtpowGowAa1+ujXAkKC/DZvXibbHCdtdrN hvfcFt78ruqDfMm+PX2adlSPm57UT9x5E8E6l2jl2D04JvdJnZsMzRLtKaOmmyghx9 PfX4h51y/XV7/gMlU5rQeFQZbHeE5hyGp3ecpl74= From: "pinskia at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu Date: Fri, 03 May 2024 08:38:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: pinskia at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114902 --- Comment #7 from Andrew Pinski --- (In reply to Segher Boessenkool from comment #6) > (In reply to Andrew Pinski from comment #2) > > Looks like the issue is during combine. > >=20 > > We go from CCGC with a sign_extend to a zero_extend with CCZ. that can'= t be > > right. >=20 > Why is that not correct? zero_extend is preferred over sign_extend, and = both > are equivalent when only checking for zero. For Equality they are equivalent yes. But when doing `a >=3Ds 0` a sign extend/extract will cause different results from a zero extend/extract. > Is there something wrong in target code here, perhaps? For arm, x86 and mips? For testcase in comment #4 on x86_64: Before combine we start with: ``` (insn 16 15 17 2 (parallel [ (set (reg:SI 106 [ t_4 ]) (and:SI (reg:SI 105 [ tt1_3 ]) (const_int 1 [0x1]))) (clobber (reg:CC 17 flags)) ]) "/app/example.cpp":6:9 617 {*andsi_1} (expr_list:REG_DEAD (reg:SI 105 [ tt1_3 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) (insn 17 16 20 2 (parallel [ (set (reg:SI 107 [ e_5 ]) (neg:SI (reg:SI 106 [ t_4 ]))) (clobber (reg:CC 17 flags)) ]) "/app/example.cpp":7:9 804 {*negsi_1} (expr_list:REG_DEAD (reg:SI 106 [ t_4 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) (insn 20 17 21 2 (set (reg:CCGC 17 flags) (compare:CCGC (reg:SI 107 [ e_5 ]) (const_int -1 [0xffffffffffffffff]))) "/app/example.cpp":8:16 11 {*cmpsi_1} (expr_list:REG_DEAD (reg:SI 107 [ e_5 ]) (nil))) (insn 21 20 22 2 (set (reg:QI 109) (ge:QI (reg:CCGC 17 flags) (const_int 0 [0]))) "/app/example.cpp":8:16 1125 {*setcc_qi} (expr_list:REG_DEAD (reg:CCGC 17 flags) (nil))) (insn 22 21 23 2 (set (reg:SI 108 [ _1 ]) (zero_extend:SI (reg:QI 109))) "/app/example.cpp":8:16 169 {*zero_extendqisi2} (expr_list:REG_DEAD (reg:QI 109) (nil))) (insn 23 22 24 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 108 [ _1 ]) (const_int 0 [0]))) "/app/example.cpp":9:8 7 {*cmpsi_ccno_1} (expr_list:REG_DEAD (reg:SI 108 [ _1 ]) (nil))) (jump_insn 24 23 30 2 (set (pc) (if_then_else (eq (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 30) (pc))) "/app/example.cpp":9:8 1130 {*jcc} (expr_list:REG_DEAD (reg:CCZ 17 flags) (int_list:REG_BR_PROB 7 (nil))) -> 30) ``` We first combine 16->17 into: ``` (parallel [ (set (reg:SI 107 [ e_5 ]) (sign_extract:SI (reg:SI 105 [ tt1_3 ]) (const_int 1 [0x1]) (const_int 0 [0]))) (clobber (reg:CC 17 flags)) ]) ``` which is correct and good And then when combining 17 -> 20 combine does: Trying 17 -> 20: 17: {r107:SI=3Dsign_extract(r105:SI,0x1,0);clobber flags:CC;} REG_DEAD r105:SI REG_UNUSED flags:CC 20: flags:CCGC=3Dcmp(r107:SI,0xffffffffffffffff) REG_DEAD r107:SI Successfully matched this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (zero_extract:SI (reg:SI 105 [ tt1_3 ]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 0 [0]))) Successfully matched this instruction: (set (reg:QI 109) (ne:QI (reg:CCZ 17 flags) (const_int 0 [0]))) Which is also replacing insn 21 incorrectly. We go from `-(a&1) >=3D -1` (which is always true) to `(a&1) !=3D 0`. Maybe we go to `(a&1) <=3D 1` (still always true) and we mess up somehow to= `(a & 1) !=3D 0`=