From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id A3A5F385DC2C; Fri,  3 Apr 2020 08:12:10 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A3A5F385DC2C
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
 s=default; t=1585901530;
 bh=lBU7KZyt8eMDFG7c0LuLz61xIUCM15nXXVXjlAiqMq4=;
 h=From:To:Subject:Date:In-Reply-To:References:From;
 b=n/d5GJ4KaStpsID2/a5f5rBW22s75ZJXz+HIn9GO5h/QTcjZ3SiuFaaSN3ozA5E9B
 XRFCbWQRGO7IUPy2K0qHeK37PBVkkyU/qAuRdK0w5Y0Zhi8NPNKPdCW5CAdYGNLxfG
 PrVOJ3E2Vn8FtdiMAiSv10L77bpUC+ejFjRjlgMk=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/93264] [10 Regression] ICE in
 cfg_layout_redirect_edge_and_branch_force, at cfgrtl.c:4522
Date: Fri, 03 Apr 2020 08:12:10 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 10.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P1
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 10.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-93264-4-hpxLyKhK6A@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-93264-4@http.gcc.gnu.org/bugzilla/>
References: <bug-93264-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <http://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <http://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Apr 2020 08:12:10 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D93264
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Roman Zhuykov from comment #6)
> First, I want here to mention that Richard have recently discussed
> partitioning in mailing list with Segher, starting from
> https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00666.html
>=20
> > I'll run some cross-testing to check how it works for now.
> Second, those tests have finished and there is nothing to say about the
> results. From correctness aspect everything looks fine. =C2=A0But I haven=
't tried
> to look at each example where it prevents/allows loops to be scheduled, a=
nd
> know nothing about performance impact.
>=20
> (In reply to rsandifo@gcc.gnu.org from comment #4)
> > Yeah, it ought to be better to do mode switching first. =C2=A0But I thi=
nk
> > the more important ones are:
> >
> > =C2=A0 =C2=A0 =C2=A0 =C2=A0NEXT_PASS (pass_split_all_insns);
> > =C2=A0 =C2=A0 =C2=A0 =C2=A0NEXT_PASS (pass_lower_subreg3);
> >
> > Scheduling should happen on the split form of insns rather than the
> > unsplit form. =C2=A0lower_subreg should also improve "schedulability".
> Agreed, so it would be much better to fix the issue conservatively without
> moving the pass.

As of pass ordering partitioning needs to run before RA but I don't see
why it necessarily needs to run that early.  Well, it works on CFG layout
mode and it's the "last" pass before we go out of CFG layout - but obviously
SMS goes back into so we could do the very same dance for partitioning
and only schedule it after SMS (or even later).

The whole complication arises from the direct CFG edges that are
implemented with indirect jumps and which we cannot easily redirect.
A more high-level representation of those "direct jumps" would make
redirection possible again (at the expense of micro-optimizing the
reg set + indirect jump between partitioning and the point we
expose the operation).  The only thing we need is the RA giving us
a scratch register we can use.  Maybe we can have a special named
pattern for those jumps allocating an extra scratch and have targets
either split them after reload into true indirect jumps or leave them
in place all the way to asm output.  So they'd appear as simple_jump_p
for a much longer time?

Currently we have

(insn 577 350 578 27 (set (reg:DI 309)
        (high:DI (label_ref:DI 220))) "pr71550.c":19:20 -1
     (insn_list:REG_LABEL_OPERAND 220 (nil)))
(insn 578 577 579 27 (set (reg:DI 308)
        (lo_sum:DI (reg:DI 309)
            (label_ref:DI 220))) "pr71550.c":19:20 -1
     (insn_list:REG_LABEL_OPERAND 220 (expr_list:REG_EQUAL (label_ref:DI 22=
0)
            (nil))))
(jump_insn/j 579 578 220 27 (set (pc)
        (reg:DI 308)) "pr71550.c":19:20 -1
     (nil)

not sure if jump_insns can have extra operands for the scratch but
we want part of the jump to be like

(jump_insn/j 579 578 220 27 (set (pc)
        (label_ref:DI 220)) "pr71550.c":19:20 -1
     (nil)

but also mention (reg:DI 309) as scratch.

Ideas?  I guess doing

(jump_insn/j 579 578 220 27 [
  (set (pc) (label_ref:DI 220))
  (clobber (reg:DI 309))])

might work?  The pattern of the jump insn woudl be a parallel though
so not match simplejump_p.  There's also JUMP_LABEL, not sure if that's
usable.  There's condjump_in_parallel_p so we could add
simplejump_in_parallel_p as well and use that in CFG manipulation.

This woudl of course need support from targets but where partitioning
decides it needs to emit indirect jumps it could say that it needs
such target support or otherwise refuse to operate.  Currently
we look at HAS_LONG_COND_BRANCH / HAS_LONG_UNCOND_BRANCH where
only failure in the latter needs indirect jumps using "indirect_jump".
So we'd add sth like

(define_insn "crossing_jump"
  [(set (pc) (label_ref (match_operand 0 "" ""))
   (clobber (match_scratch:DI 1))]
  ""
  "br\\t%1"
  [(set_attr "type" "branch")]
)

with the assembly part adjusted accordingly (the move is missing).  As said
a late splitter may be the best approach (but that's up to the target).

And partitioning would require either HAS_LONG_UNCOND_BRANCH or
targetm.have_crossing_jump and we'd arrange for CFG manipulation to
handle this form.
Oh, and while these issues can trigger in a way appearing as P1 regression
they of course really are not new issues.  So yeah, not really P1.=