public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "tnfchris at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/103350] [12 Regression] wrong code with -Os -fno-tree-ter on aarch64-unknown-linux-gnu since r12-2288-g8695bf78dad1a42636775843ca832a2f4dba4da3 Date: Mon, 13 Dec 2021 09:02:49 +0000 [thread overview] Message-ID: <bug-103350-4-I9JiSgmLZv@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-103350-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103350 Tamar Christina <tnfchris at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tnfchris at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org --- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> --- This and the report in PR103632 are caused by a bug in REE where it generates incorrect code. It's trying to eliminate the following zero extension (insn 54 90 102 2 (set (reg:V4SI 33 v1 [orig:94 _5 ] [94]) (zero_extend:V4SI (reg/v:V4HI 40 v8 [orig:112 v64u16_0D.3917 ] [112]))) "jon-inc.c":21:30 4106 {zero_extendv4hiv4si2} (nil)) by folding it in the definition of `v8`: (insn 2 5 104 2 (set (reg/v:V4HI 40 v8 [orig:112 v64u16_0D.3917 ] [112]) (reg:V4HI 32 v0 [156])) "jon-inc.c":15:1 1160 {*aarch64_simd_movv4hi} (nil)) which is fine, except that `v8` is also used by the extracts, e.g.: (insn 11 10 12 2 (set (reg:SI 1 x1 [orig:103 _17 ] [103]) (zero_extend:SI (vec_select:HI (reg/v:V4HI 40 v8 [orig:112 v64u16_0D.3917 ] [112]) (parallel [ (const_int 3 [0x3]) ])))) 2480 {*aarch64_get_lane_zero_extendsiv4hi} (nil)) REE replaces insn 2 by folding insn 54 and placing it at the definition site of insn 2, so before insn 11. Trying to eliminate extension: (insn 54 90 102 2 (set (reg:V4SI 33 v1 [orig:94 _5 ] [94]) (zero_extend:V4SI (reg/v:V4HI 40 v8 [orig:112 v64u16_0D.3917 ] [112]))) "jon-inc.c":21:30 4106 {zero_extendv4hiv4si2} (nil)) Tentatively merged extension with definition (copy needed): (insn 2 5 104 2 (set (reg:V4SI 33 v1) (zero_extend:V4SI (reg:V4HI 32 v0 [156]))) "jon-inc.c":15:1 -1 (nil)) to produce (insn 2 5 110 2 (set (reg:V4SI 33 v1) (zero_extend:V4SI (reg:V4HI 32 v0 [156]))) "jon-inc.c":15:1 4106 {zero_extendv4hiv4si2} (nil)) (insn 110 2 104 2 (set (reg:V4SI 40 v8) (reg:V4SI 33 v1)) "jon-inc.c":15:1 -1 (nil)) The new insn 2 using v0 directly is correct, but the insn 110 it creates is wrong, `v8` should still be V4HI. or it also needs to eliminate the zero extension from the extracts, so instead of (insn 11 10 12 2 (set (reg:SI 1 x1 [orig:103 _17 ] [103]) (zero_extend:SI (vec_select:HI (reg/v:V4HI 40 v8 [orig:112 v64u16_0D.3917 ] [112]) (parallel [ (const_int 3 [0x3]) ])))) 2480 {*aarch64_get_lane_zero_extendsiv4hi} (nil)) it should be (insn 11 10 12 2 (set (reg:SI 1 x1 [orig:103 _17 ] [103]) (vec_select:SI (reg/v:V4SI 40 v8 [orig:112 v64u16_0D.3917 ] [112]) (parallel [ (const_int 3 [0x3]) ]))) 2480 {*aarch64_get_lane_zero_extendsiv4hi} (nil)) without doing so the indices have been remapped in the extension and so we extract the wrong elements At any other optimization level but -Os ree seems to abort so this doesn't trigger: Trying to eliminate extension: (insn 54 90 101 2 (set (reg:V4SI 32 v0 [orig:94 _5 ] [94]) (zero_extend:V4SI (reg/v:V4HI 40 v8 [orig:112 v64u16_0D.3917 ] [112]))) "jon-inc.c":21:30 4106 {zero_extendv4hiv4si2} (nil)) Elimination opportunities = 2 realized = 0 purely due to the ordering of instructions. REE doesn't check uses of `v8` because it assumes that with a zero extended value, you still have access to the lower bits by using the the bottom part of the register. This is true for scalar but not for vector. This would have been fine as well if REE had eliminated the zero_extend on insn 11 and the rest but it doesn't do so since REE can only handle cases where the SRC value are REG_P. It does try to do this in add_removable_extension: 1160 /* For vector mode extensions, ensure that all uses of the 1161 XEXP (src, 0) register are in insn or debug insns, as unlike 1162 integral extensions lowpart subreg of the sign/zero extended 1163 register are not equal to the original register, so we have 1164 to change all uses or none and the current code isn't able 1165 to change them all at once in one transaction. */ However this code doesn't trigger for the example because REE doesn't check the uses if the defining instruction doesn't feed into another extension.. Which is bogus. For vectors it should always check usages. r12-2288-g8695bf78dad1a42636775843ca832a2f4dba4da3 simply exposed this as it now lowers VEC_SELECT 0 into the RTL canonical form subreg 0 which causes REE to run more often. Mine.
next prev parent reply other threads:[~2021-12-13 9:02 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-21 19:44 [Bug rtl-optimization/103350] New: wrong code with -Os -fno-tree-ter on aarch64-unknown-linux-gnu zsojka at seznam dot cz 2021-11-21 22:22 ` [Bug rtl-optimization/103350] " pinskia at gcc dot gnu.org 2021-11-22 9:23 ` marxin at gcc dot gnu.org 2021-11-22 10:42 ` [Bug rtl-optimization/103350] [12 Regression] wrong code with -Os -fno-tree-ter on aarch64-unknown-linux-gnu since r12-2288-g8695bf78dad1a42636775843ca832a2f4dba4da3 marxin at gcc dot gnu.org 2021-11-22 10:43 ` marxin at gcc dot gnu.org 2021-12-13 8:36 ` tnfchris at gcc dot gnu.org 2021-12-13 9:02 ` tnfchris at gcc dot gnu.org [this message] 2021-12-15 10:26 ` cvs-commit at gcc dot gnu.org 2021-12-15 10:29 ` tnfchris at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-103350-4-I9JiSgmLZv@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).