From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-74.mimecast.com (us-smtp-delivery-74.mimecast.com [63.128.21.74]) by sourceware.org (Postfix) with ESMTP id B636F385E000 for ; Wed, 25 Mar 2020 08:05:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B636F385E000 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-116-DtEIC9BqMcWr6s3AWtx9vw-1; Wed, 25 Mar 2020 04:04:57 -0400 X-MC-Unique: DtEIC9BqMcWr6s3AWtx9vw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 515611851C33; Wed, 25 Mar 2020 08:04:56 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-22.ams2.redhat.com [10.36.112.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D64625DA7B; Wed, 25 Mar 2020 08:04:55 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id 02P84rBm014385; Wed, 25 Mar 2020 09:04:54 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id 02P84rQ9014384; Wed, 25 Mar 2020 09:04:53 +0100 Date: Wed, 25 Mar 2020 09:04:53 +0100 From: Jakub Jelinek To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] i386: Fix ix86_add_reg_usage_to_vzeroupper [PR94308] Message-ID: <20200325080453.GZ2156@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Spam-Status: No, score=-24.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2020 08:05:02 -0000 Hi! The following patch ICEs due to my recent change r10-6451-gb7b3378f91c. Since that patch, for explicit vzeroupper in the sources (when an intrinsic is used), we start with the *avx_vzeroupper_1 pattern which contains just t= he UNSPECV_VZEROUPPER and no sets/clobbers. The vzeroupper pass then adds som= e sets to those, but doesn't add clobbers and finally there is an && epilogue_completed splitter that splits this into the *avx_vzeroupper pattern which has the right number of sets/clobbers (16 on 64-bit, 8 on 32-bit) + the UNSPECV_VZEROUPPER first. The problem with this testcase on !TARGET_64BIT is that the vzeroupper pass adds 8 sets to the pattern, i.e. the maximum number, but INSN_CODE stays to be the one of the *avx_vzeroupper_1 pattern. The splitter doesn't do anything here, because it sees the number of rtxes in the PARALLEL already the right count, but during final we see that the *avx_vzeroupper_1 pattern has "#" output template and ICE that we forgot to split it. The following patch fixes it by forcing re-recognition of the insn after we make the changes to it in ix86_add_reg_usage_to_vzeroupper. Anything that will call recog_memoized later on will recog it and find out it is in this case already *avx_vzeroupper rather than *avx_vzeroupper_1. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2020-03-25 Jakub Jelinek =09PR target/94308 =09* config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper): Set =09INSN_CODE (insn) to -1 when changing the pattern. =09* gcc.target/i386/pr94308.c: New test. --- gcc/config/i386/i386-features.c.jj=092020-03-17 13:50:52.955933209 +010= 0 +++ gcc/config/i386/i386-features.c=092020-03-24 19:19:17.801609289 +0100 @@ -1792,6 +1792,7 @@ ix86_add_reg_usage_to_vzeroupper (rtx_in RTVEC_ELT (vec, j) =3D gen_rtx_SET (reg, reg); } XVEC (pattern, 0) =3D vec; + INSN_CODE (insn) =3D -1; df_insn_rescan (insn); } =20 --- gcc/testsuite/gcc.target/i386/pr94308.c.jj=092020-03-24 19:32:51.964436= 310 +0100 +++ gcc/testsuite/gcc.target/i386/pr94308.c=092020-03-24 19:32:39.848617482= +0100 @@ -0,0 +1,31 @@ +/* PR target/94308 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mfpmath=3Dsse -mavx2 -mfma" } */ + +#include + +void +foo (float *x, const float *y, const float *z, unsigned int w) +{ + unsigned int a; + const unsigned int b =3D w / 8; + const float *c =3D y; + const float *d =3D z; + __m256 e =3D _mm256_setzero_ps (); + __m256 f, g; + for (a =3D 0; a < b; a++) + { + f =3D _mm256_loadu_ps (c); + g =3D _mm256_loadu_ps (d); + c +=3D 8; + d +=3D 8; + e =3D _mm256_fmadd_ps (f, g, e); + } + __attribute__ ((aligned (32))) float h[8]; + _mm256_storeu_ps (h, e); + _mm256_zeroupper (); + float i =3D h[0] + h[1] + h[2] + h[3] + h[4] + h[5] + h[6] + h[7]; + for (a =3D b * 8; a < w; a++) + i +=3D (*c++) * (*d++); + *x =3D i; +} =09Jakub