From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by sourceware.org (Postfix) with ESMTP id 243AA385DC00 for ; Fri, 3 Apr 2020 22:41:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 243AA385DC00 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-431-fABHKeHTNKyYQrC_pTlQfg-1; Fri, 03 Apr 2020 18:41:36 -0400 X-MC-Unique: fABHKeHTNKyYQrC_pTlQfg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 77FBB13F8; Fri, 3 Apr 2020 22:41:35 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-113-52.ams2.redhat.com [10.36.113.52]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EDA1E5DA75; Fri, 3 Apr 2020 22:41:34 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id 033MfXbf024935; Sat, 4 Apr 2020 00:41:33 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id 033MfWA8024934; Sat, 4 Apr 2020 00:41:32 +0200 Date: Sat, 4 Apr 2020 00:41:32 +0200 From: Jakub Jelinek To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] i386: Simplify {,v}ph{add,sub{,s}{w,d} insn patterns [PR94460] Message-ID: <20200403224132.GI2212@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Spam-Status: No, score=-19.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2020 22:41:41 -0000 Hi! As mentioned in the previous PR94460 patch, the RTL patterns look too large/complicated, we can simplify them by just performing two 2 arg permutations to move the arguments into the right spots and then just doing the plus/minus (or signed saturation version thereof). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for stage1? 2020-04-04 Jakub Jelinek =09PR target/94460 =09* config/i386/sse.md (avx2_phwv16hi3, =09ssse3_phwv8hi3, ssse3_phwv4hi3, =09avx2_phdv8si3, ssse3_phdv4si3, =09ssse3_phdv2si3): Simplify RTL patterns. --- gcc/config/i386/sse.md.jj=092020-04-03 10:21:51.110564277 +0200 +++ gcc/config/i386/sse.md=092020-04-03 11:55:04.455963720 +0200 @@ -16038,73 +16038,23 @@ (define_code_iterator ssse3_plusminus [p =20 (define_insn "avx2_phwv16hi3" [(set (match_operand:V16HI 0 "register_operand" "=3Dx") -=09(vec_concat:V16HI -=09 (vec_concat:V8HI -=09 (vec_concat:V4HI -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI -=09=09 (match_operand:V16HI 1 "register_operand" "x") -=09=09 (parallel [(const_int 0)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 1)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 2)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 3)])))) -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 4)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 5)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 6)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 7)]))))) -=09 (vec_concat:V4HI -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI -=09=09 (match_operand:V16HI 2 "nonimmediate_operand" "xm") -=09=09 (parallel [(const_int 0)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 1)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 2)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 3)])))) -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 4)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 5)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 6)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 7)])))))) -=09 (vec_concat:V8HI -=09 (vec_concat:V4HI -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 8)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 9)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 10)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 11)])))) -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 12)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 13)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 14)])) -=09=09 (vec_select:HI (match_dup 1) (parallel [(const_int 15)]))))) -=09 (vec_concat:V4HI -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 8)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 9)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 10)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 11)])))) -=09 (vec_concat:V2HI -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 12)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 13)]))) -=09=09(ssse3_plusminus:HI -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 14)])) -=09=09 (vec_select:HI (match_dup 2) (parallel [(const_int 15)]))))))))] +=09(ssse3_plusminus:V16HI +=09 (vec_select:V16HI +=09 (vec_concat:V32HI +=09 (match_operand:V16HI 1 "register_operand" "x") +=09 (match_operand:V16HI 2 "nonimmediate_operand" "xm")) +=09 (parallel +=09 [(const_int 0) (const_int 2) (const_int 4) (const_int 6) +=09 (const_int 16) (const_int 18) (const_int 20) (const_int 22) +=09 (const_int 8) (const_int 10) (const_int 12) (const_int 14) +=09 (const_int 24) (const_int 26) (const_int 28) (const_int 30)])) +=09 (vec_select:V16HI +=09 (vec_concat:V32HI (match_dup 1) (match_dup 2)) +=09 (parallel +=09 [(const_int 1) (const_int 3) (const_int 5) (const_int 7) +=09 (const_int 17) (const_int 19) (const_int 21) (const_int 23) +=09 (const_int 9) (const_int 11) (const_int 13) (const_int 15) +=09 (const_int 25) (const_int 27) (const_int 29) (const_int 31)]))))= ] "TARGET_AVX2" "vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") @@ -16114,41 +16064,19 @@ (define_insn "avx2_phwv8hi3" [(set (match_operand:V8HI 0 "register_operand" "=3Dx,x") -=09(vec_concat:V8HI -=09 (vec_concat:V4HI -=09 (vec_concat:V2HI -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI -=09=09 (match_operand:V8HI 1 "register_operand" "0,x") -=09=09 (parallel [(const_int 0)])) -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 1)]))) -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 2)])) -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 3)])))) -=09 (vec_concat:V2HI -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 4)])) -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 5)]))) -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 6)])) -=09=09(vec_select:HI (match_dup 1) (parallel [(const_int 7)]))))) -=09 (vec_concat:V4HI -=09 (vec_concat:V2HI -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI -=09=09 (match_operand:V8HI 2 "vector_operand" "xBm,xm") -=09=09 (parallel [(const_int 0)])) -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 1)]))) -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 2)])) -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 3)])))) -=09 (vec_concat:V2HI -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 4)])) -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 5)]))) -=09 (ssse3_plusminus:HI -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 6)])) -=09=09(vec_select:HI (match_dup 2) (parallel [(const_int 7)])))))))] +=09(ssse3_plusminus:V8HI +=09 (vec_select:V8HI +=09 (vec_concat:V16HI +=09 (match_operand:V8HI 1 "register_operand" "0,x") +=09 (match_operand:V8HI 2 "vector_operand" "xBm,xm")) +=09 (parallel +=09 [(const_int 0) (const_int 2) (const_int 4) (const_int 6) +=09 (const_int 8) (const_int 10) (const_int 12) (const_int 14)])) +=09 (vec_select:V8HI +=09 (vec_concat:V16HI (match_dup 1) (match_dup 2)) +=09 (parallel +=09 [(const_int 1) (const_int 3) (const_int 5) (const_int 7) +=09 (const_int 9) (const_int 11) (const_int 13) (const_int 15)]))))] "TARGET_SSSE3" "@ phw\t{%2, %0|%0, %2} @@ -16163,25 +16091,17 @@ (define_insn "ssse3_phwv4hi3" [(set (match_operand:V4HI 0 "register_operand" "=3Dy,x,Yv") -=09(vec_concat:V4HI -=09 (vec_concat:V2HI -=09 (ssse3_plusminus:HI -=09 (vec_select:HI -=09=09(match_operand:V4HI 1 "register_operand" "0,0,Yv") -=09=09(parallel [(const_int 0)])) -=09 (vec_select:HI (match_dup 1) (parallel [(const_int 1)]))) -=09 (ssse3_plusminus:HI -=09 (vec_select:HI (match_dup 1) (parallel [(const_int 2)])) -=09 (vec_select:HI (match_dup 1) (parallel [(const_int 3)])))) -=09 (vec_concat:V2HI -=09 (ssse3_plusminus:HI -=09 (vec_select:HI -=09=09(match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv") -=09=09(parallel [(const_int 0)])) -=09 (vec_select:HI (match_dup 2) (parallel [(const_int 1)]))) -=09 (ssse3_plusminus:HI -=09 (vec_select:HI (match_dup 2) (parallel [(const_int 2)])) -=09 (vec_select:HI (match_dup 2) (parallel [(const_int 3)]))))))] +=09(ssse3_plusminus:V4HI +=09 (vec_select:V4HI +=09 (vec_concat:V8HI +=09 (match_operand:V4HI 1 "register_operand" "0,0,Yv") +=09 (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,Yv")) +=09 (parallel +=09 [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) +=09 (vec_select:V4HI +=09 (vec_concat:V8HI (match_dup 1) (match_dup 2)) +=09 (parallel +=09 [(const_int 1) (const_int 3) (const_int 5) (const_int 7)]))))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" "@ phw\t{%2, %0|%0, %2} @@ -16211,41 +16131,19 @@ (define_insn_and_split "ssse3_phdv8si3" [(set (match_operand:V8SI 0 "register_operand" "=3Dx") -=09(vec_concat:V8SI -=09 (vec_concat:V4SI -=09 (vec_concat:V2SI -=09 (plusminus:SI -=09=09(vec_select:SI -=09=09 (match_operand:V8SI 1 "register_operand" "x") -=09=09 (parallel [(const_int 0)])) -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 1)]))) -=09 (plusminus:SI -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 2)])) -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 3)])))) -=09 (vec_concat:V2SI -=09 (plusminus:SI -=09=09(vec_select:SI -=09=09 (match_operand:V8SI 2 "nonimmediate_operand" "xm") -=09=09 (parallel [(const_int 0)])) -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 1)]))) -=09 (plusminus:SI -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 2)])) -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 3)]))))) -=09 (vec_concat:V4SI -=09 (vec_concat:V2SI -=09 (plusminus:SI -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 4)])) -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 5)]))) -=09 (plusminus:SI -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 6)])) -=09=09(vec_select:SI (match_dup 1) (parallel [(const_int 7)])))) -=09 (vec_concat:V2SI -=09 (plusminus:SI -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 4)])) -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 5)]))) -=09 (plusminus:SI -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 6)])) -=09=09(vec_select:SI (match_dup 2) (parallel [(const_int 7)])))))))] +=09(plusminus:V8SI +=09 (vec_select:V8SI +=09 (vec_concat:V16SI +=09 (match_operand:V8SI 1 "register_operand" "x") +=09 (match_operand:V8SI 2 "nonimmediate_operand" "xm")) +=09 (parallel +=09 [(const_int 0) (const_int 2) (const_int 8) (const_int 10) +=09 (const_int 4) (const_int 6) (const_int 12) (const_int 14)])) +=09 (vec_select:V8SI +=09 (vec_concat:V16SI (match_dup 1) (match_dup 2)) +=09 (parallel +=09 [(const_int 1) (const_int 3) (const_int 9) (const_int 11) +=09 (const_int 5) (const_int 7) (const_int 13) (const_int 15)]))))] "TARGET_AVX2" "vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") @@ -16255,25 +16153,17 @@ (define_insn "avx2_phdv4si3" [(set (match_operand:V4SI 0 "register_operand" "=3Dx,x") -=09(vec_concat:V4SI -=09 (vec_concat:V2SI -=09 (plusminus:SI -=09 (vec_select:SI -=09=09(match_operand:V4SI 1 "register_operand" "0,x") -=09=09(parallel [(const_int 0)])) -=09 (vec_select:SI (match_dup 1) (parallel [(const_int 1)]))) -=09 (plusminus:SI -=09 (vec_select:SI (match_dup 1) (parallel [(const_int 2)])) -=09 (vec_select:SI (match_dup 1) (parallel [(const_int 3)])))) -=09 (vec_concat:V2SI -=09 (plusminus:SI -=09 (vec_select:SI -=09=09(match_operand:V4SI 2 "vector_operand" "xBm,xm") -=09=09(parallel [(const_int 0)])) -=09 (vec_select:SI (match_dup 2) (parallel [(const_int 1)]))) -=09 (plusminus:SI -=09 (vec_select:SI (match_dup 2) (parallel [(const_int 2)])) -=09 (vec_select:SI (match_dup 2) (parallel [(const_int 3)]))))))] +=09(plusminus:V4SI +=09 (vec_select:V4SI +=09 (vec_concat:V8SI +=09 (match_operand:V4SI 1 "register_operand" "0,x") +=09 (match_operand:V4SI 2 "vector_operand" "xBm,xm")) +=09 (parallel +=09 [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) +=09 (vec_select:V4SI +=09 (vec_concat:V8SI (match_dup 1) (match_dup 2)) +=09 (parallel +=09 [(const_int 1) (const_int 3) (const_int 5) (const_int 7)]))))] "TARGET_SSSE3" "@ phd\t{%2, %0|%0, %2} @@ -16288,17 +16178,15 @@ (define_insn "ssse3_phdv2si3" [(set (match_operand:V2SI 0 "register_operand" "=3Dy,x,Yv") -=09(vec_concat:V2SI -=09 (plusminus:SI -=09 (vec_select:SI +=09(plusminus:V2SI +=09 (vec_select:V2SI +=09 (vec_concat:V4SI =09 (match_operand:V2SI 1 "register_operand" "0,0,Yv") -=09 (parallel [(const_int 0)])) -=09 (vec_select:SI (match_dup 1) (parallel [(const_int 1)]))) -=09 (plusminus:SI -=09 (vec_select:SI -=09 (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv") -=09 (parallel [(const_int 0)])) -=09 (vec_select:SI (match_dup 2) (parallel [(const_int 1)])))))] +=09 (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,Yv")) +=09 (parallel [(const_int 0) (const_int 2)])) +=09 (vec_select:V2SI +=09 (vec_concat:V4SI (match_dup 1) (match_dup 2)) +=09 (parallel [(const_int 1) (const_int 3)]))))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" "@ phd\t{%2, %0|%0, %2} =09Jakub