From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 78454385782C for ; Tue, 17 May 2022 07:00:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 78454385782C Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-100-eiEIfGJ1N82el2vQOS2nwA-1; Tue, 17 May 2022 03:00:07 -0400 X-MC-Unique: eiEIfGJ1N82el2vQOS2nwA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8714E299E759; Tue, 17 May 2022 07:00:07 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 488AA40C1421; Tue, 17 May 2022 07:00:07 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 24H704ks4065493 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 17 May 2022 09:00:05 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 24H704lT4065492; Tue, 17 May 2022 09:00:04 +0200 Date: Tue, 17 May 2022 09:00:04 +0200 From: Jakub Jelinek To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] i386: Fix up V2DI and V1TI inequality comparisons [PR105613] Message-ID: Reply-To: Jakub Jelinek MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2022 07:00:10 -0000 Hi! The recent r13-458 change to introduce vec_cmpeqv1tiv1ti and add TARGET_SSE2 support to vec_cmpeqv2div2di works nicely for equality comparisons, but as the testcase shows doesn't work for inequality comparisons. For EQ if we perform comparison with twice as many half-sized elemenets, the result should be ~0 when both halves are ~0 only (both halves need to be equal for the whole to be equal), otherwise 0, so AND is the correct operation for it. But for NE, the result should be ~0 when either of the halves is ~0 (if either half is not equal, the whole is not equal) and so the right operation for NE is IOR, not AND. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-05-17 Jakub Jelinek PR target/105613 * config/i386/sse.md (vec_cmpeqv2div2di, vec_cmpeqv1tiv1ti): Use andv4si3 only for EQ, for NE use iorv4si3 instead. * gcc.c-torture/execute/pr105613.c: New test. --- gcc/config/i386/sse.md.jj 2022-05-16 09:46:01.962065216 +0200 +++ gcc/config/i386/sse.md 2022-05-16 10:48:45.698038881 +0200 @@ -4407,7 +4407,10 @@ (define_expand "vec_cmpeqv2div2di" emit_insn (gen_sse2_pshufd (tmp1, ops[0], GEN_INT (0xb1))); rtx tmp2 = gen_reg_rtx (V4SImode); - emit_insn (gen_andv4si3 (tmp2, tmp1, ops[0])); + if (GET_CODE (operands[1]) == EQ) + emit_insn (gen_andv4si3 (tmp2, tmp1, ops[0])); + else + emit_insn (gen_iorv4si3 (tmp2, tmp1, ops[0])); emit_move_insn (operands[0], gen_lowpart (V2DImode, tmp2)); } @@ -4435,7 +4438,10 @@ (define_expand "vec_cmpeqv1tiv1ti" emit_insn (gen_sse2_pshufd (tmp1, tmp2, GEN_INT (0x4e))); rtx tmp3 = gen_reg_rtx (V4SImode); - emit_insn (gen_andv4si3 (tmp3, tmp2, tmp1)); + if (GET_CODE (operands[1]) == EQ) + emit_insn (gen_andv4si3 (tmp3, tmp2, tmp1)); + else + emit_insn (gen_iorv4si3 (tmp3, tmp2, tmp1)); emit_move_insn (operands[0], gen_lowpart (V1TImode, tmp3)); DONE; --- gcc/testsuite/gcc.c-torture/execute/pr105613.c.jj 2022-05-16 10:42:34.286151601 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr105613.c 2022-05-16 10:48:07.687562119 +0200 @@ -0,0 +1,26 @@ +/* PR target/105613 */ +/* { dg-do run { target int128 } } */ + +typedef unsigned __int128 __attribute__((__vector_size__ (16))) V; + +void +foo (V v, V *r) +{ + *r = v != 0; +} + +int +main () +{ + V r; + foo ((V) {5}, &r); + if (r[0] != ~(unsigned __int128) 0) + __builtin_abort (); + foo ((V) {0x500000005ULL}, &r); + if (r[0] != ~(unsigned __int128) 0) + __builtin_abort (); + foo ((V) {0}, &r); + if (r[0] != 0) + __builtin_abort (); + return 0; +} Jakub