From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 95C2A3840029 for ; Tue, 27 Jul 2021 10:39:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 95C2A3840029 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-181-jcy_sWo4P2mN70lKj2uxMg-1; Tue, 27 Jul 2021 06:39:24 -0400 X-MC-Unique: jcy_sWo4P2mN70lKj2uxMg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B9372A0C00; Tue, 27 Jul 2021 10:39:23 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-143.ams2.redhat.com [10.36.112.143]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 52EF5179B3; Tue, 27 Jul 2021 10:39:23 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 16RAdKa34168284 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 27 Jul 2021 12:39:20 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 16RAdJT64168283; Tue, 27 Jul 2021 12:39:19 +0200 Date: Tue, 27 Jul 2021 12:39:19 +0200 From: Jakub Jelinek To: Hongtao Liu Cc: Uros Bizjak , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] i386: Improve AVX2 expansion of vector >> vector DImode arithm. shifts [PR101611] Message-ID: <20210727103919.GH2380545@tucnak> Reply-To: Jakub Jelinek References: <20210727081138.GG2380545@tucnak> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jul 2021 10:39:27 -0000 On Tue, Jul 27, 2021 at 06:33:24PM +0800, Hongtao Liu wrote: > > AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode > > it only supports logical and not arithmetic shifts, only AVX512F for > > V8DImode or AVX512VL for V{2,4}DImode fixed that omission. > > Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift > > emulation using various sequences, this patch handles the vector >> vector > > case. No need to adjust costs, the previous cost adjustment actually > > covers even the vector by vector shifts. > > The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right > > V{2,4}DImode shifts (once of the original operands, once of sign mask > > constant by the vector shift count), xor and subtraction, on each element > > (long long) x >> y is done as > > (((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y)) > > - (0x8000000000000000ULL >> y) > I'm wondering when y > 64, would the transformation still be proper. > Guess since it's UD, compiler can do anything. The patch is changing optabs, not something from target builtins where the intrinsics might make it well defined. In the optabs out of bound shifts (including y == 64) are UB - i386.h doesn't define SHIFT_COUNTS_TRUNCATED. Jakub