From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by sourceware.org (Postfix) with ESMTPS id 381F73858C78 for ; Fri, 19 Jan 2024 16:50:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 381F73858C78 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 381F73858C78 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::d2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705683018; cv=none; b=u3HydmhR64bHBnJo5rGeXFngOEO62mXK68Vj04/EI/wIMchIAVpGU6Zr2H0NmqIUf9qjal+jxE1sb/v/b5rrh0QNVimrqyvXvzMqC2+dxgc5YKO7BC+THJpGkhJwqTjv3PyiflqFtNB2IHmvmuEs3AkNYYxAvSPc9pf9wkNo2Jk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705683018; c=relaxed/simple; bh=soHWjlp2Xkg6dgFfad1UM4XDP7k2jIwy32eFQV9XNro=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=iSLRfhdZ76u0as1p43vB0tteOF7rCD24lTxpHAd2IFDbG59c35nVRsHRLqhm4mFxn9TnB9dKvskl572ROqRsXfGfjQfWk6q4/OMRJx/M4HBZYC+ztFTSYvA2tvKjTone1e1gUud/jHNDfnqFXW67J0FyevfnLPJC1cFLJga1HhE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-io1-xd2b.google.com with SMTP id ca18e2360f4ac-7bedd61c587so31796539f.2 for ; Fri, 19 Jan 2024 08:50:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705683014; x=1706287814; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=lXbNcxmhDuUrXhidt/DYpgGdcq3+V1vaBu1sQfw140k=; b=AaNCDQtF/DZmoqw7KF7ZRIbOImmmIoCAuK5ZsGFWN9lrGDhu27hL8mDa0gYG7/OhNZ KksgiHvAt70YO+6B1T+OPeMT/FNUA9uuvetLgGau2m+RajNwlBeRsQy8wT3aaAkVL6Rl 2mtCVuHBgub41N1ZlAmQbfav8Rm0VBVPvTisNpVvGadANy9bpxS9Lj3SNONDIKlljwZw MaIPdzo7BHJpWxTJYc1Bk/dIpXHTmhby6KROX7ZrUnB0IxVD5WuK4m77+qq3/pJ781Su jH5bvylGf4D6D//pQG4rvxl8TPPRtwfukdB0njH9kWT5IbHrs1BeTNcIxx/PoizdQdPU E8nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705683014; x=1706287814; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lXbNcxmhDuUrXhidt/DYpgGdcq3+V1vaBu1sQfw140k=; b=NY4E2jQwwtI67iiljObreHcaMJ08Gm6rS+bHSlYmHPiax5GpkeHz0bhmgJ2HenVQVv CvFxgv2pN1XEAr5SOe4f7MoLXBOFujPu3Goy1Lv18fxiwLj6I3y6bqbqyUT/IyxvOYB5 s6aMMHhlEUPJa5ZaZI9cAJcXT0ISCFptvq8zKdy7OUxaIXrIiLZW71/qPYq4NYE2oxvd j/eyCZj8JXAUbE+8iqSeGnFmYFiWHbJ1b+eMiJF1XyK2eQ+YCvGvUGuwxgz9+AuPGwRY iIrubOVbbsMXWFVjUQlLdME7NosmlgA3WstQL1q+FmUqwhf6PkKyIJzhTWlp+t9SgqfS dlrA== X-Gm-Message-State: AOJu0Yzx4LASiyt9hVTlmvwCmN+jt5Wjy5R8q8+9OGB2qA+NxAmJ4hBE Fc+/vpbFdmfa+WNDdDpfn6P5n36E/N5duVTxtGfAX6pf9sX4D7OS X-Google-Smtp-Source: AGHT+IHSspaDFAJgYpRXbv+QeNH7FdQ9fPBTlioma24ZSgSrghMNOh+GkKYUBPcvS2BFJDLyAy5qBw== X-Received: by 2002:a6b:ea0a:0:b0:7bf:37b7:b0c0 with SMTP id m10-20020a6bea0a000000b007bf37b7b0c0mr3675492ioc.31.1705683014285; Fri, 19 Jan 2024 08:50:14 -0800 (PST) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id w6-20020a6b4a06000000b007bf4e3ad4dbsm2518204iob.33.2024.01.19.08.50.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 19 Jan 2024 08:50:13 -0800 (PST) Message-ID: <23b0bd81-fd43-4b44-91b4-871d681c11a3@gmail.com> Date: Fri, 19 Jan 2024 09:50:11 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates. Content-Language: en-US To: Georg-Johann Lay , Roger Sayle , gcc-patches@gcc.gnu.org References: <023501da4a48$320e7540$962b5fc0$@nextmovesoftware.com> <71f8f116-e3b8-4e70-b30a-a4bc042466a2@gjlay.de> From: Jeff Law In-Reply-To: <71f8f116-e3b8-4e70-b30a-a4bc042466a2@gjlay.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 1/19/24 09:05, Georg-Johann Lay wrote: > > > Am 18.01.24 um 20:54 schrieb Roger Sayle: >> >> This patch tweaks RTL expansion of multi-word shifts and rotates to use >> PLUS rather than IOR for disjunctive operations.  During expansion of >> these operations, the middle-end creates RTL like (X<>C2) >> where the constants C1 and C2 guarantee that bits don't overlap. >> Hence the IOR can be performed by any any_or_plus operation, such as >> IOR, XOR or PLUS; for word-size operations where carry chains aren't >> an issue these should all be equally fast (single-cycle) instructions. >> The benefit of this change is that targets with shift-and-add insns, >> like x86's lea, can benefit from the LSHIFT-ADD form. >> >> An example of a backend that benefits is ARC, which is demonstrated >> by these two simple functions: > > But there are also back-ends where this is bad. > > The reason is that with ORI, the back-end needs only to operate no > these sub-words where the sub-mask is non-zero.  But for PLUS this > is not the case because the back-end does not know that intermediate > carry will be zero.  Hence, with PLUS, more instructions are needed. > An example is AVR, but maybe much more target with multi-word operations > are affected in a bad way. > > Take for example the case with 2 words and a value of 1. > > LO |= 1 > HI |= 0 > > can be optimized to > > LO |= 1 > > but for addition this is not the case: > > LO += 1 > HI +=c 0 ;; Does not know that always carry = 0. I think it's clear that the decision is target and possibly uarch specific within a target. Which means that expmed is probably the right place and that we're going to need to look for a good way for the target to control. I suspect rtx_cost isn't likely a good fit. Jeff