From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id 88CF13858C5F for ; Thu, 5 Oct 2023 11:35:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 88CF13858C5F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-5362bcc7026so1494669a12.1 for ; Thu, 05 Oct 2023 04:35:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696505719; x=1697110519; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=09GcxinSOgiNmj/X1flxAsPFBri9UraOV7b/yxWCAWE=; b=Etb7W0fgcHIge58/U1Ht6VniUCSVpcmrijv4hfAh1OrT7jaVypsEkE4jH1utzOJ6e9 3OahO1VbU6MMCIdnULLVX6qd40bjoYjsG7AcFxT1bV6X8ysKyMf7WZavQGfpcppShcLv 5O3WNhE3vTxlc6cs0w8Bt4JMWxH57+c1FlesuPLwg6vDbnXJEIZVrqA80lNuYldKFlFm gmnJ/gMsK6TRnpa2VrbNoiknjptNwtnvibHDwEqt1o06dbW+GTaYrKubtAxI4pMav/ie vv+PObjdk9R22Juu4Ba8g+7W3pd4hjnBbx3BdTyZGK5l/JulgbPNIWDIzhMpdXzCshfX GPCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696505719; x=1697110519; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=09GcxinSOgiNmj/X1flxAsPFBri9UraOV7b/yxWCAWE=; b=oB/uPpnQSRYVc/I4csifL20rw3GoWEOnEnTx4NjE3VhPC37bleUHfExvKAkogZnTF/ QXqnywK7iDMcUvsmywTzyi5FZkDEeuSsoYwt4MaivBAiCTLU220DCLpxmabrYPzyO1Hb yO8iz7CnYJiS9vtaeKN2Lo5LfVemzG/2lrAqIYKtXTS/3aEF3q+tZ7S1r0fOi/Ny+0yW RIR9EIuYBMqeeq5CBbBRUJGBcCAeWhngfWPKryUxc6dttSM4njDHJptdvTLHwCOHaeqa z6JKbn57UNH5j+RZDtY4o+urFvG1CmdnssUBvljYM4bFMsPt8cHjyezqHHeQLJGamJ7G komw== X-Gm-Message-State: AOJu0YwE5SMcieajR3eK6ot/dkTNTHabseDqV/ggNFAT+AyO/qVMFdZC 4KaYuSYzi8ClYK0mi0K/mXCRcABu06ztumjHd5pL+JndWWRs2Q== X-Google-Smtp-Source: AGHT+IG4a3wuMCHyyh9PaX/Qv4otGquAmCswvfOoBn11CWISN0jwyqWb796R1yeVjKJeLEd++e54ZIPewcOxbVye1o4= X-Received: by 2002:aa7:d293:0:b0:533:4fad:2e5f with SMTP id w19-20020aa7d293000000b005334fad2e5fmr4198549edq.12.1696505718972; Thu, 05 Oct 2023 04:35:18 -0700 (PDT) MIME-Version: 1.0 References: <00cd01d9f76b$3db62990$b9227cb0$@nextmovesoftware.com> In-Reply-To: <00cd01d9f76b$3db62990$b9227cb0$@nextmovesoftware.com> From: Uros Bizjak Date: Thu, 5 Oct 2023 13:35:07 +0200 Message-ID: Subject: Re: [X86 PATCH] Split lea into shorter left shift by 2 or 3 bits with -Oz. To: Roger Sayle Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Oct 5, 2023 at 11:06=E2=80=AFAM Roger Sayle wrote: > > > This patch avoids long lea instructions for performing x<<2 and x<<3 > by splitting them into shorter sal and move (or xchg instructions). > Because this increases the number of instructions, but reduces the > total size, its suitable for -Oz (but not -Os). > > The impact can be seen in the new test case: > > int foo(int x) { return x<<2; } > int bar(int x) { return x<<3; } > long long fool(long long x) { return x<<2; } > long long barl(long long x) { return x<<3; } > > where with -O2 we generate: > > foo: lea 0x0(,%rdi,4),%eax // 7 bytes > retq > bar: lea 0x0(,%rdi,8),%eax // 7 bytes > retq > fool: lea 0x0(,%rdi,4),%rax // 8 bytes > retq > barl: lea 0x0(,%rdi,8),%rax // 8 bytes > retq > > and with -Oz we now generate: > > foo: xchg %eax,%edi // 1 byte > shl $0x2,%eax // 3 bytes > retq > bar: xchg %eax,%edi // 1 byte > shl $0x3,%eax // 3 bytes > retq > fool: xchg %rax,%rdi // 2 bytes > shl $0x2,%rax // 4 bytes > retq > barl: xchg %rax,%rdi // 2 bytes > shl $0x3,%rax // 4 bytes > retq > > Over the entirety of the CSiBE code size benchmark this saves 1347 > bytes (0.037%) for x86_64, and 1312 bytes (0.036%) with -m32. > Conveniently, there's already a backend function in i386.cc for > deciding whether to split an lea into its component instructions, > ix86_avoid_lea_for_addr, all that's required is an additional clause > checking for -Oz (i.e. optimize_size > 1). > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=3D'unix{-m32}' > with no new failures. Additional testing was performed by repeating > these steps after removing the "optimize_size > 1" condition, so that > suitable lea instructions were always split [-Oz is not heavily > tested, so this invoked the new code during the bootstrap and > regression testing], again with no regressions. Ok for mainline? > > > 2023-10-05 Roger Sayle > > gcc/ChangeLog > * config/i386/i386.cc (ix86_avoid_lea_for_addr): Split LEAs used > to perform left shifts into shorter instructions with -Oz. > > gcc/testsuite/ChangeLog > * gcc.target/i386/lea-2.c: New test case. > OK, but ... @@ -0,0 +1,7 @@ +/* { dg-do compile { target { ! ia32 } } } */ Is there a reason to avoid 32-bit targets? I'd expect that the optimization also triggers on x86_32 for 32bit integers. +/* { dg-options "-Oz" } */ +int foo(int x) { return x<<2; } +int bar(int x) { return x<<3; } +long long fool(long long x) { return x<<2; } +long long barl(long long x) { return x<<3; } +/* { dg-final { scan-assembler-not "lea\[lq\]" } } */ Uros.